This assignment relates to motor insurance claim data. In every tutorial class we will spend some time working on this assignment and at the end of week 5 you will be expected to submit your part 1 assignment. There are two data sets used for this assignment.
There is data for more than 73000 policies in the data file motor20pct.csv that are associated with claims in a particular year. The variables for each of the policies in this data set are explained below:-
CAR_AGE measures the age of the insured car in years
DRIVERS measures the number of people who are specified as designated drivers EXPOSURE measures the fraction of the year for which the policy was active MILEAGE measures the expected mileage travelled in a single year
PRIMAGE gives the age of the primary driver in years
TOTAL gives the total amount claimed on the policy in the year
EXCESS = 0, 75 or 100 indicating the excess claim amount associated with each policy. The insurance company will not pay out claims below this excess amount.
USAGE specifies how the car is used (S=only social, SB=strictly business, SC=social and business, ST=social and taxi)
CLAIM=1 if there was at least one claim during the year, 0 otherwise.
Create a variable called CatClaim set equal to “Yes” when CLAIM=1 and “No” for CLAIM=0.
You will be working with a random sample of 10000 of these policies. Instructions for generating this sample are provided in Tutorial 1.
Suggest a list of questions that could be answered using this data in your assignment. Consider the CLAIM variable as a possible TARGET variable in your models and the total claim amount as a possible RISK variable, reflecting the risk associated with any claim.
Instructions for this question are provided in Tutorials 2 and 3.
Summarise your data using descriptive statistics and graphs. Some suggestions are provided below. All tables and graphs must be numbered/labelled and discussed/interpreted.
i) Produce summary statistics for your data
i) Boxplots for numeric input variables for claim categories
ii) Pairs plot for all numeric input variables
iii) Correlation Plot for all numeric input variables
iv) Hierarchical Correlation Plot for all numeric input variables
v) Bar charts for the categorical variables Usage and the claim variable
vi) Other exciting plots
Tutorial 5 provides guidelines for this question
Partition your data with 70% for training, 15% for validation and 15% for testing. Number and label all your tables and graphs and discuss/interpret the results.
a) Produce a Tree to predict CatClaim. Then Draw your tree and ask for the Rules.
b) How do the results change when you re-run your tree assuming a loss matrix with losses half as big for a false positive (CatClaim=”Yes”) than a false negative (CatClaim=”No”).
c) How do your results change when you re-run your tree assuming priors of 20% for CatClaim = Yes and 80% for CatClaim = No. These were the percentages for the original data file “motor20pct”.
Description for the second data set
For this question consider the data set MBAmotor2.csv which was created using MBAmotor.csv. This file tells us what type of claim was posted by each of the policy holders during the year. There is at least one type of claim for all these policies.
WSCLMS=WS for windshield claims ADCLMS=AD for accidental damage FTCLMS =FT for fire or theft
PDCLMS = PD for personal damage claims PICLMS = PI for personal injury claims
Tutorial 4 provides guidelines for this question.
Conduct an association analysis using these data and discuss your results. In particular, you should define the terms support and confidence and determine the strongest and the most common associations between the above types of claim. Number and label all tables and figures and discuss/interpret the results.
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme