# You are required to predict the used car price by using linear regression and write a report on the process.

INSTRUCTIONS TO CANDIDATES

4KG3 Assignment 1 – Linear regression for prediction

Total Marks: 15 points

You are required to predict the used car price by using linear regression and write a report on the process. The data set is ToyotaCorolla1000.jmp on avenue. There are 1000 records and details on 11 variables, including Price, Age, KM, HP and other specifications. The goal is to build a linear regression model to predict the price. Watch the assignment-1 video posted on avenue for the steps. The objective of this assignment is to demonstrate that (i) you know how to use JMP Pro to carry out linear regression, (ii) you have adequate understanding of the associated concepts and (iii) your understanding of its business value.

Objectives:

Learn how to use linear regression for prediction.

1. Using data visualization for initial variable investigation and selection (20%)

(a) What are the distributions for the dependent and independent variables?

(b) How do they related to each other?

(c) What appears to be the three or four most important car specifications for predicting the used car price?

2. Data preparation and partition for training, validation, and testing (10%)

(a) Why do we need to convert fuel type to dummy variables?

(b) Why do we need to do data partition?

3. Run a linear regression with all available variables (30%)

(a) What is the mathematical formula of the regression model obtained?

(b) How do you calculate the predicted price and the prediction error (residual) for each record?

(c) Show the error distribution for training, validation and testing? What are the differences between them?

4. Automated variable selection (30%)

(a) What methods have you used for variable selection?

(b) What is the best set of variables you will use? What are the criteria used for selection?

5. Possible real word applications (10%)

(a) What is the potential business value for used car price prediction? For instance, you may consider its possible use for used car sellers or replacement cost evaluation for insurance companies.

(b) Search the web and find out potential utilities of linear regression in other domains such as finance, healthcare etc.

6. Report how much time you have spent on this assignment, what problem you have faced and what you have learned.

Procedures:

1. Access JMP Pro from Vlab or install it on your PC.

2. Download and save the ToyotaCorolla1000.jmp from Avenue and open it with JMP Pro.

3. Data visualization and exploitation.

Use Analyze> Distribution to find the distribution and basic statistics for all variables.

