(5/5)

Hire Me
(5/5)

Hire Me
(5/5)

Hire Me
(5/5)

Hire Me
(5/5)

# you\'re an analyst at Crankshaft List. Hundreds of free advertisements for vehicles are published on your site every day.

INSTRUCTIONS TO CANDIDATES

Project description: You're an analyst at Crankshaft List. Hundreds of free advertisements for vehicles are published on your site every day. You need to study data collected over the last few years and determine which factors influence the price of a vehicle.

Instructions for completing the project

Step 1. Open the data file and study the general information  vehicles_us.csv

Step 2. Data preprocessing

Identify and study missing values:

In some cases, there's an obvious way to replace missing values. For instance, if a Boolean field contains only True values, it's reasonable to assume that the missing values are False. There aren't such obvious fixes for other data types, and there are cases when the fact that a value is missing is significant. In such instances, don't fill in the values.

When appropriate, do fill in the values. Explain why you chose to do so and how you selected the replacement values.

Describe the factors that may have resulted in missing values.

— Convert the data to the required types:

Indicate the columns where the data types need to be changed and explain why.

Step 3. Calculate and add to the table the following:

Day of the week, month, and year the ad was placed

The vehicle's age (in years) when the ad was placed

The vehicle's average mileage per year

In the condition column, replace string values with a numeric scale:

new = 5

like new = 4

excellent = 3

good = 2

fair = 1

salvage = 0

Step 4. Carry out exploratory data analysis, following the instructions below:

Study the following parameters: price, vehicle's age when the ad was placed, mileage, number of cylinders, and condition. Plot histograms for each of these parameters. Study how outliers affect the form and readability of the histograms.

Determine the upper limits of outliers, remove the outliers and store them in a separate DataFrame, and continue your work with the filtered data.

Use the filtered data to plot new histograms. Compare them with the earlier histograms (the ones that included outliers). Draw conclusions for each histogram.

Study how many days advertisements were displayed (days_listed). Plot a histogram. Calculate the mean and median. Describe the typical lifetime of an ad. Determine when ads were removed quickly, and when they were listed for an abnormally long time.

Analyze the number of ads and the average price for each type of vehicle. Select the two types with the greatest number of ads. Plot a graph showing the dependence of the number of ads on the vehicle type.

What factors impact the price most? Take each popular type you detected at the previous stage and study whether the price depends on age, mileage, condition, transmission type, and color. For categorical variables (transmission type and color), plot box-and-whisker charts and create scatterplots for the rest. When analyzing categorical variables, note that the categories must have at least 50 ads; otherwise, their parameters won't be valid for analysis.

Step 5: Create a Regression model which can predict the resale value of the vehicle. Explain the whole process.

Step 5. Write an overall conclusion. Prepare PPT and Sumbit it on Team.

Description of the data

The dataset contains the following fields:

price

model_year

model

condition

cylinders

fuel — gas, diesel, etc.

odometer — the vehicle's mileage when the ad was published

transmission

paint_color

is_4wd — whether the vehicle has 4-wheel drive (Boolean type)

date_posted — the date the ad was published

days_listed — from publication to removal

How will your project be evaluated?

Here’s what project reviewers look for when assessing your project:

How you explain the problems identified in the data

Which methods you used to process missing values

How you use data slices

Which methods you use to plot graphs

Whether or not you automated graph plotting

Whether you calculate correlation figures for the data and how you explain them

Whether you follow the project structure and keep the code tidy

Whether you leave comments at each step

(5/5)

## Related Questions

##### . The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

CS 340 Milestone One Guidelines and Rubric  Overview: For this assignment, you will implement the fundamental operations of create, read, update,

##### . Develop a program to emulate a purchase transaction at a retail store. This  program will have two classes, a LineItem class and a Transaction class

Retail Transaction Programming Project  Project Requirements:  Develop a program to emulate a purchase transaction at a retail store. This

##### . The following program contains five errors. Identify the errors and fix them

7COM1028   Secure Systems Programming   Referral Coursework: Secure

##### . Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip

##### . The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer

CS 340 Final Project Guidelines and Rubric  Overview The final project will encompass developing a web service using a software stack and impleme

Get Free Quote!

386 Experts Online