logo Hurry, Grab up to 30% discount on the entire course
Order Now logo

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

expert
Kevin BatesTechnical writing
(5/5)

549 Answers

Hire Me
expert
Neil BissonnetteBusiness
(5/5)

759 Answers

Hire Me
expert
StatAnalytica ExpertSociology
(5/5)

554 Answers

Hire Me
expert
Vikas BohraComputer science
(5/5)

544 Answers

Hire Me
R Programming
(5/5)

In many businesses, identifying which customers will make a purchase (and when), is a critical exercise

INSTRUCTIONS TO CANDIDATES
ANSWER ALL QUESTIONS

1  Online retail sales prediction

 

In many businesses, identifying which customers will make a purchase (and when), is a critical exercise. This is true for both brick-and-mortar outlets and online stores.

The data provided in this assignment is website traffic data acquired from an online retailer and provides information on customer’s website site visit behavior. Customers may visit the store multiple times, on multiple days, with or without making a purchase.

Your goal is to predict how much sales revenue can be expected from each customer. The variable revenue lists the amount of money that a customer spends on a given visit. Your goal is to predict how much money a customer will spend, in total, across all visits to the website, during the allotted one-year time frame (August 2016 to August 2017).

More speciffically, you will need to predict a transformation of the aggregrate customer-level sales value based on the natural log. That is, if customer i has ki revenue transactions, then you should compute:

 

ki

custRevenuei = revenueij ∀i ∈ customers

j=1

 

And then transform this variable as follows:

targetRevenuei = ln(custRevenuei + 1)    ∀i ∈ customers

You will be evaluated on how well you can predict the target revenue on a test data set available at the Kaggle.com website (see the Canvas assignment page for the private competition URL)

 

(a) (50 points) Preparation and modeling.

i. (10 points) Data understanding. Generate a Data Quality Report. Also, choose at least two meaningful visualizations and/or analyses and explain their relevance.

ii. (10 points)  Data preparation.   Choose two of the most critical data preparation actions you took and explain the reasoning for these actions.

iii. (20 points) Modeling. Build an OLS model and 3 or more regression variant models (these may include robust regression, PLS, PCR, ridge regression, LASSO, elasticnet, MARS, or SVR) and summarize their performance in a table (as shown in Table 1). Clearly state your resampling approach. Note: You may combine models, techniques, etc.

iv. (10 points) Debrief. For your best predictions, describe your approach, e.g., did you examine interactions? did you use any type of model stacking? what was your secret sauce?  Did you have any problems during the modeling process? If so, how did you overcome those?

(b) (50 points) Competition modeling.

       Upload your predictions to the Kaggle website and check the predictive performance on the “Public Leaderboard”

       You may submit multiple times throughout the competition, however, there is a limit to the number of submissions per day.

       Score is based on ranked performance on the “Private Leaderboard”; extra-credit is possible.

       All modeling approaches covered in lecture can be used (OLS, robust methods, dimension reduction methods, penalized methods, MARS, SVR, PCA, LDA, k-nn, t-SNE, transformations, missing value imputations, etc.) To be fair, approaches not yet discussed in detail are not allowed (e.g., tree-based models, neural network based models, clustering, are not allowed at this time.)

       You must outperform the benchmark model to receive any credit.

       May the odds be ever in your favor.

 

(5/5)
Attachments:

Related Questions

. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

CS 340 Milestone One Guidelines and Rubric  Overview: For this assignment, you will implement the fundamental operations of create, read, update,

. Develop a program to emulate a purchase transaction at a retail store. This  program will have two classes, a LineItem class and a Transaction class

Retail Transaction Programming Project  Project Requirements:  Develop a program to emulate a purchase transaction at a retail store. This

. The following program contains five errors. Identify the errors and fix them

7COM1028   Secure Systems Programming   Referral Coursework: Secure

. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip

. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer

CS 340 Final Project Guidelines and Rubric  Overview The final project will encompass developing a web service using a software stack and impleme

January
January
February
March
April
May
June
July
August
September
October
November
December
2025
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
2037
2038
2039
2040
2041
2042
2043
2044
2045
2046
2047
2048
2049
2050
SunMonTueWedThuFriSat
29
30
31
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
1
00:00
00:30
01:00
01:30
02:00
02:30
03:00
03:30
04:00
04:30
05:00
05:30
06:00
06:30
07:00
07:30
08:00
08:30
09:00
09:30
10:00
10:30
11:00
11:30
12:00
12:30
13:00
13:30
14:00
14:30
15:00
15:30
16:00
16:30
17:00
17:30
18:00
18:30
19:00
19:30
20:00
20:30
21:00
21:30
22:00
22:30
23:00
23:30