I. Purpose
The purpose of this assignment is to allow students to get familiarized with all the phases of predictive modeling. You have been hired by SuperApp, a fictional supermarket company, as a Data Analyst, to assist in setting up their marketing strategy for a new line of products. Your purpose is to analyse existing customer data and discover which customers are likely to purchase these products.
In particular, in this assignment, you will:
· Prepare a dataset for analysis purposes;
· Explore the data and understand the dataset and its main dimensions by highlighting key findings;
· Analyse the data using a range of predictive analytical techniques to reveal important insights and perhaps hidden patterns;
· Create a comprehensive business report that encompasses the key findings of all aforementioned parts.
ΙΙ. Requirements
SuperApp is a supermarket that is offering a new line of products. The supermarket's management wants to determine which customers are likely to purchase these products. As an initial buyer incentive plan, the supermarket has provided coupons for the new line of products to all of the loyalty program participants and has collected data about whether these customers have purchased any related products recently.
In particular, the management of the supermarket has created a dataset that includes variables about demographics and loyalty status purchase information about products. The variables in the data set are shown below with the default roles and levels.
Name |
Description |
CustomerID |
Customer loyalty identification number |
ProsperityClass |
Prosperity class on a scale from 1 to 100, 100=highest prosperity class |
Age |
Customer Age, in years |
ResType |
Type of residential neighbourhood |
Gender |
Customer Gender |
District |
District |
TVReg |
Television region |
CardClass |
Loyalty status: Bronze, Silver, Gold, or Platinum |
AmountSpent |
Total amount spent |
CustomerRetention |
Total months as a customer |
CountProducts |
Number of products purchased |
Target |
Purchased new line of products recently: 1 = Yes, 0 = No |
The above dataset, contains more than 22K observations. The data granularity level is customer aggregated, and records are depicted in the form of a consolidated view of all attributes/dimensions from a star schema. These attributes describe customer demographics, customer loyalty and purchases.
You are required to prepare, explore and analyse the data using multiple predictive modelling techniques, and create a business report that will summarize your findings. SAS Enterprise Miner will be used for all aspects of data preparation, exploration and analysis. Microsoft Word will be used for the compilation of the comprehensive report. In particular, you are required to complete the following parts:
Part |
Description |
Details |
Requirements |
Α |
Business Report |
A formal business report containing a cover, an executive summary summarizing the results of analysis (i.e., results of parts B,C,D). |
|
B |
Data Preparation and Exploration |
A document detailing the steps of the data preparation phase. |
Tasks DPE1-DPE7 |
C |
Predictive Modelling |
A document describing the analysis of the data using a range of descriptive, predictive and prescriptive analytical techniques, which reveals important insights and perhaps hidden patterns. |
Tasks PM1-PM5 |
D |
Model Evaluation |
A report that assesses the structure, performance, and resilience of the predictive models used in part C. |
Tasks ME1-ME2 |
E |
Presentation |
A critical discussion comparing the predictive modelling techniques used in C, including their advantages and disadvantages. |
|
Tasks for Part B: Data Preparation and Exploration
DPE1. Create the data source and place it into a new diagram.
DPE2. Adjust the role and level of each variable. Justify your decisions for each variable.
DPE3. There are two target variables. Discuss how these can be used for predictive modelling. Discuss if AmountSpent should be used as an input for a model for predicting Target.
DPE4. Discuss the distribution of the Target variable. Provide insight on the correlations of target with other attributes.
DPE5. Attach the StatExplore tool to the data source. Discuss the results with regards to Missing Values and Imputation.
DPE6. Partition the data source for Training 50% and Validation 50%.
Tasks for Part C: Predictive Modelling
During this task you are requested to create a number of predictive models for predicting the Target attribute and assess their prediction accuracy.
For each predictive model, you will need to discuss the following elements:
i. Special data preparation requirements of the model
ii. Prediction accuracy of the model
iii. Interpretation of results of the model
Your analysis should include at least 3 models of each of the families listed below: PM1. Decision Trees
PM2. Regressions
PM3. Clustering
PM4. Neural Networks
PM5. Support Vector Machines
Tasks for Part D: Model Evaluation and Scoring
ME1. Using Model Comparison, evaluate the predictive models with regards to Misclassification Rate. Use the ROC curve to demonstrate which predictive model is the best.
ME2. Use the model that was selected in the previous step to Score a fresh copy of the data source. Confirm the accuracy of the prediction.
Task for Part E: Presentation
You should prepare a presentation that critically presents the results. Your presentation must (a) include a description of the historical data; (b) describe and interpret the most accurate decision tree; (c) describe and interpret another modelling technique; (d) explain the cut - off point business wise; (e) describe the gain of using predictive modelling (cumulative lift) of the selected model.
Additionally, you will need to prepare one/two slides of conclusions/recommendations, focusing on which customers will buy the new line of products to present to the management team.
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme