Scenario
Your company, the maker of StackBook laptops (you may have seen this scenario in MRT285, depending on if/when you took it), is interested in understanding customer loss and retention. The supplied file is a list of customers spanning ten years’ worth of activity, containing information about their purchase history. Each record represents the purchase history for one customer, including the number of purchases and the dates of the first and last purchases made. (Most of the fields are self-explanatory; see notes below for the others.)
You will use this data to compute tenures for the customers, and their un/censored state. The exact purchase dates are given in the file; we will be measuring tenure in days.
Calculating tenure would be as simple as subtracting dates, except for the fact that people do not buy laptops every day (or even every month!) The StackBook division is making the assumption that customers buy a new laptop every four years or less. Thus, a customer who makes a subsequent
purchase within 1461 days (i.e., 365*4, plus one leap day) is a continuing customer, while after 1461 days without a purchase, a customer is considered to be lost. (Note: For simplicity, we have not included the timing of every purchase a customer might have made, and we are not considering the case where a customer might be ‘lost’ and then won back. In this data, you can assume that the entire time between the first purchase and the last was a continuous relationship; we are only concerned with loss after the last purchase.)
Question 1
To do survival analysis, you need to calculate the tenure and censored status of each customer. This will require new formula columns (some making use of if statements).
You can use the 'Date Difference' function built in to JMP Pro to determine the number of days between two dates. You will need to do several calculations, each in a new column:
• The number of days between the first and last purchase is the 'Date Difference' between the dates, using 'Day' as the interval. (I.e., Date Difference (<first date>, <last date>, "Day") .)
• The days since the last purchase can be calculated using the Date Difference function in similar fashion. (Today's date is not in a column, but is available via the 'Today' function).
• A customer will be considered as active (censored) if they have made a purchase within 1461 days preceding today. Add a Censored column that has a value of 1 if the days since the last purchase is less than 1461, 0 otherwise.
• Because we don’t consider a customer lost until 1461 days after their last purchase, the 1461 days are included in their tenure. Add a Tenure column that calculates tenure as Days between purchases + days since last purchase if the customer is censored, Days between purchases + 1461 otherwise.
Include in your report both the formulas you use for each column, as well as the first ten rows of your data table (including the new columns).
Before completing the rest of the assignment, you may send me an email with a screen capture of the first few rows of your table, to check that your values look correct. (I would suggest that you do this at least one week before the assignment is due, because I can’t guarantee turnaround time!) I will do this only because if you get Question 1 wrong, all of your other work will be incorrect as well.
Question 2
Using the resulting table from Question 1, do a survival analysis. Include the default output in your report.
Comment on the shape of the survival curve. What noteworthy elements (patterns, events, etc.) are seen as time progresses? Interpret the noteworthy elements in business terms.
What is the median tenure? What is the mean tenure?
Question 3
Using the results from Question 2 and the technique outlined in the exercises/tutorial, generate the hazard probabilities. Include only a graph in your report.
Question 4
Re-run your analysis from Question 2, but this time, grouping by job. (Do not ‘test cross groups’.) Include only the survival curve plot and the summary table in your output.
Which job has the highest survival? Which job has the lowest? (You can gauge these from the average tenures in the summary table.)
Suppose that a customer spends, on average, about $1200 per year on our products. Calculate the customer lifetime value for each of the two groups you identified above.
Question 5
Now, run a Proportional Hazards analysis on the same data, using job as the only model effect. Is the model significant?
For the jobs you identified as the best and worst in Question 4, include in your report only the rows from the Risk Ratios table that compare the two. Use the risk ratios to interpret the relative risk of the two groups (in both directions). Does this seem consistent with what you observed in Question 4?
Question 6
Is 'days' a good unit of measurement for tenure in this case, or would another unit be better? Answer based on what you have seen in your analysis.
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme