A bank would like to understand the demographics and other characteristics associated with whether a customer accepts a credit card offer. Observational data is somewhat limited for this kind of problem, in that often the company sees only those who respond to an offer. To get around this, the bank designs a focused marketing study, with 18,000 current bank customers. This focused approach allows the bank to know who does and does not respond to the offer, and to use existing demographic data that is already available on each customer.
The designed approach also allows the bank to control for other potentially important factors so that the offer combination isn’t confused or confounded with the demographic factors. Because of the size of the data and the possibility that there are complex relationships between the response and the studied factors, a decision tree is used to find out if there is a smaller subset of factors that may be more important and that warrant further analysis and study.
We want to build a model that will provide insight into why some bank customers accept credit card offers. Because the response is categorical (either Yes or No) and we have a large number of potential predictor variables, we use the Partition platform to build a classification tree for Offer Accepted. We are primarily interested in understanding characteristics of customers who have accepted an offer, so the resulting model will be exploratory in nature.1
The Data Credit Card Marketing BBM.jmp
The data set consists of information on the 18,000 current bank customers in the study.
Customer Number: A sequential number assigned to the customers (this column is hidden and excluded – this unique identifier will not be used directly).
Offer Accepted: Did the customer accept (Yes) or reject (No) the offer.
Reward: The type of reward program offered for the card.
Mailer Type: Letter or postcard.
Income Level: Low, Medium or High.
# Bank Accounts Open: How many non-credit-card accounts are held by the customer.
1 In exploratory modeling, the goal is to understand the variables or characteristics that drive behaviors or particular outcomes. In predictive modeling, the goal is to accurately predict new observations and future behaviors, given the current information and situation.
Overdraft Protection: Does the customer have overdraft protection on their checking account(s) (Yes or No).
Credit Rating: Low, Medium or High.
# Credit Cards Held: The number of credit cards held at the bank. # Homes Owned: The number of homes owned by the customer. Household Size: Number of individuals in the family.
Own Your Home: Does the customer own their home? (Yes or No).
Average Balance: Average account balance (across all accounts over time).
Q1, Q2, Q3 and Q4 Balance: Average balance for each quarter in the last year.
We start by getting to know our data. We explore the data one variable at a time, two at a time, and many variables at a time to gain an understanding of data quality and of potential relationships. Since the focus of this case study is classification trees, only some of this work is shown here. We encourage you to thoroughly understand your data and take the necessary steps to prepare your data for modeling before building exploratory or predictive models.
Since we have a relatively large data set with many potential predictors, we start by creating numerical summaries of each of our variables using the Columns Viewer (see Exhibit 1). (Under the Cols menu select Columns Viewer, then select all variables and click Show Summary. To deselect the variables, click Clear Select).
Under N Categories, we see that each of our categorical variables has either two or three levels. N Missing indicates that we are missing 24 observations for each of the balance columns. (Further investigation indicates that these values are missing from the same 24 customers.) The other statistics provide an idea of the centering, spread and shapes of the continuous distributions.
Next, we graph our variables one at a time. (Select the variables within the Columns Viewer and click on the Distribution button. Or, use Analyze > Distribution, select all of the variables as Y, Columns, and click OK. Click Stack from the top red triangle for a horizontal layout).
In Exhibit 2, we see that only around 5.68 percent of the 18,000 offers were accepted.
We select the Yes level in Offer Accepted and then examine the distribution of accepted offers (the shaded area) across the other variables in our data set (the first 10 variables are shown in Exhibit 3).
Our two experimental variables are Reward and Mailer Type. Offers promoting Points and Air Miles are more frequently accepted than those promoting Cash Back, while Postcards are accepted more often than Letters. Offers also appear to be accepted at a higher rate by customers with low to medium income, no overdraft protection and low credit ratings.
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme