The Thera bank recently saw a steep decline in the number of users of their credit card, credit cards are a good source of income for banks because of different kinds of fees charged by the banks like annual fees, balance transfer fees, and cash advance fees, late payment fees, foreign transaction fees, and others. Some fees are charged on every user irrespective of usage, while others are charged under specified circumstances.
Customers’ leaving credit cards services would lead bank to loss, so the bank wants to analyze the data of customers’ and identify the customers who will leave their credit card services and reason for same – so that bank could improve upon those areas
You as a Data scientist at Thera bank need to come up with a classification model that will help bank improve their services so that customers do not renounce their credit cards
Objective
1. Explore and visualize the dataset.
2. Build a classification model to predict if the customer is going to churn or not
3. Optimize the model using appropriate techniques
4. Generate a set of insights and recommendations that will help the bank
* CLIENTNUM: Client number. Unique identifier for the customer holding the account
* Attrition_Flag: Internal event (customer activity) variable - if the account is closed then 1 else 0
* Customer_Age: Age in Years
* Gender: Gender of the account holder
* Dependent_count: Number of dependents
* Education_Level: Educational Qualification of the account holder
* Marital_Status: Marital Status of the account holder
* Income_Category: Annual Income Category of the account holder
* Card_Category: Type of Card
* Months_on_book: Period of relationship with the bank
* Total_Relationship_Count: Total no. of products held by the customer
* Months_Inactive_12_mon: No. of months inactive in the last 12 months
* Contacts_Count_12_mon: No. of Contacts in the last 12 months
* Credit_Limit: Credit Limit on the Credit Card
* Total_Revolving_Bal: Total Revolving Balance on the Credit Card
* Avg_Open_To_Buy: Open to Buy Credit Line (Average of last 12 months)
* Total_Amt_Chng_Q4_Q1: Change in Transaction Amount (Q4 over Q1)
* Total_Trans_Amt: Total Transaction Amount (Last 12 months)
* Total_Trans_Ct: Total Transaction Count (Last 12 months)
* Total_Ct_Chng_Q4_Q1: Change in Transaction Count (Q4 over Q1)
* Avg_Utilization_Ratio: Average Card Utilization Ratio
• The notebook should be well-documented, with inline comments explaining the functionality of code and markdown cells containing comments on the observations and insights.
• The notebook should be run from start to finish in a sequential manner before submission.
• It is preferable to remove all warnings and errors before submission.
• The notebook should be submitted as an HTML file (.html) and NOT as a notebook file (.ipynb)
Like in real-world projects, the ultimate destination of any project or work is generally an executive or decision-making meeting, where you are supposed to present your solution to the business problem, based on the project/work you have done. The purpose of this presentation is to simulate that kind of experience, and to draw the attention of your audience (a business leader like CMO, COO, CFO or CEO) to the key points of your project, which are
• Business overview of the problem and solution approach
• Key findings and insights which can drive business decisions
• Model overview and performance summary
• Business recommendations
• Focus on explaining the takeaways in an easy-to-understand manner.
• Inclusion of the potential benefits of implementing the solution will give you the edge.
• Copying and pasting from the notebook is not a good idea, and it is better to avoid showing codes unless they are the focal point of your presentation.
• Please submit the presentation in PDF format only.
1. There are two parts to the submission:
1. A well commented Jupyter notebook [format - .html]
2. A presentation as you would present to the top management/business leaders [format - .pdf ] (you have to export/save the .pptx file as .pdf)
Criteria Points
- Univariate analysis - Bivariate analysis - Use appropriate visualizations to identify the patterns and insights - Any other exploratory deep dive 4
Key meaningful observations on the relationship between variables 4
Prepare the data for analysis - Missing value Treatment, Outlier Detection(treat, if needed- why or why not ), Feature Engineering, Prepare data for modeling 4
- Make a logistic regression model - Improve model performance by up and downsampling the data - Regularize above models, if required 5
- Build Decision tree, random forest, bagging classifier models - Build Xgboost, AdaBoost, and gradient boosting models 8
- Tune all the models using grid search - Use pipelines in hyperparameter tuning 8
- Tune all the models using randomized search - Use pipelines in hyperparameter tuning 8
- Compare the model performance of all the models - Comment on the time taken by the grid and randomized search in optimization 4
- Business recommendations and insights 3
- Structure and flow - Crispness - Visual appeal - All key insights and recommendations covered 8
- Structure and flow - Well commented code 4
Points 60
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme