logo Hurry, Grab up to 30% discount on the entire course
Order Now logo
381 Times Downloaded

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

expert
DaniaStatistics
(/5)

733 Answers

Hire Me
expert
Minal JordenOthers
(5/5)

706 Answers

Hire Me
expert
Earl BarwinAccounting
(5/5)

535 Answers

Hire Me
expert
Herbert BarciaEducation
(5/5)

955 Answers

Hire Me
Rapid Miner
(5/5)

The goal of the Predictive Analytics Case Study is to predict whether a customer is likely to become a loan delinquency

INSTRUCTIONS TO CANDIDATES
ANSWER ALL QUESTIONS

Task 1 Predictive Analytics Case Study (40 Marks)

The goal of the Predictive Analytics Case Study is to predict whether a customer is likely to become a loan delinquency and default on a loan for ACME Bank (see Table 1 Data Dictionary for loan-delinq.csv data set below). In completing Task 1 you will apply business understanding, data understanding, data preparation, modelling and evaluation phases of the CRISP DM data mining process. It is important that you understand this data set to complete Task 1 and four sub tasks.

 

Table 1 Data dictionary for loan-delinq.csv

Variable Name Description Data

Type

Record_ID Unique record id for customer Integer

SeriousDlqin2yrs Person experienced 90 days past due

delinquency or worse Yes = 1 or

0 = No

RevolvingUtilizationOfUnsecuredLines Total balance on credit cards and personal lines of credit except real estate and no instalment debt like car

loans divided by sum of credit limits Percentage

Age Age of borrower in years Integer

 

NumberOfTime30-

59DaysPastDueNotWorse Number of times borrower 30-59 days

past due but no worse in last 2 years. Integer

DebtRatio Monthly debt payments, alimony, living costs divided by monthly gross

income. Percentage

MonthlyIncome Monthly Income Real

NumberOfOpenCreditLinesAndLoans Number of Open loans (installment like car loan or mortgage) and Lines

of credit (e.g. credit cards). Integer

NumberOfTimes90DaysLate Number of times borrower has been

90 days or more past due. Integer

NumberRealEstateLoansOrLines Number of mortgage and real estate loans including home equity lines of

credit. Integer

NumberOfTime60- 89DaysPastDueNotWorse Number of times borrower has been 60-89 days past due but no worse in

last 2 years. Integer

NumberOfDependents Number of dependents in family excluding themselves (spouse,

children etc.) Integer

 

1.1 Exploratory data analysis and date preparation Conduct an exploratory data analysis and data preparation of loan-delinq.csv data set using RapidMiner to understand the characteristics of each variable and relationship of each variable to other variables. Summarise the findings of your exploratory data analysis and data preparation in terms of describing key characteristics of each variable in the loan-delinq.csv data set such as maximum, minimum values, average, standard deviation, most frequent values (mode), missing values and invalid values etc and relationships with other variables, transformation of existing variables, creation of new variables in a table named Task 1.1 Results of Exploratory Data Analysis and Data Preparation.

 

Hint: Statistics Tab and Chart Tab in RapidMiner provide a lot of descriptive statistical information and useful charts like Barcharts, Scatterplots required for Task 1.1 etc. You might also like to look at running some correlations and/or chi square tests depending on whether a variable is a categorical variable or a numeric variable. Indicate in Table 1.1 which variables which contribute most to predicting whether a customer is likely to become a loan delinquency and default on a loan or not. You could also consider transforming some variables and creating new variables and converting target/label variable into a binominal variable to facilitate analysis in Tasks 1.2, 1.3 and 1.4.

 

Briefly discuss the key findings of your exploratory data analysis and data preparation and justification for variables most likely to predict whether a customer is likely to become a loan delinquency and default on a loan or not (10 marks 500 words).

 

1.2 Decision Tree Model Build a Decision Tree model for predicting whether a customer is likely to become a loan delinquency and default on a loan or not, on the loan-

 

delinq.csv data set using RapidMiner and a set of data mining operators in part determined by your exploratory data analysis in Task 1.1. Provide these outputs from RapidMiner (1) Final Decision Tree Model process, (2) Final Decision Tree diagram and (3) Decision tree rules. Briefly explain your final Decision Tree Model Process, and discuss the results of the Final Decision Tree Model drawing on key outputs (Decision Tree Diagram, Decision Tree Rules) for predicting whether a customer is likely to become a loan delinquency and default on a loan or not based on key contributing variables and relevant supporting literature on interpretation of decision trees (10 marks 150 words).

 

1.3 Logistic Regression Model Build a Logistic Regression model for predicting whether a customer is likely to become a loan delinquency and default on a loan or not using RapidMiner and an appropriate set of data mining operators and loan-delinq.csv data set determined in part by your exploratory data analysis in Task 1.1. Provide these outputs from RapidMiner (1) Final Logistic Regression Model process (2) Key outputs from Logistic Regression Model. Hint for Task 1.3 Logistic Regression Model you may need to change data types of some variables. Briefly explain your final Logistic Regression Model Process and discuss the results of the Final Logistic Regression Model drawing on the key outputs (Coefficients, Standardised Coefficients, Odds Ratios, P Values etc) for predicting whether a customer is likely to become a loan delinquency and default on a loan or not based on key contributing variables and relevant supporting literature on interpretation of logistic regression models (10 marks 150 words).

 

1.4 Model Validation and Performance: You will need to validate your Final Decision Tree Model and Final Logistic Regression Model using the Cross-Validation Operator, Apply Model Operator and Performance Operator in your data mining processes. Discuss and compare the performance of the Final Decision Tree Model with the Final Logistic Regression Model for predicting whether a customer is likely to become a loan delinquency and default on a loan or not based on key results of the confusion matrix presented in Table 1.4 Model Performance Metrics (Decision Tree vs Logistic Regression). Table 1.4 will compare the Final Decision Tree Model with the Final Logistic Regression Model using following model performance metrics – (1) accuracy (2) sensitivity (3) specificity and (4) F1 score (10 marks 200 words).

 

Note the important outputs from the data mining analyses conducted in RapidMiner for Task 1 must be included in your Report 3 to provide support for your conclusions reached regarding each analysis conducted for 1.1, 1.2, 1.3 and 1.4. Note you can export important outputs from RapidMiner as jpg image files and include these screenshots in the relevant Task 1 parts of your Assessment 3 Report.

 

Note you will find the North Text book and RapidMiner Tutorials useful references for the data mining process activities conducted in Task 1 in relation to the exploratory data analysis and data preparation, decision tree analysis, logistic regression analysis and evaluation of the performance of the Final Decision Tree model and the Final Logistic Regression model.

These concepts are covered in Module RapidMiner Practicals and Chapters 3, 4, 9, 10 and 13 of North Textbook and RapidMiner Tutorials contained within RapidMiner.

 

Research and critically review the study materials and other relevant literature to provide a suitable written response to each of the following tasks 2, 3 and 4 supported with an appropriate level of in-text referencing:

 

Task 2 Social media analytics (15 marks 500 words)

2.1 Explain why social media analytics is such an important activity for business (7 Marks 250 words)

2.2 Choose and describe a widely used application of social media analytics and explain how impact of social media can be measured in this application area using social media analytics (8 marks 250 words)

 

Task 3 Big Data Technologies 15 marks 500 words)

3.1 Explain why streaming analytics is such an important concept in big data management, illustrate your answer with a real world application of streaming analytics (8 marks 250 words).

3.2 Discuss the key technology building blocks of IoT in the context of a real world application of IoT (7 marks 250 words).

Task 4 Artificial Intelligence: GPT Chat transforming work and ethical considerations of using Chat GPT(20 marks 1000 words)

4.1 First, discuss how configurations of humans and artificial intelligence will evolve in workplaces as organisations increasingly drive automation and augmentation through the adoption of AI applications such as GPT Chat (10 marks 500 words).

4.2 Second identify and discuss the ethical implications for organisations in relation to (1) privacy (2) transparency (3) bias and discrimination and (4) governance and accountability of using Chat GPT to drive automation and augmentation of work (10 marks 500 words).

 

 

(5/5)
Attachments:

Expert's Answer

381 Times Downloaded

Related Questions

. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

CS 340 Milestone One Guidelines and Rubric  Overview: For this assignment, you will implement the fundamental operations of create, read, update,

. Develop a program to emulate a purchase transaction at a retail store. This  program will have two classes, a LineItem class and a Transaction class

Retail Transaction Programming Project  Project Requirements:  Develop a program to emulate a purchase transaction at a retail store. This

. The following program contains five errors. Identify the errors and fix them

7COM1028   Secure Systems Programming   Referral Coursework: Secure

. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip

. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer

CS 340 Final Project Guidelines and Rubric  Overview The final project will encompass developing a web service using a software stack and impleme