logo Hurry, Grab up to 30% discount on the entire course
Order Now logo

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

expert
Doug BeaucheminAccounting
(5/5)

861 Answers

Hire Me
expert
Aashi NagpalOthers
(5/5)

608 Answers

Hire Me
expert
Shannon HCriminology
(5/5)

966 Answers

Hire Me
expert
Jason ParkerCriminology
(5/5)

769 Answers

Hire Me
R Programming
(5/5)

Decision trees are widely used in the banking industry due to their high accuracy and ability to formulate a statistical model in plain language.

INSTRUCTIONS TO CANDIDATES
ANSWER ALL QUESTIONS

Decision trees are widely used in the banking industry due to their high accuracy and ability to formulate a statistical model in plain language. Since government organizations in many countries carefully monitor lending practices, executives must be able to explain why one applicant was rejected for a loan while others were approved. This information is also useful for customers hoping to determine why their credit rating is unsatisfactory.

It is likely that automated credit scoring models are employed for instantly approving credit applications on the telephone and the web. In this section, we will develop a simple credit approval model using C5.0 decision trees. We will also see how the results of the model can be tuned to minimize errors that result in a financial loss for the institution.

Step 1 – collecting data

The idea behind our credit model is to identify factors that make an applicant at higher risk of default. Therefore, we need to obtain data on a large number of past bank loans and whether the loan went into default, as well as information about the applicant.

Data with these characteristics are available in a dataset donated to the UCI Machine Learning Data Repository (http://archive.ics.uci.edu/ml) by Hans Hofmann

of the University of Hamburg. They represent loans obtained from a credit agency in Germany.

The data presented in this chapter has been modified slightly from the original one for eliminating some preprocessing steps. To follow along with the examples, download the credit.csv file from Packt Publishing's website and save it to your R working directory.

The credit dataset includes 1,000 examples of loans, plus a combination of numeric and nominal features indicating characteristics of the loan and the loan applicant. A class variable indicates whether the loan went into default. Let's see if we can determine any patterns that predict this outcome.

Step 2 – exploring and preparing the data

As we have done previously, we will import the data using the read.csv() function. We will ignore the stringsAsFactors option (and therefore use the default value, TRUE) as the majority of features in the data are nominal. We'll also look at the structure of the credit data frame we created:

> credit <- read.csv("credit.csv")

> str(credit)

The first several lines of output from the str() function are as follows:

'data.frame':1000 obs. of 17 variables:

$ checking_balance : Factor w/ 4 levels "< 0 DM","> 200 DM",..

$ months_loan_duration: int 6 48 12 ...

$ credit_history : Factor w/ 5 levels "critical","good",..

$ purpose : Factor w/ 6 levels "business","car",..

$ amount : int 1169 5951 2096 ...

We see the expected 1,000 observations and 17 features, which are a combination of factor and integer data types.

Let's take a look at some of the table() output for a couple of features of loans that seem likely to predict a default. The checking_balance and savings_balance features indicate the applicant's checking and savings account balance, and are recorded as categorical variables:

> table(credit$checking_balance)

< 0 DM > 200 DM 1 - 200 DM unknown

274 63 269 394

> table(credit$savings_balance)

< 100 DM > 1000 DM 100 - 500 DM 500 - 1000 DM unknown

603 48 103 63 183

Since the loan data was obtained from Germany, the currency is recorded in Deutsche Marks (DM). It seems like a safe assumption that larger checking and savings account balances should be related to a reduced chance of loan default. 

(5/5)
Attachments:

Related Questions

. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

CS 340 Milestone One Guidelines and Rubric  Overview: For this assignment, you will implement the fundamental operations of create, read, update,

. Develop a program to emulate a purchase transaction at a retail store. This  program will have two classes, a LineItem class and a Transaction class

Retail Transaction Programming Project  Project Requirements:  Develop a program to emulate a purchase transaction at a retail store. This

. The following program contains five errors. Identify the errors and fix them

7COM1028   Secure Systems Programming   Referral Coursework: Secure

. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip

. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer

CS 340 Final Project Guidelines and Rubric  Overview The final project will encompass developing a web service using a software stack and impleme