logo Hurry, Grab up to 30% discount on the entire course
Order Now logo

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

expert
Jayden StewartEnglish
(4/5)

520 Answers

Hire Me
expert
Henry SimmonsEnglish
(5/5)

727 Answers

Hire Me
expert
Akshay SinglaResume writing
(5/5)

729 Answers

Hire Me
expert
Rex HuntGeneral article writing
(5/5)

772 Answers

Hire Me
R Programming
(5/5)

This Project is brand new, and may have typos, errors, etc, that I have missed

INSTRUCTIONS TO CANDIDATES
ANSWER ALL QUESTIONS

Predictive Analytics:  Estimation and Clustering

Use the cc.tr and cc.te data sets.   The datasets summarize the usage behavior of active credit card holders over six months.

The project instructions are shown in bold.  This is to distinguish the instructions from your work.  Your work should be in not-bold.

Work neatly.  Aim for a professional-looking presentation.  I will be grading your level of professionalism, as well as your English expression.

Make sure all graphs and tables fit neatly on the page.  

Neither add nor delete pages.  Use single-spacing.

For all text output, surround it with a text box.  In Word, select the text output, then Insert > Text Box > Draw Text Box.

Apart from this document, which you will save as a pdf and submit, you must submit your R script, containing the code you used to solve the problems.  The R script should be neat and easily understandable by people who are not you.  It should be well-annotated, describing what you are doing so that anyone could understand it.

This Project is brand new, and may have typos, errors, etc, that I have missed.  Please report these to me asap.  For this and other reasons, this Project is subject to change at any time (though of course I will be reasonable.)

I am aware that I am not asking about validating the regression assumptions here.  Of course, you should do so in the real world.  But this project is long enough, and I think you deserve a break.  

Standardize the predictors, but not the target, balance.  

Make sure you do not include the target balance when creating clusters.  

You may need to set the data sets to be data frames.  

Install and library the caret, psych, Kohonen, and NbClust packages.  

1. Insert your Executive Summary here.  (A strategy for this is given at the end.)

2. Import the cc.tr training data set.  Use the CH Criterion to determine the optimal number of clusters, for k = 2 … 8.  Provide the CH values for all candidate models.  Let the winning value of k be denoted as CHK1, and the runner-up as CKH2, Report CHK1 and CKH2.

3. We introduce a third measure of cluster goodness: the Predictive Clustering Criterion.  The Predictive Clustering Criterion selects the value of k that obtains the best predictive metrics, when the clusters are used as the sole predictors.  

The following rather inelegant code will work to get this done, for k = 4.  I would like to hear from an R expert who can show how to do this more elegantly.

k4 <- kmeans(ccnbs, centers = 4, nstart = 25)

k4cluster <- k4$cluster

k4cluster <- as.factor(k4cluster)

k4cluster <- data.frame(k4cluster)

dummy_model <- dummyVars(" ~ .", data = k4cluster)

k4cluster.d <- predict(dummy_model, newdata = k4cluster)

reg4 <- lm(CC$balance ~ k4cluster.d)

summary(reg4)

4. Thoroughly discuss your findings from the table in the previous problem.  Then, weighing each of the criteria, attempt to find a consensus as to the optimal value of k.  

5. Use k-means clustering to develop clusters, using k = MSK1. Then construct a nice table in Word containing the record count and the means for all variables and all clusters.  Highlight the max and min means using green and red font.  Then provide brief but comprehensive profiles of each cluster. 

6. Use Kohonen networks clustering to develop clusters, using k = MSK1, and hexagonal structure.  Then construct a nice table in Word containing the record count and the means for all variables and all clusters.  Highlight the max and min means using green and red font.  Copy the nice table from the k = MSK1, K means clustering model.

(5/5)
Attachments:

Related Questions

. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

CS 340 Milestone One Guidelines and Rubric  Overview: For this assignment, you will implement the fundamental operations of create, read, update,

. Develop a program to emulate a purchase transaction at a retail store. This  program will have two classes, a LineItem class and a Transaction class

Retail Transaction Programming Project  Project Requirements:  Develop a program to emulate a purchase transaction at a retail store. This

. The following program contains five errors. Identify the errors and fix them

7COM1028   Secure Systems Programming   Referral Coursework: Secure

. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip

. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer

CS 340 Final Project Guidelines and Rubric  Overview The final project will encompass developing a web service using a software stack and impleme