Use the cc.tr and cc.te data sets. The datasets summarize the usage behavior of active credit card holders over six months.
The project instructions are shown in bold. This is to distinguish the instructions from your work. Your work should be in not-bold.
• Work neatly. Aim for a professional-looking presentation. I will be grading your level of professionalism, as well as your English expression.
• Make sure all graphs and tables fit neatly on the page.
• Neither add nor delete pages. Use single-spacing.
For all text output, surround it with a text box. In Word, select the text output, then Insert > Text Box > Draw Text Box.
Apart from this document, which you will save as a pdf and submit, you must submit your R script, containing the code you used to solve the problems. The R script should be neat and easily understandable by people who are not you. It should be well-annotated, describing what you are doing so that anyone could understand it.
This Project is brand new, and may have typos, errors, etc, that I have missed. Please report these to me asap. For this and other reasons, this Project is subject to change at any time (though of course I will be reasonable.)
I am aware that I am not asking about validating the regression assumptions here. Of course, you should do so in the real world. But this project is long enough, and I think you deserve a break.
Standardize the predictors, but not the target, balance.
Make sure you do not include the target balance when creating clusters.
You may need to set the data sets to be data frames.
Install and library the caret, psych, Kohonen, and NbClust packages.
1. Insert your Executive Summary here. (A strategy for this is given at the end.)
2. Import the cc.tr training data set. Use the CH Criterion to determine the optimal number of clusters, for k = 2 … 8. Provide the CH values for all candidate models. Let the winning value of k be denoted as CHK1, and the runner-up as CKH2, Report CHK1 and CKH2.
3. We introduce a third measure of cluster goodness: the Predictive Clustering Criterion. The Predictive Clustering Criterion selects the value of k that obtains the best predictive metrics, when the clusters are used as the sole predictors.
The following rather inelegant code will work to get this done, for k = 4. I would like to hear from an R expert who can show how to do this more elegantly.
k4 <- kmeans(ccnbs, centers = 4, nstart = 25)
k4cluster <- k4$cluster
k4cluster <- as.factor(k4cluster)
k4cluster <- data.frame(k4cluster)
dummy_model <- dummyVars(" ~ .", data = k4cluster)
k4cluster.d <- predict(dummy_model, newdata = k4cluster)
reg4 <- lm(CC$balance ~ k4cluster.d)
summary(reg4)
4. Thoroughly discuss your findings from the table in the previous problem. Then, weighing each of the criteria, attempt to find a consensus as to the optimal value of k.
5. Use k-means clustering to develop clusters, using k = MSK1. Then construct a nice table in Word containing the record count and the means for all variables and all clusters. Highlight the max and min means using green and red font. Then provide brief but comprehensive profiles of each cluster.
6. Use Kohonen networks clustering to develop clusters, using k = MSK1, and hexagonal structure. Then construct a nice table in Word containing the record count and the means for all variables and all clusters. Highlight the max and min means using green and red font. Copy the nice table from the k = MSK1, K means clustering model.
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme