1. Setting up the project and exploratory analysis (10%)
a. Create a new project and create a data source based on the selected dataset. Set Price as the role of Target and make sure the Role and Level assigned to each variable is correct.
b. Carry out a data exploration by using a StatExplore Node. Explain your findings with regard to your property dataset.
c. Create a Data Partition with 70% of the data for training and 30% for validation.
2. Decision tree-based modeling and analysis (25%)
Carry out the following modeling tasks for the selected property value dataset.
a. Create two Decision Tree models. Use two-way and three-way splits to create the two separate decision tree models.
For each decision tree,
I.
How many leaves are in the optimal tree?
II.
Which variable was used for the first split?
What were the competing splits for this first split?
III.
b. Which of the decision tree models appears to be better? Justify your answer.
c. Refer to the selected decision tree model in part (b) and
Identify leaf nodes which have good predictive performance (two leaf nodes) and poor predictive performance (two leaf nodes).
I.
II.
Provide justifications for your selections.
III.
Write down the rules for the pathways leading up to each selected leaf node.
a. In preparation for regression, is any missing values imputation needed? If yes, should you do this imputation before generating the decision tree models? Why or why not?
b. Use an Impute node connected to Data Partition node to handle missing values. Which variables have been imputed?
c. Are there any ordinal variables? Use the Replacement node to assign relevant values.
d. Conduct data exploration to select the best variables for the model with Variable Clustering node. Explain your findings.
e. Create a Regression model using the set of variables you identified as suitable in part d. You can choose the stepwise selection and use validation error as the selection criterion.
f. Run the Regression node and view the results.
I.
II.
Which variables are included in the final model? Explain what this means to the real estate company (very briefly).
What is the validation ASE? What does this mean in a predictive model?
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme