Task 2.1) Conduct and report on exploratory data analysis (EDA) of the academic- salaries.csv data set using RapidMiner Studio data mining tool. Note this will require use of a number of RapidMiner operators
Provide following for Task 2.1:
(i) a screen capture of your final EDA process, briefly describe your EDA process
(ii) summarise key results of your exploratory data analysis in Table 2.1 Results of Exploratory Data Analysis for academic-salaries.csv. Table 2.1 should include key characteristics of each variable in academic-salaries.csv data set such as maximum, minimum values, average, standard deviation, most frequent values (mode), missing values and invalid values etc.
(iii) Discuss key results of exploratory data analysis presented in Table 2.1 and provide a rationale for selecting top 5 variables for predicting academic salaries (salary), in particular focusing on the relationships of independent variables with each other and with dependent variable academic salaries (salaries) drawing on results of EDA analysis and relevant literature on determinates of academic salaries. (25 marks 300 words)
Hint: Statistics Tab and Chart Tab in RapidMiner Studio provide a lot of descriptive statistical information and the ability to create useful charts like Barcharts, Scatterplots, Boxplot charts etc for EDA analysis. You might also like to look at running correlations and/or chi square tests as appropriate to determine which variables contribute most to predicting academic salaries (salary).
Task 2.2) Build and report on your Linear Regression model for predicting academic salaries (salary) using RapidMiner data mining process and appropriate set of data mining operators and a reduced set of variables from academic-salaries.csv data set as determined by your exploratory data analysis in Task 2.1.
Provide the following for Task 2.2:
(i) A screen capture of Final Linear Regression Model process and briefly describe your Final Linear Regression Model process
(ii) Table 2.2 named Results of Final Linear Regression Model for Task 2.2 for
academic-salaries.csv data set.
(iii) Discuss the results of Final Linear Regression Model for academic-salaries.csv data set drawing on key outputs (coefficients, standardised coefficients, t-statistics values, p-values and significance levels etc) for predicting academic salaries (salary) and relevant supporting literature on interpretation of a Linear Regression Model.
(20 marks 200 words)
Include all appropriate outputs such as RapidMiner Processes, Graphs and Tables that support key aspects of exploratory data analysis and linear regression model analysis of the academic-salaries.csv data set in your Report 2.
Note: export Processes and Graphs from RapidMiner using File/Print/Export Image option, include in Task 2 section or in Appendix 2 of Report 2.
Report quality structure, presentation, writing style and referencing (Worth 10 marks) Your Report 2 must be presented in report format, written in an appropriate style and supported where required with appropriate in text references using Harvard Referencing Style
Your report 2 must be structured in report format as follows:
Report 2 Cover/Title Page Table of Contents
Body of report 2 – Task 1 main heading with appropriate sub headings Task 1.1, Task 1.2, Task 2 main heading with appropriate sub headings Task 2.1 and Task 2.2…etc
List of References List of Appendices
You submit one file for Report 2: for Tasks 1 and 2 in Word document format with extension .docx
You must use the following file naming convention:
Student_no_Student_name_CIS8008_R2.docx
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme