Calculate the correlation coefficient and come up with the conclusion that if any relationship exists
INSTRUCTIONS TO CANDIDATES
ANSWER ALL QUESTIONS
Assignment:
- Using the data set cars, which refers to pairs of the speed and stopping distances of cars, the data are arranged in a data frame. You are asked to do the following:
- Draw a scatterplot between the two variables that are contained in the data frame cars. Discuss the relationship between the two variables. (2 marks)
- Calculate the correlation coefficient and come up with the conclusion that if any relationship exists. (2 marks)
- Draw a histogram of each of the two variables. Comment on the shape of the two distributions. (4 marks)
- Draw a box-plot for each of the two variables and explain the plot. (4 marks)
- Consider a population for which the height follows a normal distribution with mean 170 cm and standard deviation of 9 cm.
- Simulate the height of 1000 people and save in a vector called heights. (2 marks)
- Plot the simulated heights against the theoretical density function for this distribution, and comment on the distribution. (2 marks)
- What is the minimum height required so that one belongs to the tallest 10% of this population? (2 marks)
- What percentage of the population is shorter than 164 cm? (2 marks)
- Consider the built-in data set cars which consists of two variables, speed and breaking distance.
- Fit a simple linear regression model of distance (dependent variable) on speed (independent variable) and check its summary. (4 marks)
- Graphically explore the four assumptions of the linear regression model, perform relevant tests, and discuss the conclusions. (8 marks)
- Install the datarium package, which contains the data set marketing. This data set consist of four variables and 200 records, advertising expenditure in three media (YouTube, Facebook, and Newspaper) and the sales (which is the dependent variable). With the marketing data:
- Discuss the correlation relationship among variables and explore the top ten rows of the data set. (2 marks)
- Split the data into an in-sample (that contains the first 150 records of the data) and an out-of-sample (that contains the last 50 records of the data). (2 marks)
- Using the in-sample data, build the simple linear regression models with one independent variable "youtube". Report and discuss the summary of the model. (6 marks)
- Based on the results above, build the multiple linear regression model with more than one independent variable. Compare and discuss the model results and choose the best model with justifications. (8 marks)
- Use the chosen model to predict the sales for the expenditures of the out-of-sample data. (2 marks)
- In this task, you will be working with the “Default of Credit Card Clients” data set, for which you can find full details regarding the variables here. For convenience, the data is also attached below, named "default of credit card clients.xls". Once you load the data:
- Randomly split the data into two sets: in-sample (75% of the data) and out-of-sample (the remaining 25%). (2 marks)
- Using the in-sample data, fit a suitable logistic regression model (note that the solution might not be unique). (8 marks)
- Produce the classification table (confusion matrix). (4 marks)
- Use the fitted model and the out-of-sample data to make predictions. Produce the classification table (confusion matrix) for these predictions. (4 marks)
- Regression does not provide proof of a causal link between the dependent variable and an independent variable. Explain why this is the case. (10 marks)
- Write a short essay (500 words), that discusses your understanding of how business statistics could be beneficial in economic recovery post COVID, with statements and justifications. (20 marks)
Attachments:
Related Questions
. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java
CS 340 Milestone One Guidelines and Rubric
Overview: For this assignment, you will implement the fundamental operations of create, read, update,
. Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class
Retail Transaction Programming Project
Project Requirements:
Develop a program to emulate a purchase transaction at a retail store. This
. The following program contains five errors. Identify the errors and fix them
7COM1028
Secure Systems Programming
Referral Coursework: Secure
. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer
CS 340 Final Project Guidelines and Rubric
Overview The final project will encompass developing a web service using a software stack and impleme