logo Hurry, Grab up to 30% discount on the entire course
Order Now logo
565 Times Downloaded

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

expert
Anuj MittalMathematics
(/5)

642 Answers

Hire Me
expert
Minakshi AroraManagement
(5/5)

820 Answers

Hire Me
expert
Imraan KhanManagement
(5/5)

890 Answers

Hire Me
expert
Umar KhalidManagement
(5/5)

637 Answers

Hire Me
Applied Statistics
(5/5)

film_2023.txt which contains data for the thickness of pieces of plastic film measured in different positions after being cut.

INSTRUCTIONS TO CANDIDATES
ANSWER ALL QUESTIONS

Please read the instruction carefully. Instructions:

Submit only one file in pdf format to the link on the Study Desk.

Assume that your report will be read by someone familiar with the data set but with limited statistical knowledge. Fully explain plots and when stating statistics or results explain what they mean statistically AND in context of the data.

Presentation should be neat, consistent, spell-checked and proofread. All questions should be clearly labelled, and all answers should clearly and concisely address the questions.

If you convert a Word document to pdf for submission check that all symbols, equations etc. have converted correctly, i.e. proof-read your work.

If you do not use Rmarkdown to compile your submission, where asked to provide R code, paste relevant code within the assignment document and italicise (or otherwise highlight or distinguish from other content). Do not include code in an appendix.

Do not include an appendix at all. Any work included in an appendix will not be marked.

Please note that referencing textbooks and other resources is not the goal of this assessment. This work requires students to demonstrate their understanding of the analysis and interpretation, not provide quotes from resources.

Use only statistical methods covered in this course and do not transform variables to try to normalise them.

When interpreting output you are expected to do so in context of the data and the method (i.e. ensure that you comment on aspects of the method that affect your interpretation with respect to the variables and sample).

A maximum of 10 marks will be deducted from your total marks for poor presentation.

Marks:

Question 1: 32

Question 2: 29

Question 3: 39

 

Data Files

Three data sets will be used for this assignment.

Question 1: film_2023.txt which contains data for the thickness of pieces of plastic film measured in different positions after being cut.

Question 2: iris_2023.txt which contains data for variables of flower characteristics for different species of iris.

Question 3: usair_2023.dat which contains data for air quality variables measured across different United States cities.

 

Question 1 [32 marks]

Provide R code, output and written interpretation for parts a) to d) of this question. Provide only output that is directly relevant to address each section.

Test for multivariate normality (MVN) by:

a). Provide output from the structure function (0.5 mark) and describe the structure of the ‘film_2023.txt’ data (2.5 marks). (3 marks total)

b). Produce (2 marks) and interpret (4 marks) univariate QQ plots, histograms and univariate Shapiro-Wilks tests of normality for each of the four film thickness variables. What is the default univariate test produced by the mvn function? (1 mark) (7 marks total)

 

c). Produce (2 marks) and interpret (4 marks) perspective and contour plots for the TopRight and TopLeft film thickness variables. What is an inherent problem with using these plots to assess MVN (1 mark)? (7 marks total)

d). Do the analysis necessary to provide the results of the Mardia, Henze-Zirkler and Royston tests of MVN based on all four film thickness variables. Include in your interpretation: (13 marks total)

The Chi-Square QQ plot (1 mark) and interpretation (2 marks).

Describe how the QQ plot is constructed and its relationship to the univariate normal QQ plots (4 marks).

Output (1 mark) and interpretation (3 marks) for the 3 tests.

What is a key limitation of these MVN statistical tests (2 marks)?

 

e). If your data does not meet MVN, why might you need to consider the ratio of cases to variables? (This question does not necessarily relate specifically to this particular data set) (2 marks total)

 

 

 

 

Question 2 [29 marks]

Provide R code, output and written interpretation for parts a), b), c), e) and f) of this question. Assume the data meets the assumption of MVN (do not test for MVN).

a). Use the structure function and describe the structure of the ‘iris_2023.txt’ dataset (1 mark). In the context of MANOVA list the dependent and independent variables and define the relationship that the MANOVA would test (2 marks). What type of variable does SPECIES need to be for MANOVA (1 mark). Make sure you have converted this variable if necessary before attempting the analysis in later parts of Question 2. (4 marks total)

b). Produce a draftsman display for four flower characteristic variables (2 marks). Use the function scatterplotMatrix (from week 2) for the draftsman and check the help documentation (?scatterplotMatrix) to help you produce the plot with observations grouped by species using different colours and include the associated legend. Your plot should not include smoothing, regression lines, or distribution curves in the diagonal panels of the plot (1 mark). Interpret these plots (3 marks). What are the y and x axes on plot [3,2] of the scatterplotMatrix (1 mark)? (7 marks total)

Hint: to move the legend in scatterplotMatrix try something like: legend=list(coords=“bottomleft”).

c). Using MANOVA in R, test for differences in ‘flower characteristics’ between the three species. Include results using all four test statistics (2 marks) covered in this course and interpret output (2 marks). Include a sentence about what these results mean in terms of within and between group variances in general (2 marks). (6 marks total)

d). Which of the four tests used in part c) would be the best to interpret if there are concerns about multivariate normality or covariance equality? (1 mark total)

e). Perform analysis that specifically compares each of the species with each other (you should have 3 comparisons) using Hotelling’s T 2 test and a significance level of 0.05. Determine the multiple test corrected significance level (1 mark). Do not provide R output; instead reproduce and complete the following table for all comparisons (3 marks) and interpret (3 marks). (7 marks total)

assignment requires R-code.

(5/5)
Attachments:

Expert's Answer

565 Times Downloaded

Related Questions

. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

CS 340 Milestone One Guidelines and Rubric  Overview: For this assignment, you will implement the fundamental operations of create, read, update,

. Develop a program to emulate a purchase transaction at a retail store. This  program will have two classes, a LineItem class and a Transaction class

Retail Transaction Programming Project  Project Requirements:  Develop a program to emulate a purchase transaction at a retail store. This

. The following program contains five errors. Identify the errors and fix them

7COM1028   Secure Systems Programming   Referral Coursework: Secure

. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip

. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer

CS 340 Final Project Guidelines and Rubric  Overview The final project will encompass developing a web service using a software stack and impleme