Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

Drop Files Here Or Click to Upload

Or Get Complete Course Help

Anmol AroraGeneral article writing

(5/5)

790 Answers

Hire Me

Neha SharmaaStatistics

(5/5)

1156 Answers

Hire Me

Jovette BermanBusiness

(5/5)

500 Answers

Hire Me

John GuthrieResume writing

(5/5)

582 Answers

Hire Me

R Programming

(5/5)

set of data we’re going to have you work on is data from an analysis of the levels of an enzyme (HK) in butterfly populations

INSTRUCTIONS TO CANDIDATES

ANSWER ALL QUESTIONS

Part I Regression analysis (40mks) Data exploration and transformation

All of the answers must be submitted in a txt file or .r file to the assignment link in module 7 of myelearning.

The first set of data we’re going to have you work on is data from an analysis of the levels of an enzyme (HK) in butterfly populations and a range of climatic conditions, including altitude (ALT), Precipitation (PRECIP), Minimum temperature (MINTEMP), and Maximum temperature (MAXTEMP). You have been provided with these three variables in the “butterflies.csv” file. You ought to know how to bring this data into R, so go ahead and do so now.

1) Produce scatter plots of the dependent variable (HK) against 4 independent variables (ALT, PRECIP, MAXTEMP, MINTEMP). Put these graphs into one figure split into 4 quadrants, and put labels on all axes. Comment on the shape of each of these graphs. Is there a discernible pattern with either of these graphs? Judging by the graphs, which of the independent variables might have more of an effect on the levels of HK? (5mks)

2) Produce histograms of each of the variables, both dependent and independent and put them in your answer (with labels). Judging from this, which of the variables illustrates a normal type distribution? Conduct basic statistics to give the skewness and kurtosis of each of these variables. Provide the values for kurtosis and skewness for each of these variables and interpret these values. You will need to do some investigation to figure out what the interpretation of

these values means. (Hint: look up the “moments” package). (6mks)

Correlation analysis

3) Carry out a correlation on these four independent variables and include your results in your answer. Is there anything we should be concerned about with these correlations? (2mk)

Regression analysis

4) The first regression analysis to be conducted is a regression between HK with ALT, PRECIP, MAXTEMP, and MINTEMP. You should know how to do this based on what was demonstrated in class. Interpret the results for the regression analysis in terms of which variables were significant in the model (at an alpha level of 0.05), how well the model is explaining the variance in the levels of HK, and the significance of the model. Do we have a problem with the variance inflation factors? (Hint: install and run the car package, and also read the reference “An R companion on Applied Regression – Chapter 6” provided for information about interpreting vif values). If so,

what is this problem and how does this affect our results? Comment on the diagnostic plots. (8mk)

5) Have a look at the anova of the model. Does this differ from the coefficients table of the regression analysis? Why is this? (2mk)

6) Given some of the problems above, you should know that you ought to remove one or more variables. Rerun the regression analysis with this/these variables missing, including the VIF (if necessary – can’t calculate VIF for a simple linear regression) as well as the 4 diagnostic plots (in a single frame using the par function). Does this resolve the problem? Discuss which final model you have chosen, and provide the results for your model, interpreting the results of the model for the regression analysis in terms of which variable/s were significant in the model (at an alpha level of 0.05), how well the model is explaining the variance in the levels of HK, and the significance of the model. Briefly discuss the effect that each of the variable/s has in terms of your model (ie whether an increase or decrease results in a resultant increase in the dependent variable). For your chosen model, comment on the diagnostic plots. You should also compare the anova table to the coefficients table in the regression result. (12mk)

7) Plug these values into your final regression equation to derive a value for HK (this of course depends upon which variable/s you have in the final model). Comment on the utility of using regression equations to extrapolate data. Is this wise? Explain your answer. (5mk)

ALT = 4

PRECIP = 65

MAXTEMP = 110 MINTEMP=-20

Part II Logistic Regression (45mks)

In this assignment, you will need to carry out some of the diagnostic tests prior to use in logistic regression analyses, conduct the analyses, conduct diagnostic tests afterwards and interpret the results of your analysis. You will be provided the data which you should be able to bring into R.

Having demonstrated this in class, it is expected that you will know how to carry out the analyses being asked of you with minimal assistance. If you are having difficulty, I urge you to review the powerpoints as well as the recordings of the Webex sessions, as well as the example R code which has been made available to you. You may also use the forum on the myelearning site to interact with your peers and instructor if you are in dire need of assistance.

You have been provided with a dataset of species’ presence (incidence) given the area of the island (area), the degree of isolation (isolation), the quality of the habitat on the island (quality), an index for predators on the island (enemy) and an index for competitors on the island (competitor) entitled “island.txt”.

8) Comment (using the hashtag #)on what you might expect to find with respect to the presence of this species on these island habitats. How might these explanatory variables be expected to influence species presence?

9) Plot normal qqplots of the explanatory variables. Do you see any problems with these? (3mks)

10) Investigate the explanatory variables to determine whether or not we have a problem with multicollinearity (5 mks).

11) Run a logistic model with incidence being the dependent variable, and with area, isolation, quality, enemy, and competitors being the explanatory variables. Check the variance inflation factors for this model and interpret them. (5mks)

12) Run a summary of the model and interpret the model in terms of what explanatory variables are significant, and, for those variables that are significant, interpret the log odds of this/these variables in how they change the dependent variable. (3mks)

13) Calculate the odds ratio for the model. Again interpret the odds ratio in the way that it affects the dependent variable for the explanatory variable/s that were significant in the model. (3mks)

14) Calculate whether or not the model is significant. (4mks)

15) Look at the influence measures of the model. Are there any data points we should be concerned about? (2mks)

16) Find the deviance residuals and the Pearson residuals for the model. Do any of these residual values conform with what you observed in question 15? Do histograms of both the deviance and Pearson residuals, comment on the shape of these histograms. (4mks)

17) Given the fact that some of the explanatory variables in question 12 didn’t come out as being significant, how would you rerun this model (i.e. what variable/s would you include in the model?). Run this model and interpret the results as you did before (steps 12 to 16). (16mks)

(5/5)

Hurry, Grab up to 30% discount on the entire course

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

Our Experts

Anmol AroraGeneral article writing

Neha SharmaaStatistics

Jovette BermanBusiness

John GuthrieResume writing

R Programming

set of data we’re going to have you work on is data from an analysis of the levels of an enzyme (HK) in butterfly populations

ANSWER ALL QUESTIONS

Part I Regression analysis (40mks) Data exploration and transformation

Correlation analysis

Regression analysis

Part II Logistic Regression (45mks)

Attachments:

Instructions Files

Related Questions

. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

. Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class

. The following program contains five errors. Identify the errors and fix them

. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer

Other Services

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

Our Experts

Anmol AroraGeneral article writing

Neha SharmaaStatistics

Jovette BermanBusiness

John GuthrieResume writing

R Programming

set of data we’re going to have you work on is data from an analysis of the levels of an enzyme (HK) in butterfly populations

ANSWER ALL QUESTIONS

Part I Regression analysis (40mks) Data exploration and transformation

Correlation analysis

Regression analysis

Part II Logistic Regression (45mks)

Attachments:

Instructions Files

Related Questions

. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

. Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class

. The following program contains five errors. Identify the errors and fix them

. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer