logo Hurry, Grab up to 30% discount on the entire course
Order Now logo

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

expert
Dania HasanStatistics
(/5)

839 Answers

Hire Me
expert
Sikhar AggarwalMathematics
(5/5)

882 Answers

Hire Me
expert
Bashi LCriminology
(5/5)

635 Answers

Hire Me
expert
Anup GodaraAccounting
(5/5)

687 Answers

Hire Me
R Programming
(5/5)

Fit a multiple logistic regression model with the census region, the logarithm of population density

INSTRUCTIONS TO CANDIDATES
ANSWER ALL QUESTIONS

Directions: Complete the following questions. The questions have been separated into 4 parts of similar material. Parts 1, 2, and 3 will only use the corona train 1, data while Part 4 will use the corona test 4, data. Use the Markdown starter file here: hw5 starter.Rmd y . 

Part 1 - Odds 

1. Using the training dataset, compute the odds that a county has reported a Coronavirus-related death. (2pts) 

2. Does the odds of a Coronavirus-related death vary by Census region? Compute the odds that a county has reported a Coronavirus-related death for each Census region within the United States. Compare these values to address the question. (3pts) 

Part 2 - Simple Logistic Regression

3. Build a plot (or plots) to explore how the logarithm of the population density predicts whether a county has recorded a coronavirus-related death. Briefly discuss the results of your plot. (2pts) 4. Build a simple logistic model to statistically determine if the logarithm of the population density predicts the probability a county has reported a Coronavirus-related death. Support your findings with an appropriate hypothesis test. (3pts) 

Part 3 - Multiple Logistic Regression Models 

5. Fit a multiple logistic regression model with the census region, the logarithm of population density, the cumulative coronavirus rate, the median county age, the median income, the percent with a college degree, the percent of the population that are veterans of the U.S. armed services, the percent with healthcare and the percent that voted for President Biden in the 2020 general election to predict the probability a county has reported a Coronavirus-related death. Conduct an appropriate test to determine whether this model significantly predicts the probability a county has reported a Coronavirus-related death. (3pts) 

6. Perform a backward selection procedure on the model from question 5. Which variable(s) has/have been removed from the model. (2pts) 

7. We will now continue a backward selection procedure, but this time using Likelihood Ratio test. Using the dropl() function to determine which predictors are significant, iteratively remove all insignificant predictors from the model in question 6. That is, look at the dropl() output from the model in question 6, refit the model after removing all insignificant terms, look at the dropl() output, refit the model after removing all insignificant terms... Continue this process until all predictors are significant. What predictor variables remain in the model? Important: Use alpha=0,01 for this problem. (4pts) 

8. The starter file contains some code to help you along on this problem. Build a table to compare the AIC, BIC and a Pseudo-R-squared for the models fit in questions 5, 6 and 7. Which model is best with respect to each metric? (3pts) 

9. Code was supplied for a Pseudo-R-squared calculation in question 8. Explain how this value mimics that of the traditional R-squared value used in multiple linear regression. (2pts) 

10. For the model with the best BIC, of those fit in questions 5, 6, or 7, interpret the coefficient regionWest. Be sure to explain this coefficient in terms of odds (not log-odds, which do not provide a nice interpretation). How does this compare to the results in question 2? Why might they be similar/different? (3pts) 

Part 4 - Prediction 

11. We will use three fitted models built above to predict whether a county in the testing dataset will have a Coronavirus-related death. Some code is supplied in the starter file, edit and replicate so it will make predictions using all three models. Briefly describe what this code is doing. (2pts) 

12. Calculate and discuss the accuracy, sensitivity and specificity for all three models to predict if a county has reported a Coronavirus-related death. Which model appears to be the best model at predicting if a county has a Coronavirus-related death? Code is provided for the confusion matrix of the first model. Replicate this code to generate the confusion matrices for the other two models. (6pts) 

13. Using the best model from the previous question, compute the sensitivity and specificity if the probability threshold (the 0.5 provided in the code for question 11) were 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, and 0.9. Use these values to complete the table in the starter file. Which threshold appears to be the best choice? (5pts) 

NOTE: the ideas of sensitivity and specificity are VERY relevant in today's society as scientist develop tests for the COVID-19 Coronavirus; for both antibody and detection of the disease. We felt it prudent to introduce these topics under the current circumstances. 

Some Coding hints 

We have covered a lot this semester... In an effort to help you with some of the necessary coding, we provide the following hints but note additional code is needed for all to work • xtabs() can be used in questions 1, 2, and 12 • ggplot() is needed in question 3 • glm(), dropl() and/or anova() are needed in questions 4, 5, 7 and 8 • stats::step() is needed in question 6 • summary() will provide output with model coefficients, you can also use coef() 

(5/5)
Attachments:

Related Questions

. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

CS 340 Milestone One Guidelines and Rubric  Overview: For this assignment, you will implement the fundamental operations of create, read, update,

. Develop a program to emulate a purchase transaction at a retail store. This  program will have two classes, a LineItem class and a Transaction class

Retail Transaction Programming Project  Project Requirements:  Develop a program to emulate a purchase transaction at a retail store. This

. The following program contains five errors. Identify the errors and fix them

7COM1028   Secure Systems Programming   Referral Coursework: Secure

. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip

. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer

CS 340 Final Project Guidelines and Rubric  Overview The final project will encompass developing a web service using a software stack and impleme