Skills being tested:
1. Knowledge and understanding of the course material:
à General skills: Distinction between prediction and hypothesis testing; Simulation inference; Transformation of regression output into useful quantities
à Logit Models: Model selection: choosing best set of variables for prediction; Changes in predicted probabilities and average marginal effects; Hypothesis testing via simulation
àPanel and Multilevel Models: Random effects: varying intercepts and slopes; Solving the ecological fallacy problem; Interpretation, prediction and hypothesis testing: using the fixed and random effects; Application to Multilevel Regression and Post-Stratification
2. Writing up and presenting statistical results. Ability to:
àApply the techniques in R
àUse tables, visualisations etc. to show your results
àInterpet your results clearly, succinctly and precisely
àUse results to test claims and understand social science questions
Presentation requirements for clarity and maximum points:
1. Presentation: R-code
R code that is messy, difficult to read makes it harder to award you marks. DOs and DON’Ts:
2. Presentation: Tables and Figures
Good tables and figure are vital. Marks are awarded for presentation. DOs and DON’Ts:
- Precision in language: test results
Bad Writing: “The coefficient is significant”; Good Writing: “The coefficient has a t-statistic of -2.1, meaning that it is statistically significant at the 5% significance level”
- Precision in language: exactly how big is the effect? Give a specific example to illustrate. Relate the size of the coefficient to the scale of the outcome variable. Make the results compelling and meaningful
Bad Writing: “The coefficient is 2, meaning that being male affects earnings a lot”; Good Writing: “The coefficient for the variable that indicates being male is 2. This means that being a man is predicted to increase earnings by £2,000 per year. This is a substantively large effect, equivalent to an extra two years of education.”
- Precision in language: terminology. Remember that words like “bias”, “test error rate” have very specific meanings in statistics. Always use those precise terms rather than looser terminology.
Bad Writing: “The logit model is impressively accurate”; Good Writing: “The logit model has a very low test error rate of 3.4%”
5. Structuring a Statistical Report
It is not a ‘lab bench report’, narrative of your journey, or exhaustive summary of everything you did:
Beginning, middle, end, But not a mystery thriller
QUESTION A: Support for Longer Prison Sentences [30 points]
Many organizations working on criminal justice argue that longer prison sentences do not cut crime. However, many voters disagree and express strong support for increasing the length of prison sentences. For this question, suppose that a campaign group advocating shorter prison sentences asks for your help. They plan to run a campaign targeted at groups that most strongly support longer prison sentences for convicted criminals, in the hope that they can change their minds. Your job is to tell them which types of people are most supportive of longer prison sentences. To help measure the likely effectiveness of their campaign, they also want to know how much each characteristic matters in explaining support.
To answer their questions, you will use an excerpt from the 2018 British Social Attitudes Survey. You need to:
i) Choose a logit model that predicts support for longer prison sentences, carefully justifying your selection of variables for the model. You must use a minimum of three independent variables.
ii) Present the model’s findings in ways that clearly explain how much the variables matter in explaining support for longer sentences
3. Transforming the regression output
A core feature of all models so far in this module is that the regression outputs alone are not very meaningful
For logit models:
For multilevel models:
· Important to distinguish between: fixed coefficients; coefficients that vary across level-2 units
4. Rules of statistical writing
- Never leave the reader in any doubt about how you did your analysis. By reading your prose (not the code) the reader should be able to re-produce your work from scratch.
Bad Writing: “I estimated a model showing what affects unemployment”; Good Writing: “I estimated a logit model for the probability of being unemployed, with x,y, and z as independent variables.”
- Don’t use the names of R packages and functions. Instead, explain to the reader what you did
Bad Writing: “Table 1 shows marginal effects from the mfx package in R with the atmean=F option”; Good Writing: “For each independent variable in the model, the marginal effect was first calculated for each individual in the sample, using their profile of independent variables. The means of those marginal effects across all individuals in the sample are shown in Table 1”
- Interpret, interpret, interpret! Never just present a figure or table without explaining in the report what it tells you. Relate it to the topic of your report and describe the specific evidence for your claim
Bad Writing: “Table 1 shows the regression results for the effect of sex on earnings”; Good Writing: “Table 1 shows that on average, men earn £2,000 per year more than women with the same level of education. This is clear from the coefficient on the dummy variable for being male, which is statistically significant at the 5% level”
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme