Problem 1.
Consider the linear regression model without an intercept, π¦π = π½π₯π + π’π. Assume πΈ(π’π|π₯1, … , π₯π) = 0.
a) Similarly to Slide 4 of Notes #5, write down the average squared error as a function of π½. Take a derivative with respect to π½ and solve for the least squares estimator, π½Μ.
b) Substitute π¦π = π½π₯π + π’π into your formula for π½Μ as we did on Slide 11 of Notes #4 and use the resulting expression along with the Law of Iterated Expectations to show π½Μ is unbiased.
c) Suppose (for this part of the question only) that there are only two observations (π = 2). Write out the expression you found for π½Μ in part b) without summation notation (just involving the four random variables π’1, π’2, π₯1, and π₯2). Now carefully explain why you need strict exogeneity, i.e., πΈ(π’1|π₯1, π₯2) = πΈ(π’2|π₯1, π₯2) = 0, in order to show that π½Μ is unbiased.
d) Assume that Var(π’π|π) = π2 and Cov(π’ , π’ |π) = 0 for π ≠ π (i.e., Homoskedasticity). Write
down a formula for πππ(π½Μ|π) similar to Slide 9 of Notes #6.
Problem 2.
Again import the file mlbdata.xlsx from the course website in STATA. Generate the same variables
salaryk and pos1 (the dummy variable for first basemen) we used in Problem Set #2.
Note: This problem assumes that you know/remember the ‘two sample t-test’ from your ECON 121 or
Math 140 class. The TAs and I can go over this in office hours/recitation for those who don’t!
a) Repeat the t-tests in problems 4b) and 4c) from Problem Set #2 using regression (i.e., use the
reg command instead of ttest). Verify that the answers you get are numerically identical.
b) Using tabulate pos1, summarize(salaryk), how does the variance of salaries compare between first basemen and non-first basemen? Look back at the histograms from last week and see if your numeric results are supported by visual evidence. In Stata, the command sdtest performs a test where the null hypothesis is that the variances are equal. The syntax and usage of the ,by() option work the same way as for ttest. Use this command to test the null hypothesis that the variance of salaries is equal for first basemen vs non-first basemen.
c) Do these same t-tests using heteroskedasticity robust standard errors (the reg with the
,robust option. Using the 5% significance level, do your conclusions change? How?
d) Re-do the same tests as part c) with the ttest command, but this time use the ,unequal
option. How do the t-statistics compare with part c)?
e) Re-do the regression in the final slide of Notes #5 with heteroskedasticity robust standard errors. Do the results change?
Problem 3.
Show that the OLS intercept estimator, π½Μ0 = π¦Μ − π½Μ1π₯Μ , is unbiased.
Begin by writing π¦π = π½0 + π½1π₯π + π’π. Take an average of both sides to show π¦Μ = π½0 + π½1π₯Μ + π’Μ . (Careful, π’π are the unobservables computed using the ‘true’ coefficients, π½0 and π½1, not the OLS estimates π½Μ0 and π½Μ1, so π’Μ is a random variable, not necessarily equal to zero!)
Now plug this expression for π¦Μ into the expression for π½Μ0 and take expected values to show πΈ(π½Μ0) = π½0. Be specific about where and/or whether you are using i) Strict Exogeneity, and ii) the Law of Iterated Expectations.
Problem 4.
This time load the consump.dta file we talked about on Slide 7 of Notes #5. Instead of regression consumption spending growth (gc) on the current year’s disposable income growth (gy), suppose we regress gc on the previous year’s disposable income growth, which in this dataset is called gy_1.
a) Run a regression of gc on gy_1. What is the p-value associated with a test of the null hypothesis H0: π½1=0? How does it compare with the regression on Slide 6 of Notes 4?
b) Use the correlate command with the ,covariance option to display the sample variance- covariance matrix of gc and gy_1. Verify that ‘covariance over variance’ gives you the same answer as the slope coefficient from the regression you ran in part a).
c) Make a plot of the residuals of this regression vs gy_1 using rvpplot gy_1. Do the residuals appear to be mean zero and uncorrelated with x?
d) In Stata entering the command predict u, resid after using the reg command stores the residuals from the preceding regression in a variable called u. Do this for the regression from part a) and verify that the residuals have sample mean and sample correlation with gy_1 exactly equal to zero. Explain why we know this will always be the case for residuals computed using the OLS estimates.
e) Make a scatterplot of gc vs gy_1. Using the same syntax as shown on the Slide 6, overlay the fitted regression line on the scatterplot. Visually, is the relationship stronger or weaker than that between of gc and gy?
f) If this relationship is weaker, why bother with this regression at all? (Hint: Suppose we’re sitting in December 2019 and we’re thinking about a tax increase that will take effect in 2020. The tax revenue raised will depend on consumption next year…)
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme