HW 2
Please do not use any built-in OLS commands (such as lm) in R to run regressions, use the OLS formulas instead!
1. Univariate OLS.
The file called “TA_NI.csv” contains data on 12,583 firms for year 2018. Extract the following two variables from the data set: firm size (TA) and firm profitability (NI).
Using the formulas provided in the lecture slides estimate the coefficients β_0 and β_1 in the following model:
ln(1+TA_i )=β_0 +β_1 NI_i+u_i
Using the results in (a) compute
(i) The average value of the estimated residual u ̂
(ii) The correlation between NI and u ̂.
Relate the results in (b) to the Normal Equations.
Now alter the model as follows:
ln(1+TA_i )=γNI_i+〖e_i〗_i
Estimate γ ̂ using the formula γ ̂=(∑_(i=1)^n▒〖x_i y_i 〗)/(∑_(i=1)^n▒x_i^2 ). This is the OLS estimate for a model with intercept restricted to be zero.
Compare the results in (d) to those in (a)
Using the results in (d) compute the average value of e ̂ as well as the correlation between NI and e ̂.
Relate the results in (f) to the Normal Equations.
2. Bias in the data.
Create two correlated data vectors using the following procedure.
Write an R code to draw three uniformly distributed random variables:
v_1∼U(-1,1), v_2 ∼ U(0,1) , and v_3 ∼ U(-1,0)
with 240 observations each.
Define x_1=2v_1+v_2 and x_2=v_1+v_3
You will use the data you just generated to do parts (b) through (d)
Generate response y using the following model:
y=1+2x_1+3x_2+u
where u∼N(〖0,2〗^2)
Estimate the following two regression models:
Model A: y=β_A0+β_A1 x_1+β_A2 x_2+u_A
Model B: y=β_B0+β_B1 x_1+u_B
Record the coefficient estimates and their standard errors. Repeat the process 1,000 times: generate a new vector y using the same x_1 and x_2 but drawing new values for u; use it to estimate Models A and B; record the estimates and their standard values.
In (b), what are your average coefficient estimates? Are they biased?
Compute the standard error for the coefficient estimators in (b) using your sample of 1,000 coefficient estimates
Compute the standard error for the coefficient estimators in (b) using your sample of 1,000 standard error estimates for the corresponding coefficients.
Compare the results obtained in (d) and (e) for Model A. Do the same for Model B. For each model, discuss if the results are indicative of bias.
3. Multivariate Regression.
The file called PS2Wage.txt contains worker wages data for 39 demographic groups. Your goal here is to study the effect of change in hourly wage (WPH) on the supply of labor measured in hours worked (HRS). The task is complicated by the fact that factors other than hourly wage affect the supply of labor. We will in particular consider two such additional factors: spouse’s annual income (ERSP) and the number of years of education (SCH).
Assume the labor supply depends on both the wage paid and how much the spouse makes as in the following multivariate regression model,
HRS = β_0 + β_1 WPH + β_2 ERSP + u
(i) Discuss what sign do you expect for β_1 and 〖 β〗_2
(ii) Estimate the regression using OLS. Are your coefficient estimates significant and have the sign you hypothesized in (i)?
(iii) Provide interpretation for the intercept term β_0. Does your estimate for β_0 make sense?
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme