1 Question 1
In this exercise, we will use data from Terza (2002) to investigate if abuse of alcohol has any impact on the employment status of men. This exercise is adapted from computer exercise 17.C15 in Wooldridge (2020).
Contains data from ./alcohol.dta
obs: 9,822
variable name variable label
abuse =1 if abuse alcohol
status out of workforce = 1; unemployed = 2,
employed = 3
unemrate state unemployment rate
age age in years
educ years of schooling
married =1 if married
famsize family size
white =1 if white
exhealth =1 if in excellent health vghealth =1 if in very good health goodhealth =1 if in good health fairhealth =1 if in fair health northeast =1 if live in northeast midwest =1 if live in midwest
south =1 if live in south
centcity =1 if live in central city of MSA outercity =1 if in outer city of MSA
qrt1 =1 if interviewed in first quarter
qrt2 =1 if interviewed in second quarter
qrt3 =1 if interviewed in third quarter beertax state excise tax, $ per gallon cigtax state cigarette tax, cents per pack ethanol state per-capita ethanol consumption mothalc =1 if mother an alcoholic
fathalc =1 if father an alcoholic
livealc =1 if lived with alcoholic
inwf =1 if status > 1
employ =1 if employed
1. What fraction of the sample is employed at the time these men were interviewed? What fraction of the sample has abused alcohol.
2. Estimate a linear regression model foremploywith the following variables as covariates:abuse, age,agesq,educ,educsq,married,famsize,white,northeast,midwest,south,centc- ity,outercity,qrt1,qrt2andqrt3. Use heteroskedasticity-robust standard errors. 1
1The fact that the dependent variable is a binary variable does not violate our assumptions MLR.1–MLR.4. This is also called a linear probability model (LPM) and the errors will be heteroskedastic. We will return to this model in a later exercise.
3. The variableabusemight be endogenous in this setting. Argue whymothalcandfathalc, indicating whether a man’s’ mother or father were alcoholics, respectively, could be reasonable instruments. Estimate the LPM using the GMM heteroskedasticity-robust instrumental variables model.
4. Test
(1) if the instruments are weak,
(2) ifabuseis endogenous, and
(3) if the instruments are valid.
5. Compare the new parameter forabusewith the original parameter estimate, and conclude with respect to eect of alcohol abuse on labor market participation.
6. (Optional.) Are there other good predictors for theabusevariable? Usek-fold cross-validation to select the best predictor model. Using the chosen prediction specication inthe GMM IV model does the parameter estimate forabusechange? (See, among other, Athey and Imbens (2017) for a discussion of big data and machine learning in prediction in therst stage equation.)
2 Question 2
This exercise draws heavily upon Hayashi (2000).
The relationship between the wage rate and schooling has been the subject of a large number of empirical and theoretical investigations following the pioneering study by Mincer (1958). This attention may seem puzzling because the explanation of the positive relationship seems to be obvious: education enhances the individual’s productivity.
There are, however, other explanations. In the job market signaling model of Spence (1973), more ed- ucated individuals receive higher wages only because education is used asa signal of higher ability. Al- though education does not increase the individual’s earning capacity, there is a correlation between the wage rate and schooling because both variables are inuenced by a third variable,ability. One of the ear- liest attempts to try to isolate the eect of education on the wage rate from that of ability was the study by Griliches (1976). Well-known later studies include, among others, Blackburn and Neumark (1992) and Card (2001).
In this exercise we will estimate the type of wage equation estimated by Griliches using data from the Young Men’s Cohort of the National Longitudinal Survey (NLS-Y). This cohort wasrst surveyed in 1966 at ages 14–24. The dataset, in thelenls80, is an extract from the NLS-Y used by Blackburn and Neumark (1992). A special feature of this particular dataset is that it contains two measures of ability. One measures is the score on the Knowledge of the World of Work (KWW) test administered by the NLS interviewers in 1966. The other measure is the IQ score that is a composed measure of various test scores obtained from the youths’ school records (from 1968).
The following variables are included in this dataset:
http://athene.umb.no/emner/pub/ECN301/data/nls80.dta
obs: 935
vars: 17
variable name variable label
wage monthly earnings
hours average weekly hours
iq IQ score
kww knowledge of world work score
educ years of education
exper years of work experience
tenure years with current employer
age age in years
married =1 if married
black =1 if black
south =1 if live in south
urban =1 if live in SMSA
sibs number of siblings
brthord birth order
meduc mother’s education
feduc father’s education
lwage log(wage)
The typical wage equation estimated in the literature is the semi-log form (Card, 1995):
log(wage) =β 1 +β 2educ+β 3ability+. . .+u (1) wherewageis the wage rate for an individual,educis the schooling in years,abilityis some measure
of ability, in addition to a series of observable characteristics such as experience, tenure and location dummies.
We will be using the same subsample as Blackburn and Neumark (1992), i.e. without any black individu- als and only those for whom we have information about mother’s education. (Remove these observations and make sure your working dataset hasN= 758.)
1. Calculate means and standard deviations of all the provided variables and prepare a summary table. Also, calculate the correlation betweenIQ,KWWandeduc.
2. Consider a wage equation witheduc,exper,tenure,south, andurbanas explanatory vari- ables. However, we do not have a variable that matches the theoretical construct ofability. Thus, the model will either have an omitted variable problem or problems with a potentially poorly measured proxy variable, namelyIQ.
Estimate the model using OLS both with and without the variableIQ.
3. If we includeIQin the model there is a potential problem with measurement errors. We can use instrumental variables regression to deal with that problem.
(1) Estimate the model using 2SLS withmeduc,KWW, andageserving as instruments forIQ. Report both therst stage and the second stage results. Discuss the validity of the instruments.
(2) Test for endogeneity ofIQ.
(3) Test for overidentifying restrictions.
4. If we omitIQfrom the model there is a potential problem with omitted variables. We can use instrumental variables regression to deal with that problem as well.
(1) Estimate the model using 2SLS usingmeduc,KWW, andIQas instruments foreduc. Report both therst stage and the second stage results. Discuss the validity of the instruments.
(2) Test for endogeneity ofeduc.
(3) Test for overidentifying restrictions.
5. Summarize and give an overall assessment of your estimates for return to schooling in light of your
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme