1. [20 Marks] In all parts to this question, choose one or more correct answer(s). Choosing a wrong answer loses points. If the number of negative points exceeds the number of positive points, your total for this question is zero. In part (f), you must also supply some additional reasoning.
(a) The OLS is consistent if
1. there is no correlation between the dependent variable and the error term
2. there is no multicollinearity
3. there is no correlation between the independent variable and the error term
(b) The dummy variable trap is an example of
1. imperfect multicollinearity
2. something that is of theoretical interest only
3. perfect multicollinearity
(c) Consider a setup of a single regressor model with a binary regressor di for i = 1, ..., n. di = 1 for individuals from the treatment group and di = 0 for individuals from the control group:
1. When i di = n− i di, the robust and the traditional standard errors coincide for a large sample
2. It is always safer to use the robust standard error
3. The estimated coefficient on the regressor is the difference between the sample average of the outcome variable for the treatment group and the same for the control group
(d) Which of the following statements are true?
1. The Block Bootstrap is a method for dealing with a clustered error structure
2. Clustered standard errors (Liang and Zeger) account for a clustered error struc- ture but require some additional assumptions about the within group correlation structure
3. If there is no within-cluster correlation for the regressor of interest, then the Moulton factor is equal to: 1 + (N − 1)ρe, where ρe is the intraclass correla- tion coefficient and N the fixed group size.
(e) Consider the linear regression equation y = β0 + β1x + u with E[u|x] /= 0. Suppose
that z is a valid instrumental variable. Then the population value of the IV estimator of β1 is:
Cov(z,x)
Cov(z,y)
Cov(z,y)
Cov(z,x)
Cov(x,y) V (x)
4. Cov(u, z)
(f) Imagine you are interested in the effect of share of police force on crime rate in a city. Which of the following biases are likely?
1. Omitted variable bias
2. Measurement error bias
3. Simultaneity bias
Please explain. (220 Words max)
(g) Overidentification occurs when:
1. there are more regressors than instruments
2. one of the instruments is not valid, but we do not know which one
3. there are more instruments than regressors
(h) A constant treatment effect
1. means that the individual causal effect is the same for each individual
2. means that the expected value of the individual causal effects is constant
3. means that individual-level causal effects can vary from individual to individual
(i) In a differences-in-differences setup where Y (d), d = {0, 1} are the potential out- comes of individual i and D is a binary variable being equal to 1 if the individual i is treated (and 0 otherwise), and treatment occurs between t = 0, 1. One identifies the average treatment effect on the treated by assuming
1. E[Y (0)t=1|D = 1] − E[Y (0)t=0|D = 1] = E[Y (0)t=1|D = 0] − E[Y (0)t=0|D = 0]
2. E[Y (0)t=1|D = 1] + E[Y (0)t=0|D = 0] = E[Y (0)t=0|D = 1] − E[Y (0)t=0|D = 0]
3. E[Y (0)t=0|D = 1] − E[Y (0)t=0|D = 0] = E[Y (0)t=1|D = 1] − E[Y (0)t=1|D = 0]
(j) If the assumptions for the random effects estimator hold, then for a large sample and time varying regressors,
1. One should find similar point estimates using fixed effects, pooled OLS and random effects
2. One should find similar point estimates using fixed effects and random effects but random effects is more efficient and pooled OLS is inconsistent
3. One should find similar point estimates using fixed effects, random effects and using the least squares dummy variable approach
(k) In a sharp regression discontinuity design:
1. one would use an instrumental variable approach to estimate the average treat- ment effect
2. one identifies the average treatment effect near the cutoff
3. individuals who are close to the threshold can decide whether or not to be treated
4. the treatment assignment might be considered like random assignment immedi- ately above and below the cutoff
(l) Consider the following subpopulations for which one can identify the different treat- ment effects resulting from
A A fuzzy Regression Discontinuity Design (with two-sided imperfect compliance at the threshold)
B An RCT with perfect compliance
C An RCT with two-sided imperfect compliance
We can apply these methods on the same sample. Which of the following is true?
1. .[B]>[C]>[A]
2. .[A]>[C]>[B]
3. .[B]>[A]>[C]
4. .[C]>[B]>[A]
2. [8 Marks] Consider the following models:
yig = β0 + β1xg + eig, g = 1, ..., G (M 1)
y¯g = β0 + β1xg + e¯g (M 2)
Where yig represents an outcome of interest for individual i from group g, xg a regressor
of interest and eig an error term. The number of individuals in group g is ng. y¯g = 1 Σng y
Show that the same expression for βˆ is obtained when estimating (M 1) using OLS.
3. [20 Marks] Imagine we are interested in the effect of reading a political newspaper on voting turnout in political elections. Denote Yi as an individual’s voting behaviour: it is equal to one if the individual votes and zero otherwise. Di is a binary variable for whether or not the individual reads the newspaper where Di = 1 indicates reading the newspaper. Both variables are measured without error.
(a) [2 Marks] Write down the formula for the observed outcome as a function of poten- tial outcomes Yi(1) and Yi(0) and explain its meaning in this context. (100 Words max)
(b) [7 Marks] One potential estimand is:
E(Yi|Di = 1) − E(Yi|Di = 0) (2)
(i) Explain in words what this estimand means in this specific context.
(ii) We are typically interested in a causal relationship. Decompose this expression mathe- matically into two components: a causal effect and another term.
(iii) Explain the meaning of both terms in the decomposition in this context. (230 Words max)
(c) [5 Marks] The table below also contains the sample analogs of the underlying terms in your decomposition. Compute estimates of the 3 terms in your decomposition in part (b). Does the sign of each term make sense? Explain. (200 Words max)
Pˆ(Yi(1) = 1|Di = 1) : 0.89 Pˆ(Yi(1) = 1|Di = 0) : 0.66 Pˆ(Yi(0) = 1|Di = 1) : 0.39 Pˆ(Yi(0) = 1|Di = 0) : 0.11
(d) [2 Marks] What is the fundamental problem of causal inference in this context? Can we observe / estimate the two components of the decomposition without further information? Explain. (100 Words max)
(e) [4 Marks] Imagine one could manipulate the treatment —here, reading the newspaper— in a way that the treatment assignment is random and everyone complies to this randomization. Does your conclusion from part (d) change and what implication does this have on equation 2 ? Explain formally using the expressions derived in the previous parts. (40 Words max)
4. [20 Marks] This question is based on the study “Bringing Education to Afghan Girls: A Randomized Controlled Trial of Village-Based Schools” by Burde and Linden (2013).
BURDE, D. AND L. L. LINDEN (2013): “Bringing Education to Afghan Girls: A Randomized Controlled Trial of Village-Based Schools,” American Economic Journal: Applied Economics, 5, 27–40.
The authors motivate their research by claiming “primary school participation rates in Afghanistan are very low, particularly for girls”. They further explain that in rural areas, the distance between home and school is sometimes long and that“the lack of separate sanitation facilities, female teachers, and gender-segregated classrooms may also deter girls’ enrollment”.
The authors used a randomized controlled trial to assess the 1-year effect of village- based schools on childrens’ school enrollment and performance in rural northwestern Afghanistan.
The intervention is based on a sample of 31 villages which were grouped into 12 equally sized village groups based on political and cultural similarities (of which one dropped out). The authors randomly assign 5 groups to obtain village-based schools a year before the other groups starting in summer 2007. The final sample has data on 11 village groups, 31 villages and a total of 1, 490 primary school-age children. Schools were placed in the treatment villages but not in the control villages during the year of the intervention.
The authors surveyed all available households in the fall of 2007 and in spring 2008. The survey collected basic demographic information, enrollment data and test scores of the children and the households.
(a) [3 Marks] Table 2 of the paper is reproduced below. The authors say that it is ob- tained by using “socio-demographic characteristics that would not have been affected by the presence of a closer school”. Columns (1) - (3) cover all eligible children and columns (4) - (6) cover the subsample of the children who were available to be tested at the time of the survey.
(i) Explain why the authors provide such a table.
(ii) Explain formally how the authors obtain column (3) and (6).
(iii) What are your broad conclusions from this table? [Note: do not comment on each single row and do not discuss columns (7) - (8)]. (220 Words max)
(b) [1 MARK] For girls who were also tested in the fall 2007 survey, a simple regression of enrollment (which is binary and equals one if a girl is enrolled) on the treat- ment assignment results in a coefficient equal to 0.552146 with a standard error of (0.101982). Give a clear statement of what this estimate means. (25 Words max)
(c) [13 Marks] Causal effects. Ultimately the authors are also interested in the causal effect of enrollment on the fall 2007 test scores.
1. Explain why there is imperfect compliance. (80 Words max)
2. When running a simple regression of the fall 2007 test scores on the randomized treatment assignment for girls, one finds a coefficient of 0.74640 with a standard error of (0.17307). Which causal effect do you identify in light of part 1? Explain what this means in this context (you can interpret the coefficient directly as standard deviations). (100 Words max).
3. Can the LATE be estimated? What causal effect identifies this estimand in this context? Explain. Briefly state all relevant assumptions in this setup and explain whether you think they hold. (450 Words max)
(d) [3 Marks] Show that within the LATE framework with a binary treatment Di, a binary treatment assignment Zi, and potential participation Di(z), for z = {0, 1}, the share of compliers P [Di(1) − Di(0) = 1] is equal to P [Di = 1|Zi = 1] − P [Di = 1|Zi = 0]. Clearly state any relevant assumptions.
5. [15 Marks] Consider the following regression model estimated using a balanced panel with two time periods:
yit = β0 + β1xit + ai + uit, t = 1, 2
Where xit is a binary variable which is e.g. equal to 1 if married and 0 otherwise. yit is an outcome of interest, ai an unobserved individual effect and uit an error term. The Fixed Effects estimator is the same as estimating the following model using OLS:
y¨it = β1x¨it + u¨it, t = 1, 2
Where x¨it = xit − x¯i and x¯i = 1 Σ2 xit. There are identical expressions for u¨it and y¨it.
There are
ns stayers: individuals who do not change their marital status.
nq quitters: individuals who are married in t = 1 but got divorced and hence are not married anymore in t = 2.
nj joiners: individuals who are not married in t = 1 but are married in t = 2.
(a) [9 Marks] Write down an algebraic expression for the pooled OLS estimate of y¨it on
x¨it and show that the stayers do not contribute to this expression.
(b) [6 Marks] The First Difference estimator in this setting is:
βˆ = nj∆yj − nq∆yq / nj + nq
Where ∆yi = yi2 − yi1 and ∆yj = 1 Σ i:j ∆yi and ∆yq = 1 Σ i:q ∆yi
Show that the First Difference and Fixed Effects estimators coincide.
6. [17 Marks] Consider the following model:
Y = β0 + β1D + β2d−1 + β3d1 + θ−1D ∗ d−1 + τ D ∗ d1 + u, t = −1, 0, 1
Where Y is an outcome of interest with potential outcomes Y (0)t and Y (1)t, D is a binary variable indicating whether the individual is ever subject to a treatment which occurs between t = 0 and t = 1. d−1 is a binary variable with d−1 = 1 in t = −1 and 0 otherwise and d1 is a binary variable with d1 = 1 in t = 1 and 0 otherwise. u is an error term that satisfies standard exogeneity assumptions.
(a) [3 Marks] State the common trend assumption for t = −1 and t = 0 and for t = 0 and t = 1 and explain in words its meaning.
(b) [3 Marks] State all the conditional expectations as functions of the parameters.
(c) [2 Marks] Derive expressions for θ−1 and τ in terms of the 6 conditional expectations from part (b).
(d) [5 Marks] Show that τ recovers the average treatment effect on the treated in t = 1 under common trends.
(e) [4 Marks] Show that θ−1 = 0 if we have common trends between t = −1 and t = 0.
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme