This empirical practice will give you an opportunity to estimate smoking equations with different regression models using STATA. I have posted smoke.dta, the source data after data cleaning process for the paper:
Kim, D. H. and Park, H. J. (2021). “Effects of a Cigarette Price Increase on the Smoking Behavior of Smokers and Non-smokers.” American Journal of Health Behavior, 45(2), 205-215.
The paper above estimated the effects of the cigarette price increase in 2015 on the smoking outcomes among Koreans. The draft posted on Blackboard is a pre-peer reviewed version (submitted draft, not the final). You can check the draft in case you would like to learn about some background on the policy if needed. (Note: we do not replicate the same results of the paper)
You will use smoke.dta to estimate the smoking rate and cigarette consumption after the cigarette price increase in Jan. 2015. The source of the data is the Korea Health Panel (KHP), which is a nationally representative panel survey and contains 49,752 observations total. The data period of the analysis is between 2011 through 2016. The 2015 data represents individuals’ behavior right after the price increase. Thus, we have four pre-period (2011-2014) and two post-period (2015 and 2016) around the policy implementation. As you can expect, the analysis should be essentially panel data analysis, but we analyze it in a repeated cross-sectional framework (you can simply use “reg” command as before). Although we cannot trace the same individuals’ behavior across time (as in the panel data analysis), we still can estimate the different responses of individuals across years. You SHOULD show your regression output for each question as applicable. There will be NO credits offered for answers without appropriate STATA results attached.
It is strongly recommended that you use a word/PDF file to submit your answers for this task. Do not attempt to write down the STATA table/results on papers. All answers should be reasonably legible.
1. Generate the price increase variable to represent the periods after cigarette price increased in 2015 (year>=2015) Name it as “price”. Also, generate the log of household income variable. Finally, generate the linear time trend variable that represents the time trends between 2011 through 2016; =1 in year 2011,
=2 in year 2012, …, =6 in year 2016. Name it as “trend”. (5 pts)
2. Print out the descriptive statistics for all variables in the data by smoking status. Then briefly explain the result (Hint: use tabstat command with options as “by (smoke) statistics (mean sd n) longstub format (%9.1g)”). (5 pts)
3. Run the standard OLS regression of smoke on price, marital status, age, educational level, working status, chronic disease status, number of family members, (log) household income, drinking status, and trend. Provide the interpretation of the estimated coefficients and the regression output. (10 pts)
4. Run the same regression above without the trend variable. How are they different? Do you think it would be better to include the trend variable in the regression or not? (5 pts)
5. Test for the heteroskedasticity using the white test. If the heteroskedasticity exists, then use the robust standard errors for the remaining questions. (5 pts)
6. Run the same regression for male and female samples respectively. Are they different from the regression from question 3? Comment on it (Hint: use if option). (10 pts)
7. As the “smoke” variable is a binary dependent variable, the OLS above was the LPM. We can use the Logit or Probit model to examine the probability of smoking. Run the Logit model with the same specification in question 3. Then, report the average marginal effects of the coefficients and provide the interpretation (Hint: use the margins, dydx(*) command). (10 pts)
8. Run the Probit model with the same specification as above and report the average marginal effects of coefficients. Compare the reported average marginal effects of coefficients among LPM, Logit, and Probit model and comment on it. (15 pts)
9. Now, we want to estimate the smoking intensity by examining the number of cigarettes smoked per day. Run the standard OLS regression of numcig on the same explanatory variables used in question 3. Interpret the coefficients. (10 pts)
10. Run the same specification above by male and female samples, respectively. (10 pts)
11. All in all, did the price increase affect the smoking behavior in Korea? Evaluate the Korean government’s tobacco policy in 2015. What would be your policy suggestion to better alleviate the smoking prevalence? (10 pts)
12. Compose a short paragraph that summarizes your work from #1 through #11. Imagine you have studied the 2015 Korean price increase as your term paper. You can think that you write an abstract for your project. Note: 100 words minimum. (5 pts)
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme