Introduction to this assignment
This assignment uses data from the European Values Study (EVS) which surveys life satisfaction across multiple European countries focusing in particular on the individual’s employment status. This assignment will show you how categorical survey data is often used to perform numerical analyses. You will load, clean and analyze these data to evaluate whether there is a substantial impact of the employment status on an individual’s life satisfaction and whether this also depends on social norms in a particular country.
Instructions and questions
1. Import the data: EVSdata.xlsx
(a) Prepare your workspace
i. Create the usual folder structure for your work (you do not need to submit your folders - it will be evident from your code what your folder structure is.)
(b) Open the original data in Excel and examine the file.
• The first tab contains a blank table with the data dictionary. The other four tabs contain the survey data for 4 waves (periods). Complete the data dictionary using the data documentation file. Use the label in the documentation file as the variable description in the excel file. Fill in descriptive names of your choice for each of the variables.
• Save this edited excel file with a different name. (Never overwrite original files!) 1
(c) Write a loop that imports each of the tabs with the data and saves it as a Stata file. Append all the waves into a single dataset.
2. Clean and prepare the data:
1. (a) Rename the variables in the data using the data dictionary you created in 1(b). (Hint: you can use the same method used in the last tutorial, i.e. save the data dictionary as a Stata file, then append it to the data and use a loop to rename the variables. You can also edit the excel file where you completed the data dictionary.).
2. (b) Missing data are recorded as ".a". Replace all these values to be blanks (==""). You should use a loop that goes over all variables and makes this replacement. (Hint: foreach var of varlist var1-var10 will loop over all variables between var1 and var10 in the order they appear in the data)
3. (c) All variables will be imported as strings but we will convert them into numeric. Since you only know what names you pick for the variables, I will refer to them by their code names.
i. Life satisfaction, our main variable, measured by A170 contains mostly numbers but needs two corrections, "Satisfied" should be replaced by 10 and "Dissatisfied" by 1. It should also be numeric (Use destring).
ii. Apply the coding below to the variables C036, C037, C038, C039, C041 and convert them into numeric variables. (Using a loop will make this less tedious.)
iii. Apply the coding below to the variable health A009 and make it numeric.
iv. Encode the wave variable and label the waves 1-4, in chronological order. Encode also the variable X001.
v. Destring the income and age variables. 2
vi. The variable X011_1 reports the number of children. It is mostly populated by numbers but "No children" should be replaced by 0 and the variable converted to numeric (destring).
vii. The variable that records the degree of education needs to be split into its numeric portion and the string portion. You can generate a new variable that contains the nu- meric portion by using the function substr(„) when you generate a new variable. (e.g. substr("abcdef",2,3) = "bcd". Type "help substr" in the Stata command window for instructions on how to use it).
viii. Createanindicator(dummy)variablecalledmarriedthatis1ifthemaritalstatus(X007) is married or living together as married and zero otherwise.
4. (d) Youwillnowcreateavariablethatwillapproximatetheviewofsocialnormasineachcountry as related to employment. Create a new variable work_ethic which is the average of the answers related to questions on attitude towards employment: C036, C037, C038, C039, C041
5. (e) Create three indicator (dummy) variables based on the employment status (X028): one for full-time employment, one for unemployed, and one for "other_employed" status which is 1 if the employment status is either part-time or self-employed and zero otherwise. Make sure these indicator variables have missing values when the employment status variable has missing values (Remember that missing numerical values are recorded with a period in Stata: var1 ==.).
3. Data exploration: summary statistics
1. (a) Checktheavailabilityofthemainvariablesbysurveywave.Onequickwaytocheckformiss- ing values in numeric variables is to use the tabstat command. Use it to display the number of observations for the following variables by wave: life satisfaction (A170), work ethic, the dummy for full-time employment, the numeric versions of A009, X001, X003, X011_01, and X025A. Which variables are present in all waves of the survey?
2. (b) Prepare a table that shows the means and standard deviations for the same variables listed in 3(a) by sex. Which variables show a difference in mean that stands out?
3. (c) Prepare a table that shows the mean and standard deviation of the life satisfaction variable (A170) over time (waves). Do you observe a trend?
4. (d) Drop all observations missing data for life satisfaction and work ethic. Which waves of the survey remain in the data?
4. Regression analysis: Export all your results using outreg2 to facilitate the comparison of coefficients for the different models.
3
1. (a) Run a regression that uses life satisfaction as the dependent variable and uses the following explanatory variables: work ethic, the three dummies based on employment status created in 2(d), the married dummy, X001, X003, X047D. How are the employment status dummies and the work ethic variables related to life satisfaction? Do you find the expected signs on the coefficients for your married dummy and age?
2. (b) Before you include additional explanatory variables that are present only in the last wave of the survey, re-run the regression above limited only to that last wave. How many observations were lost? Is there a significant change in the regression results?
3. (c) Add the following explanatory variables to your regression: the numerical version of A009, number of children and the numerical version of education (X025A). Has the relationship between the employment status dummies and the work ethic variables with respect to life satisfaction changed?
4. (d) Many people claim that cultural differences make the comparison of life satisfaction across countries unreliable. Control for this problem by including dummy variables for each country. (In Stata you can just write "i.country" to your regression equation). How do your results change?
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme