Please read through this entire document before starting to work on the project. Doing so will give you an idea of the scope of the project and the steps involved, and allow you to gauge the amount of time it will take you to complete the project, so you can plan accordingly. It is highly recommended that you work on this project incrementally throughout the semester so you can ask for help, if needed. It is highly not recommended that you wait until the end of the semester to begin work on this project. You may work together with your classmates on this project, provided everyone submits their own work.
1. Perform a regression with Stata predicting systolic blood pressure (bpsyst) (for people who are 21 to 79 years old and not pregnant), from BMI level (bmilevel), age in years (ridageyr), sex (riagendr), race (ridreth3), being currently married (married), and the interaction term of sex by being currently married (married). Make sure to consider the complex survey nature of the data.
2. Perform a regression with Stata predicting having hypertension (hyperten) (for people who are 21 to 79 years old and not pregnant), from BMI level (bmilevel), age in years (ridageyr), sex (riagendr), race (ridreth3), being currently married (married), and the interaction term of sex by being currently married (married). Make sure to consider the complex survey nature of the data.
3. Perform the same model as for objective 2, but include age group rather than continuous age in the model treating the youngest age group as the reference group. Compare how the findings are similar or different from the model you performed for objective 2. What do we learn about the relationship between age level and blood pressure from this model that we did not learn from the model you performed for objective 2?
Note: UCLA has a good page on complex survey data analysis (link). It will be very useful for you to review this prior to starting work on this project. Additionally, pay close attention to the “Analysis of subpopulations” section in this page, which covers the proper use of the “subpop” command. You should use this command when running any analyses (including descriptive statistics) on subpopulations in your dataset (i.e. specific age groups, etc.) instead of deleting observations to obtain that subset (the reason for this is explained in the link).
The National Health and Nutrition Examination Survey (NHANES) combines interviews and physical examinations to assess the health and nutritional status of adults and children in the United States. This annual survey examines a nationally representative sample of about 5,000 persons. Individuals of all ages are interviewed in their homes and complete the health examination component of the survey, which is conducted in a mobile examination center (MEC).
For this project, we will use data from the 2015-2016 survey, and will focus on the variables shown in Table 1. We are interested in looking only at participants who were 21 to 79 years old and not pregnant at the time of the survey.
As in most statistical projects, you must first process the data before it can be analyzed. First, open the code book and get familiar with the variables that we will use in this study. Notice how missing/don’t know/refused variables are coded, notice the range of possible values, Conduct the following data processing before proceeding:
a. Keep only the variables that we will use in this analysis (see Reference Table 1).
b. Create a BMI level (bmilevel) variable from existing BMI data (bmxbmi). See Reference Table 2.
c. Create an age category (agecat) variable from existing age data (ridageyr). See Reference Table 2.
d. Create a currently married (married) variable from existing marital status data (dmdmartl). See Reference Table 2.
e. Notice that systolic and diastolic blood pressure were measured three to four times. Average those readings into a single value for systolic blood pressure (bpsyst) and a single reading for diastolic blood pressure (bpdias). See Reference Table 2. The code to calculate the mean by row is:
egen newvar = rowmean (varlist)
Note: rowmean(varlist) creates the (row) means of the variables in varlist, ignoring missing values; for example, if three variables are specified and, in some observations, one of the variables is missing, in those observations newvar will contain the mean of the two variables that do exist. Other observations will contain the mean of all three variables. Where none of the variables exist, newvar is set to missing.
f. Create a new hypertension variable based on participants being told they have hypertension (bpq020) or having a high systolic or diastolic blood pressure reading (bpsyst ≥ 140 or bpdias ≥ 90). People who have high blood pressure readings but have not been told they have hypertension could have recently developed hypertension and it has not yet been diagnosed by a physician. Thus, we want to take into consideration all three variables (bpq020, bpsyst, and bpdias) when creating the new hypertension variable. See Reference Table 2.
g. To take into account the complex survey nature of the data, you will need the following svyset command:
svyset sdmvpsu [pw = wtmec2yr], strata(sdmvstra) singleunit(centered)
Before running any regressions, you should always calculate descriptive statistics. For this objective, fill in Table 4. Make sure that you calculate the descriptive statistics only for non-pregnant adults between 21 and 79 years old. As mentioned previously, do not drop observations to obtain the population subset, instead, use the subpop command as described in the UCLA tutorial.
1.1 Perform a regression with Stata predicting systolic blood pressure (bpsyst) for people who are 21 to 79 years old and not pregnant from BMI level (bmilevel), age in years (ridageyr), sex (riagendr), race (ridreth3), being currently married (married), and the interaction term of sex and being currently married. Given what you know about the outcome variable, make sure to run the correct type of regression. Make sure to consider the complex survey nature of the data. The reference categories for each variable should be:
• Sex: female
• BMI level: “healthy weight”
• Race: “non-hispanic white”
• Married: no
1.2 Write the equation of the regression from part 1.1 in terms of beta coefficients and the variable names. Do not plug in the values from the regression.
1.3 Present the results of the regression in the form of a table (Table 5). Make sure to include a descriptive title for your table. Then, describe your results in 1-2 paragraphs of text comparable to what would be found in the results section of a journal article—make sure you mention the type of regression you ran, what the variables in the regression were, which results were statistically significant (main effects and interaction term), the magnitude of the main effects and interaction term (point estimate and CI), and the interpretation of the main effects and interaction term. Make sure to add a descriptive title to your table
1.4 Create a graph of predicted systolic blood pressure based on the interaction of sex and being currently married for non-pregnant adults between 21 and 79 years old. Include the graph (Figure 1) in this document and describe what you observe in the graph. Make sure to add a title to your figure.
Calculate descriptive statistics for objectives 2 and 3 - fill in Table 6. Just as for objective 1, make sure that you calculate the descriptive statistics only for non-pregnant adults between 21 and 79 years old. For this table, calculate descriptive statistics only for those participants who had no missing data. You can do this by adding this additional condition to the subpop command: !missing(hyperten) & !missing(married).
2.1 Perform a regression with Stata predicting hypertension (hyperten) for people who are 21 to 79 years old and not pregnant from BMI level (bmilevel), age in years (ridageyr), sex (riagendr), race (ridreth3), being currently married (married), and the interaction term of sex and being currently married. Given what you know about the outcome variable, make sure to run the correct type of regression. Make sure to consider the complex survey nature of the data. The reference categories for each variable should be the same as for objective 1.
2.2 Is the interaction term of sex and being currently married significant? If it is not, re-run the regression without the interaction term.
2.3 Write the equation of the regression from part 2.2 in terms of beta coefficients. Do not plug the values from the regression.
2.4 Present the results of the regression from part 2.2 in the form of a table (Table 7) and text that would be found in the results section of a journal article. Make sure to add a title to your table.
3.1 Perform the same model as for part 2.2, but include age group rather than continuous age in the model treating the youngest age group as the reference group.
3.2 Present the results of the regression from part 2.2 in the form of a table (Table 8). Make sure to add a title for the table. Compare how the findings are similar or different from the model you performed for part 2.2. What do we learn about the relationship between age level and blood pressure from this model that we did not learn from the model you performed for part 2.2?
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme