INSTRUCTIONS TO CANDIDATES

Instructions.

You may work in groups, but you must submit your own write up of the homework. Create a Word or PDF document with text as if you were writing up a statistical analysis report. Embed the figures and tables in your document. You can copy tables by selecting the table, right-click, and copy as a picture – then paste into Word. Similarly with graphs, click on the window with the graph, Ctrl-C (Command-C on Mac) to copy, then Ctrl-V (Command-V on Mac) to paste into Word. Important: it is unacceptable to just answer with a graph or a number. You have to write a sentence or two describing what you observe and your interpretation. Please include units (mm, inches, pounds, mm/L, etc.) when reporting any numbers. Be professional in your homework report. For problems requiring probability calculations please show all work to maximize partial credit. For questions involving hypothesis testing or confidence intervals, assume the 0.05 significance level, unless otherwise specified.

For all homework problems, when performing any statistical test, please state the hypotheses, and report the test statistic, and report the p-value (or critical value, if requested) as part of performing and interpreting test.

Problems from Rosner 8th edition (with additions):

8.81. This problem uses the tennis data from Rosner, described in 8.81 and the pre-amble, and focuses on answering the research question: “Is there a difference in degree of pain during maximal activity while on Motrin compared to placebo?”. The comments and question parts below walk  you through the steps and analyses to find the answer.

Comments: This is a cross-over design with a washout period. That means that all individuals are given both the placebo and the treatment; separated by time. In this data set, some individuals were given the placebo first, and then the treatment (motrin) and others were given motrin first then placebo. Part of this problem is the coding required to organize the data. Consider the following

1) Need to organize the data so you are comparing motrin to placebo the same way across all individuals. For example if you need to take a difference, make sure the difference is the same direction (motrin – placebo), for example.

2) Need to perform data cleaning (look for missing value codes and correct to Stata missing value if needed).

3) The data dictionary defines variables in terms of a “study period”. This is either period 2 (first course of drug/placebo); period3 (washout, no one takes anything); period 4 (second course of placebo/drug). The study period for a given variable is defined in the “period” column of the data dictionary.

a) Is it reasonable to apply the central limit theorem to the variable(s) needed to answer the research question (Be sure to consider study design in deciding variable(s) to assess)? Yes or no, and provide plots to support your answer.

Regardless of your answer to part a) assume the central limit theorem does apply, and continue to answer the research question and complete the following homework questions:

b) What test would we use to answer the research question: Is there a difference in degree of pain during maximal activity while on Motrin compared to placebo?

c) Perform the test in part b using Stata. Report the null and alternative hypotheses, test statistic, and p-value and interpret your result.

d) Report and interpret the 95% CI for the difference of the means.

e) Report and interpret the appropriate one-sided confidence interval for the difference of the means to answer the following research question: Is Motrin associated with a lower degree of pain during maximal activity compared to placebo.

8.139 (comment: this problem requires data manipulation skills to compute the needed variables, as outlined in the problem).  In addition: please report the appropriate 95% confidence interval for the mean difference.

To summarize the data manipulation (Hint, use commands covered in the slides on Data Wrangling):

1) calculate the average HgbA1c value for each subject across all visits

2) find the median HgbA1c of all boys not all visits.

3) generate a categorical variable grouping boys into controlled (HgbA1c < median) or uncontrolled (HgbA1c > median)

4) create a variable for growth that is change in weight divided by change in age (weight at last visit – weight at first visit)/(age at last visit-age at first visit); where age is in years and weight is in kgs.

5) compare growth in boys with controlled HgbA1c to growth in boys with uncontrolled HgbA1c using the appropriate test.

Use the LEAD.DAT.dta data in tabular form (available on canvas) to construct your own plots using software (Stata preferred) and answer questions below. Include your plots as part of your report and assessment of outliers and describe your analysis strategy for dealing with any outliers you detect.

b) Assess whether there are any outliers for full-scale IQP (Performance IQ) in the unexposed lead group and list all the outliers with IDs.

c) Assess whether there are any outliers for full-scale IQP (Performance IQ) in the exposed lead group and list all the outliers with IDs.

