(5/5)

Instructions: The objectives of this project are to form research questions, choose a number of appropriate variables from a real data set for subsequent hypothesis testing, manipulate and clean the selected data, conduct the appropriate tests, and summarize the results substantively.

You will work on the project as a group with 3 students. Each group should work independently. If you need help to manipulate the data set, but not for interpretation of the results, you can ask for help from the instructor or other students. The project report must be typed and double-spaced in MS Word. The PDF format is not accepted. All figures and tables should be prepared with good quality. Each group only submits one report that shows every group member’s name and an email address to receive the feedback on the first page.

Students can use the datasets from their mentor’s research project.

Structured report:

1. Method:

a. Measures: define the dependent variable, the primary independent variable, and other covariates. How was each variable measured, classified or recoded? Each variable should be associated with a meaningful variable name (e.g. lung cancer, smoking status, etc.), value label (e.g. 1=smoke 2=non-smoke, etc.), and corresponding question from the questionnaire. You can use a table to present the above information.

2. Results: All independent variables included in the final model must be statistically significant. Students may take several rounds of variable selection to meet this criterion.

a. Present the analysis results using tables or figures (not required).

A table for statistical results for the proportional odds model or the multinomial logit model Any figures, if appropriate.

b. Summarize EACH test result in text (e.g. Wald x2=4.52, df=1, p=0.002; Odds ratio=1.2, 95%CI: 0.3-2.1 etc.).

c. Fully interpret test results by

i. stating a null hypothesis for each variable (e.g. the proportion of subjects with healthy diet behavior is the same in the cancer patients and the normal controls.)

ii. explicitly stating whether the null hypothesis is rejected according to the p-value of the corresponding statistical test.

iii. explaining odds ratios and 95% confidence intervals in the context of the association using plain English.

iv. summarizing the results substantively: What does the model tell you about any associations between the variables? Do these results conform to your expectations?

4. Appendix: containing the SAS syntax. Edited copies of relevant parts of your computer printouts should be included.

Part 1 - 4 should be in one electronic file in MS Word.

Statistical method requirements

A. Variable selection:

1. One ordinal or nominal dependent variable (at least 3 categories), a primary independent variable, and at least three other independent variables with a mix of categorical and continuous variables for either a proportional odds model (with ordinal dependent variable) or a multinomial logit model (with nominal dependent variable).

Note: Sociodemographic variables cannot be used as dependent variables in the regression models.

B. Data management: All variables should be coded or recoded in a meaningful way. For example, if an original variable of education is provided in years of education in the data set, a categorical variable of education should be coded as less than high school, high school graduate, college, and graduate. There should be no categories of the unknown and refused to answer in any variable in the final data set. Those categories should be recoded as missing value. Each category in a variable should be sufficiently large (>30 subjects). For each variable, the meaningful variable name (e.g. Education instead of var1), value label (e.g. Education 1=less than high school, 2=high school, 3=collage or higher), and corresponding question(s) (e.g. What is the highest education you have ever received?) should be included in the report.

C. Statistical Tests to be conducted are:

• For proportional odds model/cumulative logit models,

o Score chi-square test for checking the proportional odds assumption

o Wald chi-square test for regression coefficients for each independent variable

o Odds ratio (OR) and 95% confidence interval (CI) for each independent variable

o Write and test the null hypothesis for each variable

o Proc surveylogistic in SAS must be used

• For multinomial logit model

o Model checking using Deviance using Proc logistic

o Wald chi-square test for regression coefficients for each independent variable

o Odds ratio (OR) and 95% confidence interval (CI) for each independent variable

o Write and test the null hypothesis for each variable

o Proc surveylogistic in SAS must be used

Each student is required to email his or her answers for the following questions directly to the instructor:

I. How much did you _____________ (Name) contribute to this group project in terms of the proportion of the overall efforts?

II. How much do you think team member ________________ (Name) contributed to this group project in terms of the proportion of the overall efforts?

III. Do you believe that the distribution of workloads for this project in your group is fair?

(5/5)

CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,

Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This

7COM1028 Secure Systems Programming Referral Coursework: Secure

Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip

CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme