• Obtain a dataset of your choosing. The dataset needs to have at least 50 observations and five variables. You need to have at least two categorical and at least two quantitative variables.
• Execute some of the concepts learned in class. These would include:
o Exploratory data analysis (summary statistics, histograms, box plots, tables, etc.)
o Confidence interval estimation
o Hypothesis tests regarding proportions or means
o Correlation/regression
• Explain your dataset, your analyses, and your results/interpretations in a 5+ page report.
o Reports should be written using proper English grammar. Report presentation should be clean, neat and professional. On the homework assignments, I don’t particularly care about how clean/nice the figures/tables look. In the report, the “look” of the figures is important – the axis titles should be in English rather than generic code lingo, the figure should be sized appropriately (recommended dimensions of 6.5 inches wide and 3 inches tall).
o Provide logical graphical displays as required to compliment your analysis.
Tests should be logical applications, based upon the dataset. Students should be able to clearly articulate relevant hypotheses and explain how the results validate or invalidate the stated hypotheses. Analysis code should be attached as an appendix.
SUGGESTED OUTLINE
Here is a suggested outline for the report:
1. Abstract. Should be about 150 words. This is the tl;dr version of your report – just a statement about the data and the key findings.
2. Introduction. Background on the dataset – where it came from, why you selected this data set, and a brief description of the variables you used in your analysis. No summary statistics or graphs in this section.
3. Data Overview. Provide some relevant descriptive statistics and visualizations for the variables used in your analysis. This section should serve to inform the reader about the distribution of your variables.
a. For quantitative variables, you should show summary statistics like the mean, median, minimum, maximum, and standard deviation (similar to the table from problem 1.3 in HW1). You should also include figures like histograms or box plots that highlight the distribution of the data.
b. For categorical variables, you should show frequency (or relative frequency tables) and maybe a bar chart or pie chart.
4. Analysis and Results.
a. Analysis 1 – Create a one-sample or two-sample confidence interval for one of your variables. Discuss the results of the analysis. Discuss the meaning of the results. For a two-sample analysis, include any relevant figures comparing the two distributions. (For example, if you are looking at the average GPA by gender, then include a grouped box plot that shows the distribution of GPA by gender. If you are looking at admissions status by gender, then a grouped bar chart would be informative.)
b. Analysis 2 – Run a one-sample or two-sample hypothesis test for one of your variables. (If you did one-sample for Analysis 1, then Analysis 2 must be two-sample. If you did two-sample for Analysis 1, then Analysis 2 can be either one-sample or two-sample.) Discuss the results of the analysis. Discuss the meaning of the results. For a two-sample analysis, include any relevant figures comparing the two distributions.
c. Analysis 3 – Use your data to create a regression (or multiple regression) model. Discuss the results of the analysis. Discuss the meaning of the results. This section should touch on and/or include:
i. What is the correlation coefficient between the dependent and independent variable(s)?
ii. Scatter plot comparing the dependent and independent variable(s)
iii. If you use any indicator variables, be sure to include a sentence or two on how they are calculated.
iv. Regression output for your model
v. An interpretation of the slope(s)
vi. What does R2 of your model tell you?
5. Conclusions. Statements regarding the overall findings, any frustrations regarding analyses (is there anything you wanted to do but it was not covered in the class), and suggested next steps for your analysis. You won’t need to take any “next steps” – but what else would you look into if you had the time/interest?
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme