logo Hurry, Grab up to 30% discount on the entire course
Order Now logo

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

expert
Earl BarwinAccounting
(5/5)

753 Answers

Hire Me
expert
Alexx WardEconomics
(5/5)

549 Answers

Hire Me
expert
Jai Parkash KashiAccounting
(5/5)

750 Answers

Hire Me
expert
Rick PCriminology
(5/5)

804 Answers

Hire Me
Others
(5/5)

There are no restrictions on what dataset you use, other than you cannot use the specific datasets

INSTRUCTIONS TO CANDIDATES
ANSWER ALL QUESTIONS

Term Project: 

During the course, you will be working on a term project that takes you through an exploratory data analysis project.  You are going to pose a question and then evaluate a dataset to answer the question, in the same way the author is attempting to answer the question, “Do first babies tend to arrive late?”.  While you will be gaining experience in the course with the datasets the author of our book selected, you will also be required to select a dataset, form a question or hypothesis of that dataset and then using the various techniques we learn in the course work towards proving or disproving your hypothesis.  What will be required of you is a summary of your results along the way, along with a written and visual presentation of your findings.  The term project will be graded at the end of the course, however each week, you should follow-along with the book and apply the learned methods to your own dataset. 

The first step is selecting a dataset and forming your question/hypothesis. 

Some helpful places to find datasets include: 

Data Sets, Tableau Community Forums 

Datasets, Kaggle 

Data.gov, U.S. General Services Administration  

Science.gov, Office of Scientific and Technical Information  

Data.gov.uk, Government Digital Service  

The General Social Survey, University of Chicago 

The European Social Survey, ESS-ERIC  

There are no restrictions on what dataset you use, other than you cannot use the specific datasets used primarily in the book which are from the following sources: National Survey of Family Growth and the Behavioral Risk Factor Surveillance System. 

You will turn your entire term project in during the final week of class, however, here are some of the milestones to help you gauge where you should be at during each week of class. 

Milestone 1 (Weeks 1-2) – Evaluate datasets, start thinking of statistical questions 

Milestone 2 (Weeks 3-4) – Select a dataset, solidify your statistical question, begin describing the single variables in your dataset to determine which variables are relevant to your question (Distributions, PMFs, CDFs).  You should know your statistical question you are trying to answer by no later than Week 3. 

Milestone 3 (Weeks 5-6) – Start identifying relationships between the variables you have identified vs looking at just one variable at a time. 

**Week 6 will require you to post your topic to the discussion board, along with the analysis you plan to perform to your dataset.** 

Milestone 4 (Weeks 7-10) – Start evaluating if the results you are seeing in a sample would happen in the large population and start testing out the results and hypothesis you have made up to this point. 

Milestone 5 (Weeks 11-12) – Wrap up your PowerPoint presentation and summarization of analysis. 

The following is due submitted to your GitHub repository.  Submit a link to your repository to the assignment link during the final week of class: 

Your dataset 

A PowerPoint presentation outlining your statistical question/hypothesis o A minimum of 5 variables in your dataset used during your analysis (for help with selecting, the author made his selection on page 6 of your book).  Consider what you think could have an impact on your question – remember this is never perfect, so don’t be worried if you miss one (Chapter 1). 

o Describe what the 5 variables mean in the dataset (Chapter 1). 

o Include a histogram of each of the 5 variables – in your summary and analysis, identify any outliers and explain the reasoning for them being outliers and how you believe they should be handled (Chapter 2).  

o Include the other descriptive characteristics about the variables: Mean, Mode, Spread, and Tails (Chapter 2). 

o Using pg. 29 of your text as an example, compare two scenarios in your data using a PMF.  Reminder, this isn’t comparing two variables against each other – it is the same variable, but a different scenario.  Almost like a filter.  The example in the book is first babies compared to all other babies, it is still the same variable, but breaking the data out based on criteria we are exploring (Chapter 3). 

o Create 1 CDF with one of your variables, using page 41-44 as your guide, what does this tell you about your variable and how does it address the question you are trying to answer (Chapter 4). 

o Plot 1 analytical distribution and provide your analysis on how it applies to the dataset you have chosen (Chapter 5). 

o Create two scatter plots comparing two variables and provide your analysis on correlation and causation. Remember, covariance, Pearson’s correlation, and NonLinear Relationships should also be considered during your analysis (Chapter 7). o Conduct a test on your hypothesis using one of the methods covered in Chapter 9. 

o For this project, conduct a regression analysis on either one dependent and one explanatory variable, or multiple explanatory variables (Chapter 10 & 11). 

Using Python, submit your results via your notebook or export your code and submit via the assignment link.  You must show your code and work for full credit. 

A 250-500-word paper summarizing the following: 

o Statistical/Hypothetical Question o Outcome of your EDA o What do you feel was missed during the analysis? 

o Were there any variables you felt could have helped in the analysis? 

o Were there any assumptions made you felt were incorrect? o What challenges did you face, what did you not fully understand? 

(5/5)
Attachments:

Related Questions

. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

CS 340 Milestone One Guidelines and Rubric  Overview: For this assignment, you will implement the fundamental operations of create, read, update,

. Develop a program to emulate a purchase transaction at a retail store. This  program will have two classes, a LineItem class and a Transaction class

Retail Transaction Programming Project  Project Requirements:  Develop a program to emulate a purchase transaction at a retail store. This

. The following program contains five errors. Identify the errors and fix them

7COM1028   Secure Systems Programming   Referral Coursework: Secure

. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip

. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer

CS 340 Final Project Guidelines and Rubric  Overview The final project will encompass developing a web service using a software stack and impleme