Background and overview
Childhood obesity is a serious medical condition that affects children and adolescents. Children who are obese are above the normal weight for their age, height and gender. Obesity is prevalent worldwide; a report by the World Health Organization (WHO)1 revealed that over 340 million children and adolescents aged 5-19 were overweight or obese in 2016.
Obesity is a major risk factor for a number of health consequences, including diabetes, cardio-vascular disease, cancer, as well as psychological problems such as isolation and low self-esteem.Here and elsewhere, clicking on the blue text will take you to the relevant website.
Obese children are likely to continue being obese during adulthood and are more likely to develop a variety of health problems as adults. A study tracking child obesity published in 20112 (among others) found that the probability of overweight children becoming overweight adults increases with a child’s age.
In the UK, childhood obesity is monitored via the National Child Measurement programme (NCMP). Under this programme, measurements are taken on the height and weight of chil- dren in Reception class (aged 4 to 5) and year 6 (aged 10 to 11), to assess overweight and obesity levels in children within primary schools.
The NCMP programme runs every school year. Parents or carers of children eligible for measure- ment (i.e., attending state maintained schools at Reception Year and Year 6) receive a letter from local authorities informing them of the programme. They may choose to withdraw their child from the process. The height and weight of all eligible children (with consent) is then measured and submitted to NHS Digital and Public Health England (PHE) for further analysis.
NCMP uses the accepted method for diagnosing obesity in children by measuring their Body Mass Index (BMI), expressed as a percentile for their weight given a reference weight distribution (the WHO Growth Standards) of children of the same age, gender and height. A child is classified as overweight if their weight exceeds the 91st percentile and obese if they exceed the 98th.
The aim of the NCMP data collection is to study the factors associated with childhood obesity and understand appropriate mitigation strategies to tackle it. Interventions may be both direct and indirect. For example, a direct intervention may be to improve nutritional value of school meals; an indirect intervention may address underlying factors which are associated with obesity, such as socioeconomic indicators. A detailed understanding of the complex socioeconomic factors involved in child obesity is therefore paramount to any successful intervention. In this assignment, your aim is to build statistical models that will give you such an understanding.
The data provided for the analysis are a subset of the NCMP data, together with relevant socioeconomic indicators, spanning the period 2011 to 2017. The data have been obtained from the PHE database and contain annual information on numbers of Year 6 obese children in each of 326 Unitary Authorities (UAs — these are administrative districts) in England, along with corresponding socioeconomic indicators and the population size. PHE allows the use of these data free of charge in any format or medium, under the terms of the Open Government Licence v.3.0.
The data are provided as three separate files on the In-course assessment 2 tab of the STAT0023 Moodle page. The first, obesity.csv, contains a “cleaned” and anonymised version of the original obesity data: 2 232 observations (each row is an annual observation for a single UA) of Year 6 obesity counts and 14 additional covariates. Full details, including the anonymisation procedure (which includes small linear transformations of most variables) can be found in the Appendix to these instructions. The first 1 785 rows are complete, i.e., contain all values of obesity and covariates. The last 447 rows contain all values of the covariates
Citation: Starc G, Strel J. Tracking excess weight and obesity from childhood to young adulthood: a 12-year prospective cohort study in Slovenia. Public Health Nutrition 2011;14:49-55. doi:10.1017/S1368980010000741. A copy of this paper can also be downloaded from the ‘In-course assessment 2’ tab of the STAT0023 Moodle page.
obesity counts. The second file, UARegions.csv, lists all the UA codes together with the wider region each UA belongs to (for example, UA EE01 is in the “East of England” region). The final file, IndicatorNames.csv, gives information about the variables in the PHE database.
Your task in this assessment is to carry out some data preprocessing to combine the datasets and then to use the data from the first 1 785 records, to build a statistical model that will help you to: Understand the social, demographic and economic factors associated with variation in annual obesity Year 6 counts in each UA; and Estimate the Year 6 obesity counts for each of the 447 records where you don’t have this information.
Detailed instructions
You may use either R or SAS for this assessment.
1. Read the data into your chosen software package, define appropriate variable names (see the Appendix) and carry out any necessary recoding (e.g. to deal with the fact that ‘ 1’ represents a missing value).
2. Combine the data from obesity.csv and UARegions.csv into a single dataset (a data frame if you’re using R, a dataset in SAS), so that each row contains both the original data and the wider region that each observation corresponds to.
3. Carry out an exploratory analysis that will help you to start building a sensible statistical model to understand and predict annual obesity counts for each UA. This analysis should aim to identify an appropriate set of candidate variables to take into the subsequent modeling exercise, as well as to identify any important features of the data that may have some implications for the modeling. You will need to consider the context of the problem to guide your choice of exploratory analysis. See the ‘Hints’ below for some ideas.
4. Using your exploratory analysis as a starting point, develop a statistical model that enables you to predict annual obesity counts for each UA based on (a subset of) the other variables in the dataset, and also to understand the variation of obesity counts between different UA and across different years. To be convincing, you will need to consider a range of models and to use an appropriate suite of diagnostics to assess them. Ultimately however, you are required to recommend a single model that is suitable for interpretation, and to justify your recommendation. Your chosen model should be either a linear model, a generalized linear model or a generalized additive model.
5. Use your chosen model to predict the obesity counts for each UA and year where this information is missing, and also to estimate the standard deviation of your prediction errors.
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme