Summary
For this assignment, you will be working with different types of data sets. Your goal is to understand and characterize one or more of the data sets, pose data mining questions against four data sets and propose additional questions if you had more data.
The Data
Attached to this assignment are four groups of data sets as listed below.
Data Group 1: Ice Cream Data
Data Group 2: 100 Years of Beach Erosions in New Zealand
Data Group 3: Nutrition (There are 2 datasets. McDonald’s Link and Nut. values for common foods & products) Data Group 4: Where to Attend College
If you follow the given links above, you can find out more about each data group.
You may need to do some research to better understand each data set. Quite often when you want to data mine a data set, it is from a domain you are not familiar with. Thus, you need to do some research to understand the attributes within each data group.
Part 1: Ice Cream Data group (20 points)
A) Provide descriptive statistics for each attribute as they apply. This may include minimum, maximum, average, etc.
B) For each attribute, explain whether it is descriptive, discrete, continuous, or discontinuous?
C) Is this a supervised or unsupervised data set? If supervised, what is the class variable(s)?
D) Is this data set time-series, temporal, spatial data, sequence, some combination, or none of these?
E) What measurement scale is the dependent variable (if it exists, it is usually the right-most column)?
Part 2: Formulating questions for the 4 data groups (80 Points Total: 10 points per question per data group)
F) Formulate three specific questions against each of the data sets. Try to vary the question types (e.g. classification, regression, clustering). If this is not possible, explain why. For each question, describe any preprocessing that need to be done to get the data in a format so that you could ask the question. Make sure to ask data mining, not database type of questons.
G) If you could collect additional data (other data files), describe what that data would look like. What additional questions would you be able to ask? I am looking for you to be creative and not just repeat what I have done on the next page.
If you work as a group of one do everything above.
If you work as a group of two, do everything above, plus apply the tasks in Part 1 to the Where to Attend College data group.
Grading criteria:
1) How well did you describe the data group from Part 1?
2) Do you show a good understanding of the data set?
3) Are your questions reasonable? insightful?
4) How much effort did you put into the assignment?
5) How creative are you to extend a given data group and project additional analysis?
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme