logo Hurry, Grab up to 30% discount on the entire course
Order Now logo

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

expert
Dr Samuel BarberaStatistics
(5/5)

559 Answers

Hire Me
expert
Angel MoralesLaw
(5/5)

939 Answers

Hire Me
expert
Joan DometttData mining
(5/5)

559 Answers

Hire Me
expert
StatAnalytica ExpertGeneral article writing
(5/5)

874 Answers

Hire Me
Weka
(5/5)

Choose the area of your preference and create a dataset. For example actresses/actors, food, movies, sports, music bands

INSTRUCTIONS TO CANDIDATES
ANSWER ALL QUESTIONS

1. (10pt)Choose the area of your preference and create a dataset. For example actresses/actors, food, movies, sports, music bands, or anything you would like. Create a data file in .arff format (please attach dataset with your submission) containing at least 50 instances, each described by at least 4 attributes, the last attribute containing your preference (or class attribute), e.g.

@relation food

@attribute calories numeric

@attribute taste {sweet, sour, bitter, salty}

@attribute course {appetizer, main, dessert, drink}

@attribute vegetarian {yes, no}

@attribute like_it {yes, no}

@data

100, sweet, dessert, yes, yes%icecream

80, bitter, drink, yes, yes%beer

2, sweet, dessert,yes, no%cake

...

In your own words please describe the dataset. Use data mining to explore and create models to explain the dataset.

Create and compare at least 3 algorithms on your data set (ex. decision trees, a classification or an association rule learner, naive Bayes, etc.) For each algorithm evaluate the model and discuss your findings. What was the performance, is the model relevant, which algorithm can explain your personal liking the best, and observe the generated rules and if they tell you anything interesting? If the model is not good, discuss why and some techniques on how you might improve.

2. On the labor_neg_nominal.arff file, use the learners 1) IBk, 2)J48 with -M2(minimum number of objects per leaf=2), and 3) J48 with -M3. For each learner, create two models using (training set) and (10-fold stratified cross-validation). Analyze the scores of each model created. There should be six models in total. Answer the following:

A) What does the training set evaluation score tell you?

B) What does the cross-validation score evaluate?

C) What did you learn from the models about the data?

D) Which one of these models would you say is the best? Why?

3. Use the following learning schemes to analyze the Titanic data (in titanic.arff).

C4.5 - weka.classifiers.j48.J48

Association rules -weka.associations.apriori

Decision List - weka. Classifiers.PART

 

A) What is the most important descriptor (attribute) in titanic.arff, and how can you tell?

B) How well were these methods able to learn the patterns in the dataset? Quantify your answer?

C) Compare the training set and 10-fold cross-validations scores of the methods.

D) Would you trust these models? Did they really learn what was important to survive the Titanic disaster?

E) Which one would you trust more, even if just very slightly? Why?

4. Choose one of the following files: soybean.arff, autoprice.arff, hungarian, zoo.arff or zoo2_x.arff and use any two schemas of your choice to build and compare the models. Evaluate and discuss the models. What was learned by the models (be specific to the dataset)? Which one of the models would you keep? Why?

(5/5)
Attachments:

Related Questions

. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

CS 340 Milestone One Guidelines and Rubric  Overview: For this assignment, you will implement the fundamental operations of create, read, update,

. Develop a program to emulate a purchase transaction at a retail store. This  program will have two classes, a LineItem class and a Transaction class

Retail Transaction Programming Project  Project Requirements:  Develop a program to emulate a purchase transaction at a retail store. This

. The following program contains five errors. Identify the errors and fix them

7COM1028   Secure Systems Programming   Referral Coursework: Secure

. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip

. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer

CS 340 Final Project Guidelines and Rubric  Overview The final project will encompass developing a web service using a software stack and impleme