(5/5)

Buy Now $10 USD

639 Times Downloaded

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

Drop Files Here Or Click to Upload

Or Get Complete Course Help

Vishal GoyalComputer science

(5/5)

513 Answers

Hire Me

Gavin PhillipsData mining

(5/5)

583 Answers

Hire Me

Marius BantockGeneral article writing

(4/5)

646 Answers

Hire Me

Neville StevensMarketing

(5/5)

575 Answers

Hire Me

Weka

(5/5)

The goal of this assignment is to gain practical experience is using Weka and applying it to storytelling with data.

INSTRUCTIONS TO CANDIDATES

ANSWER ALL QUESTIONS

The goal of this assignment is to gain practical experience is using Weka and applying it to storytelling with data. You can work on this assignment on teams of up to three students. The standard academic honesty rules apply.

First Deliverable – Dataset:

You will be selecting and downloading a dataset of your choice for the purpose of this assignment. Your selected dataset should satisfy the following criteria:

• Contains at least 5 dimensions/features, including at least one categorical and one numerical dimension.

• Contains a clear class label attribute (binary or multi-label).

• Be of a simple tabular structure (i.e., no time series, multimedia, etc.).

• Be of reasonable size, and contains at least 2K tuples.

While the assignment is open ended, you are expected to select an interesting dataset, which in turn tells an interesting story. Many such datasets are available on public repositories such as: UCI, Kaggle, KDnuggets, etc. Attached, please find some suggested datasets to select from.

Second Deliverable – Video Presentation:

Data Exploration Tasks

• The name and source of dataset.

• A description of how the dataset was collected or created.

• A summary of the purpose of each column in your dataset, including the class label.

• An overview of the data .ARFF file, and how you created it (if needed).

• Explain any data quality problems you might have faced, and how it was handled.

• Provide a visual overview of all the attributes in your dataset.

• Discuss the top distinctive categorical attribute, which is highly correlated to the class label. Support your discussion with a visualization of that attribute.

• Discuss the top distinctive numerical attribute, which is highly correlated to the class label. Support your discussion with a discretized visualization of that attribute.

• Identify and discuss one attribute that clearly has no impact on the class label. Support your discussion with a visualization of that attribute.

Data Analytics Tasks:

In the following, always use K-nearest neighbor classification algorithm

Task 1. Using the default settings of K-nearest neighbor, report on the performance of your classifier (e.g., accuracy, precision, recall, etc.).

Task 2. Now try different values of K, and report on the obtained performance for those different values.

Task 3. In your opinion, what is the most suitable setting of K for your dataset?

Task 4. Given your answer to the previous task, try different settings for the split ratio and report on the obtained performance.

Task 5. Compare the performance of Task 4 to that of a cross-fold data partitioning.

In the following, always use the decision tree classification algorithm

Task 6. Using the default settings of the decision tree, report on the performance of your classifier (e.g., accuracy, precision, recall, etc.).

Task 7. Inspect the visualization of the obtained decision tree, and discuss:

1) the most distinctive features of your dataset, and 2) any interesting observations learned from the tree structure.

Task 8. Compare the observations obtained from Task 6 to your findings in the Data Exploration tasks of the assignment. That is, how the features you identified in the exploration phase are similar or different to the ones from Task 6.

Task 9. Adjust the decision tree parameters to allow overfitting, and compare to the results obtained in the previous task in terms of: 1) tree structure, and 2) classifier performance.

Task 10. In your opinion, what would be the best settings for the decision tree classifier for your dataset?

(5/5)

Attachments:

Instructions Files

Expert's Answer

Buy Now $10 USD

639 Times Downloaded

Hurry, Grab up to 30% discount on the entire course

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

Our Experts

Vishal GoyalComputer science

Gavin PhillipsData mining

Marius BantockGeneral article writing

Neville StevensMarketing

Weka

The goal of this assignment is to gain practical experience is using Weka and applying it to storytelling with data.

ANSWER ALL QUESTIONS

Attachments:

Instructions Files

Expert's Answer

Related Questions

. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

. Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class

. The following program contains five errors. Identify the errors and fix them

. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer

Other Services

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

Our Experts

Vishal GoyalComputer science

Gavin PhillipsData mining

Marius BantockGeneral article writing

Neville StevensMarketing

Weka

The goal of this assignment is to gain practical experience is using Weka and applying it to storytelling with data.

ANSWER ALL QUESTIONS

Attachments:

Instructions Files

Expert's Answer

Related Questions

. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

. Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class

. The following program contains five errors. Identify the errors and fix them

. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer