(5/5)

Hire Me
(5/5)

Hire Me
(5/5)

Hire Me
(5/5)

Hire Me

# you will be implementing a few classification algorithms including the optimal Bayesian classier, one for Decision Trees (DTs), and one for Dependence Trees, and using them to classify several di event data sets.

INSTRUCTIONS TO CANDIDATES

1 Bayesian, Decision Tree and Dependence Tree Classi ers

• Introduction

In this assignment, you will be implementing a few classification algorithms including the optimal Bayesian classier, one for Decision Trees (DTs), and one for Dependence Trees, and using them to classify several di event data sets.

• Binary-valued Artificial Data Sets

• Data Generation

Use the scheme below to generate the data sets you need:

1. You are dealing with a d-dimensional feature space with c = 4 You can assume that d = 10.

2. Assume that the vector components obey a Dependence Tree structure between the various features. This Dependence Tree must be arbitrarily assigned and unknown to the classification (i.e., training and testing)

3. For each of the c classes and for each of the d features, randomly generate the probabilities of the feature taking the value 0 or 1. Thus, for class j = 1; : : : ; c and for feature indices i = 1; : : : ; d, you must randomly assign the value vi;j = P r[xi = 0j! = !j]. These values must be based on the Dependence Tree that you have chosen.

4. Generate 2,000 samples for each class based on the above

• Training and Testing

With regard to training and testing, do the following:

1. Use a 5-fold cross-validation scheme for training and

2. Using estimates of the vi;j's, estimate the true but unknown Dependence    Record the results of how good your estimate of the true but unknown Dependence Tree is.

3. Perform a  Bayesian classication1   assuming that all the random variables are independent. Notice that in this case, you must not assume a Gaussian distribution for the features, but the binary

4. Perform a Bayesian classification assuming that all the random variables are dependent based on the dependence tree that you have

5. Perform the classification based on a DT algorithm. For the DT algorithm, have your program output the resulting The output2 should be neatly indented for easy viewing.

1 Each data set has more than two classes.   In each case,  you must do the classification using a pairwise classification on all the classes and assign the testing sample to the most appropriate winning class.  This paradigm must be followed for the other classification tasks too.

2 An excellent program to draw decision trees is Graphviz, available at: http://www.graphviz.org/.

• Binary-valued Real-life Data Sets

In this section you will deal with the one Real-life data set.

• Data

The Glass Identi cation data set3   is to be used to classify the type of glass, given the following features, speci ed in this order:

1. Class: In this case there are 7 possible types, which can be further split in to 2 categories  of windowed and non-windowed glass

2. Id: Number

3. RI: Refractive index

4. Na: Sodium (unit measurement is weight percent in the oxide, as are attributes 5-11)

5. Mg: Magnesium

6. Al: Aluminum

7. Si: Silicon

8. K: Potassium

9. Ca: Calcium

10. Ba: Barium

11. Fe: Iron

You may ignore all the features that are non-numeric. Whenever you need binary features (i.e., for training and classifying using the Dependence Tree and Decision Tree), render the features to be binary by adopting a thresholding mechanism.

• Techniques to be Implemented

Perform all the tasks given in Section 1.2.2 on this real-life data set.

2  Report

1. Write a 2-3 page report summarizing all your results. The report should be relatively

2. Compare the classi cation accuracy of the Dependence Trees you have obtained for the arti cial and real-life data

3. Compare the classi cation accuracy of the four algorithms for the arti cial data sets. Do some seem to outperform others? Discuss the possible reasons for these

4. Compare the classi cation accuracy of the four algorithms ((a) Bayes, (b) Naive Bayes,

(c) using Dependence trees, and (d) using Decision Trees) for the real-life data sets. Do some seem to outperform others? Again, discuss the possible reasons for these results.

## Related Questions

##### . The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

CS 340 Milestone One Guidelines and Rubric  Overview: For this assignment, you will implement the fundamental operations of create, read, update,

##### . Develop a program to emulate a purchase transaction at a retail store. This  program will have two classes, a LineItem class and a Transaction class

Retail Transaction Programming Project  Project Requirements:  Develop a program to emulate a purchase transaction at a retail store. This

##### . The following program contains five errors. Identify the errors and fix them

7COM1028   Secure Systems Programming   Referral Coursework: Secure

##### . Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip

##### . The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer

CS 340 Final Project Guidelines and Rubric  Overview The final project will encompass developing a web service using a software stack and impleme