Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

Drop Files Here Or Click to Upload

Or Get Complete Course Help

Sachin RanaComputer science

(5/5)

730 Answers

Hire Me

Chris EldredgriggMathematics

(5/5)

883 Answers

Hire Me

Steven HughesEducation

(5/5)

878 Answers

Hire Me

Joan DomettEngineering

(4/5)

919 Answers

Hire Me

Mathematics

(5/5)

Which feature is your target and what type of feature is it? If this is a categorical/nominal feature be sure it’s coded as a Factor in R.

INSTRUCTIONS TO CANDIDATES

ANSWER ALL QUESTIONS

Classification using kNN (k nearest-neighbors)

We will be looking at the “Breast Cancer Wisconsin Diagnostic” data set, which is donated by researchers of the University of Wisconsin and includes measurements from digitized images of fine-needle aspirate of a breast mass. The data includes 30 numeric features which measure various characteristics related to shape and size of the mass, and a feature identifying the cancer diagnosis. The diagnosis feature is coded B for benign mass and M for malignant mass. Each example represents an independent cancer biopsy. In effect, we have 569 labeled examples. We would like to employ a classification algorithm that can learn from the data that we have and predict the diagnosis of new unlabeled examples, i.e., classify undiagnosed breast masses as either benign or malignant. Use R and present all your answers/explanations/visualizations in a Word document, using font ‘Courier New’.

The data file was recorded in comma-separated-variable form in file “wisc_bc_data”.

You will need to watch the tutorials to complete some of these task.

1. Which feature is your target and what type of feature is it? If this is a categorical/nominal feature be sure it’s coded as a Factor in R. What do we call the process of predicting a target variable of this type?

2. How many examples do we have in this data set? In your own words what does each example represent here?

3. Construct a table of frequencies and a separate table of relative frequencies for the target feature. We should have a good mix of examples at each level of our target for our algorithm to work well. Using the summary function quickly peruse the 30 other features.

4. The kNN algorithm is ‘scale-variant’, which means that features with larger values will have undue influence on the outcome. I have already ‘Normalized’ the numerical features for you in this data set, you should have noticed that when you looked at the summaries of the 30 numerical features in the previous question. (nothing to do on this question other than to confirm that the range of each potential predictor in the data set is 0 to 1)

5. Let’s split the data into a Training set and a Test set. First, we should mix up the examples to protect against any type of nonrandom ordering in the rows of the data. I have done the randomizing the rows step for you already. Knowing that they are well mixed, take the first 469 rows for our Training set and the last 100 for our Test set.

Tip: This will require you to subset the raw data set into 4 parts for eventual use in the kNN function, carefully selecting particular rows and columns by using square brackets. You must watch the video tutorials where I do this and explain.

6. With the Training and Test sets successfully split in the previous step, you can now use the kNN algorithm to predict the classifications, i.e., the diagnoses of the Test examples. You need a few items here:

1. Choose an integer for k, close to the square root of the number of examples in the Training data set.

2. Load the “class” package into your current R environment using code: library(class)

3. Define model with the knn function (template provided here): m1 <- knn(train= , test= , cl= , k=

7. Create a Confusion Matrix with the predicted diagnoses of the Test examples on the horizontal axis and their actual diagnoses on the vertical axis. Comment on the performance of our kNN algorithm. What was the Accuracy Rate expressed as a proportion? What was the Error Rate expressed as a proportion? Of the incorrectly classified examples which ones do you think are most dangerous in this context?

(5/5)

Hurry, Grab up to 30% discount on the entire course

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

Our Experts

Sachin RanaComputer science

Chris EldredgriggMathematics

Steven HughesEducation

Joan DomettEngineering

Mathematics

Which feature is your target and what type of feature is it? If this is a categorical/nominal feature be sure it’s coded as a Factor in R.

ANSWER ALL QUESTIONS

Attachments:

Instructions Files

Related Questions

. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

. Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class

. The following program contains five errors. Identify the errors and fix them

. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer

Other Services

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

Our Experts

Sachin RanaComputer science

Chris EldredgriggMathematics

Steven HughesEducation

Joan DomettEngineering

Mathematics

Which feature is your target and what type of feature is it? If this is a categorical/nominal feature be sure it’s coded as a Factor in R.

ANSWER ALL QUESTIONS

Attachments:

Instructions Files

Related Questions

. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

. Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class

. The following program contains five errors. Identify the errors and fix them

. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer