logo Hurry, Grab up to 30% discount on the entire course
Order Now logo

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

expert
Les BarkerLaw
(5/5)

686 Answers

Hire Me
expert
Dillon RamosResume writing
(5/5)

585 Answers

Hire Me
expert
Douglas CowleyTechnical writing
(5/5)

958 Answers

Hire Me
expert
Hemant Kumar KhatriStatistics
(/5)

743 Answers

Hire Me
Python Programming
(5/5)

The purpose of this assignment is for you to gain experience with the KNN and PCA algorithms,

INSTRUCTIONS TO CANDIDATES
ANSWER ALL QUESTIONS

Overview:

 

The purpose of this assignment is for you to gain experience with the KNN and PCA algorithms, as well as to assess your ability to construct a well-organized and well-documented notebook. In this assignment you will not receive a notebook template.

 

Upon submission, be sure that your notebook is well documented - this is required for code development and collaboration in industry, and is needed for grading your work. You will be heavily graded on the clarity of your notebook. How easy is it to understand the meaning of your computations? Do the names of your variables clearly communicate their meaning? Is your notebook documented completely? Document/Comment your code so that someone who didn’t know about your assignment would be able to follow what you were doing every step of the way.

 

At the beginning of your notebook, you must include an overview of your work. You should offer a detailed explanation of all steps taken in the notebook, and you should explain your process thoroughly. Give special attention to any deviations from the below outline - there may be steps you need to add!

 

You will be working with the mushroom data set. You should download the necessary files at the supplied link - I have also posted the materials to Brightspace (zip file called mushrooms), in case you have difficulties downloading the data for yourself. Begin by

 

inspecting all of the materials and understanding the data set (note that you will not use all of the supplied files).

 

Directions:

 

The following outlines, roughly, what you aim to accomplish in this assignment:

 

1. Import the Mushroom data set.

 

2. Use the KNN algorithm to impute missing values in the dataset. Note: You are not permitted to use the KNNImputer class from Scikit-learn. Instead, you must explicitly write the code to perform the imputation using the KNeighborsClassifier algorithm (use the default settings).

 

The first step is to think through what will be your feature data and what will be your response data for this imputation step. You will want to one-hot-encode your feature data and label encode your response data. Next, you should train your KNN model on those instances that don’t have missing values, then have the model make predictions for those instances that have missing values. This is how you will have the KNN model impute missing data.

 

When you have computed the missing values, create a data structure (i.e. a list) called missing_values that contains all of the imputed values (in terms of the original categorical/letter data and not the encoded numeric data) in order of increasing index from the original data set. You must then print the first 10 instances of missing_values to the screen so we can check your work. Finally, you will impute the missing values back into the original data set before continuing, so that the next step starts fresh with a complete data set in terms of the raw data values.

 

Graded Concept Question #1 (include a section in your notebook): Would it still be possible to train the KNN model if you one-hot encoded the response data instead? Why or why not?

 

3. Train a RandomForestClassifier as well as a LogisticRegression model to predict whether a mushroom is edible or poisonous given this data set of nominally-valued characteristics. Train the model on the feature data supplied to you after you’ve one-hot encoded it. You should label encode the response data.

 

(5/5)
Attachments:

Related Questions

. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

CS 340 Milestone One Guidelines and Rubric  Overview: For this assignment, you will implement the fundamental operations of create, read, update,

. Develop a program to emulate a purchase transaction at a retail store. This  program will have two classes, a LineItem class and a Transaction class

Retail Transaction Programming Project  Project Requirements:  Develop a program to emulate a purchase transaction at a retail store. This

. The following program contains five errors. Identify the errors and fix them

7COM1028   Secure Systems Programming   Referral Coursework: Secure

. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip

. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer

CS 340 Final Project Guidelines and Rubric  Overview The final project will encompass developing a web service using a software stack and impleme