logo Hurry, Grab up to 30% discount on the entire course
Order Now logo

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

expert
Sikhar AggarwalMathematics
(5/5)

767 Answers

Hire Me
expert
Mayank ThakurComputer science
(5/5)

770 Answers

Hire Me
expert
Bidit SadhukhanStatistics
(/5)

737 Answers

Hire Me
expert
Vishal FogatComputer science
(5/5)

829 Answers

Hire Me
Python Programming
(5/5)

The purpose of this assignment is for you to gain experience with the KNN and PCA algorithms,

INSTRUCTIONS TO CANDIDATES
ANSWER ALL QUESTIONS

Overview:

 

The purpose of this assignment is for you to gain experience with the KNN and PCA algorithms, as well as to assess your ability to construct a well-organized and well-documented notebook. In this assignment you will not receive a notebook template.

 

Upon submission, be sure that your notebook is well documented - this is required for code development and collaboration in industry, and is needed for grading your work. You will be heavily graded on the clarity of your notebook. How easy is it to understand the meaning of your computations? Do the names of your variables clearly communicate their meaning? Is your notebook documented completely? Document/Comment your code so that someone who didn’t know about your assignment would be able to follow what you were doing every step of the way.

 

At the beginning of your notebook, you must include an overview of your work. You should offer a detailed explanation of all steps taken in the notebook, and you should explain your process thoroughly. Give special attention to any deviations from the below outline - there may be steps you need to add!

 

You will be working with the mushroom data set. You should download the necessary files at the supplied link - I have also posted the materials to Brightspace (zip file called mushrooms), in case you have difficulties downloading the data for yourself. Begin by

 

inspecting all of the materials and understanding the data set (note that you will not use all of the supplied files).

 

Directions:

 

The following outlines, roughly, what you aim to accomplish in this assignment:

 

1. Import the Mushroom data set.

 

2. Use the KNN algorithm to impute missing values in the dataset. Note: You are not permitted to use the KNNImputer class from Scikit-learn. Instead, you must explicitly write the code to perform the imputation using the KNeighborsClassifier algorithm (use the default settings).

 

The first step is to think through what will be your feature data and what will be your response data for this imputation step. You will want to one-hot-encode your feature data and label encode your response data. Next, you should train your KNN model on those instances that don’t have missing values, then have the model make predictions for those instances that have missing values. This is how you will have the KNN model impute missing data.

 

When you have computed the missing values, create a data structure (i.e. a list) called missing_values that contains all of the imputed values (in terms of the original categorical/letter data and not the encoded numeric data) in order of increasing index from the original data set. You must then print the first 10 instances of missing_values to the screen so we can check your work. Finally, you will impute the missing values back into the original data set before continuing, so that the next step starts fresh with a complete data set in terms of the raw data values.

 

Graded Concept Question #1 (include a section in your notebook): Would it still be possible to train the KNN model if you one-hot encoded the response data instead? Why or why not?

 

3. Train a RandomForestClassifier as well as a LogisticRegression model to predict whether a mushroom is edible or poisonous given this data set of nominally-valued characteristics. Train the model on the feature data supplied to you after you’ve one-hot encoded it. You should label encode the response data.

 

(5/5)
Attachments:

Related Questions

. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

CS 340 Milestone One Guidelines and Rubric  Overview: For this assignment, you will implement the fundamental operations of create, read, update,

. Develop a program to emulate a purchase transaction at a retail store. This  program will have two classes, a LineItem class and a Transaction class

Retail Transaction Programming Project  Project Requirements:  Develop a program to emulate a purchase transaction at a retail store. This

. The following program contains five errors. Identify the errors and fix them

7COM1028   Secure Systems Programming   Referral Coursework: Secure

. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip

. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer

CS 340 Final Project Guidelines and Rubric  Overview The final project will encompass developing a web service using a software stack and impleme

January
January
February
March
April
May
June
July
August
September
October
November
December
2025
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
2037
2038
2039
2040
2041
2042
2043
2044
2045
2046
2047
2048
2049
2050
SunMonTueWedThuFriSat
29
30
31
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
1
00:00
00:30
01:00
01:30
02:00
02:30
03:00
03:30
04:00
04:30
05:00
05:30
06:00
06:30
07:00
07:30
08:00
08:30
09:00
09:30
10:00
10:30
11:00
11:30
12:00
12:30
13:00
13:30
14:00
14:30
15:00
15:30
16:00
16:30
17:00
17:30
18:00
18:30
19:00
19:30
20:00
20:30
21:00
21:30
22:00
22:30
23:00
23:30