logo Hurry, Grab up to 30% discount on the entire course
Order Now logo

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

expert
Martin DoeStatistics
(3/5)

943 Answers

Hire Me
expert
Barbara ElseMathematics
(5/5)

811 Answers

Hire Me
expert
Milan JoshiSocial sciences
(5/5)

987 Answers

Hire Me
expert
Kevin BatesTechnical writing
(5/5)

698 Answers

Hire Me
R Programming
(5/5)

Given the definition of noise, derive the corresponding mean and variance parameters of the normal distribution for y|x1,x2; 0.Write also down its probability density function.

INSTRUCTIONS TO CANDIDATES
ANSWER ALL QUESTIONS

Problem 1:  Linear Regression and Gradient Learning [30 points]

In class, we derived linear regression and various learning algorithms based on gradient descent. In addition to the least square objective, we also learned its probabilistic perspective where each observation is assumed to have a Gaussian noise. (i.e., Noise of each example is an independent and identically distributed sample from a normal distribution) In this problem, you are supposed to deal with the following regression model that includes two linear features and one quadratic feature.y = θ0 + θ1x1 + θ2x2 + θ3x2 + ϵ where ϵ ∼ N (0, σ2)

Your goal is to develop a gradient descent learning algorithm that will estimate the best pa- rameters θ = {θ0, θ1, θ2, θ3}.

(a) Given the definition of noise, derive the corresponding mean and variance parameters of the normal distribution for y|x1, x2; θ. Write also down its probability density function. 

(b) You  are  provided  with  a  training  observations  D  = {(x(i), x(i), y(i))|1 ≤ i ≤ m}.  Derive

the conditional log-likelihood that will be later maximized to make D most likely.

(c) If you omit all the constant that does not relate to our parameters θ, what will be the objective  function  J(θ0, θ1, θ2, θ3)  that  you  are  going  to  perform  Maximum  Likelihood Estimation? Does J look similar to the Least Square objective for this problem?

(d) Compute the gradient of J(θ) with respect to each parameter. (Hint: You should evaluate the partial derivatives of J(θ) with respect to each θj  for 0 ≤ j ≤ 3)

(e) [Coding] Develop two learning algorithms: batch and stochastic gradient descent for this problem on the Auto dataset given in Problem 4 in Homework 1. Try to find the two best input features for predicting the output mpg. Compare and report the difference between your full coding and R’s built-in function call: lm. (Hint: At least your stochastic gradient algorithm must learn parameters θ comparable to the result from calling the R’s built-in function. Otherwise try to tune the learning rate α < 1.0)

Problem  2:  Logistic  Regression  and  Evaluations [30 points]

Recall the grading problem to predict pass/fail in the class. Suppose you collect data for a group of students in the class that consist of two input features X1 = hours studied and X2 = undergrad GPA. Your goal is to predict the output Y  ∈ {pass, fail }.  Suppose that you fit a logistic regression, learning its parameter (θ0, θ1, θ2) = (−6, 0.05, 1).

(a) What will be the probability for a student who studies for 40 hours and has a GPA of

3.5 to pass the class?

(b) How many hours would the student in part (a) needs to study in order to have at least 50% chance of passing the class?

The following questions must be answered using Weekly dataset in ISLR package. It contains 1,089 weekly returns for 21 years from the beginning of 1990 to the end of 2010.  You will use its 1990-2008 as a training data and 2009-2010 as a test data.

(c) [Coding] Given the training data, you are to perform a logistic regression where the input features are five of Lag variables and Volume, and the binary output is Direction. Use the summary function to print our the results. Report the confusion matrix and the accuracy on both training and test data given the learned model. (Hint: As nothing is specified, your default threshold to decide Up/Down must be 0.5)

(d) [Coding] Now you will run logistic regression five times with only one input features Lagj(1 ≤ j ≤ 5) for each time. Compute the  confusion  matrix  and  the  accuracy  on both training and test data given each of the learned models. Which are the best models among the five models here and the earlier model in part (c) in terms of the accuracy and F-score, respectively? Does the best model also achieve the best accuracy or F-score

on the training data? (Hint: The best model must be chosen based on the test data, not the training data!)

(e) [Coding] Try to draw six ROC curves for 6 models from the part (c) and (d) with varying thresholds. Determine the best model in terms of the Area Under Curve (AUC). (Note: 6 curves must be plotted at the same time as a single graph. You can use R’s built-in function to compute the AUC)

(f)  [Coding] Try to draw six Precision-Recall curves for 6 models from the part (c) and (d) with varying thresholds. Determine the best model in terms of the Area Under Curve (AUC). (Note: 6 curves must be plotted at the same time as a single graph. You can use R’s builtin function to compute the AUC)

(g)  Report your observation about different evaluation measures, and pick the one that you think the most appropriate for this problem. Explain why.

(5/5)
Attachments:

Related Questions

. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

CS 340 Milestone One Guidelines and Rubric  Overview: For this assignment, you will implement the fundamental operations of create, read, update,

. Develop a program to emulate a purchase transaction at a retail store. This  program will have two classes, a LineItem class and a Transaction class

Retail Transaction Programming Project  Project Requirements:  Develop a program to emulate a purchase transaction at a retail store. This

. The following program contains five errors. Identify the errors and fix them

7COM1028   Secure Systems Programming   Referral Coursework: Secure

. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip

. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer

CS 340 Final Project Guidelines and Rubric  Overview The final project will encompass developing a web service using a software stack and impleme