please follow these submission instructions carefully, so that I can focus on grading and providing helpful feedback. Not following these instructions will result in a -5 point penalty, and I may ask you to resubmit it in the correct format. Upload your submission to Moodle as a single PDF comprising photos of your (legibly!) hand- written work or a document that has been typeset in TEX. This PDF should be named <last_name>.pdf, replacing <last_name> with your last name(s).
Working in groups is allowed (and even encouraged), provided you do all of the following:
1. clearly include the names of the people you worked with (inside the write-up, not in the file name)
2. do all of the write-up yourself, in your own words
3. use the group for discussing ideas and not just sharing answers
For any questions, email me (Alex) at alexander.markham@univie.ac.at, or post on the Question discussion forum on Moodle.
You might find this section easier if you wait until gradient descent is covered in lecture (still well before the due date of this assignment). Nevertheless, all the necessary information is here and you should be able to complete these tasks before the lecture.
vector of parameters (this replaces w and b from the notes ), returns a probability of the sample being from a certain class. So, you are given a matrix X ⊂ R m,n of samples (with m rows of samples and n columns of features) and a vector y ⊂ {0, 1}m of labels to train the model.
Then, do not forget to add a column of 1s to X for the bias term in θ so that the decision boundary is not constrained to pass through the originusing the trained model, you can classify a sample x(i) in class 1 if P(y = 1 | x(i)) = hθ(x(i)) ≥ 1
and class 0 otherwise. We define the loss function as m ll.r.(θ, X, y) = ∑ y(i) log hθ(x(i)) (1 y(i)) log 1 hθ(x(i)) ,i=1and we can thus compute the gradient (over the entire training set; but it could analogously be defined for any subset of samples) as follows(θ, X, y) =m∑i=1 hθ(x(i)) − y(i) x(i) T.
" x(1) #
" y(1) #
Data: Training samples X =x(m)and corresponding labels y =y(m)
Parameters: T, number of iterations; ηinit, initial learning rate; and k, batch size
1 initialize θˆ = 0;
2 for t in (1, ..., T) do
3 η ← η√init ;
4 randomly draw k samples from (X, y) to get Xk and yk;
5 θˆ ← θˆ − η • 1 • ∂ l(θˆ, Xk, yk);
6 end
k ∂θ
Result: θˆ that minimizes l
For each of the following tasks, show your detailed calculations at every step. Simply providing the value without showing how it was calculated will not earn any credit. Round to at least three decimal places. You may use a scientific calculator, and you can represent the sigma notation summation as matrix multiplication to (slightly) condense your calculations, but you may not use more advanced calculators or features of e.g. Python or GNU Octave that automatically compute gradients or perform gradient descent.
1. Using an initial learning rate of ηinit = 0.1, perform T = 3 iterations of stochastic gradient descent. Assume we have a large data set X, from which we draw k = 3 samples:
x(1) = (1, 3); y(1) = 0
x(2) = (3, 1); y(2) = 1
x(3) = (1, 4); y(3) = 0
or equvalently written using matrix notation:
Xk = x(1) x(2) x(3) 1 3 = 3 1 1 4 and the corresponding labels y = y(1) 0 y(2) = 1 y(3) 0
Provide the learned values for θˆ and the final value for η. Note that you should use all of Xk for each iteration and that Xk X is the same for each iteration, even though in practice it would likely be a different subset each iteration, because it is randomly drawn from X. (tip: make sure you understand the dimensions of θ, X, Y, x(i), y(i), and how they are all related.)
2. What is the loss ll.r. of your learned θˆ (computed over the three samples) after each iteration?
3. Using the θˆ parameters learned in the first task, classify the following data point and state the probability that it belongs in that class (i.e., estimate y(4) and state the corresponding posterior):x(4) = (2, 2)
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme