Copyright statement: All the materials of this project—including this specification and code skeleton—are copyright of the University of Melbourne. These documents are licensed for the sole purpose of your assessment in COMP90051. You are not permitted to share or post these documents.
Academic misconduct: You are reminded that all submitted work is to be your own individual work. Automated similarity checking software will be used to compare all submissions. It is University policy that academic integrity be enforced. For more details, please see the policy at http://academichonesty.unimelb.edu.au/policy.html. Academic misconduct hearings can determine that students receive zero for an assessment, a whole subject, or are terminated from their studies. You may not use software libraries or code snippets found online, from friends/private tutors, or anywhere else. You can only submit your own work.
The Support Vector Machine (SVM) are a powerful framework for classification, formulated around the idea of max- imum margin of separation which ensures strong generalisation performance. We have seen the equivalent primal and dual formulations of the SVM in lectures, and how these lead to different time complexities of inference, as well as the ability to incorporate kernels when using the dual. In this project, you will work individually to implement several SVM algorithms. These go beyond what was covered in class, and includes real research papers that you will have to read and understand yourself.
By the end of the project you should have developed:
ILO1. A deeper understanding of the SVM and its primal and dual formulation; ILO2. An appreciation of how SVMs are applied;
ILO3. Demonstrable ability to implement ML approaches in code; and
ILO4. Ability to understanding research papers, understand their focus, contributions, and algorithms enough to be able to implement and apply them. (While ignoring details not needed for your task.)
Overview
The project consists of four parts.
1. Solving the primal problem with stochastic online updates [5 marks]
2. Solving the dual problem with Kernel-Adtron [7 marks]
3. Incorporating kernels [5 marks]
4. Implementing the sequential minimal optimization (SMO) algorithm [8 marks]
All parts are to be completed in the provided Python Jupyter notebook proj1.ipynb.1 Detailed instructions for each task are included in this document.
1We appreciate that while some have entered COMP90051 feeling less confident with Python, many workshops so far have exercised and built up basic Python and Jupyter knowledge. Both are industry standard tools for the data sciences.
SVM algorithms. The project’s tasks require you to implement SVM training algorithms by completing provided skeleton code in proj1.ipynb. All of the SVM training algorithms must be implemented as sub-classes of the pro- vided base SVM class. This ensures all SVM training algorithms inherit the same interface, and your implementations must conform to this interface. You may implement functionality in the base SVM class if you desire—e.g., to avoid duplicating common functionality in each sub-class. Your classes may also use additional private methods to better structure your code. And you may decide how to use class inheritance.
Python environment. You must use the Python environment used in workshops to ensure markers can reproduce your results if required. We assume you are using Python ≥ 3.8, numpy ≥ 1.19.0, scikit-learn ≥ 0.23.0 and matplotlib
≥ 3.2.0.
Other constraints. You may not use functionality from external libraries/packages, beyond what is imported in the provided Jupyter notebook highlighted here with margin marking . You must preserve the structure of the skeleton code—please only insert your own code where specified. You should not add new cells to the notebook. You may discuss the SVM learning slide deck or Python at a high-level with others, including via Piazza, but do not collaborate with anyone on direct solutions. You may consult resources to understand SVM conceptually, but do not make any use of online code whatsoever. (We will run code comparisons against online partial implementations to enforce these rules. See ‘academic misconduct’ statement above.)
Submission Checklist
You must complete all your work in the provided proj1.ipynb Jupyter notebook. When you are ready to submit, follow these steps. You may submit multiple times. We will mark your last attempt. Hint: it is a good idea to submit early as a backup. Try to complete Part 1 in the first week and submit it; it will help you understand other tasks and be
a fantastic start!
1. Restart your Jupyter kernel and run all cells consecutively.
2. Ensure outputs are saved in the ipynb file, as we may choose not to run your notebook when grading.
3. Rename your completed notebook from proj1.ipynb to username.ipynb where username is your university central username.2
4. Upload your submission to the Project 1 Canvas page.
Marking
Projects will be marked out of 25. Overall approximately 60%, 20%, 20% of available marks will come from correctness, code structure & style, and experimentation. Markers will perform code reviews of your submissions with indicative focus on the following. We will endeavour to provide (indicative not exhaustive) feedback.
1. Correctness: Faithful implementation of the algorithm as specified in the reference or clarified in the speci- fication with possible updates in the Canvas changelog. It is important that your code performs other basic functions such as: raising errors if the input is incorrect, working for any dataset that meets the requirements (i.e., not hard-coded).
2. Code structure and style: Efficient code (e.g., making use of vectorised functions, avoiding recalculation of ex- pensive results); self-documenting variable names and/or comments; avoiding inappropriate data structures, duplicated code and illegal package imports.
2LMS/UniMelb usernames look like tcohn, not to be confused with email such as trevor.cohn.
3. Experimentation: Each task you choose to complete directs you to perform some experimentation with your implementation, such as evaluation, tuning, or comparison. You will need to choose a reasonable approach to your experimentation, based on your acquired understanding of the SVM learners.
Late submission policy. Late submissions will be accepted to 4 days at −2.5 penalty per day or part day. Weekends and holidays will also be counted towards the late penalty.
Part Descriptions
Part 1: Primal problem with stochastic gradient update [5 marks total]
Your first task is to implement a soft-margin SVM in its primal formulation, using stochastic gradient descent (SGD) for training. This is an online algorithm which is similar to the perceptron training algorithm (see week 6 tutorial), where training involves an outer loop run several times, and an inner loop over each instance in the training set, where the parameters are updated using the gradient of the loss for a single example. The stopping critierion here for the outer is to finish iterations iterations.
At each step, the algorithm will update the weights w and bias b, with update rules:
wt = wt−1 − η∇wLi (w, b) bt = bt−1 − η∇bLi (w, b)
where η is the learning rate, and Li (w, b) is the loss function for the i th example. You will first have to figure out the per-instance loss, Li (w, b), based on the total loss of the training set,
L(w, b) = X max 0, 1 − yi (w′xi + b) + 1 λ∥w∥2
and then derive the relevant gradients needed for the SGD updates. Your will then need to implement a PrimalSVM class with the following functions:
1. fit, which trains using SGD as described above
2. predict, which classifies a new instance x based on sign of w′x + b
3. init , to retain state or precompute values to support the above methods
Experiments. Once your PrimalSVM class is implemented, it is time to perform some basic experimentation.
(a) Include and run an evaluation on the given dataset:
psvm = PrimalSVM(eta = 0.1, lambda0 = 0.1) psvm.fit(X,y,iterations = 100)
print(f"Accuracy is {round(psvm.evaluate(X,y),4)}") psvm.visualize(X,y)
Print the training accuracy and plot the dataset and decision surface.
(b) Now you will need to tune your PrimalSVM’s hyperparameters. Based on your understanding of the λ value, test a range of values (consider equally spaced points in logarithmic space) and run experiments to find the best value.3 You must set eta to 0.1 and iterations to 100. Output the result of this strategy—which could be a graph, number, etc. of your choice.
3Best here means the setting with the best training accuracy. This is used here as a sanity check to test whether the model can fit the data, but of course it is not a good measure of the model’s generalisation accuracy.
Part 2: Dual SVM formulation with stochastic online update [7 marks total]
In this part, you are to implement the dual formulation of the soft-margin SVM. Your implementation should be done in the provided DualSVM skeleton code, and be based on SGD training algorithm in Table 7.1 of Chapter 7 of the following book:
Cristianini, N., & Shawe-Taylor, J. (2000). An Introduction to Support Vector Machines and Other Kernel- based Learning Methods (pp. 125-148). Cambridge: Cambridge University Press.4
You will need to implement the following functions:
1. init
2. fit with the training algorithm described above. For the training loop the same stopping criterion as Part 1, that is stopping after a given number of iterations. The algorithm listed in the paper assumes a fixed bias, however for in this project, you should implement get_bias function to compute b. The standard method for doing so uses the equation:
n
b = yi − αj y j k(xi , xj )
j =1
for an arbitrary instances i with 0 < αi < C . However, this assumes training has converged and the KKT condi- tions have been satisfied, which may not be true with SGD training. For this reason, you should compute the
bias estimate for all candidate instances i , and use the average estimate as the bias.5
3. predict which should classify a new instance x based on sign of b + .n αi yi k(xi, x), where k is the kernel
function (linear for now, but your implementation should be general to support part 3, below).
4. primal_weights which computes the equivalent primal weights using a linear kernel, i.e., w = .n αi yi xi .
Experiments. Evaluate your model with given dataset, use η = 0.1 C = 100, i terations = 100 and a linear model. Perform the following experiments, with dsvm = DualSVM(eta = 0.1)| and keep the other hyperparameter values unchanged except C value:
(a) Tune the value of C , to optimise for training accuracy.
(b) Report the equivalent weights for the dual solution, and compare to the primal trained with equivalent loss. Hint: consider how C and λ are related, such that you compare an equivalent model. Test to check the weights are similar, using the norm of the difference.
(c) Identify the support vectors, and points where α = C . Display your results either on a plot, or by printing the index of the relevant instances.
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme