This assignment is an integral part of this module and contributes 20% to the overall mark.
Imagine that you are a member of a research group in a company. Your group leader has asked you to understand a training algorithm fully by implementing the method, testing it out, explore its behavior and explaining it, and perhaps even suggesting ways it might be improved.
To do this assignment you will need a copy of the paper by Mosca, entitled “Training Convolutional Networks with Weight–wise Adaptive Learning Rates”, published in the ESANN 2017 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Bruges (Belgium), 26-28 April 2017. You will implement the training algorithm described in the paper, called WAME, and you will test it on a real world data set. You have to describe how the method works, adjust its free parameters, discuss your experimental set up and explain your experiments. Lastly, you have to evaluate how the algorithm solves the problem, and discuss your results and their significance.
By doing this coursework you will get experience with implementing, running, adapting, and evaluating machine learning methods on real data. You will need to write, reuse or change code, run it on some data, make some figures, read a few background papers, present your results, and write a report describing the problem you tackled, the machine learning algorithm you used and the results you obtained.
You have to choose a real world data set that contains more than a couple of thousands of data points, such as for example:
• Landsat: this dataset was originally part of the StatLog project datasets, and is a collection of 3×3 pixel neighbourhoods from images taken by the Landsat satellite, for the purpose of classifying the terrain type into one of 7 available classes. The current dataset is composed of 6435 instances and has no instances for class 6, so is heavily imbalanced. You can find online several papers that have used this data (this could very useful as it provides a point of reference), as well more information at the UCI Machine Learning Repository, Statlog (landsat satellite) data set, at http://archive.ics.uci.edu/ml/datasets/Statlog+(Landsat+Satellite)
The above data set is just an example. You can choose any data you wish. You can use data from your own MSc project, or data from a public repository, as the example above. However, you need to make sure that the data set is large enough (as specified above).
You can use any programming language or software library for this assignment. In the labs, we use MATLAB which provides well tested functions for neural networks design that are appropriate for the UCI dataset mentioned above.
The assignment is further explained in Sections 2 below. Section 3 of this document gives you an example of how to structure your report and explains the marking scheme. Section 4 presents the deadlines and submission instructions. Section 5 explains the penalties for late
You can do your WAME implementation in MATLAB, write your own code or build on a package/library from the internet. You are not tested on programming so the coding style does not have to be perfect and your code does not have to be optimal but it should obviously work correctly. Since the focus is on implementing WAME, I wouldn’t recommend implementing all the methods required for training neural networks, e.g. backpropagation, derivatives calculation etc., from scratch unless you are very experienced with Java, C++, Python or some other programming language or platform. No matter what you do/use, make sure that all sources and code taken by others or the internet are cited properly in your Report; otherwise, you may be accused of plagiarism.
Some packages provide techniques for determining the optimal structures of machine learning models (e.g. the model architecture or the model hyperparameters) automatically as part of the training. In that case, instead of performing experimental tests varying the number of free parameters of the model, these techniques can be used to find the appropriate structure for your model. Still some of these methods may have their own parameters, which require fine tuning.
Note that the performance results of your approach would be more meaningful, if a validation technique is used, such as k-fold cross validation (k=7 or k=10 is typically used), or leave- one-out cross validation, or some form of Monte Carlo simulation. Lastly, the use of regularisation, provided in some software packages and in Matlab, normally helps to get better results.
You are expected to test your implementation of WAME using relevant data and store results in ASCII format for each experiment that you conduct. Results are typically in the form of: number of successfully recognised patterns per class for each experiment; number of unsuccessfully recognised patterns per class for each experiment; the overall average classification success in training and in testing, and average error in training and in testing.
The results of your experiments should be stored in ASCII format, in a Jupyter notebook, or in notebook documents produced by other web-based interactive computational environments, specifying whether the result is from the training or the testing phase, and should be submitted together with your report (Moodle will allow to submit additional files; up to 5 files in total can be submitted). Check that these files can be opened and read correctly. Results should be presented using figures and tables and discussed in your Report, i.e. it is not enough just to submit a python notebook or files with results- these are not accepted as a Report submission.
Your work will be presented in a Report (notebook documents are not accepted as a Report). It is important that your Report is properly structured. Sections like the ones shown below could be included in your report to ensure good coverage of the topic. A number of 2000-3000 words are expected to cover in sufficient depth all aspects of the assignment, but our marking is not based on the number of words used in the Report. Also, you are not just being marked on how good the results of your neural model are. What primarily matters is that you describe your design of the training algorithm, justify any choices you made, explain how things work, and make the model work with data. Also, you will need to provide insight on how to (pre)process the data before feeding into the model and do the training (if necessary), how to debug the learned model (not only the training algorithm), how to measure model performance and demonstrate its significance.
1. Methodology and design (50% of the mark): the appropriate use and sophistication of design methods and overall methodology is marked here
1.1 This part should normally describe clearly the approach used in your design and implementation and any relevant parameters (e.g. for neural networks models this includes number of hidden nodes, layers, type of activation functions etc). If you are using a particular library or tool, you still need to describe how the methods/functions that you are using operate. Citing the library, tool, etc., and just listing the library functions is not enough to get a good mark.
1.2 This part should describe any special techniques/methods or parameters used in the stages of your methodology, e.g. in data preprocessing, in WAME initialisation or during training. Also, this part should describe any normalisation techniques used, or other pre-processing or balancing methods, and whether you have used some form of cross-validation, regularisation, or weight decay, providing details of the particular method. Citing the library, tool, etc., and mentioning the library functions that you used is not enough to get a high mark.
2. Experiments, findings and discussion (40% of the mark): the experimental design and the investigation’s systematicity are marked here. You must present and discuss your experimental design and results. You are expected to run several experiments, explore the behaviour of the training algorithm under different conditions, and calculate basic statistics to summarise performance in training and testing. You also need to discuss the significance of your findings. Your report must include at least two figures which graphically illustrate quantitative aspects of your results, such as training/testing errors, performance metrics for sets of learned parameters, algorithm outputs, descriptive statistics, etc.
For example, in this part you can use Excel or other packages to provide charts - like the figure below, which uses error bars (Box and Whisker Charts in Excel), to show the performance of your trained model in terms of generalisation. For example, the figure below shows generalisation with respect to number of hidden nodes used in a neural network-based solution. Alternatively, one could use tables to provide the same information by giving for each number of hidden nodes the average value, the minimum value, and the maximum value of generalisation performance (in percentage of successfully recognised patterns) in the tests.
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme