1 Question Description
If Music is a Place — then Jazz is the City, Folk is the Wilderness, Rock is the Road, Classical is a Temple.
Vera Nazarin
Many of us use some music streaming app to listen to music. These apps usually make personalized playlists to cater to each user’s need. But what is the logic behind the personalized playlist? One general example is to have a Music Genre Classification System. More specifically, creating a machine learning model, which classifies music samples into different genres using various audio features.
The overall aim of this assignment is to develop the best possible machine learning system to predict the genres of music. The task is to classify the music tracks into one of ten genres based on the provided audio features. The data for each track includes both textual features e.g. artist and track names, numerical descriptors e.g. duration and various audio features. The hope is that your model will identify the relationship between music genres and the audio features.
We have set up a Kaggle InClass Competition1 to facilitate finding the best machine learning system for officials to use. You are expected to analyse the provided data, design and improve your own machine learning pipeline, and consider the consequences of applying your pipeline to this data.
Note the data is real. Thus, you could attempt to find the original dataset and create a look-up table. This is not permitted as it misses the point of the course. We want to see your analyses, rather than see perfect results.
2.1 Preliminary: Accessing the Kaggle In Class Competition
To access the class competition, you must use the below url. Please do not share this publicly as it will allow anybody to access our competition, which will make the experience less enjoyable for your classmates. Deliberate cheating is a disciplinary matter, so please don’t go there either.
Competition link: https://www.kaggle.com/t/59de28f43fe94576bb7044d28b3d5965
1 https://www.kaggle.com/about/inclass/overview
You will need to register a Kaggle account. It is perfectly fine (and expected) to use a pseudonym as your Kaggle username so your classmates do not know your real-life identity. However, you will need to fill out the following form so that the lecturers and tutors can link your Kaggle result to your ECS account. No other people will have access to this information! Each time you change your Username, please update the form (it would make sense to do this only once, but past experience suggests that students will change their Username a few times). We cannot give credit to top-scoring students if we cannot link a username to an actual/course name! Usernames must be suitable for work and respectful.
Please fill out the following form:
https://forms.office.com/r/EgdbpuNe5E Please submit as part of your report.
Once you have completed the above steps, please verify that you can access the following page: https://www.kaggle.com/competitions/comp309-2022/overview (when logged in).
Once successful, please accept the terms of the competition so that you may proceed to the rest of the assignment!
2.2 Exploring and understanding the Data [40 marks]
We have created a processed version of the music genres data. This is to be used in the classification competition. We have split the data into training and test set.
The training set is to be used to create your model. You can use any machine learning tool, e.g. Orange, Scikit-learn. Your model will need to be able to predict the class of ‘unseen’ test data (i.e. features, but not class, are provided).
It is often much more effective to first learn about the properties of a dataset (business and data understanding) before applying machine learning to it. You should begin by familiarising yourself with the dataset by reading the “Overview” and “Data” tabs of the Kaggle competition. Please download the dataset from the Data tab (in .csv format). You should now spend some time examining the data and taking notes of any interesting patterns you find.
Requirements
Using any tools you find useful, you should explore and analyse the dataset. You should draw upon your previous experiences and what you have learnt in this course to find a number of interesting patterns. You may wish to start by examining the quality, completeness and representation of individual features.
In your report, you should spend no more than 2.5 pages describing the following regarding the Core part:
• (20 marks) Highlight the findings of your dataset exploration. You should identify four important patterns (e.g. large correlation between variables), and discuss the potential consequence this may have on your results. To achieve a high mark, you should consider more complicated patterns, such as feature interactions. Use your judgement and justify what is an important pattern.
• (20 marks) Visualisation is an important aspect of this task. Please illustrate at least one important finding of your work using visualisation. For full marks, you should be expected to use more than a simple scatter plot.
2.3 Developing and testing your machine learning system [50 marks]
You may use any ML tools you wish, but a good solution will consider a number of factors, such as: pre-processing steps, the properties of the dataset and generalisation/over-fitting. Decisions around how to split your labelled data into training/testing/cross-validation set/s are your choice, which are important and should be explained.
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme
Sun | Mon | Tue | Wed | Thu | Fri | Sat |
---|---|---|---|---|---|---|
29 | 30 | 31 | 1 | 2 | 3 | 4 |
5 | 6 | 7 | 8 | 9 | 10 | 11 |
12 | 13 | 14 | 15 | 16 | 17 | 18 |
19 | 20 | 21 | 22 | 23 | 24 | 25 |
26 | 27 | 28 | 29 | 30 | 31 | 1 |