Introduction
In this assignment, you are given a specific data science problem and a related research paper. You are required to present critical analysis about how to deploy the techniques in the related research paper to tackle the given data science problem, and then implement it.
The “Practical Data Science” Canvas contains further announcements and a discus- sion board for this assignment. Please be sure to check these on a regular basis – it is your responsibility to stay informed with regards to any announcements or changes. Login through https://rmit.instructure.com/.
Where to Develop Your Code
Yo are encouraged to develop and test your code in two environments: Jupyter Note- book on Lab PCs and Anaconda 3 that you installed on your own computer.
Jupyter Notebook on Lab PCs
On Lab Computer, you can find Jupyter Notebook via:
Start → All Programs → Anaconda3 (64-bit) → Jupyter Notebook
Then,
• Select New → Python 3
• The new created ‘*.ipynd’ is created at the following location:
– C:\Users\sXXXXXXX
– where sXXXXXXX should be replaced with a string consisting of the letter “s” followed by your student number.
Academic integrity and plagiarism (standard warning)
Academic integrity is about honest presentation of your academic work. It means ac- knowledging the work of others while developing your own insights, knowledge and ideas. You should take extreme care that you have:
• Acknowledged words, data, diagrams, models, frameworks and/or ideas of others you have quoted (i.e. directly copied), summarised, paraphrased, discussed or men- tioned in your assessment through the appropriate referencing methods
• Provided a reference list of the publication details so your reader can locate the source if necessary. This includes material taken from Internet sites. If you do not acknowledge the sources of your material, you may be accused of plagiarism because you have passed off the work and ideas of another person without appropriate referencing, as if they were your own.
RMIT University treats plagiarism as a very serious offence constituting misconduct. Plagiarism covers a variety of inappropriate behaviours, including:
• Failure to properly document a source
• Copyright material from the internet or databases
• Collusion between students
For further information on our policies and procedures, please refer to the following: https://www.rmit.edu.au/students/student-essentials/rights-and-responsibilities/ academic-integrity.
General Requirements
This section contains information about the general requirements that your assignment must meet. Please read all requirements carefully before you start.
• You must include a plain text file called “readme.txt” with your submission. This file should include your name and student ID, and instructions for how to execute your submitted script files. This is important as automation is part of the 6th step of data science process, and will be assessed strictly.
• Please ensure that your submission follows the file naming rules specified in the tasks below. File names are case sensitive, i.e. if it is specified that the file name is gryphon, then that is exactly the file name you should submit; Gryphon, GRYPHON, griffin, and anything else but gryphon will be rejected.
Overview
It is well-known that missing values are one of the biggest challenges in data science projects.
You might know that k nearest neighbour based Collaborative Filtering is also called
“memory-based” Collaborative Filtering. Luckily, data scientists and researchers have been working hard to solve the missing value problem in k-neighbourhood-based Collab- orative Filtering, and have got solutions there.
In this assignment, you are required to tackle the missing value problem in Collaborative Filtering by predicting them. Specifically, an existing solution about how to predict the missing values in Collaborative Filtering is provided, which is a report named “Effective Missing Data Prediction for Collaborative Filtering”. Please read this report carefully, then complete the following tasks.
Tasks
Task 1: Implementation
In this task, you are required to implement the solution in the provided report so as to predict the missing values in Collaborative Filtering.
Note, you are required to implement your own implementation, and please do not use any other libraries that are related to Recommender Systems or Collaborative Filtering. If you use any of these libraries, your implementation part will be invalid.
We provide Python framework code (named assignment3 framework.ipynb) to help you get started, and this will also automate the correctness marking. The framework also includes the training data and the test data.
Please only put your own code in the provided cell in the framework as shown in Figure 1, Please DO NOT CHANGE anything else in the rest cells of the framework, otherwise they might cause errors during the automatic marking.
Please provide detailed comments to explain your implementation. To what level of details should you provide in your solution? Please take the comments in the ipynb files in Week 10 (knn based cf updated.zip) as examples for the level of detailed comments you are expected to put for your solution. You might find the following information uesful: https://www.w3schools.com/python/python_comments.asp
Figure 1: Where to put your implementation in the provided framework (assign- ment3 framework.ipynb)
Task 2: Presentation
• The presentation should
– Explain how the solution in the provided report predicts the missing values in the Collaborative Filtering by using your own language clearly and completely.
– Explain why the solution in the provided report can tackle the missing value problem in Collaborative Filtering clearly and completely.
– Explain how you implement the solution clearly and completely.
• The presentation should be no more than 10 minutes.
• Your presentation slides should be:
– Microsoft PowerPoint slides (with audio inserted for each slide by using: Insert
— > Audio − > Record Audio).
– or you can create your own presentation slides (e.g. PDF version) and please submit your own recording (in the format of mp4 or avi) of your presentation as well.
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme