DSCI553 Foundations and Applications of Data Mining
Spring 2022
Assignment 3
1. Overview of the Assignment
In Assignment 3, you will complete two tasks. The goal is to familiarize you with Locality Sensitive Hashing (LSH), and different types of collaborative-filtering recommendation systems. The dataset you are going to use is a subset from the Yelp dataset used in the previous assignments.
4. Tasks
Note: This Assignment has been divided into 2 parts on Vocareum. This has been done to provide more computational resources.
4.1 Task1: Jaccard based LSH (2 points)
In this task, you will implement the Locality Sensitive Hashing algorithm with Jaccard similarity. In this task, we focus on the “0 or 1” ratings rather than the actual ratings/stars from the users. Specifically, if a user has rated a business, the user’s contribution in the characteristic matrix is 1. If the user hasn’t rated the business, the contribution is 0. You need to identify similar businesses whose similarity >= 0.5.
You can define any collection of hash functions that you think would result in a consistent permutation of the row entries of the characteristic matrix. Some potential hash functions are:
f(x)= (ax + b) % m or f(x) = ((ax + b) % p) % m
where p is any prime number and m is the number of bins. Please carefully design your hash functions.
After you have defined all the hashing functions, you will build the signature matrix. Then you will divide the matrix into b bands with r rows each, where b x r = n (n is the number of hash functions). You should carefully select a good combination of b and r in your implementation (b>1 and r>1). Remember that two items are a candidate pair if their signatures are identical in at least one band.
Your final results will be the candidate pairs whose original Jaccard similarity is >= 0.5. You need to write the final results into a CSV file according to the output format below.
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme