• Type: Project report
• Learning Objectives Assessed: 1, 2, 3, 4, 5
• Deliverables:
o Written report submitted via Turnitin,
o RapidMiner process, and
o Oral presentation and Q&A
• Weight: 70%
This assignment is an individual assignment. The aim is to provide experience in the steps involved with text processing and creating, evaluating, improving models, and finally presenting and interpreting the model in a business report. You are strongly encouraged to commence this assignment by the end of week 7 of the semester, and you should progress thoughtfully through the steps. Hasty decisions made early in the design process may result in much more work later.
Feel free to discuss concepts and ideas with peers, but remember your submission must be your work. Be careful not to allow anyone to copy your work.
Online reviews help consumers reduce uncertainty and risks faced in purchase decision-making by providing information about products and services. However, the overwhelming amount of data continually being produced in online review platforms introduces a challenge for customers to read and judge the reviews. Reviews may appear on the company's websites, social media, or review platforms. Companies are aware of the impacts of online reviews on consumer attitudes and behaviors. Consumers also expect reviews to be HELPFUL to assist them in making more informed purchasing decisions.
In Amazon.com, reviews on products are available, and customers can read the reviews before their purchase (see Figure 1).
Figure 1 Sample reviews on a product in Amazon.com
When customers read a review, they can mention if they have been HELPFUL (see Figure 2).
Figure 2 Voting if a review is HELPFUL in Amazon.com
Before starting the assignment and going through the rest of the assignment specifications, you need to read some reviews on Amazon.com and familiarize yourself with how the review platform works.
Business problem: Some online reviews are not read by any customers. Amazon.com sorts reviews either based on recency or top score (see Figure 3).
Figure 3 Sorting reviews in Amazon.com
Top reviews are those that are read by the customers and have received more HELPFUL votes.
Based on this sorting mechanism in Amazon.com, if there is a fair and in-depth review that is not recent and has not been scored by the readers, the review won't be very visible on the platform. And if the product has hundreds of reviews, customers likely miss this fair and in-depth review, simply because they do not have enough time to read all the reviews, and the review platform sorts the reviews based on either recency or top score.
Importance and motivation: Your developed model predicts if the reviews would be HELPFUL. To develop the model, you will use the HELPFUL values of existing reviews. This model can be used as an assistive tool in Amazon.com in several ways. For example, Amazon.com can add another metric for sorting reviews, named 'projected helpfulness'. The readers can then sort the reviews based on the predicted helpfulness value provided by your model. This way, Amazon.com assures that no valuables review is missed among hundreds of reviews on a product.
A2 dataset consists of reviews on different products in Amazon.com. Reviews include product and user information, ratings, and a plain text review. Table 1 shows the regular attributes of the dataset.
Table 1 The regular attributes of review dataset from Amazon.com
Regular attribute |
Description |
Product_Score |
Rating between 1 to 5 provided by the reviewer about the quality product on which the review has been written |
Average_Product_Score |
|
Product_Category |
The category of the product on which the review has been written |
Reviewer_Helpfulness |
An index showing how helpful the reviewer who has written the review has been |
Reviewer_Activity |
An index showing how active the reviewer who has written the review has been |
Review_Order |
The order or review among the written reviews on the product |
Review_Count |
Total number of reviews on the product on which the review is written |
Review_Summary |
A summary of the review written by the reviewer in the title of review |
Review_Text |
The body of the review |
There are three ID attributes, namely, Review_ID, Reviewer_ID, and Product_ID. You don't need to use these attributes in model building. Table 2 shows the label attributes of the dataset.
Table 2 The target attributes of review dataset from Amazon.com
Special attribute |
Description |
Total_Reads |
Total number of people who have read the review You need to use this attribute as a label one in developing the prediction model. |
Helpfulness_Label |
A label showing if the review has been helpful You need to use this attribute as a label one in developing the classification model. |
Important note: You should not use the above special attributes as regular attributes. You won't get any marks for developing a model that uses either Total_Reads os Helpfulness_Label as a regular attribute.
Your task is to
develop one model to predict the Total_Reads of the newly posted reviews and
develop another model to assign a value to the Helpfulness_Label attribute of newly posted reviews.
Your approach should involve the following tasks:
Data exploration and preparation
o Structured data preparation. You need to discuss the basic statistics of the available data, whether or not it is balanced.
o Unstructured data (text) preparation. It should at least include TF-IDF & SVD generation and sentiment analysis. You may also consider topic modeling.
Model building (classification), improvement and evaluation
o You can use all the classification techniques and ensemble methods to develop a classification model to assign a Helpfulness_Label attribute of newly posted reviews.
Model building (prediction), and evaluation
o You can use NN to develop a model for predicting Total_Reads. You can use the same input variable in both prediction and classification models.
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme