logo Hurry, Grab up to 30% discount on the entire course
Order Now logo

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

expert
Bernard BeckmannData mining
(5/5)

632 Answers

Hire Me
expert
Neil BissonnetteBusiness
(5/5)

584 Answers

Hire Me
expert
Ngozi EmeagwaliStatistics
(/5)

859 Answers

Hire Me
expert
Aashi NagpalOthers
(5/5)

649 Answers

Hire Me
Python Programming

this assignment is to scrape consumer reviews from a set of web pages and evaluate the performance of text classification on the data.

INSTRUCTIONS TO CANDIDATES
ANSWER ALL QUESTIONS

The objective of this assignment is to scrape consumer reviews from a set of web pages and evaluate the performance of text classification on the data. The reviews have been divided into five categories here:

http://mlg.ucd.ie/modules/yalp

 

Each review has a star rating. For this assignment, we will assume that 1-star to 3-star reviews are “negative”, and 4-star to 5-star reviews as “positive”.

The assignment should be implemented as a single Jupyter Notebook (not a script). Your notebook should be clearly documented, using comments and Markdown cells to explain the code and results. The assignment can be completed either individually or in pairs.

 

Tasks:

In this assignment you should complete all of the following tasks:

Select two review categories of your choice. Scrape all reviews for each category and store them as two separate datasets. For each review, you should store the review text and a class label (i.e. whether the review is “positive” or “negative”). 

 

For both category datasets: 

From the reviews in this category, apply appropriate preprocessing steps to create a numeric representation of the data, suitable for classification.

Build a classification model using a classifier of your choice, to distinguish between “positive” and “negative” reviews.

Test the predictions of the classification model using an appropriate evaluation strategy. Report and discuss the evaluation results in your notebook.

Evaluate how well your two classification models transfer between category. That is, run experiments to:

Train a classification model on the data from “Category A”, and evaluate its performance on the data from “Category B”.

Train a classification model on the data from “Category B”, and evaluate its performance on the data from “Category A”.

 

Guidelines:

The assignment can be completed either individually or in pairs. Any evidence of plagiarism will result in a 0 grade.

For the assignment, only these third-party packages can be used: NumPy, Pandas, Scikit-learn, NLTK, SciPy, Requests, BeautifulSoup, Matplotlib, Seaborn, Gensim.

Related Questions

. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

CS 340 Milestone One Guidelines and Rubric  Overview: For this assignment, you will implement the fundamental operations of create, read, update,

. Develop a program to emulate a purchase transaction at a retail store. This  program will have two classes, a LineItem class and a Transaction class

Retail Transaction Programming Project  Project Requirements:  Develop a program to emulate a purchase transaction at a retail store. This

. The following program contains five errors. Identify the errors and fix them

7COM1028   Secure Systems Programming   Referral Coursework: Secure

. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip

. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer

CS 340 Final Project Guidelines and Rubric  Overview The final project will encompass developing a web service using a software stack and impleme