Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

Drop Files Here Or Click to Upload

Or Get Complete Course Help

Patrick GrahamEnglish

(5/5)

889 Answers

Hire Me

Pierree BernierCriminology

(5/5)

964 Answers

Hire Me

Stanley BaxterComputer science

(5/5)

577 Answers

Hire Me

Noel HaywardGeneral article writing

(5/5)

711 Answers

Hire Me

Others

(5/5)

Implement a FeatureCreator for individual BoW for each sentence. Then the features for the classifier are the union of both feature set.

INSTRUCTIONS TO CANDIDATES

ANSWER ALL QUESTIONS

Assignment 2: Question Pair Identification

On websites like Quora, users can ask any question and other users will try to answer this question. If new questions are asked, it would be helpful to automatically determine whether a question with the same intent has already been asked. This will be your task for this assignment.

You are given manually labeled data, consisting of pairs of questions. Each pair is labeled with a class 0/1 whether they have the same intent or a different one. Then you should create the features for a classifier and in the second step you should use a classifier to classify the examples. You should create the feature set yourself, while you can use an existing classifier from sklearn for the classification.

Use the training data to train your model and report the scores on the test data. If you need to try different hyper-parameter, use the validation set to select the best ones.

Framework

The basic framework already exists. You find the code in a git repository. Clone the code by: git clone

In addition, you will need the libraries sklearn and scipy

The data is in asgmt2/data

You have three sets, train.*, test.* and valid.*. Each set contains 3 files. They always have one example per line.

$set.q1.txt: The first question of a question pair

$set.q2.txt: The second question of a question pair

$set.class.txt: Class for this question pair You find the code in asgmt2/src

Files:

- train.py: The main file for training and evaluating the classifier

- FeatureCreator.py: Template of how to create features. One method will take the training data and create all possible features. The other will create the features for a specific example

Tasks:

1. In FeatureCreator, a single BoW uni-gram feature set is implemented. Extend this implementation in two ways:

a. Implement a FeatureCreator for individual BoW for each sentence. Then the features for the classifier are the union of both feature set.

b. Implement a filter on the word features based on the frequency in the training data. Once take only take the most frequent words and in a second experiment ignore the frequent words

2. Train a classifier on your different features and evaluate them on the test data. (To faster get some initial results, it might be helpful to do also some experiments where you use the validation set as your training set.)

a. Use a Naïve Bayes classifier (MultinomialNB())

b. Use a logistic regression ( LogisticRegression)

c. Discuss the results in the report

3. Create a feature set that that looks at the difference between the questions instead of the question individually

a. It should use the following features for each word of the vocabulary

i. How often does the word occur both questions

ii. How more often does the word occur in one question than in another question

b. Discuss your results. Take into consideration that you are using linear models

4. Improve the results by using a different representation of the questions

Submission

The submission to Canvas should respect the following guidelines:

• Submit a pdf file with the report and a zip file with the code

• Write the first and last name in the beginning of the report

• Submitting the assignment as a zip file and not as rar

• When using Jupiter notebooks, add the reports also as PDF

• Name the zip file with the code and the pdf file with the report with lastname_firstname.zip / lastname_firstname.pdf

(5/5)

Hurry, Grab up to 30% discount on the entire course

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

Our Experts

Patrick GrahamEnglish

Pierree BernierCriminology

Stanley BaxterComputer science

Noel HaywardGeneral article writing

Others

Implement a FeatureCreator for individual BoW for each sentence. Then the features for the classifier are the union of both feature set.

ANSWER ALL QUESTIONS

Assignment 2: Question Pair Identification

Framework

Files:

Tasks:

Submission

Attachments:

Instructions Files

Related Questions

. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

. Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class

. The following program contains five errors. Identify the errors and fix them

. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer

Other Services

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

Our Experts

Patrick GrahamEnglish

Pierree BernierCriminology

Stanley BaxterComputer science

Noel HaywardGeneral article writing

Others

Implement a FeatureCreator for individual BoW for each sentence. Then the features for the classifier are the union of both feature set.

ANSWER ALL QUESTIONS

Assignment 2: Question Pair Identification

Framework

Files:

Tasks:

Submission

Attachments:

Instructions Files

Related Questions

. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

. Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class

. The following program contains five errors. Identify the errors and fix them

. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer