Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

Drop Files Here Or Click to Upload

Or Get Complete Course Help

Nagendra Singh ChauhanMathematics

(/5)

809 Answers

Hire Me

Adele AndersonLaw

(5/5)

906 Answers

Hire Me

Tanushri MehtaGeneral article writing

(5/5)

630 Answers

Hire Me

Emily WeiData mining

(5/5)

885 Answers

Hire Me

Others

(5/5)

Remove from the list you created all the non-semantically relevant words

INSTRUCTIONS TO CANDIDATES

ANSWER ALL QUESTIONS

Web page mining

Assignment Specification

Description: This program will extract data from a web page and perform some analysis.

Input: No user provided input. Data will be collected from any news website.

Output:

Print the headlines

Generate a wordcloud for the words/bigrams in the headlines

Calculate the sentiment

See details in the Procedure.

Procedure:

1. Import the needed libraries

These are just some of the libraries I think need to be used

# import the libraries

import bs4 as bs

import requests

from wordcloud import WordCloud

from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

2. Define the target URL and open it

# defining the target source

# getting the content

body = requests.get()

3. Load the page into your “soup” (assuming you are using Beautifulsoup)

soup = bs.BeautifulSoup(body.content,'html.parser')

4. Create an empty list to host the list of words from the headlines

5. Loop into the “soup”, looking for the section with headlines

6. Transform the story heading/headlines into a string first and a list then

7. Remove from the list you created all the non-semantically relevant words (the “stopwords”), using the attached file “stopwords_en.txt” for the list of stopwords. Feel free to update the list, adding words that may be too frequent and – in your opinion – not too relevant (explaining the reason why you want to remove them). Filter out non-alphabetical elements and perform all the other preliminary cleaning on the text that you may require

#stopwords file is attached

#adding words to the stopwords

stopwords.extend(['dallas', 'texas', 'city'])

8. Looping into the list of clean headlines, print the headlines with the highest and lowest sentiment (3 each)

This is just an example of how to do the Sentiment Analysis

##### Sentiment Analysis #####

# calculating the sentiment using vader library

analyzer = SentimentIntensityAnalyzer()

# vader needs strings as input. Transforming the list into string

clean_text_str_pro = ' '.join(Pro_words)

vad_sentiment = analyzer.polarity_scores(clean_text_str_pro)

pos_pro = vad_sentiment ['pos']

neg_pro = vad_sentiment ['neg']

neu_pro = vad_sentiment ['neu']

9. Extract bigram, generating a separate list. Consider bigrams 2 words appearing together more than 2 times in the whole text. Bigrams will be like “word1_word2”, meaning you will create a new string composed by the 2 words, separated by an underscore (“_”)

10. Merge the list of single words with the list of bigrams

11. Create a wordcloud with the resulting list. If wordcloud is not available on your computer, either use an online option (see previous assignments) or calculate the sentiment as in previous assignments

This is an example of how to do the Word Cloud

##### Word cloud #####

print ('\n\n--- Generating the wordcloud')

# Transforming the lists of words into strings

Pro_words_string = ' '.join(Pro_words)

# Defining the wordcloud parameters

wc = WordCloud(background_color = "white", max_words = 2000, stopwords = stopwords)

# Store to file

#wc.to_file('Pro.png')

# Show the cloud

plt.imshow(wc)

plt.axis('off')

plt.show()

12. Submit the py file

(5/5)

Hurry, Grab up to 30% discount on the entire course

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

Our Experts

Nagendra Singh ChauhanMathematics

Adele AndersonLaw

Tanushri MehtaGeneral article writing

Emily WeiData mining

Others

Remove from the list you created all the non-semantically relevant words

ANSWER ALL QUESTIONS

Web page mining

Assignment Specification

Attachments:

Instructions Files

Related Questions

. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

. Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class

. The following program contains five errors. Identify the errors and fix them

. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer

Other Services

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

Our Experts

Nagendra Singh ChauhanMathematics

Adele AndersonLaw

Tanushri MehtaGeneral article writing

Emily WeiData mining

Others

Remove from the list you created all the non-semantically relevant words

ANSWER ALL QUESTIONS

Web page mining

Assignment Specification

Attachments:

Instructions Files

Related Questions

. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

. Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class

. The following program contains five errors. Identify the errors and fix them

. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer