logo Hurry, Grab up to 30% discount on the entire course
Order Now logo

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

expert
Romesh RanganathanCriminology
(5/5)

768 Answers

Hire Me
expert
Jyoti PrajapatiStatistics
(/5)

824 Answers

Hire Me
expert
Aryan BediComputer science
(5/5)

538 Answers

Hire Me
expert
Nitesh BhardwajEconomics
(5/5)

710 Answers

Hire Me
R Programming
(5/5)

create a table with all these variables in order (date, text) and call it df. Each document’s data should be a row in the table

INSTRUCTIONS TO CANDIDATES
ANSWER ALL QUESTIONS

Instructions

Purpose is to use a text file provided to create a table below.

Section 1) split the large string via pattern

section 2) extract dates

section 3) create tibble

Assignment should take 15~30 minutes

## # A tibble: 131 x 2

##    date           text

##    <date>         <chr> 

##  1 2015-02-01     "MSNBC Febru~

##  2 2015-02-01     "MSNBC Febru~

##  3 2015-02-02     "MSNBC Febru~

## # ... with 128 more rows

First downloadmsnbc_text.TXT and load it with readr package. Save it to single string variable called text

1) Split the string based on pattern 

Rightnow, text variable should be a giant single string that has multiple documents.

If you open the msnbc_text notice how each document start with with something like 1 of 131 DOCUMENTS, 2 of 131 DOCUMENTS and so on. This is a pattern that separates each document in the file.

Instead of one big string, split the string (which should be in a variable called text at this point) on the pattern that separates each document and save it as a character vector.

You can do this by writing a regular expression that captures this pattern and then use str_split(text, pattern) %>% unlist() to split the single string you read in with readr::read_file() into separate documents

Check the length of your new character vector (make sure you have a character vector and not a list). You should have 132 items in your vector, but this is strange bc we have 131 documents. If you did this correctly, R will have created a string with only whitespace (“ and”" are whitespace characters) as the first element, check to make sure this is the case. If not, you did something wrong. If so, then subset the vector so we only include items 2 on from the text vector and save it back into the varaible text.

Lastly, trim whitespace from both sides of each document in the vector

Extract the dates 

You should notice another pattern in the text for each document, the date appears at the top with a specific pattern. Use this pattern to extract the date from each document and save this in a variable called dates

Create a table with the data

create a table with all these variables in order (date, text). 

and call it df. Each document’s data should be a row in the table

(5/5)
Attachments:

Related Questions

. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

CS 340 Milestone One Guidelines and Rubric  Overview: For this assignment, you will implement the fundamental operations of create, read, update,

. Develop a program to emulate a purchase transaction at a retail store. This  program will have two classes, a LineItem class and a Transaction class

Retail Transaction Programming Project  Project Requirements:  Develop a program to emulate a purchase transaction at a retail store. This

. The following program contains five errors. Identify the errors and fix them

7COM1028   Secure Systems Programming   Referral Coursework: Secure

. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip

. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer

CS 340 Final Project Guidelines and Rubric  Overview The final project will encompass developing a web service using a software stack and impleme