Statistics & Analysis
how to analyze the source of the tweets on twitter
INSTRUCTIONS TO CANDIDATES
ANSWER ALL QUESTIONS
8 Project Description
Due Week 13, Friday 11:59 pm
The company "Old School Business", also known as OSB wants to start using social media to promote its business. They have approached your team with a request to find what other businesses have done successfully using social media. OSB is particularly interested in using Twitter and so has asked your group to perform the following analysis on Twitter. To begin, find a company that has a Twitter handle with over 10,000 followers and 1500 tweets, then perform the following tasks using the chosen Twitter handle.
8.1 Analysing the source of the tweets
The company wants to know how other companies and the public post their tweets. They want to use this information to understand if there is a relationship between the source of a tweet and the retweeting behavior.
- Use rtweetlibrary to download 1000 tweets that the company posted. Save these tweets as “tweets.company”.
- Use rtweetlibrary to download 1000 tweets about the company you selected. Save these tweets as “tweets.public".
- Examine the source column of both the company and the public tweets to see the source of tweets. Find out how many different levels of sources exist in the public and company tweets.
- Draw a bar plot of the top 10 most frequent tweet sources for both company tweets and the public tweets. Label each bar with the source name.
- Comment on your bar plots.
- By using an appropriate statistical test, test whether retweeting is independent of the tweet source that the public posted. Use the “source” and “is_retweet” columns to get the source and retweet information. Group the sources as; “Salesforce - Social Studio”, "Twitter for Android", “Twitter for Ipad”, “Twitter for iPhone”, “Twitter Web App”, “Twitter Web Client” and “Other”.
- What is the conclusion of the test? Interpret your results.
- Calculate a 95% confidence interval of the text width used in the tweets that the company posted. Use the “display_text_width” column to get this information.
8.2 Themes in public and company tweets
To be successful on Twitter, a company needs to provide useful information to its followers and encourage customers to talk about their posts. We will examine this information so that we can suggest what OSB can tweet about. We do not want to present all tweets to OSB, so we must identify if there is a set of common tweet themes between the public and company tweets. This process involves:
- Combine tweets.publicand tweets.company and save as tweets.
- Clean and pre-process the data (use TFIDF weights in your analysis).
- Compute the most appropriate number of clusters using the elbow method for the combined tweetsby using cosine distance.
- Cluster the tweets using the most appropriate clustering method.
- Visualize your clustering in 2-dimensional vector space. Show each cluster in a different colour and the tweets in tweets.publicand tweets.company with different symbols in your visualization.
- Comment on your visualization.
- Compute the proportion of tweets.publicat each cluster. Print these proportions.
- Which clusters are dominated by the public and which are dominated by the company?
- Draw a word cloud and a dendrogram of these two clusters to understand the theme of the clusters.
8.3 Following friends
We are unsure if friending leads to an increase in popularity. To examine this, we will: (you can use twitteR package in this section).
- Find the most popular 10 friends of the chosen Twitter handle.
- Obtain a 1.5-degree egocentric graph centered at the chosen Twitter handle and plot the graph. The egocentric graph should contain the most popular 10 friends of the chosen Twitter handle.
- Compute the betweenness centrality score for each Twitter handle in your graph. List the top 3 most central people in your graph according to the betweenness centrality.
- Comment on your results.
Note that in Section 8.3, depending on the friends of the chosen twitter handle, you possibly will reach the rate limit of the Twitter API. I strongly recommend that you save your objects as an RData file once you download friends - so you can continue downloading friends the following day or with a different authentication key. For more information on how to save your objects see: https://stackoverflow.com/questions/19967478/how-to-save-data-file-into-rdata.
See this https://developer.twitter.com/en/docs/basics/rate-limiting.html for more information on the rate limit.
It is also recommended that you save your tweets.public and tweets.company once you have downloaded them. Otherwise you will get different tweets each time you run your script and you will need to change your clustering.
The company wants the above three-part analysis to be written up as a professional report. Each part should have its own section of the report and all questions should have thoughtful answers. Include all the R code along with its output in your assignment. Output without the code, or code without the output will result zero marks for the relevant section.
. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java
CS 340 Milestone One Guidelines and Rubric
Overview: For this assignment, you will implement the fundamental operations of create, read, update,
. Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class
Retail Transaction Programming Project
Develop a program to emulate a purchase transaction at a retail store. This
. The following program contains five errors. Identify the errors and fix them
Secure Systems Programming
Referral Coursework: Secure
. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer
CS 340 Final Project Guidelines and Rubric
Overview The final project will encompass developing a web service using a software stack and impleme