Twitter Account Sentiment Analysis Program

Overview

A scala-based program that reads both Twitter batch data and streaming data and runs sentiment analysis on the data.

User Story

Almost all businesses today run their social media accounts, hoping that it will bring them love and popularity among users in some way. Yet, there has been an interesting question posed as to whether businesses actually benefit from running social media accounts or do nothing but damage themselves. This program is designed to look into businesses' Twitter accounts and run sentiment analysis on each tweet and response generated by the businesses.

Technologies / Resources

sbt
Apache Spark
- Spark SQL
- Spark Streaming
Docker
Twitter API v2
Apache Parquet
Subjectivity Lexicon (Link)

Features

Read Twitter batch data of a selected business account for the past 7 days.
Load the batch data using Apache Spark and convert the data into Spark DataFrame.
Manipulate the converted DataFrame, so that only tweet texts are fed to the sentiment analysis program.
Run sentiment analysis on each tweet and response generated by businesses and returns one of followings responses: Positive, Negative, or Mixed.
Read and process live Twitter-Stream data using Spark Structured Streaming in order to find the most popular topics of discussion on Twitter at a given moment.
Read Twitter streaming data of a selected business account in real-time and save every 10 new lines as a csv file in Datalake1.
Read newly generated csv files in Datalake1 in real-time using Spark Streaming, convert them into DataFrame, extract only tweet texts from the files, and save them as a parquet file in Datalake2.
Load the parquet files using Apache Spark and run sentiment anlaysis on each tweet and response generated in real-time to return one of followings responses: Positive, Negative, or Mixed.

Getting Started / Usage

In order to run this program properly, you will need to do the following prerequisites:

Be sure to create Twitter API v2 key.
Be sure to download Subjectivity Lexicon from the link above and upload it to the cluster where you will run your jar files.

If all of the prerequisites above are met, go ahead and clone this repo by using the command below:

git clone https://github.com/spark131008/Twitter_Account_Sentiment_Analysis_Program.git

In order to create a jar file of each program, use the command below:

sbt package

Once all jar files are created, copy the files located within /target/scala-2.12 directory and paste them to JVM or a local cluster. If you are running your cluster in a Docker container, use the command below:

docker cp ./target/scala-2.12/<Name of the jar file>.jar spark-master:/<Name of the jar file>.jar

In order to run a jar file using Apache Spark in a Docker container, use the command below:

docker exec spark-master bash -c "./spark/bin/spark-submit --class "<Name of the class>" --master local[4] /<Name of the jar file>.jar"

If you want to run sentiment analysis on filtered Twitter streaming data, please follow the order below when spark-sbumitting jar files.

1. docker exec spark-master bash -c "./spark/bin/spark-submit --class "TwitterStreamingDataProcessing" --master local[4] /filtered_twitter_stream.jar"

2. docker exec spark-master bash -c "./spark/bin/spark-submit --class "SparkStreaming" --master local[4] /spark_streaming.jar"

3. docker exec spark-master bash -c "./spark/bin/spark-submit --class "SentimentAnalysis" --master local[4] /sentiment_analysis.jar"

Contributors

Sundoo, Chase, Trenton, Josh

Example Results

Wendys

Chick-Fil-A

Google Slides powerpoint

https://docs.google.com/presentation/d/1vG7IgBXfc0gUOD-RylH3TJpXcLts6diCY8UIR92Tm0M/edit?usp=sharing

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
project		project
src/main/scala		src/main/scala
.gitignore		.gitignore
README.md		README.md
build.sbt		build.sbt
sampleResponse		sampleResponse

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Twitter Account Sentiment Analysis Program

Overview

User Story

Technologies / Resources

Features

Getting Started / Usage

Contributors

Example Results

Google Slides powerpoint

About

Releases

Packages

Contributors 5

Languages

revature-scalawags/Project2-Group2

Folders and files

Latest commit

History

Repository files navigation

Twitter Account Sentiment Analysis Program

Overview

User Story

Technologies / Resources

Features

Getting Started / Usage

Contributors

Example Results

Google Slides powerpoint

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages