SparkStream-for-meetup

Spark Streaming of RSVPs from meetup.com API using Kafka meetup.com provides a streaming data of RSVPs in JSON Format. The stream is accesible through, http://stream.meetup.com/2/rsvps

Getting this streaming data into Apache Spark-Streaming is the first step to perform various analytics, recommendations or visualizations on the data.

Technologies Used

Kafka 0.9.0.1
Spark 2.0.1

Kafka Python API is used to interact with kafka cluster. PySpark is used to write the spark streaming jobs.

Execute the Application

Assuming Kafka and Spark of appropriate version is installed, the following commands are used to run the application.

Spark Streaming integeration with kafka 0.10.0.0 and above, is still in experimental status, Hence using Kafka 0.9 (http://spark.apache.org/docs/latest/streaming-kafka-integration.html)

Run Zookeeper to maintain Kafka, command to be run from Kafka root dir

bin/zookeeper-server-start.sh config/zookeeper.properties

Start Kafka server, aditional servers can be added as per requirement.

bin/kafka-server-start.sh config/server.properties

Start Producer.py to start reading data from the meetup stream and store it in '''meetup''' kafka topic.
Start Consumer.py to consume the stream from the '''meetup''' topic
Submit the spark job spark_meetup.py, to read the data into Spark Streaming from Kafka.

Spark depends on a external package for kafka integeration link

bin/spark-submit --packages org.apache.spark:spark-streaming-kafka-0-8_2.11:2.0.1 spark_meetup.py localhost:2181 meetup

An analysis of number of RSVPs from various cities in "US" region is performed on the RSVPs Stream.

//TODO

Visualize the data using Bokeh
Explore ElasticSearch and Kibana to Visualize

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Consumer.py		Consumer.py
Producer.py		Producer.py
README.md		README.md
spark_meetup.py		spark_meetup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SparkStream-for-meetup

Technologies Used

Execute the Application

About

Releases

Packages

Languages

gautham20/SparkStream-for-meetup

Folders and files

Latest commit

History

Repository files navigation

SparkStream-for-meetup

Technologies Used

Execute the Application

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages