A simple Hive MapReduce application that utilizes Hive to analyze very large data sets
-Scala 2.13.3 -Hadoop 3.2.1 -Hive -YARN -sbt 1.4.4 -Docker container
- InputStream - Retrieves twitter stream with Spark session
- dataMapper - Maps every key in the dataframe to a value
- dataReduce - Reduces the datasets so that all the keys are distinct values
- MapReduce
- Install & Configure git
- Install xCode for easy access
- sbt assembly to package files
- sbt compile to build
- sbt run to output
Zeshawn Manzoor