Kundera with Spark

Apache Spark

Apache Spark is a fast and general-purpose cluster computing system. Spark has an advanced DAG execution engine that supports cyclic data flow and in-memory computing. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs. It also supports a rich set of higher-level tools including [Spark SQL] (http://spark.apache.org/docs/latest/sql-programming-guide.html) for SQL and structured data processing, MLlib for machine learning, GraphX for graph processing, and Spark Streaming. Spark runs on Hadoop, Mesos, standalone, or in the cloud. It can access diverse data sources including HDFS, Cassandra, HBase, and S3.

##Support Being a JPA provider, Kundera provides support for Spark. It allows to perform write data to dabases/hdfs/csv/json files and read & query operations over the data using JPA specifications.

Kundera provides 3 modules with Spark:

spark-core : It deals with HDFS and FS(CSV & JSON) part. You can perform read, write operations & query data over there.
spark-cassandra : This module is designed for Cassandra. Similarly, you can perform read, write operations & query data over there.
spark-mongodb : This module is designed for MongoDB. In the same way, you can perform read, write operations & query data over there.

Home

Datastores Supported
Releases
- Stable Release
- Older Releases
- Archives
Architecture
Concepts
Getting Started in 5 minutes
Features
- Object Mapper
- Polyglot Persistence
- Queries Support
  - JPQL (JPA Query Language)
  - Native Queries
- Batch insert update
- Schema Generation
- Primary Key Auto generation
- Transaction Management
- REST Based Access
- Geospatial Persistence and Queries
- Graph Database Support
- Composite Keys
- No hard annotation for schema
- Support for Mapped superclass
- Object to NoSQL Data Mapping
- Cassandra's User Defined Types and Indexes on Collections
- Support for aggregation
- Scalar Queries over Cassandra
- Connection pooling using Kundera Cassandra
Configuration
- Common Configuration
- Data store Specific Configuration
[Kundera with Couchdb] (https://github.com/impetus-opensource/Kundera/wiki/Kundera-with--Couchdb)
[Kundera with Elasticsearch] (https://github.com/impetus-opensource/Kundera/wiki/Kundera-with-Elasticsearch)
[Kundera with HBase] (https://github.com/impetus-opensource/Kundera/wiki/Kundera-with-HBase)
[Kundera with Kudu] (https://github.com/impetus-opensource/Kundera/wiki/Kundera-with-Kudu)
[Kundera with MongoDB] (https://github.com/impetus-opensource/Kundera/wiki/Kundera-with-MongoDB)
[Kundera with OracleNoSQL] (https://github.com/impetus-opensource/Kundera/wiki/Kundera-OracleNoSQL)
[Kundera with Redis] (https://github.com/impetus-opensource/Kundera/wiki/Kundera-over-Redis)
[Kundera with Spark] (https://github.com/impetus-opensource/Kundera/wiki/Kundera-with-Spark)
Extend Kundera
Sample Codes and Examples
- Datastax Java Driver Support
[Blogs and Articles] (https://github.com/impetus-opensource/Kundera/wiki/Blogs--and-Articles)
Tutorials
* Kundera with Openshift
* Kundera with Play Framework
* Kundera with GWT
* Kundera with JBoss
* Kundera with Spring
Performance
Troubleshooting
FAQ
Production deployments
Feedback

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kundera with Spark

Apache Spark

Clone this wiki locally