Skip to content

Sample program to populate Neo4J from Impala using Cloudera VM

Notifications You must be signed in to change notification settings

davidfauth/ImpalaNeo4J

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

###Cloudera Impala JDBC Example

This example shows how to build and run a maven-based project that executes SQL queries on Cloudera Impala using JDBC and then populates a Neo4J database based on the results of the query. Cloudera Impala is a native Massive Parallel Processing (MPP) query engine which enables users to perform interactive analysis of data stored in HBase or HDFS.

Here are links to more information on Cloudera Impala:

Neo4J is the world's leading graph database.

Here are links to more information on Neo4J:

To use the Cloudera Impala JDBC driver in your own maven-based project you can copy the <dependency> and <repository> elements from this project's pom to your own instead of manually downloading the JDBC driver jars.

####Dependencies To build the project you must have Maven 2.x or higher installed. Maven info is here.

To run the project you must have access to a Hadoop cluster running Cloudera Impala with at least one populated table defined in the Hive Metastore. Neo4J must also be installed. You can download [Neo4J] from (http://www.neo4j.org/download).

####Configure the example To configure the example you must:

  • Select or create the table(s) to query against.
  • Set the query and impalad host in the example source file

These steps are described in more detail below.

#####Select or create the table(s) to run the example with For this example I created my own table and populated it.

#####Set the query and impalad host Edit these two setting in the ImpalaNeo4JImporter.java source file:

  • Set the SQL Statement

private static final String SQL_STATEMENT = "SELECT * from organizations limit 10";

  • Set the host for the impalad you want to connect to:

private static final String IMPALAD_HOST = "MyImpaladHost";

####Building the project To build the project, run the command:

mvn clean compile

from the root of the project directory. There is a build.sh script for your convenience.

About

Sample program to populate Neo4J from Impala using Cloudera VM

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published