Optiq is a dynamic data management framework.
Optiq requires git, maven (3.0.4 or later), and JDK 1.6 or later (JDK 1.7 preferred).
$ git clone git://github.com/julianhyde/optiq.git
$ cd optiq
$ mvn install
Use -DskipTests
if you do not want to execute the tests.
The spark
module is excluded from the build by default. To enable it, activate the spark
maven build profile:
mvn install -Pspark
Optiq makes data anywhere, of any format, look like a database. For example, you can execute a complex ANSI-standard SQL statement on in-memory collections:
public static class HrSchema {
public final Employee[] emps = ... ;
public final Department[] depts = ...;
}
Class.forName("net.hydromatic.optiq.jdbc.Driver");
Connection connection = DriverManager.getConnection("jdbc:optiq:");
OptiqConnection optiqConnection =
connection.unwrap(OptiqConnection.class);
ReflectiveSchema.create(optiqConnection,
optiqConnection.getRootSchema(), "hr", new HrSchema());
Statement statement = optiqConnection.createStatement();
ResultSet resultSet = statement.executeQuery(
"select d.\"deptno\", min(e.\"empid\")\n"
+ "from \"hr\".\"emps\" as e\n"
+ "join \"hr\".\"depts\" as d\n"
+ " on e.\"deptno\" = d.\"deptno\"\n"
+ "group by d.\"deptno\"\n"
+ "having count(*) > 1");
print(resultSet);
resultSet.close();
statement.close();
connection.close();
Where is the database? There is no database. The connection is
completely empty until ReflectiveSchema.create
registers
a Java object as a schema and its collection fields emps
and depts
as tables.
Optiq does not want to own data; it does not even have favorite data
format. This example used in-memory data sets, and processed them
using operators such as groupBy
and join
from the linq4j
library. But Optiq can also process data in other data formats, such
as JDBC. In the first example, replace
ReflectiveSchema.create(optiqConnection,
optiqConnection.getRootSchema(), "hr", new HrSchema());
with
Class.forName("com.mysql.jdbc.Driver");
BasicDataSource dataSource = new BasicDataSource();
dataSource.setUrl("jdbc:mysql://localhost");
dataSource.setUsername("sa");
dataSource.setPassword("");
JdbcSchema.create(optiqConnection, dataSource, rootSchema, "hr", "");
and Optiq will execute the same query in JDBC. To the application, the
data and API are the same, but behind the scenes the implementation is
very different. Optiq uses optimizer rules
to push the JOIN
and GROUP BY
operations to
the source database.
In-memory and JDBC are just two familiar examples. Optiq can handle any data source and data format. To add a data source, you need to write an adapter that tells Optiq what collections in the data source it should consider "tables".
For more advanced integration, you can write optimizer rules. Optimizer rules allow Optiq to access data of a new format, allow you to register new operators (such as a better join algorithm), and allow Optiq to optimize how queries are translated to operators. Optiq will combine your rules and operators with built-in rules and operators, apply cost-based optimization, and generate an efficient plan.
Optiq also allows front-ends other than SQL/JDBC. For example, you can execute queries in linq4j:
final OptiqConnection connection = ...;
ParameterExpression c = Expressions.parameter(Customer.class, "c");
for (Customer customer
: connection.getRootSchema()
.getSubSchema("foodmart")
.getTable("customer", Customer.class)
.where(
Expressions.<Predicate1<Customer>>lambda(
Expressions.lessThan(
Expressions.field(c, "customer_id"),
Expressions.constant(5)),
c))) {
System.out.println(c.name);
}
Linq4j understands the full query parse tree, and the Linq4j query
provider for Optiq invokes Optiq as an query optimizer. If the
customer
table comes from a JDBC database (based on
this code fragment, we really can't tell) then the optimal plan
will be to send the query
SELECT *
FROM "customer"
WHERE "customer_id" < 5
to the JDBC data source.
The optiq-csv project provides a CSV adapter, which is fully functional for use in applications but is also simple enough to serve as a good template if you are writing your own adapter.
See the optiq-csv tutorial for information on using optiq-csv and writing adapters.
See the HOWTO for more information about using other adapters, and about using Optiq in general.
The following features are complete.
- Query parser, validator and optimizer
- Support for reading models in JSON format
- Many standard functions and aggregate functions
- JDBC queries against Linq4j and JDBC back-ends
- Linq4j front-end
- SQL features: SELECT, FROM (including JOIN syntax), WHERE, GROUP BY (and aggregate functions including COUNT(DISTINCT ...)), HAVING, ORDER BY (including NULLS FIRST/LAST), set operations (UNION, INTERSECT, MINUS), sub-queries (including correlated sub-queries), windowed aggregates, LIMIT (syntax as Postgres)
For more details, see the Reference guide.
- Apache Drill adapter
- Cascading adapter (Lingual)
- CSV adapter (optiq-csv)
- JDBC adapter (part of optiq-core)
- MongoDB adapter (optiq-mongodb)
- Spark adapter (optiq-spark)
- Splunk adapter (optiq-splunk)
- Eclipse Memory Analyzer (MAT) adapter (optiq-mat-plugin)
- License: Apache License, Version 2.0.
- Author: Julian Hyde
- Blog: http://julianhyde.blogspot.com
- Project page: http://www.hydromatic.net/optiq
- Source code: http://github.com/julianhyde/optiq
- Developers list: http://groups.google.com/group/optiq-dev
- HOWTO
- Reference guide
- Release notes