Skip to content
This repository has been archived by the owner on Feb 12, 2022. It is now read-only.

Roadmap

jtaylor-sfdc edited this page Apr 11, 2013 · 15 revisions

##Roadmap## Our roadmap is driven by our user community. Other than adding miscellaneous built-in functions, some of the bigger ticket items under consideration include:

  1. Secondary Indexes. Allow users to create indexes through a new CREATE INDEX DDL command, and then, behind the scenes, build multiple projections of the table (i.e. a copy of the table using re-ordered or different row key columns). Phoenix will take care of maintaining the indexes when DML commands are issued and will choose the best table to use at query time.
  2. IN/OR/LIKE Optimizations. When an IN (or the equivalent OR) and a LIKE appears in a query using the leading row key columns, compile it into a skip scanning filter to more efficiently retrieve the query results.
  3. Support ASC/DESC declaration of primary key columns. Allow a primary key column to be declared as ascending (the default) or descending such that the row key order can match the desired sort order (thus preventing an extra sort).
  4. Salting Row Key. To prevent hot spotting on writes, the row key may be "salted" by inserting a leading byte into the row key which is a mod over N buckets of the hash of the entire row key. This ensures even distribution of writes when the row key is a monotonically increasing value (often a timestamp representing the current time).
  5. TopN Queries. Support a query that returns the top N rows, through support for derived tables and implementation of a server-side coprocessor that keeps the top N rows.
  6. COUNT DISTINCT. Although COUNT is currently supported, supporting COUNT DISTINCT will require returning more state to the client for the final merge operation.
  7. TABLESAMPLE. Implement a filter that uses a skip next hint based on the region boundaries of the table to only return n rows per region.
  8. Hash Joins. Support hash joins, where one side of the join is small enough to fit into memory.
  9. Dynamic Columns. For some use cases, it's difficult to model a schema up front. You may have columns that you'd like to specify only at query time. This is possible in HBase, in that every row (and column family) contains a map of values with keys that can be specified at run time. So, we'd like to support that.
  10. Nested Children. Unlike with standard relational databases, HBase allows you the flexibility of dynamically creating as many key values in a row as you'd like. Phoenix could leverage this by providing a way to model child rows inside of a parent row. The child row would be comprised of the set of key values whose column qualifier is prefixed with a known name and appended with the primary key of the child row. Phoenix could hide all this complexity, and allow querying over the nested children through joining to the parent row.
  11. Schema evolution. Phoenix supports adding and removing columns through the [ALTER TABLE] (http://forcedotcom.github.com/phoenix/index.html#alter_table) DDL command, but changing the data type of, or renaming, an existing column is not yet supported.
  12. CREATE SEQUENCE. Surface the HBase put-and-increment functionality through the standard SQL sequence support.
  13. OLAP extensions. Support the WINDOW, PARTITION OVER, RANK, etc. functionality.
Clone this wiki locally