Cloudant Query is inspired by MongoDB's query implementation, so users of MongoDB should feel at home using Cloudant Query in their mobile applications.
The aim is that the query you use on our cloud-based database works for your mobile application.
These notes assume familiarity with Cloudant Sync Datastore.
Cloudant Query uses indexes explicitly defined over the fields in the document. Multiple indexes can be created for use in different queries, the same field may end up indexed in more than one index.
Query offers a powerful way to find documents within your datastore. There are a couple of restrictions on field names you need to be aware of before using query:
- A dollar sign (
$
) cannot be the first character of any field name. This is because, when querying, a dollar sign tells the query engine to handle the object as a query operator and not a field. - A field with a name that contains a period (
.
) cannot be indexed nor successfully queried. This is because the query engine assumes dot notation refers to a sub-object.
These come from Query's MongoDB heritage where these characters are not allowed in field names, which we don't share. Hopefully we'll work around these restrictions in the future.
Querying is carried out by supplying a query in the form of a map which describes the query.
For the following examples, assume these documents are in the datastore:
{ "name": "mike",
"age": 12,
"pet": { "species": "cat" },
"comment": "Mike goes to middle school and likes reading books." };
{ "name": "mike",
"age": 34,
"pet": {"species": "dog"},
"comment": "Mike is a doctor and likes reading books." };
{ "name": "fred",
"age": 23,
"pet": {"species": "cat"},
"comment": "Fred works for a startup out of his home office." };
In order to query documents, creating indexes over the fields to be queried against will typically enhance query performance. Currently we support a JSON index, which is used by query clauses containing comparison operators like $eq
, $lt
, and $gt
. Query clauses containing these operators are based on standard SQLite indexes to provide query results.
Basic querying of fields benefits but does not require a JSON index. For example, { "name" : { "$eq" : "mike" } }
would benefit from a JSON index on the name
field but would succeed even if there isn't an index on name
.
Use the following method to create a JSON index:
datastore.ensureIndexed(fieldNames, indexName)
These indexes are persistent across application restarts as they are saved to disk. They are kept up to date as documents change; there's no need to call the ensureIndexed(...)
method each time your applications starts, though there is no harm in doing so.
The first argument, fieldNames
is a list of fields to put into the index. The second argument, indexName
is a name for the index. This is used to delete indexes at a later stage.
A field can appear in more than one index. The query engine will select an appropriate index to use for a given query. However, the more indexes you have, the more disk space they will use and the greater the overhead in keeping them up to date.
To index values in sub-documents, use dotted notation. This notation puts the field names in the path to a particular value into a single string, separated by dots. Therefore, to index the species
field of the pet
sub-document in the examples above, use pet.species
.
// Create an index over the name, age, and species fields.
datastore.ensureIndexed(["name", "age", "pet.species"], "basic")
.then(function (indexName) {
// index created with indexName = "basic"
});
####Changing and removing indexes
If an index needs to be changed, first delete the existing index by calling deleteIndexNamed(indexName)
where the argument is the index name, then call the appropriate ensureIndexed(...)
method with the new definition.
The document ID and revision ID are automatically indexed under _id
and _rev
respectively. If you need to query on document ID or document revision ID,
use these field names.
Indexing of array fields is supported. See "Array fields" below for the indexing and querying semantics.
Query documents are JSON objects that use the Cloudant Query selector
syntax. Several features of Cloudant Query are not yet supported in this implementation.
See below for more details.
To query for all documents where pet.species
is cat
:
var query = {
selector: {
"pet.species": "cat"
}
};
If you don't specify a condition for the clause, equality ($eq
) is used. To use other conditions, supply them explicitly in the clause.
To query for documents where age
is greater than twelve use the $gt
condition:
var query = {
selector: {
age: { $gt: 12 }
}
};
See below for supported operators (Selections -> Conditions).
Using the $mod
operator in queries allows you to select documents based on the value of a field divided by an integer yielding a specific remainder.
To query for documents where age
divided by 5 has a remainder of 4, do the following:
var query = {
selector: {
age: { $mod: [ 5, 4 ] }
}
};
A few things to keep in mind when using $mod
are:
- The list argument to the
$mod
operator must contain two number elements. The first element is the divisor and the second element is the remainder. - Division by zero is not allowed so the divisor cannot be zero.
- The dividend (field value), divisor, and the remainder can be positive or negative.
- The dividend, divisor, and the remainder can be represented as whole numbers or by using decimal notation. However internally, prior to performing the modulo arithmetic operation, all three are truncated to their logical whole number representations. So, for example, the selector
{ "age": { "$mod": [ 5.6, 4.2 ] } }
will provide the same result as the selector{ "age": { "$mod": [ 5, 4 ] } }
.
Compound queries allow selection of documents based on more than one criteria. If you specify several clauses, they are implicitly joined by AND.
To find all people named fred
with a cat
use:
var query = {
selector: {
name: "fred",
pet.species: "cat"
}
};
Use $or
to find documents where just one of the clauses match.
To find all people with a dog
or who are under thirty:
var query = {
selector: {
$or: [
{ pet.species: { $eq: "dog" } },
{ age: { $lt: 30 } }
]
}
};
Using a combination of AND and OR allows the specification of complex queries.
This selects documents where either the person has a pet dog
or they are
both over thirty and named mike
:
var query = {
selector: {
$or: [
{ pet.species: { $eq: "dog" }},
{ $and: [
{ age: { $gt: 30 } },
{ name: { $eq: "mike" } }
]}
]
}
};
To find documents matching a query, use the Datastore
object's find(query)
method. This returns an array containing your query results.
datastore.find(query)
.then(function (results) {
results.forEach(function (result) {
// Do something with result
});
}).done();
There are options for the find
method which support:
- Sorting results.
- Projecting fields from documents rather than returning whole documents.
- Skipping results.
- Limiting the number of results returned.
For any of these, use
var query = {
selector: selectorClause,
skip: numberOfResultsToSkip,
limit: numberOfResultsToReturn,
fields: anArrayOfFieldNamesToProject,
sort: anArrayOfSortOptions
});
Provide a sort document to the find
method to sort the results of a query.
The sort document is a list of fields to sort by. Each field is represented by a map specifying the name of the field to sort by and the direction to sort.
The sort document must use fields from a single index.
As yet, you can't leave out the sort direction. The sort direction can be asc
(ascending) or desc
(descending).
var query = {
selector: {},
sort: [ { "name": "asc" },
{ "age": "desc" }
]
};
Projecting fields is useful when you have a large document and only need to use a subset of the fields for a given view.
To project certain fields from the documents included in the results, pass a list of field names to the fields
argument. These field names:
- Must be top level fields in the document.
- Cannot use dotted notation to access sub-documents.
For example, in the following document the name
, age
and pet
fields could be projected, but the species
field inside pet
cannot:
{
"name": "mike",
"age": 12,
"pet": { "species": "cat" }
}
To project the name
and age
fields of the above document:
var query = {
selector: {},
fields: [ "name", "age" ]
};
Skip and limit allow retrieving subsets of the results. Amongst other things, this is useful in pagination.
skip
skips over a number of results from the result set.limit
defines the maximum number of results to return for the query.
To display the twenty-first to thirtieth results:
var query = {
selector: {},
skip: 20,
limit: 10
};
Indexing and querying over array fields is supported by this query engine, with some caveats.
Take this document as an example:
{
_id: 'mike32'
pet: [ 'cat', 'dog', 'parrot' ],
name: 'mike',
age: 32
}
You can create an index over the pet
field:
datastore.ensureIndexed(["name", "age", "pet"], "basic")
.then(function (indexName) {
console.log('Successfuly created index: ' + indexName);
});
Each value of the array is treated as a separate entry in the index. This means that a query such as:
{ pet: { $eq: 'cat' } }
Will return the document mike32
. Negation such as:
{ pet: { $not: { $eq: 'cat' } } }
Will not return mike32
because negation returns the set of documents that are not in the set of documents returned by the non-negated query. In other words the negated query above will return all of the documents that are not in the set of documents returned by { pet: { $eq: cat } }
.
Only one field in a given index may be an array. This is because each entry in each array
requires an entry in the index, causing a Cartesian explosion in index size. Taking the
above example, this document wouldn't be indexed because the name
and pet
fields are
both indexed in a single index:
{
_id: 'mike32'
pet: [ 'cat', 'dog', 'parrot' ],
name: [ 'mike', 'rhodes' ],
age: 32
}
If this happens, an error will be emitted into the log but the indexing process will be successful.
However, if there was one index with pet
in and another with name
in, like this:
datastore.ensureIndexed(["name", "age"], "index_one");
datastore.ensureIndexed(["age", "pet"], "index_two");
The document would be indexed in both of these indexes: each index only contains one of the array fields.
Also see "Unsupported features", below.
Error reporting is somewhat lacking right now. Presently a null
return value from the find
methods or the ensureIndexed(fieldNames, String indexName)
method indicates that something went wrong. Any errors that are encountered are logged but exceptions are not thrown as of yet.
Right now the list of supported features is:
- Create compound indexes using dotted notation that index JSON fields.
- Delete index by name.
- Execute nested queries.
- Limiting returned results.
- Skipping results.
- Queries can include unindexed fields.
- Queries can include a text search clause, although if they do no unindexed fields may be used.
Selectors -> combination
$and
$or
Selectors -> Conditions -> Equalities
$lt
$lte
$eq
$gte
$gt
$ne
Selectors -> combination
$not
Selectors -> Condition -> Objects
$exists
Selectors -> Condition -> Misc
$text
in combination with$search
$mod
Selectors -> Condition -> Array
$in
$nin
$size
Implicit operators
- Implicit
$and
. - Implicit
$eq
.
Arrays
- Indexing individual values in an array.
- Querying for individual values in an array.
As this is an early version of Query on this platform, some features are not supported yet. We're actively working to support features -- check the commit log :)
Overall restrictions:
- Cannot use covering indexes with projection (
fields
) to avoid loading documents from the datastore.
- Using non-dotted notation to query sub-documents.
- That is,
{"pet": { "species": {"$eq": "cat"} } }
is unsupported, you must use{"pet.species": {"$eq": "cat"}}
.
- That is,
- Cannot use multiple conditions in a single clause,
{ field: { $gt: 7, $lt: 14 } }
.
Selectors -> combination
$nor
(unplanned)$all
(unplanned)$elemMatch
(unplanned)
Selectors -> Condition -> Objects
$type
(unplanned)
Selectors -> Condition -> Misc
$regex
(unplanned, waiting on filtering)
Arrays
- Dotted notation to index or query sub-documents in arrays.
- Querying for exact array match,
{ field: [ 1, 3, 7 ] }
. - Querying to match a specific array element using dotted notation,
{ field.0: 1 }
. - Querying using
$all
. - Querying using
$elemMatch
.
Not carried out yet.
To help, I've tried to write a grammar/schema for the Query language.
Here:
- Bold is used for the JSON formatting (or to indicate the representation of the use of Map, List etc. in Java).
- Italic is variables in the grammar-like thing.
- Quotes enclose literal string values.
query := { } { many-expressions } many-expressions := expression ("," expression)* expression := compound-expression comparison-expression text-search-expression compound-expression := { ("$and" | "$nor" | "$or") : [ many-expressions ] } // nor not implemented comparison-expression := { field : { operator-expression } } negation-expression := { "$not" : { operator-expression } } operator-expression := negation-expression { operator : simple-value } { "$regex" : Pattern } // not implemented { "$mod" : [ non-zero-number, number ] } { "$elemMatch" : { many-expressions } } // not implemented { "$size" : positive-integer } { "$all" : array-value } // not implemented { "$in" : array-value } { "$nin" : array-value } { "$exists" : boolean } { "$type" : type } // not implemented text-search-expression := { "$text" : { "$search" : string-value } } operator := "$gt" | "$gte" | "$lt" | "$lte" | "$eq" | "$ne" // Obviously List, but easier to express like this array-value := [ simple-value ("," simple-value)+ ] // Java mappings of basic types field := String // a field name simple-value := String | Number string-value := String number := NSNumber non-zero-number := NSNumber positive-integer := Integer boolean := Boolean type := Class