Skip to content

Latest commit

 

History

History
236 lines (182 loc) · 10.1 KB

dynamic-api-query-language-syntax.md

File metadata and controls

236 lines (182 loc) · 10.1 KB

Used terms

implicit container
A JSON object with all possible constraints defined as properties of that JSON object. All of these constraints are linked by logical [AND](https://en.wikipedia.org/wiki/Logical_conjunction) during processing.

A query, although custom for each collection, has a predefined set of rules/structure. It is basically a tree of individual constraints that ultimately define how the queried data will be filtered, ordered and returned. The basic constraints are the same as for the evitaDB query language, but they are written differently and not every single one may be available for your domain-specific data types. There are two main reasons for this: the first one is that we want to help the user as much as possible with code completion, so we use dynamically generated queries, the second one is due to the limitations of JSON objects, mainly that JSON objects don't have names to reference them by.

Therefore, we had come up with the following syntax for constraints: each constraint consists of:

  • key - basically a property of a parent JSON object
    • defines targeted property type
    • possible classifier of targeted data
    • and a constraint name
  • value - an object/value of that property
    • defines arguments for a constraint
    • can be a scalar, an array or an object of arguments

Thanks to the API schemas, you don't have to worry about the details of each part of the syntax, because the API schema will provide you only with those constraint that are valid.

Constraint key syntax in detail

If you want to know more about the underlying syntax, read on. Each key consists of 3 parts as mentioned above:

  • a property type
  • a classifier of data
  • a constraint name

Only the constraint name is required for all the supported constraints, the rest depends on the context and the type of constraint.

The property type defines where the query processor will look for data to compare. There is a finite set of possible property types:

  • generic - generic constraint that typically doesn't work with concrete data, it's more for containers like and, or or not
    • for simplicity, this property type is not written in the key of an actual constraint
  • entity - handles properties directly accessible from an entity like the primary key
  • attribute - can operate on an entity’s attribute values
  • associatedData - can operate on an entity’s associated data values
  • price - can operate on entity prices
  • reference - can operate on entity references
  • hierarchy - can operate on an entity’s hierarchical data (the hierarchical data may be even referenced from other entities)
  • facet - can operate on referenced facets to an entity

In some special case, the property type can be omitted and represent other property type than generic. This is usually when we have child constraints that are valid only in specific parent with same domain and so on. This is only to provide better DX.

The classifier specifies exactly which data of the specified property type the constraint will operate on, if supported by the property type. This is used e.g. for attributes, where simply defining the property type doesn't tell us which attribute we want to compare. But without the property type we don't know what the classifier represents. Therefore, we need both the property type and the classifier. But in cases like price comparison, these constraints operate on single computed price so the evitaDB query processor implicitly knows which price we want to compare.

Finally, the constraint name actually defines what the query processor will do with the target data (i.e., how to compare the passed data with data in the database).

All possible parts combinations are:

{constraint name} -> `and` (only usable for generic constraints)
{property type}{constraint name} -> `hierarchyWithinSelf` (in this case the classifier of used hierarchy is implicitly defined by rest of a query)
{property type}{classifier}{constraint name} -> `attributeCodeEquals` (key with all metadata)
Example of a simple constraint

A single constraint to return only entities that contain the deviceType attribute equal to the string phone would look like this:

attributeDeviceTypeEquals: "phone"

As mentioned above, JSON objects don't have names, and we can't define the constraint key in the body of a generic JSON object because we would lose the strictly-typed query language backed by the API schema. Instead, the key is defined as a property in the parent container (parent JSON object). Such containers contain all possible constraints in a given context. These containers also contain some generic constraints such as and, or or not that accept inner containers to combine constraints into trees of complex queries.

However, this complicates things when you need to pass child constraints into arguments of another constraint, because you cannot simply pass the object representing the constraint, you need to wrap it in the above-mentioned container with all available constraints to be able to define the constraint key. We call these necessary wrapping containers implicit containers, and they can look like this:

{
  and: ...,
  or: ...,
  attributeCodeEquals: ...,
  ...
}

Unfortunately, this means that you can define multiple constraints in one go in such a container, and we need to somehow define relational logic between these child constraints. In filter containers, we chose to have all these implicit containers define logical AND between passed child constraints, ultimately resulting in the and constraint under the hood. In order containers, the children behaves the same as they would if they were passed separately as an array of implicit containers, however, if multiple constraints are passed into a single container, there is no guarantee that the order of the constraints will be preserved. This is a "limitation" of JSON objects, which don't have a defined order of properties. Therefore, you should always wrap each order constraint into a separate implicit container and pass them like array to the parent order constraint.

Unfortunately, there is another small drawback if you need to define the same constraint multiple times in a single list with different arguments. In such a case, you need to wrap each such constraint into a separate implicit container and pass it like array to the parent constraint, like so:

or: [
  {
    attributeCodeStartsWith: "ipho"
  },
  {
    attributeCodeStartsWith: "sams"
  }
]

This is mainly because JSON objects don't support multiple properties with the same name.

Example of a complex constraint

A complex constraint tree with simple constraints, containers, and implicit containers to return only entities with specific primary keys or other more complex constraints:

filterBy: {
   or: [
      {
         entityPrimaryKeyInSet: [100, 200]
      },
      {
         attributeCodeStartsWith: "ipho",
         hierarchyCategoryWithin: {
            ofParent: {
                entityPrimaryKeyInSet: [20]
            }
         },
         priceBetween: ["100.0", "250.0"]
      }
   ]
}

Handling null values

Each constraint value is defined as nullable, and passing null as a constraint value has special meaning. Such a constraint is excluded from the query as if it were never there. So if you write the following query:

filterBy: {
  entityPrimaryKeyInSet: [100, 200],
  attributeCodeEquals: null
}

evitaDB will exclude the atributeCodeEquals from the query because it doesn't make sense, and will use the following modified query instead:

filterBy: {
  entityPrimaryKeyInSet: [100, 200]
}

This is supported for all constraint value types: scalars, containers, and wrapper objects.

Note: inner properties of wrapper objects that aren't nested constraint container don't behave this way. Instead, they are defined with proper nullability based on a constraint definition in the evitaDB core.

Why is this useful?

Imagine you construct queries dynamically in a JavaScript application and passing them as variables into GraphQL queries. Typically there are some constraints which should be in a final query only if some application-specific conditions are met. Thanks to this null behavior, you can easily turn on and off individual constraints in a query without having to specify multiple separate queries:

let filterBy = {
  entityPrimaryKeyInSet: [100, 200],
  priceBetween: (desiredPriceRange != undefined ? desiredPriceRange : null)
}
Want to know more about the decisions behind the query language design?

We have written a whole blog post about how we approached the whole issue of representing the evitaDB query language in the APIs, the possible syntax variants, limitations of JSON, etc.