Skip to content
rdelbru edited this page Nov 16, 2011 · 11 revisions

SIREn Query Components

SIREn provides a set of query components for performing operations over the content and structure of the tuple table. Those query components are the building blocks for writing semi-structured search.

Searching Content using Primitive Queries

SIREn currently provides two primitive query operators to access (and search) the content of a tuple table. These query operators provide the basic following operations

  • SirenTermQuery: performs a term lookup, similarly to the original Lucene TermQuery;
  • SirenPhraseQuery: performs a phrase query, similarly to the original Lucene PhraseQuery.
  • SirenBooleanQuery: performs a boolean query by combining primitive queries with unary boolean operators. The interface is similar to the Lucene BooleanQuery with the possibility of adding multiple clauses using the SirenBooleanQuery.add(SirenPrimitiveQuery query, Occur occur) method.

In future SIREn releases, more advanced primitives operators will be available such as fuzzy or prefix queries.

These operators can then be combined with higher level operators, such as SirenCellQuery and SirenTupleQuery presented next, in order to create semi-structured queries.

Restricting Search within a Cell

The SirenCellQuery wraps a SIREn primitive query component, e.g., SirenTermQuery, SirenPhraseQuery or SirenBooleanQuery, and provides an interface, i.e., SirenCellQuery.setConstraint(int index), to add a cell index constraint. For example, in the N-Triples tuple table example, the cell index of a subject is always 0. When trying to match the subject cell, all cell matching cells with an index different from 0 should be discarded. This is illustrated in the example below. The index constraint is not hard and can be represented as an interval using SirenCellQuery.setConstraint(int start, int end) in order to search multiple cell at the same time.

// Create a cell query matching either the keyword "renaud" or the full URI "http://renaud.delbru.fr/rdf/foaf#me" at the subject position (cell 0)
final SirenBooleanQuery bq = new SirenBooleanQuery();
bq.add(new SirenTermQuery(new Term(DEFAULT_FIELD, "renaud")), SirenBooleanClause.Occur.SHOULD);
bq.add(new SirenTermQuery(new Term(DEFAULT_FIELD, "http://renaud.delbru.fr/rdf/foaf#me")), SirenBooleanClause.Occur.SHOULD);
// Constraint the cell index to 0 (first column: subject position)
final SirenCellQuery cq = new SirenCellQuery(bq);
cq.setConstraint(0);

Combining Cells into Tuples

A SirenCellQuery allows one to express a search over the content of a cell. Multiple cell query components can be combined to form a "tuple query" using the SirenTupleQuery component. A tuple query retrieves tuples matching a boolean combination of the cell queries. The SirenTupleQuery provides a similar interface to BooleanQuery with the possibility to add multiple clauses using the SirenTupleQuery.add(SirenCellQuery query, Occur occur) method.

Since 0.2, the SirenTupleQuery provides an interface, i.e., SirenTupleQuery.setConstraint(int index), to add a tuple index constraint. As for the SirenCellQuery, the index constraint is not hard and can be represented as an interval using SirenTupleQuery.setConstraint(int start, int end) in order to restrict the search to multiple tuples at the same time.

// Simple tuple query that lookup a triple pattern (*, name, "renaud delbru")

// Create a cell query matching "name"
final SirenBooleanQuery bq1 = new SirenBooleanQuery();
bq1.add(new SirenTermQuery(new Term(DEFAULT_FIELD, "name")), SirenBooleanClause.Occur.MUST);
// Constraint the cell index to 1 (second column: predicate position)
final SirenCellQuery cq1 = new SirenCellQuery(bq1);
cq1.setConstraint(1);

// Create a cell query matching the phrase "renaud delbru"
final SirenPhraseQuery pq = new SirenPhraseQuery();
pq.add(new Term(DEFAULT_FIELD, "renaud"));
pq.add(new Term(DEFAULT_FIELD, "delbru"));
final SirenBooleanQuery bq2 = new SirenBooleanQuery();
bq2.add(pq, SirenBooleanClause.Occur.MUST);
// Constraint the cell index to 2 (third column: object position)
final SirenCellQuery cq2 = new SirenCellQuery(bq2);
cq2.setConstraint(2);

// Create a tuple query that combines the two cell queries
final SirenTupleQuery tq = new SirenTupleQuery();
tq.add(cq1, SirenTupleClause.Occur.MUST);
tq.add(cq2, SirenTupleClause.Occur.MUST);

Combining Tuples, Cells and Primitive Queries with Lucene Operators

SirenTupleQuery, but also_SirenCellQuery_ and SirenPrimitiveQuery, can be combined using Lucene BooleanQuery, and allows one to express more advanced queries, e.g. for matching entities. The query example will retrieve all entities working in "DERI" and having a property labeled name with a value "Renaud Delbru".

// Complex tuple queries that matches: (*, name, "renaud delbru") AND (*, workplace, deri)

// Create a cell query matching "workplace"
final SirenBooleanQuery bq1 = new SirenBooleanQuery();
bq1.add(new SirenTermQuery(new Term(DEFAULT_FIELD, "workplace")), SirenBooleanClause.Occur.MUST);
// Constraint the cell index to 1 (second column: predicate position)
final SirenCellQuery cq1 = new SirenCellQuery(bq1);
cq1.setConstraint(1);

// Create a cell query matching the "deri"
final SirenBooleanQuery bq2 = new SirenBooleanQuery();
bq2.add(new SirenTermQuery(new Term(DEFAULT_FIELD, "deri")), SirenBooleanClause.Occur.MUST);
// Constraint the cell index to 2 (third column: object position)
final SirenCellQuery cq2 = new SirenCellQuery(bq2);
cq2.setConstraint(2);

// Create a tuple query that combines the two cell queries
final SirenTupleQuery tq = new SirenTupleQuery();
tq.add(cq1, SirenTupleClause.Occur.MUST);
tq.add(cq2, SirenTupleClause.Occur.MUST);

// Combine two tuple queries with a Lucene boolean query
final BooleanQuery q = new BooleanQuery();
q.add(tq, Occur.MUST);
// Get the tuple query (*, name, "renaud delbru")
q.add(this.getQuery2(), Occur.MUST);