-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Star Tree Request/Response structure #227
Conversation
// Check if the query builder is an instance of TermQueryBuilder | ||
if (queryBuilder instanceof TermQueryBuilder) { | ||
TermQueryBuilder tq = (TermQueryBuilder) queryBuilder; | ||
String field = tq.fieldName(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be converted to dimension field name of star tree?
long inputQueryVal = Long.parseLong(tq.value().toString()); | ||
|
||
// Get or create the list of predicates for the given field | ||
List<Predicate<Long>> predicates = predicateMap.getOrDefault(field, new ArrayList<>()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we use predicates, we can't use binary search during star tree traversal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't really put much thought into this yet, let me revisit the part on how to better store information to use filters. For request/response parsing I had just inspired changes from previous POCs.
@Override | ||
public void collect(int doc, long bucket) throws IOException { | ||
// TODO: Fix the response for collecting star tree sum | ||
sums = bigArrays.grow(sums, bucket + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we extract out the default implementation getDefaultLeafCollector
and reuse the same logic.
I really like the approach of reusing the existing aggregators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'll try and do that, I am inclined to on refactoring & re-using same implementations wherever possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
final SortedNumericDoubleValues values = valuesSource.doubleValues(ctx);
we can check if we are able to get sortedNumericDoubleValues , otherwise we need to convert to double for each doc
|
||
protected StarTreeValues getStarTreeValues(LeafReaderContext ctx, CompositeIndexFieldInfo starTree) throws IOException { | ||
SegmentReader reader = Lucene.segmentReader(ctx.reader()); | ||
if (!(reader.getDocValuesReader() instanceof CompositeIndexReader)) return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to see if its better to load them as doubleValuesSource similar to how existing fields are loaded. And that too load the specific fields requested instead of loading the entire star tree values. ( for example in sum aggregator, we can fetch the doubleFieldData of a particular field of star tree metric , for eg : sum_status_metric can be loaded in )
Then you don't need to worry about conversion either.
// Can be marked false for majority cases for which star-tree cannot be used | ||
// Will save checking the criteria later and we can have a limit on what search requests are supported | ||
// As we increment the cases where star-tree can be used, this can be set back to true | ||
boolean canUseStarTree = context.mapperService().isCompositeIndexPresent(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be based on dimensions and metric rather ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just initialization, will toggle this off for multiple cases. (TODO-in this PR iteself)
Signed-off-by: Sandesh Kumar <[email protected]>
if (queryBuilder instanceof TermQueryBuilder) { | ||
TermQueryBuilder tq = (TermQueryBuilder) queryBuilder; | ||
String field = tq.fieldName(); | ||
long inputQueryVal = Long.parseLong(tq.value().toString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Convert to sortable long
Signed-off-by: Sandesh Kumar <[email protected]>
kahanSummation.reset(sum, compensation); | ||
|
||
for (int i = 0; i < valuesCount; i++) { | ||
double value = Double.longBitsToDouble(dv.nextValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we do this ?
public static Double sortableLongtoDouble(Long value) {
return NumericUtils.sortableLongToDouble(value);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also lets see if we can get double sorted numeric dv ahead
@Override
public SortedNumericDoubleValues getDoubleValues() {
try {
SortedNumericDocValues raw = DocValues.getSortedNumeric(reader, field);
return FieldData.sortableLongBitsToDoubles(raw);
} catch (IOException e) {
throw new IllegalStateException("Cannot load doc values", e);
}
}
Closing this PR in lieu of opensearch-project#15289 |
Description
For an index supporting star-tree composite index, this changes tries to achieve resolving a metric aggregation with/without a numeric terms query with the help of star-tree.
This PR in present state primarily focuses on flow of information request to response. Therefore, the code pieces around calculating the correct response values are still inaccurate and are in WIP.
This PR depends on opensearch-project#14809, therefore the depending unmerged changes have been utilized for now in my private fork to discuss the changes in parallel.
Approach
A new StarTreeQuery is introduced which helps resolve to star-tree documents. This star-tree query is formed (if it can be) at the shard level, this is not done at coordinator level to avoid node to node transportation. Also, all the information is present at shard level and OpenSearch does majority of query rewrite at shard level itself. This star tree query is encapsulated in an OriginalOrStarTreeQuery which helps preserve the original query alongwith the new star tree query. This encapsulation is done so as to preserve both the queries and decision whether to use which query can be taken at a segment level.
Example query shape:
Request:
Response:
Star Tree Flow Response:
Approach:
Related Issues
opensearch-project#15257
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.