Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GITHUB-11915] [Draft / Discussion Only] Add benchmark tasks for GITHUB-11915 changes to facilitate discussion #213

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

zacharymorn
Copy link
Contributor

This is a draft PR to provide benchmark result for apache/lucene#12194

@zacharymorn zacharymorn marked this pull request as draft March 20, 2023 06:36
@@ -14409,8 +14409,6 @@ BrowseDayOfYearTaxoFacets: *:* +facets:DayOfYear.taxonomy
BrowseDateSSDVFacets: *:* +facets:Date.sortedset
BrowseMonthSSDVFacets: *:* +facets:Month.sortedset
BrowseDayOfYearSSDVFacets: *:* +facets:DayOfYear.sortedset
BrowseRandomLabelTaxoFacets: *:* +facets:RandomLabel.taxonomy
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to remove these two tasks as index name was getting too long when I tried to include sorting

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought @zhaih had made a change to stop building such massive filenames -- maybe it's opt-in (you have to provide your own index name) or so?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mikemccand for the suggestion! I'm not aware of this feature and it does seems like an opt-in to me, but I'll dig around and report back.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's actually not added by me, it's there for long time: https://github.com/mikemccand/luceneutil/blob/master/src/python/competition.py#L155

I think just set the name will avoid the overlong auto-generated name.

@@ -698,6 +700,23 @@ public Document nextDoc(DocState doc, boolean expected) throws IOException {
doc.title.setStringValue(title);
doc.randomLabel.setStringValue(randomLabel);
if (addDVFields) {
doc.doc.removeField("quarter");
doc.doc.removeField("quarterBDV");
int randomInt = RANDOM.nextInt(13);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this used by multiple threads? do we really want them synchronizing on the same random instance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rmuir for the review! I was focusing on generating quarter field randomly earlier, and did not realize this is used by multiple threads, but it makes sense to not have threads contention here. I have removed this and related changes to use month for sorting instead per suggestion from apache/lucene#12194 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants