-
Notifications
You must be signed in to change notification settings - Fork 296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update db covering indexes to have columns likely to be unique leftmost #2712
base: master
Are you sure you want to change the base?
Conversation
77fde5c
to
d9cb381
Compare
to make them covering indexes
d9cb381
to
295cefb
Compare
fdf02e1
to
5a83d72
Compare
5a83d72
to
1e7cf59
Compare
database.execSQL( | ||
"CREATE INDEX IF NOT EXISTS `index_TokenIndexEntity_index_value_resourceType_index_name_resourceUuid` ON `TokenIndexEntity` (`index_value`, `resourceType`, `index_name`, `resourceUuid`);", | ||
) | ||
database.execSQL( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we refactor the code by grouping operations logically and separating concerns into helper functions? For example, create separate functions to drop and create the indices
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, sure. Do you mean grouping by the type of operation or the table? Which one would be better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @LZRS thanks for this draft PR - this is really important!
Some comments on the draft PR: it would be much easier to review smaller changes because this touches so many tables - maybe we can have 1 pr to change the indices and another one to change the queries.
secondly, can you use query planner in android studio to check if the indices are being hit? and do you have actual numbers with regards to time taken for these queries?
@@ -28,7 +28,7 @@ import org.hl7.fhir.r4.model.ResourceType | |||
@Entity( | |||
indices = | |||
[ | |||
Index(value = ["resourceType", "index_name", "index_system", "index_value", "resourceUuid"]), | |||
Index(value = ["index_value", "resourceType", "index_name", "resourceUuid"]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you look at this file https://github.com/google/android-fhir/blob/fbeeb0b4abff0ecf0b133d57326bec37ac1cc920/engine/src/test/java/com/google/android/fhir/search/SearchTest.kt
and search for TokenIndexEntity, you'll find some of the queries generated by our search api. a lot of them have resource type, index name, and index value - so changing this index here will probably make those queries worse.
but - we probably should fix the index, to move up the index value column.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the index_value
to be the leftmost column within the index since it should be the most distinguishable column, and removed the index_system
column since it never gets used within the index because the queries generated use the IFNULL statement that forces evaluation from the actual rows
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given query
SELECT a.*
FROM ResourceEntity a
WHERE a.resourceType = 'Task'
AND a.resourceUuid IN (SELECT resourceUuid
FROM TokenIndexEntity
WHERE resourceType = 'Task'
AND index_name = 'identifier'
AND index_value = 'test-task')
AND a.resourceUuid IN (SELECT resourceUuid
FROM TokenIndexEntity
WHERE resourceType = 'Task'
AND index_name = 'status'
AND (
(index_value = 'ready' AND IFNULL(index_system, '') = 'http://hl7.org/fhir/task-status') OR
((index_value = 'in-progress' AND
IFNULL(index_system, '') = 'http://hl7.org/fhir/task-status') OR
(index_value = 'requested' AND
IFNULL(index_system, '') = 'http://hl7.org/fhir/task-status'))))
The query generated for previous indexes were
QUERY PLAN
|--SEARCH a USING INDEX index_ResourceEntity_resourceType_resourceId (resourceType=?)
|--LIST SUBQUERY 1
| `--SEARCH TokenIndexEntity USING COVERING INDEX index_TokenIndexEntity_resourceType_index_name_index_system_index_value_resourceUuid (resourceType=? AND index_name=?)
`--LIST SUBQUERY 2
`--SEARCH TokenIndexEntity USING COVERING INDEX index_TokenIndexEntity_resourceType_index_name_index_system_index_value_resourceUuid (resourceType=? AND index_name=?)
With the changes,
QUERY PLAN
|--SEARCH a USING INDEX index_ResourceEntity_resourceUuid_resourceType (resourceUuid=? AND resourceType=?)
|--LIST SUBQUERY 1
| `--SEARCH TokenIndexEntity USING COVERING INDEX index_TokenIndexEntity_index_value_resourceType_index_name_resourceUuid (index_value=? AND resourceType=? AND index_name=?)
`--LIST SUBQUERY 2
`--MULTI-INDEX OR
|--INDEX 1
| `--SEARCH TokenIndexEntity USING INDEX index_TokenIndexEntity_index_value_resourceType_index_name_resourceUuid (index_value=? AND resourceType=? AND index_name=?)
|--INDEX 2
| `--SEARCH TokenIndexEntity USING INDEX index_TokenIndexEntity_index_value_resourceType_index_name_resourceUuid (index_value=? AND resourceType=? AND index_name=?)
`--INDEX 3
`--SEARCH TokenIndexEntity USING INDEX index_TokenIndexEntity_index_value_resourceType_index_name_resourceUuid (index_value=? AND resourceType=? AND index_name=?)
Okay, I will do that and share some of the related query plans |
based on https://www.sqlite.org/queryplanner.html#_multi_column_indices and https://www.sqlite.org/optoverview.html
IMPORTANT: All PRs must be linked to an issue (except for extremely trivial and straightforward changes).
Fixes #[issue number]
Description
Refactor db covering indexes to have their leftmost columns to be those most likely to be unique (also mostly likely to be used)
Based on https://www.sqlite.org/queryplanner.html#_multi_column_indices
And https://www.sqlite.org/optoverview.html
Alternative(s) considered
Have you considered any alternatives? And if so, why have you chosen the approach in this PR?
Type
Choose one: (Bug fix | Feature | Documentation | Testing | Code health | Builds | Releases | Other)
Screenshots (if applicable)
Checklist
./gradlew spotlessApply
and./gradlew spotlessCheck
to check my code follows the style guide of this project../gradlew check
and./gradlew connectedCheck
to test my changes locally.