Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow legacy deduplication #741

Open
wants to merge 430 commits into
base: master
Choose a base branch
from
Open

Allow legacy deduplication #741

wants to merge 430 commits into from

Conversation

jh-bate
Copy link
Contributor

@jh-bate jh-bate commented Jun 18, 2024

back-37 - Migrate all existing jellyfish data to add required Platform deduplication hash using the existing generated jellyfish _id
back-2773 - Update platform to validate certain fields in pumpSettingsOverride and overridePresets
back-2728 - Support tandem pumpSettings in the platform
back-2520 - Data model changes to support Control-IQ sleep schedules
BACK-2607 - Bolus includes deliveryContext
BACK-3074 - basal/suspend to remove the duration maximum validation

@tidebot
Copy link
Collaborator

tidebot commented Oct 8, 2024

jh-bate updated values.yaml file in qa3

@tidebot
Copy link
Collaborator

tidebot commented Oct 8, 2024

jh-bate updated flux policies file in qa3

@tidebot
Copy link
Collaborator

tidebot commented Oct 8, 2024

jh-bate deployed platform jf_deduplication branch to qa3 namespace

@jh-bate
Copy link
Contributor Author

jh-bate commented Oct 8, 2024

/deploy qa3 data

@tidebot
Copy link
Collaborator

tidebot commented Oct 8, 2024

jh-bate updated values.yaml file in qa3

@tidebot
Copy link
Collaborator

tidebot commented Oct 8, 2024

jh-bate updated flux policies file in qa3

@tidebot
Copy link
Collaborator

tidebot commented Oct 8, 2024

jh-bate deployed platform jf_deduplication branch to qa3 namespace

Copy link
Contributor

@darinkrauss darinkrauss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of questions and comments, but it is a lot of code. :)

data/blood/glucose/test/glucose.go Outdated Show resolved Hide resolved
dexcom/fetch/translate.go Outdated Show resolved Hide resolved
data/types/common/day.go Outdated Show resolved Hide resolved
data/types/common/day.go Outdated Show resolved Hide resolved
data/types/common/day_test.go Outdated Show resolved Hide resolved
data/datum.go Outdated Show resolved Hide resolved
data/deduplicator/deduplicator/hash.go Outdated Show resolved Hide resolved

func (m *DataMigration) writeErrors() {
if !m.settings.writeToDisk {
m.groupedErrors = map[string][]ErrorData{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this return here?

return nil
}

func (c MigrationStats) report() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe put this with the rest of MigrationStats?

@@ -0,0 +1,175 @@
package utils
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What all are we doing here? Just adding the deduplicator name, version, and hash, right? Can we just replace this with a direct Mongo update?

Copy link
Contributor Author

@jh-bate jh-bate Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I am beginning to wonder if it is even required

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after discussion with Jimmy and how he ran the test I don't think this is required any longer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't you keep this code in a separate saved feature branch, just in case.

@jh-bate
Copy link
Contributor Author

jh-bate commented Oct 30, 2024

/deploy qa3 data

@tidebot
Copy link
Collaborator

tidebot commented Oct 30, 2024

jh-bate updated values.yaml file in qa3

@tidebot
Copy link
Collaborator

tidebot commented Oct 30, 2024

jh-bate updated flux policies file in qa3

@tidebot
Copy link
Collaborator

tidebot commented Oct 30, 2024

jh-bate deployed platform jf_deduplication branch to qa3 namespace

Copy link
Contributor

@darinkrauss darinkrauss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more changes and clarifications. Plus, one thing we should chat about regarding the deduplication version versus the identity fields version.

Comment on lines 281 to 286

if filter.IsLegacy != nil && *filter.IsLegacy {
selector["_id"] = bson.M{"$not": bson.M{"$type": "objectId"}}
if filter.LegacyOnly != nil {
if *filter.LegacyOnly {
selector["_id"] = bson.M{"$not": bson.M{"$type": "objectId"}}
} else {
selector["_id"] = bson.M{"$type": "objectId"}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, it was either rename it to LegacyOnly OR change the code. I'd revert this code back to what it was deleted here.

data/types/settings/pump/sleep_schedule.go Show resolved Hide resolved
data/types/settings/pump/sleep_schedule.go Show resolved Hide resolved
Comment on lines 71 to 73
if b.DeliveryContext != nil {
validator.String("deliveryContext", b.DeliveryContext).Exists().OneOf(DeliveryContext()...)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should just change to:

Suggested change
if b.DeliveryContext != nil {
validator.String("deliveryContext", b.DeliveryContext).Exists().OneOf(DeliveryContext()...)
}
validator.String("deliveryContext", b.DeliveryContext).OneOf(DeliveryContext()...)

There is no need for the b.DeliveryContext != nil. The validator handles that via the Exists test, which we also don't want because this field is optional.

Comment on lines 16 to 18
DeviceDeactivateHashVersionUnknown string = ""
DeviceDeactivateHashVersionCurrent string = types.IdentityFieldsVersion
DeviceDeactivateHashVersionLegacy string = types.LegacyIdentityFieldsVersion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You actually don't need the string type as it is implicit.

Suggested change
DeviceDeactivateHashVersionUnknown string = ""
DeviceDeactivateHashVersionCurrent string = types.IdentityFieldsVersion
DeviceDeactivateHashVersionLegacy string = types.LegacyIdentityFieldsVersion
DeviceDeactivateHashVersionUnknown = ""
DeviceDeactivateHashVersionCurrent = types.IdentityFieldsVersion
DeviceDeactivateHashVersionLegacy = types.LegacyIdentityFieldsVersion

Comment on lines 73 to 75
if *dataSet.Deduplicator.Version != "" {
return *dataSet.Deduplicator.Version
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need this conditional. It is checked two lines above.

Suggested change
if *dataSet.Deduplicator.Version != "" {
return *dataSet.Deduplicator.Version
}
return *dataSet.Deduplicator.Version

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think may need to modify this. Since we are just returning the *dataSet.Deduplicator.Version here it is very possible is not either either the current or the legacy and the logic would break. I think we may need to break the intertwined DeviceDeactivateHashVersionLegacy and types.LegacyIdentityFieldsVersion.

Let's chat about this next week.

}
}

var missingLegacyGroupIdErr = errors.New("missing required legacy groupId for the device deactive hash legacy version")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't really want to create static error variables since it loses the context of where the error actually originated from (the caller). Just take the right side and replace it in the two locations of missingLegacyGroupIdErr.

return missingLegacyGroupIdErr
}
case DeviceDeactivateHashVersionCurrent:
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to ensure there is no LegacyGroupID specified here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it does no harm

data/types/bolus/bolus.go Outdated Show resolved Hide resolved
Copy link
Contributor

@darinkrauss darinkrauss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need to make some changes to the versions. Let's chat tomorrow.

@@ -24,8 +24,6 @@ func NewBase(name string, version string) (*Base, error) {
}
if version == "" {
return nil, errors.New("version is missing")
} else if !net.IsValidSemanticVersion(version) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why removed?

DeviceDeactivateHashVersionUnkown DeviceDeactivateHashVersion = ""
DeviceDeactivateHashVersionCurrent DeviceDeactivateHashVersion = "1.1.0"
DeviceDeactivateHashVersionLegacy DeviceDeactivateHashVersion = "0.0.0"
DeviceDeactivateHashVersionUnknown = "unknown-hash"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I think we have a disconnect. I may not have explained myself very well. These version strings need to remain the same.

There need to be two separate version groups.

First, the version of the DeviceDeactivationHash deduplicator. That would be "0.0.0" (for Jellyfish) and "1.1.0" (for current Platform).

Second, the version of the identity fields, that would be, for simplicity, "0" (for legacy) and "1" (for current). This is what is passed into the datum.IdentityFields function.

The hash function and HashOptions, since it is not specific to the DeviceDeactivateHash deduplicator, should use the identity fields version.

Or, for complete separation of concerns, you could add a third version, the hash version, either "0" or "1".

So, then the code maps from deduplicator version to hash version to identity fields version.

The DeviceDeactivateHash code looks at its version, maps its version to a hash version and creates the HashOptions with that version. Then, the hash code looks at its version, maps its version to an identity field versions and calls IdentityFields with that identity field version.

Then, you've completely isolated the DeviceDeactivationHash deduplicator from the hash functionality from the identity fields functionality. In practice, we definitely have to do the first step (deduplicator version to hash version) because there will be another duplicator for continuous data sets that is not based off the device id, but the data set id, and doesn't deactivate, but deletes.

Technically, we don't actually need the hash version and could just use the identity fields version for both hash function and IdentityFields function.

Does that make sense? Let's chat about this tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants