Add multi-source data principle #5763

sffc · 2024-11-01T23:40:22Z

I have talked about this principle before, but I couldn't find it written down.

It came up in #5759.

robertbastian · 2024-11-03T15:03:03Z

documents/design/principles.md

@@ -58,6 +58,23 @@ ICU4C/ICU4J exposes certain pieces of data through user-facing APIs such as Date

 Runtime customizability of locale data can sometimes come at a performance or memory cost.

+## Locale data from multiple sources works seamlessly
+
+*What:* If data is available for a particular constructor and locale, the resulting behavior should not change based on where the data was sourced, with a narrow exception for data that primarily impacts performance characteristics.


this basically says that behaviour can never change even if CLDR data changes. If I'm sourcing from more recent CLDR, behaviour should often change

Yeah good point; that obviously wasn't the intention. Is this better?

Manishearth · 2024-11-04T17:58:04Z

documents/design/principles.md

@@ -58,6 +58,23 @@ ICU4C/ICU4J exposes certain pieces of data through user-facing APIs such as Date

 Runtime customizability of locale data can sometimes come at a performance or memory cost.

+## Locale data from multiple sources works seamlessly
+
+*What:* If data is available for a particular constructor and locale, the correctness of behavior should not change based on where the data was sourced, with a narrow exception for data that primarily impacts performance characteristics.


I had the same reaction as @robertbastian on reading this updated text so I think it still needs work. CLDR updates fix correctness all the time.

Perhaps just have an explicit exception for outdated/buggy data that was fixed in new data source versions?

Also should this be constructor, locale, and key attributes? My understanding was that we were open to having people filter out e.g. unit data based on key attributes.

Though this means that the "don't do this" example below could still be done by having a "common" and "extended" key attribute. That's a bit of a misuse of key attributes, maybe.

One issue is that the word "source" is overloaded. It has long meant "the runtime data provider that ultimately reads data", such as a blob provider or bake provider. However, now we also have icu_provider_source, which means "the data provider that reads from CLDR/ICU/LSTM". In this principle, I'm primarily referring to the first version: if you mix a baked, blob, and fs provider, which were built with different datagen settings, there should not be an observable difference in behavior.

I don't think that was the ambiguity that was tripping me up, it is more that when you talk about "where the data is sourced", that can absolutely involve data that is outdated, etc.

One of the main reasons to load data on demand will be in cases where data is more likely to become outdated.

Manishearth · 2024-11-04T18:01:12Z

In favor of the principle, think the wording needs more work.

Add multi-source data principle

9370d9c

sffc requested a review from a team as a code owner November 1, 2024 23:40

sffc mentioned this pull request Nov 1, 2024

Deduplicate tz locations against root #5759

Merged

sffc requested review from Manishearth and robertbastian November 2, 2024 00:16

robertbastian reviewed Nov 3, 2024

View reviewed changes

Update principles.md

3ea0f1c

robertbastian approved these changes Nov 4, 2024

View reviewed changes

Manishearth reviewed Nov 4, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add multi-source data principle #5763

Add multi-source data principle #5763

sffc commented Nov 1, 2024

robertbastian Nov 3, 2024 •

edited

Loading

sffc Nov 3, 2024

Manishearth Nov 4, 2024

Manishearth Nov 4, 2024

Manishearth Nov 4, 2024

sffc Nov 4, 2024

Manishearth Nov 4, 2024

Manishearth Nov 4, 2024

Manishearth commented Nov 4, 2024

Add multi-source data principle #5763

Are you sure you want to change the base?

Add multi-source data principle #5763

Conversation

sffc commented Nov 1, 2024

robertbastian Nov 3, 2024 • edited Loading

Choose a reason for hiding this comment

sffc Nov 3, 2024

Choose a reason for hiding this comment

Manishearth Nov 4, 2024

Choose a reason for hiding this comment

Manishearth Nov 4, 2024

Choose a reason for hiding this comment

Manishearth Nov 4, 2024

Choose a reason for hiding this comment

sffc Nov 4, 2024

Choose a reason for hiding this comment

Manishearth Nov 4, 2024

Choose a reason for hiding this comment

Manishearth Nov 4, 2024

Choose a reason for hiding this comment

Manishearth commented Nov 4, 2024

robertbastian Nov 3, 2024 •

edited

Loading