Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use pattern in icu_decimal #5385

Closed
wants to merge 4 commits into from

Conversation

robertbastian
Copy link
Member

use core::{
convert::Infallible,
fmt::{self, Write},
marker::PhantomData,
};
use writeable::{adapters::TryWriteableInfallibleAsWriteable, PartsWrite, TryWriteable, Writeable};

#[allow(clippy::unwrap_used, clippy::indexing_slicing)]
impl<Store> SinglePlaceholderPattern<Store> where Store: Deref<Target = str> {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sffc how do I do this nicely? Are there encoding helpers?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use Pattern::try_from_items and then Pattern::take_store

#[cfg_attr(feature = "serde", serde(borrow))]
pub minus_sign_affixes: AffixesV1<'data>,
pub minus_sign_pattern: Option<SinglePlaceholderPattern<Cow<'data, str>>>,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strictly speaking this should use a pattern type that requires exactly one placeholder

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you make .suffix() return an Option, you can debug_assert that it is Some.

@robertbastian robertbastian marked this pull request as ready for review August 19, 2024 14:48
Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also #4437; I had been thinking that #4437 would be resolved and also solve the Pattern problem at the same time, but this is fine as a first step.

#[cfg_attr(feature = "serde", serde(borrow))]
pub minus_sign_affixes: AffixesV1<'data>,
pub minus_sign_pattern: Option<SinglePlaceholderPattern<Cow<'data, str>>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: why did you make it an Option? The value is only 2 bytes long. Seems like unnecessary indirection? You already have a Cow so the niche is already being used.

Comment on lines +27 to +35
impl SinglePlaceholderPattern<str> {
pub fn encode_store(prefix: &str, suffix: &str) -> alloc::boxed::Box<str> {
alloc::format!(
"{}{prefix}{suffix}",
char::from_u32(prefix.len() as u32 + 1).unwrap()
)
.into_boxed_str()
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: I would prefer to avoid adding methods directly to SinglePlaceholderPattern<str>. Methods should be added to Pattern<Store>.

Besides that, this code duplicates logic that lives in the pattern constructor. You should build the pattern using the standard APIs and then use take_store.

The constructor you probably want is Pattern::try_from_items.

#[allow(clippy::unwrap_used, clippy::indexing_slicing)]
impl<Store> SinglePlaceholderPattern<Store>
where
Store: Deref<Target = str>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: The bound I use everywhere else is AsRef<str>.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like that bound.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should open a discussion about whether we want the backend to be

  1. AsRef<str>
  2. Deref<Target = str>
  3. Borrow<str>
  4. str

But for now please remain consistent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Backend::Store

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, for all of those I mean a substitution of str with Backend::Store

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to start the commentary a bit:

I went with AsRef<Backend::Store> because it is the most flexible.

Why AsRef: The Store parameter worked out well for me in LiteMap and ZeroTrie before embarking on Pattern. It makes certain impls a pain to write, but once they are written, things "just work" and are very easy to use and very flexible.

Why not Deref: I don't care that the store dereferences uniquely to Backend::Store. For instance, at some point we might get Box<AsciiStr> which derefs to AsciiStr before str.

Why not Borrow: I don't care about the additional invariants on Borrow like Hash and Eq. If I want to implement those traits on Pattern, I can; I don't need Borrow's help.

Why not DSTs: I didn't go with the dynamically sized type because I dislike the amount of unsafe code required for the DSTs to work. I have made multiple attempts to write safe abstractions for ref casts and box casts, but they keep being shot down (#5101). I would like to block the pure DST approach on first solving #2778 / #2310 / #5127, but I'm open to it assuming it can be written in a way that minimizes unsafe.

if let Some(index) = index.checked_sub(1) {
&chars.as_str()[index..]
} else {
chars.as_str()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: this returns the same string for both prefix and suffix if there is no placeholder. I think suffix() should return Option<&str>.

utils/pattern/src/frontend/mod.rs Show resolved Hide resolved
Comment on lines +44 to +49
let index = chars.next().unwrap() as usize;
if let Some(index) = index.checked_sub(1) {
&chars.as_str()[..index]
} else {
chars.as_str()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: this code is not panic-proof. There are constructors that are not unsafe; you can make a Pattern pointing to an invalid store, and the docs say "unexpected behavior may occur" but not "panics may occur".

Please resolve the two fallible array getters in this function to debug-assert and then return "".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, that's why I asked how to do this properly

Comment on lines +54 to +56
let index = chars.next().unwrap() as usize;
if let Some(index) = index.checked_sub(1) {
&chars.as_str()[index..]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: Please change these two fallible array getters to debug-assert and then return either None or Some("").

#[cfg_attr(feature = "serde", serde(borrow))]
pub minus_sign_affixes: AffixesV1<'data>,
pub minus_sign_pattern: Option<SinglePlaceholderPattern<Cow<'data, str>>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you make .suffix() return an Option, you can debug_assert that it is Some.

@robertbastian robertbastian marked this pull request as draft August 19, 2024 20:50
@sffc
Copy link
Member

sffc commented Aug 20, 2024

It's also worth noting that the "correct" way to use Pattern is to use its own interpolation and construction APIs. Right now you are going out of your way to hack around the functions that Pattern offers to you on a silver platter. You wrote your own constructor and your own accessor functions (prefix and suffix), instead of using try_from_items and interpolate, so you are in effect using Pattern only as a storage mechanism.

Pattern, like ListFormatter, makes use of nested writeables. Kartavya did a good job with nested writeables in #5383. You can do this here by making a private type UnsignedFormattedFixedDecimal and then passing it to interpolate.

@robertbastian
Copy link
Member Author

so you are in effect using Pattern only as a storage mechanism.

I don't think there's anything wrong with this.

You can do this here by making a private type UnsignedFormattedFixedDecimal and then passing it to interpolate.

I can, but this will increase code complexity and very likely also code size.

@sffc
Copy link
Member

sffc commented Aug 20, 2024

I think you're overestimating the code complexity of nested writeables. I'm quite happy with how it went in Kartavya's open PR and it reduces the total number of lines of code.

I don't think it will have an impact on code size, assuming everything is sufficiently inlined.

But this is purely a code cleanup thing, not a data representation thing, so it doesn't block the PR which is more about the data representation.

@robertbastian
Copy link
Member Author

Discussion: we'll pursue stuffing all the strings in a multi-field-VarULE instead

@robertbastian robertbastian deleted the decimal branch October 17, 2024 00:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants