Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enums derive #71

Open
elpiel opened this issue May 3, 2020 · 10 comments
Open

Enums derive #71

elpiel opened this issue May 3, 2020 · 10 comments

Comments

@elpiel
Copy link

elpiel commented May 3, 2020

Hello everyone!

I am working on a project in which we have enums that whey (de)serialized add the Enum::Case(CaseStruct) as a u8 or u16 before (de)serializing the CaseStruct.
Currently we've implemented custom TryFromCtx & TryIntoCtx, but it would be nice to be able to derive it and add something like serde enum tag attribute and specify the type per case(maybe?!).

#[derive(Pread, Pwrite)]
#[scroll(tag = "u16")]
enum MyEnum {
   #[scroll(tag_value = "0")]
   Case(CaseStruct),
   #[scroll(tag_value = "1")]
   CaseTwo(CaseTwoStruct),
}
@m4b
Copy link
Owner

m4b commented May 4, 2020

Hi @elpiel yea this seems like a pretty cool idea! I will note I'm worried it's turning into a kind of serialization format, but your usecase seems reasonable; especially with the more C-ish enum types.

Maybe it might help to give a list of enum use-cases you're interested in supporting (e.g., will this do full blown algebraic data type serialization/deserialization?)

For simple #[repr(usize)] enums, I think it's straightforward and useful; for more complicated enums, we can think about that too, but might be good to have a minor proposal here for what you think a reasonable first attempt at what this:

  1. binary serialization/deserialization format
  2. scroll derive api/annotation

will look like. Looks like you've already got some interesting ideas for tagging, etc. Would you like to expand a bit more, maybe we can flesh out something that would make a PR more easier to evaluate and review whether it closes and/or "fixes" this issue :)

And glad this library is working out for you in your project, can you share it? (Or is it the backlinked issue referencing it above ?)

@elpiel
Copy link
Author

elpiel commented May 4, 2020

Hey @m4b 👋
First of all: Thank you for the quick reply ❤️ !
Yes, the backlinked PR is our use case. It is a Parrot drone SDK we are building that uses a specific binary format. Main reason I opened this issue was Enums like this one - Feature, that can contain information, but also require (de)serialization for the Enum type first (as either u8, u16, etc.)

especially with the more C-ish enum types.
Yes, it is a C SDK we are looking those enums from and trying to impl them in Rust.

will this do full blown algebraic data type serialization/deserialization?
If I understand correctly - yes.

The idea was just that 😅 - an idea. I like how serde has this tag attribute you can use (since I use serde a lot for json).

Regarding:

binary serialization/deserialization format
scroll derive api/annotation
If there is a tag use it, before (de)serialization of the actual enum data.
If there is no tag - just try to (de)serialize the data inside.

What we need for our use case is, smth like #[scroll(tag_value = "0")] and it to be required if you have a tag attr on the enum. Or maybe even mimic how Rust is doing it? I.e.:

enum MyEnum {
  Case1 = 1,
  Case2, // 2
  Case3, // 3
  Case56 = 56,
  Case57, // 57

I haven't thought about that much, so it would be nice to hear your thoughts on this as well.

@m4b
Copy link
Owner

m4b commented May 9, 2020

So looking at your Feature enum, it carries data. To summarize, hopefully correctly, it looks like effectively you check for tag, then if it matches known lookup-table of tags, then goto branch for that tag, recursively. E.g., in the case of Common and ArDrone, you recursively descend into the data type and return with an offset after consuming whatever was in there. Makes sense.

It might be useful to study this wire format: https://github.com/mfp/extprot/blob/master/doc/encoding.md#sum-types-disjoint-unions

Iirc it is similar style for OCaml serialization of tagged unions and the data they can carry. I recall encountering a kind of deserialization ambiguity in some cases for tagged union wire formats, but I can't remember if it was just regular issues, or something else that doesn't matter that much.

Anyway, might be good to consider above link, then perhaps for extended version of your example:

#[derive(Pread, Pwrite)]
#[scroll(tag = "u16")]
enum MyEnum {
   #[scroll(tag_value = "0")]
   Case(CaseStruct),
   #[scroll(tag_value = "1")]
   CaseTwo(CaseTwoStruct),
  #[scroll(tag_value = "2")]
   CaseThree,
}

we can start discussion about implementation details, e.g. what does the wire/binary format look like here? What error messages do we expect if we omit the `#[scroll(tag_value= "0")`` annotation (will it compile?). Can we omit the tag values and have a sensible default (this could/would be nice?)? Etc.

But I think we could come up with something sensible perhaps, so you could theoretically slap a derive on an enum with complicated structure, and have it do something more or less meaningful; at the very least i think we could expect that a regular:

#[repr(usize)]
#[derive(Pread, Pwrite)]
enum CThing {
  Foo = 0,
  Bar = 1,
  Baz = 2,
}

would do more or less precisely what you expect it to. (Probably we have to enforce that all numeric values are given, but yea, one gets the idea)

@elpiel
Copy link
Author

elpiel commented May 9, 2020

I will take a look and read the wire format link you've provided! And yes, you are correct on your summary.

we can start discussion about implementation details, e.g. what does the wire/binary format look like here? What error messages do we expect if we omit the #[scroll(tag_value= "0") annotation (will it compile?). Can we omit the tag values and have a sensible default (this could/would be nice?)? Etc.

Regarding this part. It might be sensible to stick to how Rust handles this. Even if you don't specify the value Foo = 1, Rust will automatically assign them (Playground example)

For me this is a reasonable default and in this case you'll only need to specify that your Enum should be tagged. I think this is nice!

In our case, the usize is not useful in particular, since it's platform specific and we need the protocol message size instead (which are u8, u16, u32 and are per case specific) otherwise it will read wrong amount of bytes.

Thank you for your fast response and help! <3

@m4b
Copy link
Owner

m4b commented May 9, 2020

Regarding this part. It might be sensible to stick to how Rust handles this. Even if you don't specify the value Foo = 1, Rust will automatically assign them (Playground example)

Yes so the issue here (unless I'm missing something) is that when you go to deserialize in the derive implementation for i.e. Pread, which numeric tag equals which variant (if they're unspecified)? For example in:

#[repr(u16)]
enum Specified {
  Foo = 0,
  Bar = 6,
  Baz,
  Fizz,
}

if we encounter 17, is it Baz, Fizz, or neither (a bad parse); in order to know this we would have to know which numeric tag rustc assigned (maybe this is knowable in the derive macro, but i would assume this is unstable, and may vary from rustc release to release potentially) Hence my suggestion the user may have to specify all numeric tag variants explicitly, or if none are specified perhaps it proceeds upwards from 0.

@philipc
Copy link
Contributor

philipc commented May 9, 2020

@m4b
Copy link
Owner

m4b commented May 9, 2020

I didn't see anything about it being stable, though I suppose the reference makes it so?

And from link:

Under the default representation, the specified discriminant is interpreted as an isize value although the compiler is allowed to use a smaller type in the actual memory layout.

How do we know when the compiler used a smaller type, and did not? Probably the derive would have to require repr(C) or other repr I guess.

@m4b
Copy link
Owner

m4b commented May 9, 2020

In any event, we could just by fiat make the un-explicitly tagged variants behave similar/identical to the reference posted (auto-incrementing +1 from last variant), for the derive serialization format of simple enums; this is probably what @elpiel was suggesting.

@m4b
Copy link
Owner

m4b commented May 9, 2020

Also this brings up question of what size the tag should be serialized as; e.g., in example above, we have:

#[repr(u16)]
enum Foo {
  Bar,
  Baz
}

I assume we'd serialize the tag here as a u16; but i'm wondering now if not fixing the tag size could cause any issues when recursively deserializing, or other scenarios haven't fully invisioned? Probably I should just reread what extprot does ; )

@elpiel
Copy link
Author

elpiel commented May 9, 2020

How do we know when the compiler used a smaller type, and did not? Probably the derive would have to require repr(C) or other repr I guess.

Disclaimer: I don't fully understand repr!

We don't actually need this right? How Rust is representing the enum can be different than what we (de)serialize from/into. Plus this can't apply for enums containing fields only for Fieldless enums.
This is why I proposed to specify what type you would like to (de)serialize from/into.

Maybe I don't fully understand it, so feel free to correct me 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants