Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IntoCtx/TryIntoCtx on borrowed values #46

Open
willglynn opened this issue Oct 19, 2018 · 8 comments
Open

IntoCtx/TryIntoCtx on borrowed values #46

willglynn opened this issue Oct 19, 2018 · 8 comments

Comments

@willglynn
Copy link
Contributor

The trait signatures are:

/// Writes `Self` into `This` using the context `Ctx`
pub trait IntoCtx<Ctx: Copy = (), This: ?Sized = [u8]>: Sized {
    fn into_ctx(self, &mut This, ctx: Ctx);
}

/// Tries to write `Self` into `This` using the context `Ctx`
pub trait TryIntoCtx<Ctx: Copy = (), This: ?Sized = [u8]>: Sized {
    type Error;
    fn try_into_ctx(self, &mut This, ctx: Ctx) -> Result<usize, Self::Error>;
}

These traits mirror Into which consumes self. However, these might be the wrong function signatures for how scroll is used in practice; both traits are primarily used on borrowed values.

scroll implements IntoCtx and TryIntoCtx on borrowed primitive types, suggesting that [Try]IntoCtx needn't consume its value after all.

scroll_derive generates code implementing IntoCtx and TryIntoCtx on Self by borrowing and delegating to the implementation on &'a Self.

Should these traits take &self instead of self?

@m4b
Copy link
Owner

m4b commented Oct 23, 2018

So I considered this a while ago since we don’t technically consume the object, but I made it take an owned value because it seemed “right”.

That being said are there any downsides to switching to a non owned value ? I don’t think we can do very many optimizations unless it’s a copy value, but then we don’t need the owned value anyway?

Anyone else have thoughts ? (Besides this being another breaking change, willglynn is busting up scroll 🤣)

For the record I’m pretty much ok with this I think, unless I’m missing something.
This will be a pretty annoying refactor though in goblin. I’ll also see how it affects faerie (which is the primary consumer of writing goblin structs), but I even recall remembering having to clone because of scroll (eg I wrote the same thing multiple times)

@luser @philipc anyone else thoughts ?

@m4b
Copy link
Owner

m4b commented Oct 23, 2018

An alternative to note is to add another tryinto method but I suppose we can always do that after (and add another pwrite method); but we can’t do the reverse after the crate is 1.0

@willglynn
Copy link
Contributor Author

I've pondered this a bit since my initial writeup, and I can now distill my argument:

  1. The benefit of Into consuming self is that into() can move fields from the old type the new type.
  2. [Try]IntoCtx always goes into a &mut [u8] – a mutable reference to a caller-provided slice – so even in the ideal case when the originating type was a [u8] we'd need to copy it into the caller's slice.
  3. Since this trait must always make a copy, it should take &self instead of consuming self.

The breakage of this change is worth considering, but it doesn't seem like it'll be too bad. scroll_derive changes do most of the work, so it's just adding a & to whichever pwrite() calls the compiler doesn't like. I found many pwrite()s already writing borrowed values since that works right now, so it wasn't even every call needing to get touched.

@luser
Copy link
Contributor

luser commented Oct 23, 2018

scroll implements IntoCtx and TryIntoCtx on borrowed primitive types, suggesting that [Try]IntoCtx needn't consume its value after all.

I added those in #35 in service of m4b/scroll_derive#17. I think this is another one of those design decisions that makes sense when your use cases are "structs full of primitive types" where everything is Copy, but it stops working well when you have non-Copy types. I ran into this in my minidump crate because I removed #[derive(Copy)] from a bunch of structs since they were actually fairly large and it didn't feel right.

The generated TryIntoCtx impls in scroll_derive don't deconstruct the self argument anyway, they just use it field-by-field, so I don't know that this will effectively make a difference. It should be easy enough to build the example sources before and after making this change and run them through something like cargo asm to see what the generated code looks like.

@philipc
Copy link
Contributor

philipc commented Oct 24, 2018

I agree that these traits should take &self, for the reasons that @willglynn gives. It should only consume self if there are benefits from reusing parts of self for the conversion, but writing to &mut [u8] never does that.

@willglynn
Copy link
Contributor Author

I'm working on a PR and hit this:

error: borrow of packed field is unsafe and requires unsafe function or block (error E0133)
   --> tests/api.rs:263:27
    |
263 |         bytes.cwrite_with(&self.foo, 0, ctx);
    |                           ^^^^^^^^^
    |
note: lint level defined here
   --> tests/api.rs:5:9
    |
5   | #![deny(safe_packed_borrows)]
    |         ^^^^^^^^^^^^^^^^^^^
    = warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release!
    = note: for more information, see issue #46043 <https://github.com/rust-lang/rust/issues/46043>
    = note: fields of packed structs might be misaligned: dereferencing a misaligned pointer or even just creating a misaligned reference is undefined behavior

Context:

#[repr(packed)]
struct Bar {
    foo: i32,
    bar: u32,
}

// …

impl scroll::ctx::IntoCtx<scroll::Endian> for Bar {
    fn into_ctx(self, bytes: &mut [u8], ctx: scroll::Endian) {
        use scroll::Cwrite;
        bytes.cwrite_with(&self.foo, 0, ctx);
        bytes.cwrite_with(&self.bar, 4, ctx);
    }
}

Like IntoCtx, my local Cwrite trait now takes &self. Both traits no longer copy while writing, and borrowing potentially-misaligned packed fields isn't safe, so there's a new tension here.

In this case, the #[repr(packed)] struct will have the same structure whether it's packed or not, so we could remove the annotation. In other cases, the user might need to Copy their fields for serialization:

impl scroll::ctx::IntoCtx<scroll::Endian> for Bar {
    fn into_ctx(&self, bytes: &mut [u8], ctx: scroll::Endian) {
        use scroll::Cwrite;
        let Self{foo, bar} = *self;
        bytes.cwrite_with(&foo, 0, ctx);
        bytes.cwrite_with(&bar, 4, ctx);
    }
}

Backing up a bit more: how should #[repr(packed)] and scroll fit together?

My thoughts: we're working with bytes a field at a time, so the layout of the Rust data structure doesn't affect the binary input/output. If you're on a platform that can't do unaligned access, it seems like we'd want to handle that during the serialization step. This makes me think the Rust struct should always have optimal layout (i.e. not be #[repr(packed)]), pushing all non-aligned accesses into scroll's read/write operations which already support that.

@luser
Copy link
Contributor

luser commented Nov 1, 2018

I agree that #[repr(packed)] is unnecessary for scroll usage and we should advise against it. I was originally using packed and transmuting bytes in my minidump crate, but I removed that and switched everything to scroll and it's been working well. Since I changed the scroll_derive SizeWith impl to call itself recursively for every struct member it should produce the right size as if the struct was packed.

@willglynn
Copy link
Contributor Author

Hypothesis: #[repr(packed)] is only useful with Copy types.

#[derive] imposes such a requirement on the struct itself:

#[derive(Debug,PartialEq,Hash)]
#[repr(packed)]
struct Bar {
    foo: i32,
    bar: u32,
}
error: #[derive] can't be used on a #[repr(packed)] struct that does not derive Copy (error E0133)
   --> tests/api.rs:209:10
    |
209 | #[derive(Debug,PartialEq,Hash)]
    |          ^^^^^
    |
error: #[derive] can't be used on a #[repr(packed)] struct that does not derive Copy (error E0133)
   --> tests/api.rs:209:16
    |
209 | #[derive(Debug,PartialEq,Hash)]
    |                ^^^^^^^^^
    |
error: #[derive] can't be used on a #[repr(packed)] struct that does not derive Copy (error E0133)
   --> tests/api.rs:209:26
    |
209 | #[derive(Debug,PartialEq,Hash)]
    |                          ^^^^
    |
    = warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release!
    = note: for more information, see issue #46043 <https://github.com/rust-lang/rust/issues/46043>

If we're willing to require Copy on the struct fields, then there's a clear way forward. We need to copy each fields out of the #[repr(packed)] struct so we can safely & the copies. Turns out we need only some curly braces:

impl scroll::ctx::IntoCtx<scroll::Endian> for Bar {
    fn into_ctx(&self, bytes: &mut [u8], ctx: scroll::Endian) {
        use scroll::Cwrite;
        bytes.cwrite_with(&{self.foo}, 0, ctx); // look ma, anonymous locals!
        bytes.cwrite_with(&{self.bar}, 4, ctx);
    }
}

This is easy enough to handle in scroll_derive – check if the struct is #[repr(packed)], emit extra braces if so. This leaves us three cases for #[derive(Pwrite,IOwrite)]:

  1. Normal structs get each fields written by reference. ✅
  2. #[repr(packed)] structs with Copy fields get each field copied as we write them. ✅
  3. #[repr(packed)] structs with non-Copy fields fail to compile. ❌ These types already complain when you try to #[derive] anything else, so… maybe that's fine? ❓

I would love to produce specialized error messages for this third case, but we can't know if the field types are all Copy within the macro invocation. (Primitives, sure, but in general…) I guess we could emit type bounds into the derived code? impl IntoCtx for Bar where i32: Copy, u32: Copy? Eh… I'm inclined to leave this unhandled and just let #[derive(Pwrite,IOwrite)] fail.

willglynn added a commit to willglynn/scroll that referenced this issue Nov 2, 2018
This assumes that #[repr(packed)] types are also Copy, which is a thing
we can't verify in this context. At least we're in good company: the
built-in #[derive]s all require #[derive(Copy)] for packed structs.

Further discussion:
  m4b#46 (comment)
willglynn added a commit to willglynn/scroll that referenced this issue Nov 2, 2018
This assumes that #[repr(packed)] types are also Copy, which is a thing
we can't verify in this context. At least we're in good company: the
built-in #[derive]s all require #[derive(Copy)] for packed structs.

Further discussion:
  m4b#46 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants