Pread/Pwrite redefinitions #79

nickbclifford · 2021-09-14T18:08:33Z

This PR adds a new type, DynamicBuffer, that automatically expands when Pwriteing types into it. Details in the doc comments.

It also includes a relaxed definition of the Pwrite trait that, while technically an API break, is source-compatible and makes implementing custom containers easier.

Resolves #78.

nickbclifford · 2021-09-14T18:11:51Z

I was considering adding a TryIntoCtx<_, DynamicBuffer> impl to the Pwrite derive macro as well, but (no offense intended!) the proc macro doesn't seem to be documented well and I didn't really fully understand the specifics of how it worked, so I ultimately decided not to.

nickbclifford · 2021-09-24T01:11:30Z

Could I get a review @m4b?

m4b

This is pretty awesome! I'm quite surprised that tests pass, without any changes, honestly, although it is technically a breaking change (which is fine).

So, some thoughts:

I think the associated type Buffer is unavoidable? (which is fine if so, this is the technically breaking change)
Should Pread be updated (I don't think your DynamicBuffer works with pread? It might be nice from a symmetric viewpoint to get that working, but I don't think its' necessary)
There's also the argument to be made that this type could be in a third party crate (assuming the Pwrite changes go in)

The one thing I don't like is the asymmetry now between Pwrite/Pread; one of them has an associated type and the other doesn't. Is there a way to rectify this, you think?

Also what are you thoughts on DynamicBuffer being in a third party crate? (I think it's possible, I don't see an orphan rule violation anywhere). This would also punt on deciding on whether it should be under a feature flag (I generally don't like feature flags if can be avoided).

Lastly, w.r.t. the scroll_derive issues, I have to think about it, but I don't think we should generate extra implementations for DynamicBuffer in the default derive, since for my guess is majority of usecases won't require a DynamicBuffer, and it's just extra code bloat having them compile it.

Anyway, I don't see any immediate issues preventing this from being merged; it's quite cool, and thanks for the PR; as I said, will need to be a little patient getting this in, don't have the most time right now, but I don't see any major blocking issues, however I would like to play with it a bit see if something comes up.

Anyway, feel free to add your thoughts here, etc. :) really great stuff :)

m4b · 2021-09-27T01:14:47Z

src/buffer.rs

+
+        if let Some(diff) = index.checked_sub(self.buffer.len()) {
+            self.buffer
+                .extend(core::iter::repeat(0).take(core::cmp::max(self.alloc_increment, diff + 1)));


this is a bit hard to read; needs a comment at least

m4b · 2021-09-27T01:16:43Z

src/buffer.rs

+                .extend(core::iter::repeat(0).take(core::cmp::max(self.alloc_increment, diff + 1)));
+        }
+
+        self.write_end = self.write_end.max(index);


shouldn't this only be updated if the buffer gets resized? if yes, I'd suggest to put it into that block

m4b · 2021-09-27T01:23:59Z

src/buffer.rs

+        buf.pwrite_with(0x1234u16, 0, Endian::Little).unwrap();
+        buf.pwrite_with(0x5678i16, 2, Endian::Big).unwrap();
+
+        assert_eq!(buf.get(), [0x34, 0x12, 0x56, 0x78]);


i haven't thought deeply, but be careful to think about how this test would pass on big endian machine. In other places in scroll there sometimes need to be cfg(target_endian = "little") etc.; I think you're ok here but something to think about for a minute, because in my experience easy to make subtle mistake with endianness in tests and it fails years later when someone tries to build it on some big endian machine :)

m4b · 2021-09-27T01:30:00Z

src/buffer.rs

+        offset: usize,
+        ctx: Ctx,
+    ) -> Result<usize, E> {
+        self.start_offset += offset;


hmm, why do you increment start offset by offset then subtract below? might be good to comment this since it isn't immediately clear

m4b · 2021-09-27T01:32:19Z

src/buffer.rs

+    pub fn get(&self) -> &[u8] {
+        &self.buffer[..=self.write_end]
+    }
+}


I'm on fence about adding:

pub fn into_inner(self) -> Vec<u8> { self.buffer }

on one hand it's nice and typical api of this sort; on other hand it forever forces the implementation to either use a Vec internally or create one on the fly if the user calls this method. It's probably fine?

src/pwrite.rs

src/lib.rs

m4b · 2021-09-27T01:42:57Z

src/lib.rs

@@ -219,6 +219,8 @@ pub use scroll_derive::{IOread, IOwrite, Pread, Pwrite, SizeWith};
 #[cfg(feature = "std")]
 extern crate core;

+#[cfg(feature = "std")]
+mod buffer;


we may want to add a feature flag for dynamic buffers, not sure.

m4b · 2021-09-27T01:44:43Z

src/buffer.rs

@@ -0,0 +1,294 @@
+use crate::ctx::TryIntoCtx;


I think this entire module could actually be a third party crate; it is a self-defined type which implements a foreign trait, so ti won't violate orphan rules. (assuming we merge your Pwrite/Pread changes).

m4b · 2021-09-27T01:52:48Z

also it doesn't look like there's any CI anymore on scroll which is terrifying? (it used to run on travis, but it is probably dead?)

nickbclifford · 2021-09-28T18:03:32Z

I'd be okay with moving DynamicBuffer into a separate crate, I believe all that would be necessary are the changes for Pwrite.

However, I do agree that the asymmetry with Pread is annoying, so I'll change that as well.

nickbclifford · 2021-09-28T18:32:00Z

Let me know what you think about the trait changes, and I'd be happy to move the buffer stuff out of this PR and change scope.

m4b · 2021-10-17T05:11:14Z

I have not forgotten about this! I will try to review tomorrow, thanks for your patience; always feel free to ping me if you aren't hearing from me, also :)

m4b

This looks good to me, at least on paper. I am more convinced the buffer module does not belong in scroll, but is better as a third party library, since it's more niche.
I'm wondering now what (if any?) advantage there was to defining all Pread/Pwrite logic generically, with the blanket implementation, as opposed to this approach, where all logic is dispatched to a single unimplemented method, and implement it for &[u8].

I can't see any disadvantages at the moment, and in fact, it looks like the entire purpose of MeasureWith, which effectively is our root implementation on &[u8] which allows us to pread/pwrite on [u8] can be deleted now.

Some other thoughts: will scroll be able to implement this logic for const generics now? I suspect it will, and may generate better code, since the generic bound would be what we compare instead of self.len(), etc.

m4b · 2021-10-18T04:07:38Z

src/pwrite.rs

-            }
-            err => err,
-        }
+        self.pwrite_with(n, o, ctx).map(|size| {


m4b · 2021-10-18T04:13:22Z

src/pwrite.rs

        &mut self,
        n: N,
        offset: usize,
        ctx: Ctx,
    ) -> result::Result<usize, E> {
-        let len = self.measure_with(&ctx);
-        if offset >= len {
+        if offset >= self.len() {


so measure_with is gone now; which makes me wonder what relevance the trait has anymore.

I believe MeasureWith as a bound was required because we needed a way to return the size of something w.r.t. offset logic in a generic context.
Because there was no such trait provided by std we added one here.
It seems like it could be promising to remove this trait entirely, but I'm wondering if there's some effect I'm not foreseeing; specifically, if MeasureWith isn't used anymore, then the Context cannot effect the len of Self in a generic way; maybe this is fine/it was only required before because the implementation defaulted to implementation logic inside the trait itself, instead of inside the implementation of the trait.

it's quite interesting; I'm not sure which is more flexible, allowing the user to implement these external traits and we implement everything for them in the Pread trait, or effectively allow the trait to be implemented / specialized.

Anyway, kind of rambling a bit :)

m4b · 2021-10-18T04:15:26Z

src/pread.rs

-        }
-        N::try_from_ctx(&self[offset..], ctx).and_then(|(n, _)| Ok(n))
+    ) -> result::Result<N, E> {
+        let mut ignored = offset;


this is elegant way to dispatch all methods to a single implementation gread/gwrite, I'm not sure why it didn't occur to me before!
My only worry is whether the optimizer will see through this and actually ignore doing the write to the offset when it statically dispatches to gread/gwrite

That's the problem with Pread vs Pwrite, pwrite is already supposed to return the written offset so we can do the writing for you. However, Pread doesn't have any methods that return both the result type and the new offset, and I didn't want to add another method that's only used for Pread implementations, so my only option was to make gread a required trait method.

m4b · 2021-10-18T04:17:13Z

src/pwrite.rs

 use core::result;

-use crate::ctx::{MeasureWith, TryIntoCtx};


I'm wondering if we can delete MeasureWith now?

m4b · 2021-10-18T05:09:44Z

Ok I'm glad I played with this a bit, I do not think this design is implementable anymore:

impl<Ctx: Copy, E: From<error::Error>, const LEN: usize> Pwrite<Ctx, E> for [u8; LEN] {
    fn pwrite_with<N: TryIntoCtx<Ctx, Self, Error = E>>(
        &mut self,
        n: N,
        offset: usize,
        ctx: Ctx,
    ) -> result::Result<usize, E> {
        if offset >= LEN {
            return Err(error::Error::BadOffset(offset).into());
        }
        let dst = &mut self[offset..];
        n.try_into_ctx(dst, ctx)
    }
}

specifically, because TryIntoCtx<Ctx, Dst> has Dst constrained to Self due to the Pwrite trait definition. We could potentially get over this problem by writing something like:

impl<Ctx: Copy, E: From<error::Error>, const LEN: usize> Pwrite<Ctx, E> for [u8; LEN] {
    fn pwrite_with<N: TryIntoCtx<Ctx, Self, Error = E>>(
        &mut self,
        n: N,
        offset: usize,
        ctx: Ctx,
    ) -> result::Result<usize, E> {
        if offset >= LEN {
            return Err(error::Error::BadOffset(offset).into());
        }
        let dst: &mut [u8; LEN] = &mut self[offset..].try_into().unwrap();
        n.try_into_ctx(dst, ctx)
    }
}

But this is incorrect; specifically, we don't want a dst size of LEN, but actually std::mem::size_of::<N>; alas however, const generics don't allow writing std::mem::size_of::<N> in a generic context just yet it seems :(

Nevertheless, even if we could, the definition of Pwrite won't let us I believe, since it will again diverge from the trait definition, since std::mem::size_of::<N> != LEN.

This leads me to believe, ironically, since I suggested to remove it, that the associated type Buffer might have to return, as this allows the implementee to diverge from the TryInto/From arguments. Specifically, I believe it would allow implementing for [u8; LEN] but the byte sync is either &mut [u8] or &mut [u8; size_of::<N>].

It's also possible the original design of scroll would allow this as well, but we would have to implement MeasureWith, etc.

Currently I'm more positive towards your current design as it has the following attractive properties:

It simplifies the generic bounds all over (no RangeWithExclusive etc.)
It makes it straightforward to implement for different Source buffers (implementees)
It appears to drop the use of MeasureWith (which always seemed like an unnecessary hack).

But again, would like to think through various scenarios here, as as we've seen, it's quite tricky to imagine various usecases (I would not have guessed the [u8; LEN] implementation would fail in that manner unless I tried it) but I think we're getting close to the right, semi-future proof design.

Anyway, thanks for being patient here.

m4b · 2021-10-18T05:20:13Z

Ah, and quite sadly, even with associated type Buffer, we cannot ever write a const generic dst buffer (e.g., if future const generics allow generics in a const position), because the required size, N, is not in scope yet :(

impl<Ctx: Copy, E: From<error::Error>, const LEN: usize> Pwrite<Ctx, E> for [u8; LEN] {
    type Buffer = [u8; std::mem::size_of::<N>()]; // N is not in scope yet...
    fn pwrite_with<N: TryIntoCtx<Ctx, Self::Buffer, Error = E>>(
        &mut self,
        n: N,
        offset: usize,
        ctx: Ctx,
    ) -> result::Result<usize, E> {
        if offset >= LEN {
            return Err(error::Error::BadOffset(offset).into());
        }
        let dst: &mut [u8; std::mem::size_of::<N>()] = &mut self[offset..].try_into().unwrap();
        n.try_into_ctx(dst, ctx)
    }
}

I've just tested, and it is sufficient to write into a slice however, e.g., the following does work:

impl<Ctx: Copy, E: From<error::Error>, const LEN: usize> Pwrite<Ctx, E> for [u8; LEN] {
    type Buffer = [u8];
    fn pwrite_with<N: TryIntoCtx<Ctx, Self::Buffer, Error = E>>(
        &mut self,
        n: N,
        offset: usize,
        ctx: Ctx,
    ) -> result::Result<usize, E> {
        if offset >= LEN {
            return Err(error::Error::BadOffset(offset).into());
        }
        let dst = &mut self[offset..];
        n.try_into_ctx(dst, ctx)
    }
}

nickbclifford · 2021-10-18T15:49:55Z

I'm not quite sure I understand the usecase behind implementing for an array buffer, can't it already use the [u8] impl from a &mut reference?

nickbclifford · 2021-10-18T16:02:55Z

If you'd like to discuss this more synchronously outside of GitHub as well, feel free to reach out over email and we can set up messaging or something.

m4b · 2021-10-24T19:15:01Z

So the idea behind targeting arrays is that the compiler might be able to elide some checks, etc., especially if it sees the type(s) always being smaller than array target, etc. but that could be wishful thinking 🤷

i think it would be nice to at least have the option to be able to implement arrays, but maybe I could be convinced this isn't useful?

Anyway, so w.r.t. this PR i think ti would be good to remove the dynamic buffer portion so we can concentrate on any remaining issues with the design of the traits, then we can think of merging.

also feel free to email me anytime :)

nickbclifford · 2021-10-24T20:14:49Z

I think I see your point, but I worry that it might not be sound and/or possible to implement such logic because of types without a constant bytesize. AFAIK, unless we had some sort of marker or auto trait that allowed us to discriminate between those and primitives or PODs, I don't think there'd be a way to write a coherent impl.

Apologies if I didn't explain that well, I just think it might be a premature optimization.

nickbclifford · 2021-11-02T21:51:28Z

@m4b Any thoughts about merging soon?

m4b · 2021-11-06T22:42:17Z

@nickbclifford ok, so here's what we should do:

if you can squash all your commits to single, and force push, that will make for a nice final review
I am inclined to merge this breaking change, without the type Buffer associated type; although this prevents some uses of const buffers, this may not ever matter, it would be an interesting case study to examine the assembly output to see if rustc is able to do anything with array vs slice

Unrelated to your patch, but you probably want a 0.11 crates release at some point:

i will likely remove the usize impl for pread and pwrite remove usize pread/pwrite #56
i will possibly look at making pwrite take & references, which is something i wanted for some time, though it may be a huge change, which is related to Design: boxed preads #51 and was the PR that never got merged here Make [Try]IntoCtx borrow instead of copying #47

I would really want to revisit pwrite taking & before releasing 0.11, which at this stage would be effectively a beta for scroll 1.0. Before 1.0 i think pwrite taking references is good future proof design.

So in summary, if you can squash your changes, I think i'm ok to merge, and then i'd like to see at least 1. (but probably not 2., i'm not sure who would do it and i don't have time right now) before releasing 0.11

Does that sound reasonable? And once again, thanks for your patience here :)

nickbclifford · 2021-11-06T22:47:55Z

Yep, sounds all good to me! Thank you very much for being accommodating with my first contribution 😁

m4b

LGTM! So thinking more about doing:

impl<Ctx: Copy, E: From<error::Error>, const LEN: usize> Pwrite<Ctx, E> for [u8; LEN] {

the problem here is that we want the buffer we write into to be at most the size of the type; but we don't know the type until pwrite is called, e.g.:

fn pwrite_with<N: TryIntoCtx<Ctx, Self, Error = E>>(

which means we'd want something like:

fn pwrite_with<N: TryIntoCtx<Ctx, [u8; size_of::<N>()], Error = E>>(

which means the pwrite dispatches from a buffer of some size N, into a buffer that can contain the type.

If we just dispatch to the a slice, the value of const generics is effectively nil, since it is just a check on some constant N instead of self.len().

What is interesting here is I don't think this pattern is expressible in scroll as defined, nor will an associated type Buffer help, since it's size is given in the impl block, but the size is effectively only known at pwrite definition time.

I think the only solution here would be to add another generic type to Pwrite which is a marker for the written to buffer? I'm not sure, have to think about it, but I've convinced myself that adding type Buffer adds no real benefit, other than allowing to dispatch from const arrays to slices, which has minimal benefit.

Anyway, great PR, great discussion, and thanks for your patience!

m4b reviewed Sep 27, 2021

View reviewed changes

nickbclifford requested a review from m4b October 7, 2021 14:45

m4b reviewed Oct 18, 2021

View reviewed changes

nickbclifford changed the title ~~Dynamic buffer~~ Pread/Pwrite redefinitions Oct 24, 2021

Rework definitions for Pread/Pwrite traits

1d047b6

nickbclifford force-pushed the master branch from f4d7f85 to 1d047b6 Compare November 6, 2021 22:47

m4b approved these changes Nov 6, 2021

View reviewed changes

m4b merged commit 966710a into m4b:master Nov 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pread/Pwrite redefinitions #79

Pread/Pwrite redefinitions #79

nickbclifford commented Sep 14, 2021 •

edited

Loading

nickbclifford commented Sep 14, 2021

nickbclifford commented Sep 24, 2021

m4b left a comment

m4b Sep 27, 2021

m4b Sep 27, 2021

m4b Sep 27, 2021

m4b Sep 27, 2021

m4b Sep 27, 2021

m4b Sep 27, 2021

m4b Sep 27, 2021

m4b commented Sep 27, 2021 •

edited

Loading

nickbclifford commented Sep 28, 2021

nickbclifford commented Sep 28, 2021

m4b commented Oct 17, 2021

m4b left a comment •

edited

Loading

m4b Oct 18, 2021

m4b Oct 18, 2021

m4b Oct 18, 2021

nickbclifford Oct 19, 2021

m4b Oct 18, 2021

m4b commented Oct 18, 2021

m4b commented Oct 18, 2021 •

edited

Loading

nickbclifford commented Oct 18, 2021 •

edited

Loading

nickbclifford commented Oct 18, 2021

m4b commented Oct 24, 2021

nickbclifford commented Oct 24, 2021 •

edited

Loading

nickbclifford commented Nov 2, 2021

m4b commented Nov 6, 2021

nickbclifford commented Nov 6, 2021 •

edited

Loading

m4b left a comment

Pread/Pwrite redefinitions #79

Pread/Pwrite redefinitions #79

Conversation

nickbclifford commented Sep 14, 2021 • edited Loading

nickbclifford commented Sep 14, 2021

nickbclifford commented Sep 24, 2021

m4b left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

m4b commented Sep 27, 2021 • edited Loading

nickbclifford commented Sep 28, 2021

nickbclifford commented Sep 28, 2021

m4b commented Oct 17, 2021

m4b left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

m4b commented Oct 18, 2021

m4b commented Oct 18, 2021 • edited Loading

nickbclifford commented Oct 18, 2021 • edited Loading

nickbclifford commented Oct 18, 2021

m4b commented Oct 24, 2021

nickbclifford commented Oct 24, 2021 • edited Loading

nickbclifford commented Nov 2, 2021

m4b commented Nov 6, 2021

nickbclifford commented Nov 6, 2021 • edited Loading

m4b left a comment

Choose a reason for hiding this comment

nickbclifford commented Sep 14, 2021 •

edited

Loading

m4b commented Sep 27, 2021 •

edited

Loading

m4b left a comment •

edited

Loading

m4b commented Oct 18, 2021 •

edited

Loading

nickbclifford commented Oct 18, 2021 •

edited

Loading

nickbclifford commented Oct 24, 2021 •

edited

Loading

nickbclifford commented Nov 6, 2021 •

edited

Loading