Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pread/Pwrite redefinitions #79

Merged
merged 1 commit into from
Nov 6, 2021
Merged

Pread/Pwrite redefinitions #79

merged 1 commit into from
Nov 6, 2021

Conversation

nickbclifford
Copy link
Contributor

@nickbclifford nickbclifford commented Sep 14, 2021

This PR adds a new type, DynamicBuffer, that automatically expands when Pwriteing types into it. Details in the doc comments.

It also includes a relaxed definition of the Pwrite trait that, while technically an API break, is source-compatible and makes implementing custom containers easier.

Resolves #78.

@nickbclifford
Copy link
Contributor Author

I was considering adding a TryIntoCtx<_, DynamicBuffer> impl to the Pwrite derive macro as well, but (no offense intended!) the proc macro doesn't seem to be documented well and I didn't really fully understand the specifics of how it worked, so I ultimately decided not to.

@nickbclifford
Copy link
Contributor Author

Could I get a review @m4b?

Copy link
Owner

@m4b m4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty awesome! I'm quite surprised that tests pass, without any changes, honestly, although it is technically a breaking change (which is fine).

So, some thoughts:

  1. I think the associated type Buffer is unavoidable? (which is fine if so, this is the technically breaking change)
  2. Should Pread be updated (I don't think your DynamicBuffer works with pread? It might be nice from a symmetric viewpoint to get that working, but I don't think its' necessary)
  3. There's also the argument to be made that this type could be in a third party crate (assuming the Pwrite changes go in)

The one thing I don't like is the asymmetry now between Pwrite/Pread; one of them has an associated type and the other doesn't. Is there a way to rectify this, you think?

Also what are you thoughts on DynamicBuffer being in a third party crate? (I think it's possible, I don't see an orphan rule violation anywhere). This would also punt on deciding on whether it should be under a feature flag (I generally don't like feature flags if can be avoided).

Lastly, w.r.t. the scroll_derive issues, I have to think about it, but I don't think we should generate extra implementations for DynamicBuffer in the default derive, since for my guess is majority of usecases won't require a DynamicBuffer, and it's just extra code bloat having them compile it.

Anyway, I don't see any immediate issues preventing this from being merged; it's quite cool, and thanks for the PR; as I said, will need to be a little patient getting this in, don't have the most time right now, but I don't see any major blocking issues, however I would like to play with it a bit see if something comes up.

Anyway, feel free to add your thoughts here, etc. :) really great stuff :)

src/buffer.rs Outdated

if let Some(diff) = index.checked_sub(self.buffer.len()) {
self.buffer
.extend(core::iter::repeat(0).take(core::cmp::max(self.alloc_increment, diff + 1)));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a bit hard to read; needs a comment at least

src/buffer.rs Outdated
.extend(core::iter::repeat(0).take(core::cmp::max(self.alloc_increment, diff + 1)));
}

self.write_end = self.write_end.max(index);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this only be updated if the buffer gets resized? if yes, I'd suggest to put it into that block

src/buffer.rs Outdated
buf.pwrite_with(0x1234u16, 0, Endian::Little).unwrap();
buf.pwrite_with(0x5678i16, 2, Endian::Big).unwrap();

assert_eq!(buf.get(), [0x34, 0x12, 0x56, 0x78]);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i haven't thought deeply, but be careful to think about how this test would pass on big endian machine. In other places in scroll there sometimes need to be cfg(target_endian = "little") etc.; I think you're ok here but something to think about for a minute, because in my experience easy to make subtle mistake with endianness in tests and it fails years later when someone tries to build it on some big endian machine :)

src/buffer.rs Outdated
offset: usize,
ctx: Ctx,
) -> Result<usize, E> {
self.start_offset += offset;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, why do you increment start offset by offset then subtract below? might be good to comment this since it isn't immediately clear

src/buffer.rs Outdated
pub fn get(&self) -> &[u8] {
&self.buffer[..=self.write_end]
}
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on fence about adding:

pub fn into_inner(self) -> Vec<u8> {
  self.buffer
}

on one hand it's nice and typical api of this sort; on other hand it forever forces the implementation to either use a Vec internally or create one on the fly if the user calls this method. It's probably fine?

src/pwrite.rs Show resolved Hide resolved
src/pwrite.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated
@@ -219,6 +219,8 @@ pub use scroll_derive::{IOread, IOwrite, Pread, Pwrite, SizeWith};
#[cfg(feature = "std")]
extern crate core;

#[cfg(feature = "std")]
mod buffer;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we may want to add a feature flag for dynamic buffers, not sure.

src/buffer.rs Outdated
@@ -0,0 +1,294 @@
use crate::ctx::TryIntoCtx;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this entire module could actually be a third party crate; it is a self-defined type which implements a foreign trait, so ti won't violate orphan rules. (assuming we merge your Pwrite/Pread changes).

@m4b
Copy link
Owner

m4b commented Sep 27, 2021

also it doesn't look like there's any CI anymore on scroll which is terrifying? (it used to run on travis, but it is probably dead?)

@nickbclifford
Copy link
Contributor Author

I'd be okay with moving DynamicBuffer into a separate crate, I believe all that would be necessary are the changes for Pwrite.

However, I do agree that the asymmetry with Pread is annoying, so I'll change that as well.

@nickbclifford
Copy link
Contributor Author

Let me know what you think about the trait changes, and I'd be happy to move the buffer stuff out of this PR and change scope.

@nickbclifford nickbclifford requested a review from m4b October 7, 2021 14:45
@m4b
Copy link
Owner

m4b commented Oct 17, 2021

I have not forgotten about this! I will try to review tomorrow, thanks for your patience; always feel free to ping me if you aren't hearing from me, also :)

Copy link
Owner

@m4b m4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, at least on paper. I am more convinced the buffer module does not belong in scroll, but is better as a third party library, since it's more niche.
I'm wondering now what (if any?) advantage there was to defining all Pread/Pwrite logic generically, with the blanket implementation, as opposed to this approach, where all logic is dispatched to a single unimplemented method, and implement it for &[u8].

I can't see any disadvantages at the moment, and in fact, it looks like the entire purpose of MeasureWith, which effectively is our root implementation on &[u8] which allows us to pread/pwrite on [u8] can be deleted now.

Some other thoughts: will scroll be able to implement this logic for const generics now? I suspect it will, and may generate better code, since the generic bound would be what we compare instead of self.len(), etc.

}
err => err,
}
self.pwrite_with(n, o, ctx).map(|size| {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

&mut self,
n: N,
offset: usize,
ctx: Ctx,
) -> result::Result<usize, E> {
let len = self.measure_with(&ctx);
if offset >= len {
if offset >= self.len() {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so measure_with is gone now; which makes me wonder what relevance the trait has anymore.

I believe MeasureWith as a bound was required because we needed a way to return the size of something w.r.t. offset logic in a generic context.
Because there was no such trait provided by std we added one here.
It seems like it could be promising to remove this trait entirely, but I'm wondering if there's some effect I'm not foreseeing; specifically, if MeasureWith isn't used anymore, then the Context cannot effect the len of Self in a generic way; maybe this is fine/it was only required before because the implementation defaulted to implementation logic inside the trait itself, instead of inside the implementation of the trait.

it's quite interesting; I'm not sure which is more flexible, allowing the user to implement these external traits and we implement everything for them in the Pread trait, or effectively allow the trait to be implemented / specialized.

Anyway, kind of rambling a bit :)

}
N::try_from_ctx(&self[offset..], ctx).and_then(|(n, _)| Ok(n))
) -> result::Result<N, E> {
let mut ignored = offset;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is elegant way to dispatch all methods to a single implementation gread/gwrite, I'm not sure why it didn't occur to me before!
My only worry is whether the optimizer will see through this and actually ignore doing the write to the offset when it statically dispatches to gread/gwrite

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the problem with Pread vs Pwrite, pwrite is already supposed to return the written offset so we can do the writing for you. However, Pread doesn't have any methods that return both the result type and the new offset, and I didn't want to add another method that's only used for Pread implementations, so my only option was to make gread a required trait method.

use core::result;

use crate::ctx::{MeasureWith, TryIntoCtx};
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we can delete MeasureWith now?

@m4b
Copy link
Owner

m4b commented Oct 18, 2021

Ok I'm glad I played with this a bit, I do not think this design is implementable anymore:

impl<Ctx: Copy, E: From<error::Error>, const LEN: usize> Pwrite<Ctx, E> for [u8; LEN] {
    fn pwrite_with<N: TryIntoCtx<Ctx, Self, Error = E>>(
        &mut self,
        n: N,
        offset: usize,
        ctx: Ctx,
    ) -> result::Result<usize, E> {
        if offset >= LEN {
            return Err(error::Error::BadOffset(offset).into());
        }
        let dst = &mut self[offset..];
        n.try_into_ctx(dst, ctx)
    }
}

specifically, because TryIntoCtx<Ctx, Dst> has Dst constrained to Self due to the Pwrite trait definition. We could potentially get over this problem by writing something like:

impl<Ctx: Copy, E: From<error::Error>, const LEN: usize> Pwrite<Ctx, E> for [u8; LEN] {
    fn pwrite_with<N: TryIntoCtx<Ctx, Self, Error = E>>(
        &mut self,
        n: N,
        offset: usize,
        ctx: Ctx,
    ) -> result::Result<usize, E> {
        if offset >= LEN {
            return Err(error::Error::BadOffset(offset).into());
        }
        let dst: &mut [u8; LEN] = &mut self[offset..].try_into().unwrap();
        n.try_into_ctx(dst, ctx)
    }
}

But this is incorrect; specifically, we don't want a dst size of LEN, but actually std::mem::size_of::<N>; alas however, const generics don't allow writing std::mem::size_of::<N> in a generic context just yet it seems :(

Nevertheless, even if we could, the definition of Pwrite won't let us I believe, since it will again diverge from the trait definition, since std::mem::size_of::<N> != LEN.

This leads me to believe, ironically, since I suggested to remove it, that the associated type Buffer might have to return, as this allows the implementee to diverge from the TryInto/From arguments. Specifically, I believe it would allow implementing for [u8; LEN] but the byte sync is either &mut [u8] or &mut [u8; size_of::<N>].

It's also possible the original design of scroll would allow this as well, but we would have to implement MeasureWith, etc.

Currently I'm more positive towards your current design as it has the following attractive properties:

  1. It simplifies the generic bounds all over (no RangeWithExclusive etc.)
  2. It makes it straightforward to implement for different Source buffers (implementees)
  3. It appears to drop the use of MeasureWith (which always seemed like an unnecessary hack).

But again, would like to think through various scenarios here, as as we've seen, it's quite tricky to imagine various usecases (I would not have guessed the [u8; LEN] implementation would fail in that manner unless I tried it) but I think we're getting close to the right, semi-future proof design.

Anyway, thanks for being patient here.

@m4b
Copy link
Owner

m4b commented Oct 18, 2021

Ah, and quite sadly, even with associated type Buffer, we cannot ever write a const generic dst buffer (e.g., if future const generics allow generics in a const position), because the required size, N, is not in scope yet :(

impl<Ctx: Copy, E: From<error::Error>, const LEN: usize> Pwrite<Ctx, E> for [u8; LEN] {
    type Buffer = [u8; std::mem::size_of::<N>()]; // N is not in scope yet...
    fn pwrite_with<N: TryIntoCtx<Ctx, Self::Buffer, Error = E>>(
        &mut self,
        n: N,
        offset: usize,
        ctx: Ctx,
    ) -> result::Result<usize, E> {
        if offset >= LEN {
            return Err(error::Error::BadOffset(offset).into());
        }
        let dst: &mut [u8; std::mem::size_of::<N>()] = &mut self[offset..].try_into().unwrap();
        n.try_into_ctx(dst, ctx)
    }
}

I've just tested, and it is sufficient to write into a slice however, e.g., the following does work:

impl<Ctx: Copy, E: From<error::Error>, const LEN: usize> Pwrite<Ctx, E> for [u8; LEN] {
    type Buffer = [u8];
    fn pwrite_with<N: TryIntoCtx<Ctx, Self::Buffer, Error = E>>(
        &mut self,
        n: N,
        offset: usize,
        ctx: Ctx,
    ) -> result::Result<usize, E> {
        if offset >= LEN {
            return Err(error::Error::BadOffset(offset).into());
        }
        let dst = &mut self[offset..];
        n.try_into_ctx(dst, ctx)
    }
}

@nickbclifford
Copy link
Contributor Author

nickbclifford commented Oct 18, 2021

I'm not quite sure I understand the usecase behind implementing for an array buffer, can't it already use the [u8] impl from a &mut reference?

@nickbclifford
Copy link
Contributor Author

If you'd like to discuss this more synchronously outside of GitHub as well, feel free to reach out over email and we can set up messaging or something.

@m4b
Copy link
Owner

m4b commented Oct 24, 2021

So the idea behind targeting arrays is that the compiler might be able to elide some checks, etc., especially if it sees the type(s) always being smaller than array target, etc. but that could be wishful thinking 🤷

i think it would be nice to at least have the option to be able to implement arrays, but maybe I could be convinced this isn't useful?

Anyway, so w.r.t. this PR i think ti would be good to remove the dynamic buffer portion so we can concentrate on any remaining issues with the design of the traits, then we can think of merging.

also feel free to email me anytime :)

@nickbclifford nickbclifford changed the title Dynamic buffer Pread/Pwrite redefinitions Oct 24, 2021
@nickbclifford
Copy link
Contributor Author

nickbclifford commented Oct 24, 2021

I think I see your point, but I worry that it might not be sound and/or possible to implement such logic because of types without a constant bytesize. AFAIK, unless we had some sort of marker or auto trait that allowed us to discriminate between those and primitives or PODs, I don't think there'd be a way to write a coherent impl.

Apologies if I didn't explain that well, I just think it might be a premature optimization.

@nickbclifford
Copy link
Contributor Author

@m4b Any thoughts about merging soon?

@m4b
Copy link
Owner

m4b commented Nov 6, 2021

@nickbclifford ok, so here's what we should do:

  1. if you can squash all your commits to single, and force push, that will make for a nice final review
  2. I am inclined to merge this breaking change, without the type Buffer associated type; although this prevents some uses of const buffers, this may not ever matter, it would be an interesting case study to examine the assembly output to see if rustc is able to do anything with array vs slice

Unrelated to your patch, but you probably want a 0.11 crates release at some point:

  1. i will likely remove the usize impl for pread and pwrite remove usize pread/pwrite #56
  2. i will possibly look at making pwrite take & references, which is something i wanted for some time, though it may be a huge change, which is related to Design: boxed preads #51 and was the PR that never got merged here Make [Try]IntoCtx borrow instead of copying #47

I would really want to revisit pwrite taking & before releasing 0.11, which at this stage would be effectively a beta for scroll 1.0. Before 1.0 i think pwrite taking references is good future proof design.

So in summary, if you can squash your changes, I think i'm ok to merge, and then i'd like to see at least 1. (but probably not 2., i'm not sure who would do it and i don't have time right now) before releasing 0.11

Does that sound reasonable? And once again, thanks for your patience here :)

@nickbclifford
Copy link
Contributor Author

nickbclifford commented Nov 6, 2021

Yep, sounds all good to me! Thank you very much for being accommodating with my first contribution 😁

Copy link
Owner

@m4b m4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! So thinking more about doing:

impl<Ctx: Copy, E: From<error::Error>, const LEN: usize> Pwrite<Ctx, E> for [u8; LEN] {

the problem here is that we want the buffer we write into to be at most the size of the type; but we don't know the type until pwrite is called, e.g.:

fn pwrite_with<N: TryIntoCtx<Ctx, Self, Error = E>>(

which means we'd want something like:

fn pwrite_with<N: TryIntoCtx<Ctx, [u8; size_of::<N>()], Error = E>>(

which means the pwrite dispatches from a buffer of some size N, into a buffer that can contain the type.

If we just dispatch to the a slice, the value of const generics is effectively nil, since it is just a check on some constant N instead of self.len().

What is interesting here is I don't think this pattern is expressible in scroll as defined, nor will an associated type Buffer help, since it's size is given in the impl block, but the size is effectively only known at pwrite definition time.

I think the only solution here would be to add another generic type to Pwrite which is a marker for the written to buffer? I'm not sure, have to think about it, but I've convinced myself that adding type Buffer adds no real benefit, other than allowing to dispatch from const arrays to slices, which has minimal benefit.

Anyway, great PR, great discussion, and thanks for your patience!

@m4b m4b merged commit 966710a into m4b:master Nov 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Writing into a dynamic container
2 participants