Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distribution::sample_iter changes #758

Merged
merged 3 commits into from
Apr 19, 2019
Merged

Distribution::sample_iter changes #758

merged 3 commits into from
Apr 19, 2019

Conversation

dhardy
Copy link
Member

@dhardy dhardy commented Mar 23, 2019

Implements #602, #725. (In fact, the former makes the latter irrelevant.)

I am happy with the first commit (#725).

The second (#602) is a little controversial, and unfortunately does not match up with fn sample. See the comments.

@fizyk20 @huonw @burdges review please?

@burdges
Copy link
Contributor

burdges commented Mar 23, 2019

It looks good. You might want dist.sample(&mut *rng) to appear in some doc comment, so that people see that trick for reborrowing without going to &mut &mut R etc. It need not be explained per se, just so that readers see it.. I suppose rng.sample(dist) achieves the same thing, no? If so, maybe I'd even mention that somehow, but not sure how.

It's not rand's job to teach people Rust of course, so this should not be over emphasized, but &mut *rng is a nice idiom that keeps error messages more understandable.

@dhardy
Copy link
Member Author

dhardy commented Apr 5, 2019

I added some little hints in the docs.

@dhardy dhardy mentioned this pull request Apr 9, 2019
22 tasks
@dhardy
Copy link
Member Author

dhardy commented Apr 15, 2019

The first example can be modified slightly: sample_iter can now consume an RNG, hence we can write distr.sample_iter(rng). In the case of RNGs supporting Copy (which OsRng now does and ThreadRng probably should since both are just handles), this is all that's ever needed; otherwise explicit borrow/reborrow is required (I guess we should show examples of this too).

The first caveat is that for rng: &mut R types where R: Rng, it's no longer possible to write rng.sample_iter(...); instead one must write something like (&mut *rng).sample_iter(...).
This appears to work in current Rust.

The second caveat is that there is no Reborrow bound. One can use fn func<R: Rng>(rng: &mut R) fine, or fn func<R: Rng + Copy>(rng: R), or fn func<R: BorrowMut<RngCore>>(mut rng: R) (clunky and only works on Sized inputs of target type), but none accurately express "borrow or reborrow a type supporting this trait bound as necessary".

The third caveat is the generic code which now doesn't know whether its internal RNG is a reference type, handle or full RNG, and thus must create a new reference to what may already be a reference. Quite likely the compiler can eliminate the double reference anyway, though it's hard to verify (was trying with godbolt.org but the whole usage was dead code; in practice it may depend on the usage).


I'd like to accept this PR and make similar changes to Distribution::sample and other Rng functions, however the caveats may out-weigh the advantages. A possible compromise is to accept the changes to Distribution::sample_iter, modify Rng::sample to fn sample_iter<T, D>(&mut self, distr: D) ...(takeselfby ref) and ... maybe leaveDistribution::sample` alone for now.

Note that SeedableRng::from_rng already takes an rng: R where R: Rng by value.

@dhardy
Copy link
Member Author

dhardy commented Apr 17, 2019

I tried adjusting Distribution::sample as follows:

fn sample<R: Rng>(&self, rng: R) -> T;

First consequence: we have a lot of implementations of Distribution; okay, we can fix these. There would be a significant number of dependent crates similarly affected, but I think this is manageable.

Second consequence: functions generic over any type of RNG and calling sample, e.g. DistIter::next (for sample_iter) and fn seq::index::sample_rejection<R: Rng + ?Sized>(rng: &mut R, ...) have to pass a borrow or re-borrow of their rng. Those using the new R: Rng, rng: R pattern cannot re-borrow, so if index::sample_rejection is updated to this pattern, then its caller index::sample also needs to borrow its rng argument, hence syntactically we can have several layers of borrows.

Third consequence: functions like ziggurat, which take rng: R along with an argument zero_case: Z of type Z: FnMut(R, f64) -> f64, now don't work without changes since passing a borrow of rng to zero_case makes this a distinct type. Instead we need Z: FnMut(&mut R, f64) -> f64, and the passed closure/fn needs to be typed accordingly.

The question then becomes: should we rely on the optimiser to avoid creating pointers to pointers unnecessarily? (Note this also applies the other way: currently thread_rng().gen() passes a pointer to the ThreadRng (itself just a pointer) into gen; by switching to this syntax and making ThreadRng: Copy we can avoid that.)

Fourth consequence: lots of "use of moved value" errors within generic functions to solve by replacing rng with (&mut rng) and some args need to be marked mut. More tedious breakage...


To sum this up: I don't think we should do this without an addition to the language like R: Rng + Reborrow where Reborrow implies the type either supports Copy or is a reference type which can be reborrowed. As RFC 2364 showed, this is unlikely to happen soon.

Regarding sample_iter, consequences of the change are less significant, though I believe it makes sense to treat rng arguments the same way; I think the following signature still satisfies the motivating issues (e.g. ten_dice_rolls_other_than_five works):

fn sample_iter<'a, R>(self, rng: &'a mut R) -> DistIter<'a, Self, R, T>
where R: Rng + ?Sized, Self: Sized

@huonw and @burdges may be interested to comment but I believe this analysis wraps up the issue for now.

@burdges
Copy link
Contributor

burdges commented Apr 17, 2019

Right, fair enough. Could sample_iter still take an Rng though? I'd think distributions should normally be Copy so would this work?

pub struct DistIter<D, R, T> 
where D: Distribution<T>, R: Rng,
{
    distr: D,
    rng: R,
    phantom: ::core::marker::PhantomData<T>,
}

pub trait Distribution<T> {
    fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> T;
    fn sample_iter<R>(self, rng: R) -> DistIter<Self, R, T>
        where Self: Sized, R: Rng
    {
        DistIter {
            distr: self,
            rng: rng,
            phantom: ::core::marker::PhantomData,
        }
    }
}

It's rarely useful but occasionally someone might pass around an owned DistIter without doing anything fancy. I'd expect this more in numerical setting so you'd know better than I if this actually helps much.

@dhardy
Copy link
Member Author

dhardy commented Apr 17, 2019

You mean such that Standard.sample_iter(thread_rng()) is a valid object without lifetime bounds? I guess the merit of this may outweigh the oddity of differences vs Distribution::sample. 👍

@dhardy dhardy merged commit 9828cdf into rust-random:master Apr 19, 2019
@dhardy dhardy deleted the sized branch May 16, 2019 11:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants