Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes in ArbitraryGenerator #502

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

purusang
Copy link
Contributor

@purusang purusang commented Dec 1, 2024

Description

  • Used 128 bytes of default buffer size vs 2^24 bytes buffer size earlier.
  • Used thread_rng to generate random number in each call as compared to using slices from pre-initialized huge buffer
  • Updated unit tests accordingly to work with new changes.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature/Enhancement (non-breaking change which adds functionality or enhances an existing one)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactor
  • New or updated tests
  • Dependency Update

Notes to Reviewers

Checklist

  • I have performed a self-review of my code.
  • I have commented my code where necessary.
  • I have updated the documentation if needed.
  • My changes do not introduce new warnings.
  • I have added tests that prove my changes are effective or that my feature works.
  • New and existing tests pass with my changes.

Related Issues

STR-551

- Used 128 bytes of default buffer size vs 2^24 bytes buffer size earlier.
- Used thread_rng to generate random number in each call as compared to using slices from pre-initialized huge buffer
- Updated unit tests accordingly to work with new changes.
Copy link

codecov bot commented Dec 1, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 56.48%. Comparing base (0ea3c38) to head (5dd9049).
Report is 4 commits behind head on main.

@@            Coverage Diff             @@
##             main     #502      +/-   ##
==========================================
+ Coverage   56.45%   56.48%   +0.03%     
==========================================
  Files         273      275       +2     
  Lines       29079    29190     +111     
==========================================
+ Hits        16416    16489      +73     
- Misses      12663    12701      +38     
Files with missing lines Coverage Δ
bin/strata-client/src/extractor.rs 93.26% <100.00%> (-3.27%) ⬇️
crates/bridge-relay/src/relayer.rs 33.88% <100.00%> (ø)
crates/consensus-logic/src/duty/block_assembly.rs 47.63% <100.00%> (ø)
crates/evmexec/src/engine.rs 65.52% <100.00%> (ø)
crates/primitives/src/l1.rs 80.00% <100.00%> (-0.11%) ⬇️
crates/primitives/src/relay/util.rs 96.94% <100.00%> (ø)
crates/rocksdb-store/src/bridge/db.rs 84.93% <100.00%> (ø)
crates/rocksdb-store/src/bridge_relay/db.rs 97.91% <100.00%> (ø)
crates/rocksdb-store/src/l1/db.rs 97.29% <100.00%> (ø)
crates/rocksdb-store/src/l2/db.rs 99.43% <100.00%> (ø)
... and 2 more

... and 20 files with indirect coverage changes

Comment on lines 22 to +38
pub fn new() -> Self {
Self::new_with_size(ARB_GEN_LEN)
ArbitraryGenerator {
rng: thread_rng(),
buf: vec![0u8; ARB_GEN_LEN],
}
}

pub fn new_with_size(n: usize) -> Self {
let mut buf = vec![0; n];
OsRng.fill_bytes(&mut buf); // 128 wasn't enough
let off = AtomicUsize::new(0);
ArbitraryGenerator { buf, off }
pub fn new_with_size(s: usize) -> Self {
ArbitraryGenerator {
rng: thread_rng(),
buf: vec![0u8; s],
}
}

pub fn generate<'a, T: Arbitrary<'a> + Clone>(&'a self) -> T {
// Doing hacky atomics to make this actually be reusable, this is pretty bad.
let off = self.off.load(Ordering::Relaxed);
let mut u = Unstructured::new(&self.buf[off..]);
let prev_off = u.len();
let inst = T::arbitrary(&mut u).expect("failed to generate arbitrary instance");
let additional_off = prev_off - u.len();
self.off.store(off + additional_off, Ordering::Relaxed);
inst
pub fn generate<'a, T: Arbitrary<'a> + Clone>(&'a mut self) -> T {
self.rng.fill_bytes(&mut self.buf);
let mut u = Unstructured::new(&self.buf);
T::arbitrary(&mut u).expect("Failed to generate arbitrary instance")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this is written kinda makes having the random buf be a field pointless. Since we sample new randomness on each call it might as well just be a local and then we can just have this take &self. Also what happens if we consume more than the length of randomness in a call to generate? I'm not sure the best design approach here since it'd be nice to be able to make it deterministic if desired.

Also should verify if it makes sense to include the rng as a field as opposed to just calling thread_rng() on every invocation. I'm not sure if there's a practical difference in terms of speed/determinism there.

Copy link
Contributor

@bewakes bewakes Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the buf field is required because of lifetime issues. The random buf is generated inside the method which can't leave the method scope.

I don't really like having mut everywhere though. Maybe a RefCell can be used for buffer(and for rng?), which should be fine since we probably won't be using a generator instance across multiple threads. What are your thoughts on this?

Copy link
Contributor Author

@purusang purusang Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this is written kinda makes having the random buf be a field pointless.

The reason for having buf field is due to lifetime requirement, like @bewakes said.

Since we sample new randomness on each call it might as well just be a local and then we can just have this take &self.

We needed &mut due to having to update randomness on every call.

Also what happens if we consume more than the length of randomness in a call to generate? I'm not sure the best design approach here since it'd be nice to be able to make it deterministic if desired.

It's a good concern, but I think having new_with_size() serves that purpose.

Also should verify if it makes sense to include the rng as a field as opposed to just calling thread_rng() on every invocation. I'm not sure if there's a practical difference in terms of speed/determinism there.

Both approaches definitely work, and don't see either way to be an issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the buf field is required because of lifetime issues. The random buf is generated inside the method which can't leave the method scope.

That's what I observed too.

I don't really like having mut everywhere though. Maybe a RefCell can be used for buffer(and for rng?), which should be fine since we probably won't be using a generator instance across multiple threads. What are your thoughts on this?

Yeah, initially I did not like the idea of having to change multiple files to pass mutable arbitrary generator but due to lifetime requirement using RefCell was not helpful there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh christ I'm realizing how this api kinda sucks. I didn't see that Arbitrary has a lifetime param.

Taking &mut self means I think you can have multiple invocations now can you? Since you'd overwrite the buffer that we have a hypothetical ref to in the value returned in the previous invocation.

Is this an issue that we can deal with by generating the randomness upfront?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can have multiple invocations now can you?

Multiple arb.generate()? Yes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant can't. Since you'd need to take an exclusive borrow while there was a shared borrow active.

@purusang purusang self-assigned this Dec 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants