Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes in ArbitraryGenerator #502

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions bin/strata-client/src/extractor.rs
Original file line number Diff line number Diff line change
Expand Up @@ -337,7 +337,7 @@ mod tests {
max_txs_per_block: usize,
needle: (Vec<u8>, ProtocolOperation),
) -> usize {
let arb = ArbitraryGenerator::new();
let mut arb = ArbitraryGenerator::new();
assert!(
num_blocks.gt(&0) && max_txs_per_block.gt(&0),
"num_blocks and max_tx_per_block must be at least 1"
Expand Down Expand Up @@ -397,7 +397,7 @@ mod tests {

/// Create a known transaction that should be present in some block.
fn get_needle() -> (Vec<u8>, ProtocolOperation, DepositInfo) {
let arb = ArbitraryGenerator::new();
let mut arb = ArbitraryGenerator::new();
let network = Network::Regtest;

let el_address: [u8; 20] = arb.generate();
Expand Down Expand Up @@ -469,20 +469,20 @@ mod tests {
/// So, you can assume that this flag represents whether the pair is valid (true) or invalid
/// (false).
fn generate_mock_tx() -> (Vec<u8>, ProtocolOperation, bool) {
let arb = ArbitraryGenerator::new();
let mut arb = ArbitraryGenerator::new();

let should_be_valid: bool = arb.generate();

if should_be_valid.not() {
let (invalid_tx, invalid_protocol_op) = generate_invalid_tx(&arb);
let (invalid_tx, invalid_protocol_op) = generate_invalid_tx(&mut arb);
return (invalid_tx, invalid_protocol_op, should_be_valid);
}

let (valid_tx, valid_protocol_op) = generate_valid_tx(&arb);
let (valid_tx, valid_protocol_op) = generate_valid_tx(&mut arb);
(valid_tx, valid_protocol_op, should_be_valid)
}

fn generate_invalid_tx(arb: &ArbitraryGenerator) -> (Vec<u8>, ProtocolOperation) {
fn generate_invalid_tx(arb: &mut ArbitraryGenerator) -> (Vec<u8>, ProtocolOperation) {
let random_protocol_op: ProtocolOperation = arb.generate();

// true => tx invalid
Expand All @@ -509,7 +509,7 @@ mod tests {
(tx_with_invalid_script_pubkey, random_protocol_op)
}

fn generate_valid_tx(arb: &ArbitraryGenerator) -> (Vec<u8>, ProtocolOperation) {
fn generate_valid_tx(arb: &mut ArbitraryGenerator) -> (Vec<u8>, ProtocolOperation) {
let (tx, spend_info, script_to_spend) = generate_mock_unsigned_tx();

let random_hash = *spend_info
Expand Down Expand Up @@ -565,7 +565,7 @@ mod tests {
let empty_deposits = empty_chain_state.deposits_table_mut();
let mut deposits_table = DepositsTable::new_empty();

let arb = ArbitraryGenerator::new();
let mut arb = ArbitraryGenerator::new();

let mut operators: Vec<OperatorIdx> = arb.generate();
loop {
Expand Down
2 changes: 1 addition & 1 deletion crates/bridge-relay/src/relayer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -245,7 +245,7 @@ mod tests {
let processed_msgs = RecentMessageTracker::new();

// Create valid BridgeMsgId instances for testing
let ag = ArbitraryGenerator::new();
let mut ag = ArbitraryGenerator::new();
let cur_message_id: BridgeMsgId = ag.generate();
let old_message_id: BridgeMsgId = ag.generate();
assert_ne!(cur_message_id, old_message_id);
Expand Down
12 changes: 6 additions & 6 deletions crates/consensus-logic/src/duty/block_assembly.rs
Original file line number Diff line number Diff line change
Expand Up @@ -444,7 +444,7 @@ mod tests {

#[test]
fn test_find_pivot_noop() {
let ag = ArbitraryGenerator::new_with_size(1 << 12);
let mut ag = ArbitraryGenerator::new_with_size(1 << 12);

let blkids: [L1BlockId; 10] = ag.generate();
eprintln!("{blkids:#?}");
Expand All @@ -467,7 +467,7 @@ mod tests {

#[test]
fn test_find_pivot_noop_offset() {
let ag = ArbitraryGenerator::new_with_size(1 << 12);
let mut ag = ArbitraryGenerator::new_with_size(1 << 12);

let blkids: [L1BlockId; 10] = ag.generate();
eprintln!("{blkids:#?}");
Expand All @@ -490,7 +490,7 @@ mod tests {

#[test]
fn test_find_pivot_simple_extend() {
let ag = ArbitraryGenerator::new_with_size(1 << 12);
let mut ag = ArbitraryGenerator::new_with_size(1 << 12);

let blkids1: [L1BlockId; 10] = ag.generate();
let mut blkids2 = Vec::from(blkids1);
Expand Down Expand Up @@ -522,7 +522,7 @@ mod tests {

#[test]
fn test_find_pivot_typical_reorg() {
let ag = ArbitraryGenerator::new_with_size(1 << 16);
let mut ag = ArbitraryGenerator::new_with_size(1 << 16);

let mut blkids1: Vec<L1BlockId> = Vec::new();
for _ in 0..10 {
Expand Down Expand Up @@ -559,7 +559,7 @@ mod tests {

#[test]
fn test_find_pivot_cur_shorter_reorg() {
let ag = ArbitraryGenerator::new_with_size(1 << 16);
let mut ag = ArbitraryGenerator::new_with_size(1 << 16);

let mut blkids1: Vec<L1BlockId> = Vec::new();
for _ in 0..10 {
Expand Down Expand Up @@ -600,7 +600,7 @@ mod tests {

#[test]
fn test_find_pivot_disjoint() {
let ag = ArbitraryGenerator::new_with_size(1 << 16);
let mut ag = ArbitraryGenerator::new_with_size(1 << 16);

let mut blkids1: Vec<L1BlockId> = Vec::new();
for _ in 0..10 {
Expand Down
2 changes: 1 addition & 1 deletion crates/evmexec/src/engine.rs
Original file line number Diff line number Diff line change
Expand Up @@ -518,7 +518,7 @@ mod tests {

let el_payload = random_el_payload();

let arb = strata_test_utils::ArbitraryGenerator::new();
let mut arb = strata_test_utils::ArbitraryGenerator::new();
let l2block: L2Block = arb.generate();
let accessory = L2BlockAccessory::new(borsh::to_vec(&el_payload).unwrap());
let l2block_bundle = L2BlockBundle::new(l2block, accessory);
Expand Down
2 changes: 1 addition & 1 deletion crates/primitives/src/l1.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1314,7 +1314,7 @@ mod tests {

#[test]
fn test_bitcoin_txid_serialize_deserialize() {
let generator = ArbitraryGenerator::new();
let mut generator = ArbitraryGenerator::new();
let txid: BitcoinTxid = generator.generate();

let serialized_txid =
Expand Down
2 changes: 1 addition & 1 deletion crates/primitives/src/relay/util.rs
Original file line number Diff line number Diff line change
Expand Up @@ -187,7 +187,7 @@ mod tests {

#[test]
fn test_message_signer_serde() {
let generator = ArbitraryGenerator::new();
let mut generator = ArbitraryGenerator::new();
let txid: BitcoinTxid = generator.generate();
let scope = Scope::V0PubNonce(txid);
let payload: Musig2PubNonce = generator.generate();
Expand Down
4 changes: 2 additions & 2 deletions crates/rocksdb-store/src/bridge/db.rs
Original file line number Diff line number Diff line change
Expand Up @@ -233,7 +233,7 @@ mod tests {
fn test_bridge_duty_status_db() {
let db = setup_duty_db();

let arb = ArbitraryGenerator::new();
let mut arb = ArbitraryGenerator::new();

let duty_status: BridgeDutyStatus = arb.generate();
let txid: BitcoinTxid = arb.generate();
Expand Down Expand Up @@ -315,7 +315,7 @@ mod tests {
fn test_bridge_duty_index_db() {
let db = setup_bridge_duty_index_db();

let arb = ArbitraryGenerator::new();
let mut arb = ArbitraryGenerator::new();

let checkpoint: u64 = arb.generate();

Expand Down
2 changes: 1 addition & 1 deletion crates/rocksdb-store/src/bridge_relay/db.rs
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@ mod tests {
}

fn make_bridge_msg() -> (u128, BridgeMessage) {
let arb = ArbitraryGenerator::new();
let mut arb = ArbitraryGenerator::new();

let msg: BridgeMessage = arb.generate();

Expand Down
2 changes: 1 addition & 1 deletion crates/rocksdb-store/src/l1/db.rs
Original file line number Diff line number Diff line change
Expand Up @@ -239,7 +239,7 @@ mod tests {
db: &L1Db,
num_txs: usize,
) -> (L1BlockManifest, Vec<L1Tx>, CompactMmr) {
let arb = ArbitraryGenerator::new();
let mut arb = ArbitraryGenerator::new();

// TODO maybe tweak this to make it a bit more realistic?
let mf: L1BlockManifest = arb.generate();
Expand Down
2 changes: 1 addition & 1 deletion crates/rocksdb-store/src/l2/db.rs
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@ mod tests {
use crate::test_utils::get_rocksdb_tmp_instance;

fn get_mock_data() -> L2BlockBundle {
let arb = ArbitraryGenerator::new();
let mut arb = ArbitraryGenerator::new();
let l2_block: L2BlockBundle = arb.generate();

l2_block
Expand Down
2 changes: 1 addition & 1 deletion crates/test-utils/src/l2.rs
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ use strata_state::{
use crate::{bitcoin::get_btc_chain, ArbitraryGenerator};

pub fn gen_block(parent: Option<&SignedL2BlockHeader>) -> L2BlockBundle {
let arb = ArbitraryGenerator::new();
let mut arb = ArbitraryGenerator::new();
let header: L2BlockHeader = arb.generate();
let body: L2BlockBody = arb.generate();
let accessory: L2BlockAccessory = arb.generate();
Expand Down
42 changes: 18 additions & 24 deletions crates/test-utils/src/lib.rs
Original file line number Diff line number Diff line change
@@ -1,46 +1,40 @@
use std::sync::atomic::{AtomicUsize, Ordering};

use arbitrary::{Arbitrary, Unstructured};
use rand::{rngs::OsRng, RngCore};
use rand::{thread_rng, RngCore};

pub mod bitcoin;
pub mod bridge;
pub mod evm_ee;
pub mod l2;

const ARB_GEN_LEN: usize = 1 << 24; // 16 MiB
// Smaller buffer size as compared to 2^24
const ARB_GEN_LEN: usize = 128;

pub struct ArbitraryGenerator {
buf: Vec<u8>,
off: AtomicUsize,
rng: rand::rngs::ThreadRng, // Thread-local RNG
buf: Vec<u8>, // Persistent buffer
}

impl Default for ArbitraryGenerator {
fn default() -> Self {
Self::new()
}
}

impl ArbitraryGenerator {
pub fn new() -> Self {
Self::new_with_size(ARB_GEN_LEN)
ArbitraryGenerator {
rng: thread_rng(),
buf: vec![0u8; ARB_GEN_LEN],
}
}

pub fn new_with_size(n: usize) -> Self {
let mut buf = vec![0; n];
OsRng.fill_bytes(&mut buf); // 128 wasn't enough
let off = AtomicUsize::new(0);
ArbitraryGenerator { buf, off }
pub fn new_with_size(s: usize) -> Self {
ArbitraryGenerator {
rng: thread_rng(),
buf: vec![0u8; s],
}
}

pub fn generate<'a, T: Arbitrary<'a> + Clone>(&'a self) -> T {
// Doing hacky atomics to make this actually be reusable, this is pretty bad.
let off = self.off.load(Ordering::Relaxed);
let mut u = Unstructured::new(&self.buf[off..]);
let prev_off = u.len();
let inst = T::arbitrary(&mut u).expect("failed to generate arbitrary instance");
let additional_off = prev_off - u.len();
self.off.store(off + additional_off, Ordering::Relaxed);
inst
pub fn generate<'a, T: Arbitrary<'a> + Clone>(&'a mut self) -> T {
self.rng.fill_bytes(&mut self.buf);
let mut u = Unstructured::new(&self.buf);
T::arbitrary(&mut u).expect("Failed to generate arbitrary instance")
}
Comment on lines 22 to +38
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this is written kinda makes having the random buf be a field pointless. Since we sample new randomness on each call it might as well just be a local and then we can just have this take &self. Also what happens if we consume more than the length of randomness in a call to generate? I'm not sure the best design approach here since it'd be nice to be able to make it deterministic if desired.

Also should verify if it makes sense to include the rng as a field as opposed to just calling thread_rng() on every invocation. I'm not sure if there's a practical difference in terms of speed/determinism there.

Copy link
Contributor

@bewakes bewakes Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the buf field is required because of lifetime issues. The random buf is generated inside the method which can't leave the method scope.

I don't really like having mut everywhere though. Maybe a RefCell can be used for buffer(and for rng?), which should be fine since we probably won't be using a generator instance across multiple threads. What are your thoughts on this?

Copy link
Contributor Author

@purusang purusang Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this is written kinda makes having the random buf be a field pointless.

The reason for having buf field is due to lifetime requirement, like @bewakes said.

Since we sample new randomness on each call it might as well just be a local and then we can just have this take &self.

We needed &mut due to having to update randomness on every call.

Also what happens if we consume more than the length of randomness in a call to generate? I'm not sure the best design approach here since it'd be nice to be able to make it deterministic if desired.

It's a good concern, but I think having new_with_size() serves that purpose.

Also should verify if it makes sense to include the rng as a field as opposed to just calling thread_rng() on every invocation. I'm not sure if there's a practical difference in terms of speed/determinism there.

Both approaches definitely work, and don't see either way to be an issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the buf field is required because of lifetime issues. The random buf is generated inside the method which can't leave the method scope.

That's what I observed too.

I don't really like having mut everywhere though. Maybe a RefCell can be used for buffer(and for rng?), which should be fine since we probably won't be using a generator instance across multiple threads. What are your thoughts on this?

Yeah, initially I did not like the idea of having to change multiple files to pass mutable arbitrary generator but due to lifetime requirement using RefCell was not helpful there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh christ I'm realizing how this api kinda sucks. I didn't see that Arbitrary has a lifetime param.

Taking &mut self means I think you can have multiple invocations now can you? Since you'd overwrite the buffer that we have a hypothetical ref to in the value returned in the previous invocation.

Is this an issue that we can deal with by generating the randomness upfront?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can have multiple invocations now can you?

Multiple arb.generate()? Yes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant can't. Since you'd need to take an exclusive borrow while there was a shared borrow active.

}
Loading