Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce HashedPostStateProvider #11872

Closed

Conversation

frisitano
Copy link
Contributor

@frisitano frisitano commented Oct 18, 2024

Dependencies

This PR builds on #11867 and should be reviewed after #11867 is merged.

Overview

This PR introduces the HashedPostStateProvider trait, adds it toStateProvider super trait, and implements it on StateProvider objects and parent objects. This allows for HashedPostState to be instantiated using the StateCommitment::KeyHasher type available in the providers via the following methods which have had the KH: KeyHasher generic added:

HashedPostState::from_bundle_state<'a, KH: KeyHasher>(state: impl IntoParallelIterator<Item = (&'a Address, &'a BundleAccount)>) -> Self;
HashedPostState::from_cache_state<'a, KH: KeyHasher>(state: impl IntoParallelIterator<Item = (&'a Address, &'a CacheAccount)>,) -> Self

supports: #11830

Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for the delay, we were working on some other provider related stuff so I delayed this. this now likely has a few conflicts, sorry about that.

this looks pretty good to me, but need @rkrasiuk and @joshieDo to take a look at this as well.

@frisitano
Copy link
Contributor Author

sorry for the delay, we were working on some other provider related stuff so I delayed this. this now likely has a few conflicts, sorry about that.

this looks pretty good to me, but need @rkrasiuk and @joshieDo to take a look at this as well.

No problem at all. I will be out of office until Monday of next week. If @rkrasiuk and @joshieDo can provide review / feedback then I can rebase upstream changes and address any feedback.

#[auto_impl::auto_impl(&, Box, Arc)]
pub trait HashedPostStateProvider {
/// Returns the `HashedPostState` of the provided `BundleState`.
fn hashed_post_state_from_bundle_state(&self, bundle_state: &BundleState) -> HashedPostState;
Copy link
Collaborator

@joshieDo joshieDo Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless I'm missing something, this actually does not require &self right? It only requires bundle_state. Could we not move it to BundleState

feels a bit strange having it here and forcing every provider to implement it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to minimize the number of places that we introduce the StateCommitment generic to gain access to its associated types. With this in mind the current solution aims to keep the StateCommitment generic encapsulated in the providers and then expose traits with methods that use the StateCommitment associated types, HashedPostStateProvider being one of these traits. If we implemented it directly on BundleState then in any part of the code that we want to convert from BundleState to HashedPostState we also need to make sure the StateCommitment type is available, probably by adding a generic on the struct the logic is implemented on. In all cases that we want to do this conversion there is already a provider available in the environment so it makes sense to delegate to the provider as opposed to making the StateCommitment type available through a generic.

As for the existence of &self in the trait this also benefits us in regards to minimising the number of places that we introduce the StateCommitment generic as in some instances the type we want to implement HashedPostStateProvider on owns a provider that already implements this trait and hence we don't need to introduce a generic on the object. See below for an example:

/// A state provider that stores references to in-memory blocks along with their state as well as
/// the historical state provider for fallback lookups.
#[allow(missing_debug_implementations)]
pub struct MemoryOverlayStateProvider {
    /// Historical state provider for state lookups that are not found in in-memory blocks.
    pub(crate) historical: Box<dyn StateProvider>,
    /// The collection of executed parent blocks. Expected order is newest to oldest.
    pub(crate) in_memory: Vec<ExecutedBlock>,
    /// Lazy-loaded in-memory trie data.
    pub(crate) trie_state: OnceLock<MemoryOverlayTrieState>,
}

impl HashedPostStateProvider for MemoryOverlayStateProvider {
    fn hashed_post_state_from_bundle_state(
        &self,
        bundle_state: &reth_execution_types::BundleState,
    ) -> HashedPostState {
        self.historical.hashed_post_state_from_bundle_state(bundle_state)
    }

    fn hashed_post_state_from_reverts(
        &self,
        block_number: BlockNumber,
    ) -> ProviderResult<HashedPostState> {
        self.historical.hashed_post_state_from_reverts(block_number)
    }
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to minimize the number of places that we introduce the StateCommitment generic to gain access to its associated types.

yeah, if you ask me, the BundleState type is a bit horrific so I don't really want to add any additional logic in there

Copy link
Member

@Rjected Rjected Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't really understand why hashed_post_state_from_bundle_state requires anything more than BundleState, ie why it needs to be stateful. hashed_post_state_from_reverts makes more sense to require &self. Is there an implementation of this trait that makes use of the actual self for that method? Otherwise it can just be a stateless trait method

Copy link
Contributor Author

@frisitano frisitano Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added HashedPostStateProvider to the StateProvider super trait. As such all StateProviders must implement HashedPostStateProvider (I still need to asses if is a hard requirement but logically I think it makes sense). For the case of MemoryOverlayStateProvider I wanted to avoid making it generic over StateCommitment and instead leverage it's field pub(crate) historical: Box<dyn StateProvider> which implements HashedPostStateProvider. As such the implementation of HashedPostStateProvider for MemoryOverlayStateProvider is defined using this delegation which requires &self as seen here. Similar situation for the implementation of HashedPostStateProvider on StateProviderTraitObjWrapper here.

To conclude for types that have a StateProvider as one of its fields we delegate to that provider for the implementation of HashedPostStateProvider instead of introducing a StateCommitment on the type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for those implementations, if it were stateless, you could do <Type as HashedPostStateProvider>::hashed_post_state_from_bundle_state(bundle_state) if it were stateless. In the macro it could be <$historical_type as ...>. The point is that none of these actually require any state, so it does not make sense to be &self

@frisitano frisitano force-pushed the feat/hashed-post-state-provider branch from 3209833 to 5ac282e Compare October 30, 2024 12:26
Comment on lines 47 to 54
impl<TX: DbTx, SC: StateCommitment> AccountReader for LatestStateProviderRef<'_, TX, SC> {
/// Get basic account information.
fn basic_account(&self, address: Address) -> ProviderResult<Option<Account>> {
self.tx.get::<tables::PlainAccountState>(address).map_err(Into::into)
}
}

impl<TX: DbTx> BlockHashReader for LatestStateProviderRef<'_, TX> {
impl<TX: DbTx, SC: StateCommitment> BlockHashReader for LatestStateProviderRef<'_, TX, SC> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this bound here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I've refactored to remove and loosen bounds where possible.

Comment on lines +172 to +179
HashedPostState::from_bundle_state::<SC::KeyHasher>(&bundle_state.state)
}

fn hashed_post_state_from_reverts(
&self,
block_number: BlockNumber,
) -> ProviderResult<HashedPostState> {
HashedPostState::from_reverts::<SC::KeyHasher>(self.tx, block_number).map_err(Into::into)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the only location where we need to access SC ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now we only use SC in the HashedPostStateProvider and bellow in HistoricalStateProviderRef:

impl<'b, TX: DbTx, SC: StateCommitment> HistoricalStateProviderRef<'b, TX, SC> {
...
    /// Retrieve revert hashed state for this history provider.
    fn revert_state(&self) -> ProviderResult<HashedPostState> {
        if !self.lowest_available_blocks.is_account_history_available(self.block_number) ||
            !self.lowest_available_blocks.is_storage_history_available(self.block_number)
        {
            return Err(ProviderError::StateAtBlockPruned(self.block_number))
        }

        if self.check_distance_against_limit(EPOCH_SLOTS)? {
            tracing::warn!(
                target: "provider::historical_sp",
                target = self.block_number,
                "Attempt to calculate state root for an old block might result in OOM"
            );
        }

        Ok(HashedPostState::from_reverts::<SC::KeyHasher>(self.tx, self.block_number)?)
    }
...
}

However, we will integrate all types as described in my reply to your review.

@Rjected Rjected added C-enhancement New feature or request A-trie Related to Merkle Patricia Trie implementation A-sdk Related to reth's use as a library labels Oct 30, 2024
Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

finally want to complete all of this.

can't fix conflicts here myself because opened from org fork

https://github.com/paradigmxyz/reth/compare/feat/provider-state-commitment?expand=1

@frisitano
Copy link
Contributor Author

finally want to complete all of this.

can't fix conflicts here myself because opened from org fork

https://github.com/paradigmxyz/reth/compare/feat/provider-state-commitment?expand=1

I'm sorry I've been at an offsite for the past week so I've been a little distant in responding to this work. I can fix the conflicts if that works? Are we happy to bypass the intermediary PR (#11867) and use this as the next merge target?

I had also looked into enabling permissions for upstream maintainers to modify open PR's (as described here) but the option doesn't seem to be available.

@mattsse
Copy link
Collaborator

mattsse commented Nov 14, 2024

yeah, same issue on the other

these prs are conflict magnets, perhaps easier to submit from different fork?

@frisitano
Copy link
Contributor Author

yeah, same issue on the other

these prs are conflict magnets, perhaps easier to submit from different fork?

I'll do this first thing in the morning.

@frisitano
Copy link
Contributor Author

Sorry for the delay but I've reopened this PR and the intermediary PR in the links below on a user fork. I had tried to address feedback of removing self from the HashedPostStateProvider but rust complains about it's ability to create a vtable.

#12602
#12607

@frisitano
Copy link
Contributor Author

superseded by #12607

@frisitano frisitano closed this Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sdk Related to reth's use as a library A-trie Related to Merkle Patricia Trie implementation C-enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants