-
Notifications
You must be signed in to change notification settings - Fork 0
Conversation
@@ -1062,12 +1097,14 @@ pub(crate) async fn handle_received_txns<Types: NodeType>( | |||
let time_in = Instant::now(); | |||
for tx in txns.into_iter() { | |||
let commit = tx.commit(); | |||
let len = bincode::serialized_size(&tx).unwrap_or_default(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only method of estimating transaction's size short of actually serializing it we have. We should add one in HotShot.
src/service.rs
Outdated
// Increase max block size by 10% | ||
let mut global_state = self.global_state.write_arc().await; | ||
global_state.max_block_size = std::cmp::min( | ||
global_state.max_block_size + global_state.max_block_size.div_ceil(10), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this isn't that big a deal, but I think the steady state here is to end up thrashing between increasing and decreasing the block size, so that we are always right around the maximum VID we can process in time, but not hitting it exactly. Not sure if there's a great way to fix that though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered that, but this oscillation would happen only in a case where we are constantly overwhelmed by transactions.
I have since thought up an easy way to address that: perform this increment only if not only did we serve VID in time and unnecessarily truncated the block, but had some threshold amount of time left over as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments about size incrementation and decrementation. Also had some questions about block size.
src/builder_state.rs
Outdated
// This way we will include one transaction over our target block length. | ||
// This is done on purpose, otherwise we'd have a possible failure state | ||
// where a single transaction larger than target block state is stuck in | ||
// queue and we just build empty blocks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two questions that not necessarily need modifications:
-
Intuitively I'd assume that we'd skip a transaction larger than the max size, i.e., we'd check if
tx.len > max_block_size
, and if so, we'd just skip it and try the next transaction. I understand that we may instead want to build for all transactions, but just want to bring this up in case this option hasn't been discussed yet. -
Would it be better to include a transaction over the target length only at the beginning? E.g., if the target length is 10, and the length of the first transaction is 12, we will include it. But if we've added a transaction with length 4, and the second transaction has length 12, we won't include it for this block (but it can be added to the next block). I don't think this is strictly better than the current setting, but just feel it's closer to the purpose of avoiding empty blocks without making the length too large.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should silently drop transactions on the basis that they're too big for us to handle. What we can do is immediately reject them when they're added to mempool, I'll add a thing for that.
As for dropping transactions to big for sequencer to handle, see my response to your comment below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think the second point above is an improvement we could make, but I don't think it's urgent. And your replies to other questions/comments all make sense! I'll approve once the incrementation thing is fixed and will add this discussion to our tech debt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree on the second point, addressed it
src/builder_state.rs
Outdated
// Payload is empty despite us checking that tx_queue isn't empty earlier. | ||
// This indicates that the block was truncated due to sequencer block length | ||
// limits and we have a transaction too big to ever be included in the head of | ||
// our queue. We need to drop it and mark as "included" so that if we receive | ||
// it again we don't even bother with it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the "sequencer block length limits" the same as max_block_size
? The comment and code here make sense to me, but see my first question above--I'm wondering whether we can skip the large transaction there so that we don't need the list updates here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sequencer block length limits are different from our max_block_size
. max_block_size
is our estimation of what we can calculate VID for in time, while sequencer limits are determined by chainconfig.
Unfortunately, we can't check if transaction is over the sequencer's limit immediately, because there's no way for us to access the chainconfig, as we don't know anything about the sequencer types. Moreover, even if we did, the sequencer's maximum block size != maximum transaction size, because blocks have overhead on namespaces and other stuff. Moreover moreover, even if we had a way to get a clearly-defined transaction size from the sequencer, we can't even know the actual length of TYPES::Transaction
!
So this code block detects if a single transaction is big enough to single-handedly go over the sequencer's block size limit by observing the following scenario:
- we have some transactions in queue
- we build a block from those by calling
<TYPES::BlockPayload as BlockPayload<TYPES>>::from_transactions
- the definition of that for sequencer silently truncates the block on the first transaction it encounters that makes the block go over sequencer's limit
<TYPES::BlockPayload as BlockPayload<TYPES>>::from_transactions
returns a block, but callingnum_transactions
on it shows that it's completely empty
This means that the first transaction that we tried to include is too big, so we'll drop it from our queue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed explanation. Makes sense now!
global_state.max_block_size = std::cmp::min( | ||
global_state.max_block_size | ||
- global_state | ||
.max_block_size | ||
.div_ceil(MAX_BLOCK_SIZE_CHANGE_DIVISOR), | ||
MAX_BLOCK_SIZE_FLOOR, | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC MAX_BLOCK_SIZE_FLOOR
is the minimum block size we accept, so I think this should be cmp::max
to make sure we don't set a size lower than MAX_BLOCK_SIZE_FLOOR
.
src/service.rs
Outdated
+ global_state | ||
.max_block_size | ||
.div_ceil(MAX_BLOCK_SIZE_CHANGE_DIVISOR), | ||
MAX_BLOCK_SIZE_FLOOR, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be MAX_BLOCK_SIZE_CEIL
(not sure if we have such value)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, good catch! There's no reason to make such a ceiling, the whole min call is copypasta error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Closes #0000
This PR:
Implements a heuristic where we try not to build blocks over certain size.
Current behavior could lead to an issue where we accumulate too many transactions to respond to a VID request in time, and we get stuck in a loop, because our queue size never decreases, as we are too slow to answer and our blocks aren't included.
This PR introduces a limit on block size we make, and blocks over this size aren't built even if we have more transactions in queue. We start with assumption of how big of a block we can calculate VID for in time, and this assumption is revised down every time we don't calculate the VID in time and up every time we truncate a block, but manage to calculate VID in time.
This PR does not:
Key places to review: