feat(l1): process blocks in batches when syncing and importing#2174
Merged
MarcosNicolau merged 72 commits intomainfrom Mar 25, 2025
Merged
feat(l1): process blocks in batches when syncing and importing#2174MarcosNicolau merged 72 commits intomainfrom
MarcosNicolau merged 72 commits intomainfrom
Conversation
Lines of code reportTotal lines added: Detailed view |
mpaulucci
reviewed
Mar 7, 2025
mpaulucci
reviewed
Mar 7, 2025
mpaulucci
reviewed
Mar 7, 2025
mpaulucci
reviewed
Mar 7, 2025
mpaulucci
reviewed
Mar 7, 2025
crates/blockchain/blockchain.rs
Outdated
| // todo only execute transactions | ||
| // batch account updates to merge the repeated accounts | ||
| self.storage | ||
| .apply_account_updates_to_trie(&account_updates, &mut state_trie)?; |
Collaborator
There was a problem hiding this comment.
Ahh, I understand now. I was hoping we could just call: https://github.com/lambdaclass/lambda_ethereum_rust/blob/0acc5e28b861f88c30cebb6cbfe0230970df25ed/crates/vm/backends/revm.rs#L96 get_state_transitions only once.
We would need to add a execute_blocks inside vm.
Collaborator
There was a problem hiding this comment.
we can discuss this later.
Contributor
Author
There was a problem hiding this comment.
I've tried this approach but it won't work without making larger modifications to the vm backend.
we were missing to set the cannonical block hash for the number
fmoletta
reviewed
Mar 11, 2025
fmoletta
reviewed
Mar 11, 2025
fmoletta
reviewed
Mar 11, 2025
fmoletta
reviewed
Mar 11, 2025
mpaulucci
reviewed
Mar 21, 2025
mpaulucci
approved these changes
Mar 21, 2025
fmoletta
reviewed
Mar 21, 2025
fmoletta
reviewed
Mar 21, 2025
fmoletta
reviewed
Mar 21, 2025
fmoletta
reviewed
Mar 21, 2025
fmoletta
reviewed
Mar 21, 2025
fmoletta
reviewed
Mar 21, 2025
fmoletta
reviewed
Mar 21, 2025
fmoletta
reviewed
Mar 25, 2025
fmoletta
reviewed
Mar 25, 2025
Co-authored-by: fmoletta <99273364+fmoletta@users.noreply.github.com>
fmoletta
approved these changes
Mar 25, 2025
19e3593 to
6c037c7
Compare
pedrobergamini
pushed a commit
to pedrobergamini/ethrex
that referenced
this pull request
Aug 24, 2025
…aclass#2174) **Motivation** Accelerate syncing! **Description** This PR introduces block batching during full sync: 1. Instead of storing and computing the state root for each block individually, we now maintain a single state tree for the entire batch, committing it only at the end. This results in one state trie per `n` blocks instead of one per block (we'll need less storage also). 2. The new full sync process: - Request 1024 headers - Request 1024 block bodies and collect them - Once all blocks are received, process them in batches using a single state trie, which is attached to the last block. 3. Blocks are now stored in a single transaction. 4. State root, receipts root, and request root validation are only required for the last block in the batch. 5. The new add_blocks_in_batch function includes a flag, `should_commit_intermediate_tries`. When set to true, it stores the tries for each block. This functionality is added to make the hive test pass. Currently, this is handled by verifying if the block is within the `STATE_TRIES_TO_KEEP` range. In a real syncing scenario, my intuition is that it would be better to wait until we are fully synced and then we would start storing the state of the new blocks and pruning when we reach `STATE_TRIES_TO_KEEP`. 6. Throughput when syncing is now measured per batches. 7. A new command was added to import blocks in batch Considerations: 1. ~Optimize account updates: Instead of inserting updates into the state trie after each block execution, batch them at the end, merging repeated accounts to reduce insertions and improve performance (see lambdaclass#2216)~ Closes lambdaclass#2216. 2. Improve transaction handling: Avoid committing storage tries to the database separately. Instead, create a single transaction for storing receipts, storage tries, and blocks. This would require additional abstractions for transaction management (see lambdaclass#2217). 3. This isn't working for `levm` backend we need it to cache the executions state and persist it between them, as we don't store anything until the final of the batch (see lambdaclass#2218). 4. In lambdaclass#2210 a new ci is added to run a bench comparing main and `head` branch using `import-in-batch` Closes None --------- Co-authored-by: Martin Paulucci <martin.c.paulucci@gmail.com> Co-authored-by: fmoletta <99273364+fmoletta@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
Accelerate syncing!
Description
This PR introduces block batching during full sync:
nblocks instead of one per block (we'll need less storage also).should_commit_intermediate_tries. When set to true, it stores the tries for each block. This functionality is added to make the hive test pass. Currently, this is handled by verifying if the block is within theSTATE_TRIES_TO_KEEPrange. In a real syncing scenario, my intuition is that it would be better to wait until we are fully synced and then we would start storing the state of the new blocks and pruning when we reachSTATE_TRIES_TO_KEEP.Considerations:
Optimize account updates: Instead of inserting updates into the state trie after each block execution, batch them at the end, merging repeated accounts to reduce insertions and improve performance (see Optimize account updates inCloses Optimize account updates inadd_blocks_in_batch#2216)add_blocks_in_batch#2216.add_blocks_in_batch#2217).levmbackend we need it to cache the executions state and persist it between them, as we don't store anything until the final of the batch (see Makeadd_blocks_in_batchwork for LEVM #2218).headbranch usingimport-in-batchCloses None