ci(core): benchmark for batch block import #2210
Closed
MarcosNicolau wants to merge 2 commits intomainfrom
Closed
Conversation
Lines of code reportTotal lines added: Detailed view |
fc0a960 to
e439e62
Compare
66f40d7 to
e439e62
Compare
Benchmark Block Execution Results Comparison Against Main
|
Benchmark Block Batch Execution Results Comparison Against Main
|
mpaulucci
reviewed
Mar 19, 2025
| hyperfine --setup "./bin/ethrex-base removedb" -w 5 -N -r 10 --show-output --export-markdown "bench_pr_comparison.md" \ | ||
| -L bin "$BINS" -n "{bin}" \ | ||
| "./bin/ethrex-{bin} --network test_data/genesis-l2-ci.json import ./test_data/l2-1k-erc20.rlp --removedb" | ||
| echo -e "## Benchmark Block Batch Execution Results Comparison Against Main\n\n$(cat bench_pr_comparison.md)" > bench_pr_comparison.md |
Collaborator
There was a problem hiding this comment.
nit: maybe we could move the "Benchmark Block Batch Execution Results Comparison Against Main" and "bench_pr_comparison.md" to variables/constants so that changing in one place doesn't break the workflow
Collaborator
|
You should remove https://github.com/lambdaclass/ethrex/blob/main/.github/workflows/ci_bench_block_execution.yaml and add yours. Comparing against base is more accurate. |
mpaulucci
requested changes
Mar 21, 2025
Collaborator
mpaulucci
left a comment
There was a problem hiding this comment.
not sure we want to have a separate job just for this usecase but let's keep this open anyway.
github-merge-queue bot
pushed a commit
that referenced
this pull request
Mar 25, 2025
**Motivation**
Accelerate syncing!
**Description**
This PR introduces block batching during full sync:
1. Instead of storing and computing the state root for each block
individually, we now maintain a single state tree for the entire batch,
committing it only at the end. This results in one state trie per `n`
blocks instead of one per block (we'll need less storage also).
2. The new full sync process:
- Request 1024 headers
- Request 1024 block bodies and collect them
- Once all blocks are received, process them in batches using a single
state trie, which is attached to the last block.
3. Blocks are now stored in a single transaction.
4. State root, receipts root, and request root validation are only
required for the last block in the batch.
5. The new add_blocks_in_batch function includes a flag,
`should_commit_intermediate_tries`. When set to true, it stores the
tries for each block. This functionality is added to make the hive test
pass. Currently, this is handled by verifying if the block is within the
`STATE_TRIES_TO_KEEP` range. In a real syncing scenario, my intuition is
that it would be better to wait until we are fully synced and then we
would start storing the state of the new blocks and pruning when we
reach `STATE_TRIES_TO_KEEP`.
6. Throughput when syncing is now measured per batches.
7. A new command was added to import blocks in batch
Considerations:
1. ~Optimize account updates: Instead of inserting updates into the
state trie after each block execution, batch them at the end, merging
repeated accounts to reduce insertions and improve performance (see
#2216)~ Closes #2216.
2. Improve transaction handling: Avoid committing storage tries to the
database separately. Instead, create a single transaction for storing
receipts, storage tries, and blocks. This would require additional
abstractions for transaction management (see #2217).
3. This isn't working for `levm` backend we need it to cache the
executions state and persist it between them, as we don't store anything
until the final of the batch (see #2218).
4. In #2210 a new ci is added to run a bench comparing main and `head`
branch using `import-in-batch`
Closes None
---------
Co-authored-by: Martin Paulucci <martin.c.paulucci@gmail.com>
Co-authored-by: fmoletta <99273364+fmoletta@users.noreply.github.com>
962c790 to
a09a2f1
Compare
Collaborator
|
Closed since the |
pedrobergamini
pushed a commit
to pedrobergamini/ethrex
that referenced
this pull request
Aug 24, 2025
…aclass#2174) **Motivation** Accelerate syncing! **Description** This PR introduces block batching during full sync: 1. Instead of storing and computing the state root for each block individually, we now maintain a single state tree for the entire batch, committing it only at the end. This results in one state trie per `n` blocks instead of one per block (we'll need less storage also). 2. The new full sync process: - Request 1024 headers - Request 1024 block bodies and collect them - Once all blocks are received, process them in batches using a single state trie, which is attached to the last block. 3. Blocks are now stored in a single transaction. 4. State root, receipts root, and request root validation are only required for the last block in the batch. 5. The new add_blocks_in_batch function includes a flag, `should_commit_intermediate_tries`. When set to true, it stores the tries for each block. This functionality is added to make the hive test pass. Currently, this is handled by verifying if the block is within the `STATE_TRIES_TO_KEEP` range. In a real syncing scenario, my intuition is that it would be better to wait until we are fully synced and then we would start storing the state of the new blocks and pruning when we reach `STATE_TRIES_TO_KEEP`. 6. Throughput when syncing is now measured per batches. 7. A new command was added to import blocks in batch Considerations: 1. ~Optimize account updates: Instead of inserting updates into the state trie after each block execution, batch them at the end, merging repeated accounts to reduce insertions and improve performance (see lambdaclass#2216)~ Closes lambdaclass#2216. 2. Improve transaction handling: Avoid committing storage tries to the database separately. Instead, create a single transaction for storing receipts, storage tries, and blocks. This would require additional abstractions for transaction management (see lambdaclass#2217). 3. This isn't working for `levm` backend we need it to cache the executions state and persist it between them, as we don't store anything until the final of the batch (see lambdaclass#2218). 4. In lambdaclass#2210 a new ci is added to run a bench comparing main and `head` branch using `import-in-batch` Closes None --------- Co-authored-by: Martin Paulucci <martin.c.paulucci@gmail.com> Co-authored-by: fmoletta <99273364+fmoletta@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
Benchmark batch block execution.
Description
In #2174 we introduce batch block import, this pr:
Note: this ci will work when #2174 is merged into main.
Closes None