Skip to content

[Storehouse] Background storehouse indexing#8255

Merged
j1010001 merged 43 commits intomasterfrom
leo/background-storehouse-indexing
Jan 26, 2026
Merged

[Storehouse] Background storehouse indexing#8255
j1010001 merged 43 commits intomasterfrom
leo/background-storehouse-indexing

Conversation

@zhangchiqing
Copy link
Member

@zhangchiqing zhangchiqing commented Dec 12, 2025

Working towards: #8244

This PR adds a flag --enable-background-storehouse-indexing=<default_false> to enable a background worker to store register updates to storehouse for each executed and finalized blocks. Note, It can not be enabled if --enable-storehouse=true.

Enabled background storehouse indexing on Testnet EN3. It took about three days to fully catch up, processing roughly 56 blocks per second.

Summary by CodeRabbit

  • New Features
    • Optional background indexing of on-disk register data, configurable via new CLI flags and adjustable indexing rate; indexing can be triggered on block execution or finalization.
    • New block-executed notification mechanism allowing components to subscribe to execution events.
  • Tests
    • Comprehensive lifecycle and end-to-end tests covering bootstrap, optional checkpoint import, engine startup, and indexing progress.
  • Refactor
    • Improved initialization and shutdown wiring for register storage, notifier wiring, and background indexing for safer startup/shutdown.

✏️ Tip: You can customize this high-level summary in your review settings.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 12, 2025

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

@zhangchiqing zhangchiqing marked this pull request as ready for review December 12, 2025 17:03
@zhangchiqing zhangchiqing requested a review from a team as a code owner December 12, 2025 17:03
Copy link
Member

@fxamacker fxamacker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments and suggestions (e.g., close pebbledb in the error paths if it was opened earlier).

// and executed block, starting from the last indexed height up to the latest finalized and
// executed height.
func (b *BackgroundIndexer) IndexUpToLatestFinalizedAndExecutedHeight(ctx context.Context) error {
startHeight := b.registerStore.LastFinalizedAndExecutedHeight()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
startHeight := b.registerStore.LastFinalizedAndExecutedHeight()
lastIndexedHeight := b.registerStore.LastFinalizedAndExecutedHeight()

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@engine/execution/storehouse/background_indexer_factory.go`:
- Around line 151-154: The log message in background_indexer_factory.go uses the
camelCase flag name "--enableBackgroundStorehouseIndexing" which is inconsistent
with the CLI's kebab-case convention; update the message emitted where
enableBackgroundStorehouseIndexing is checked (the conditional that logs
"background indexer engine disabled...") to reference the kebab-case flag
"--enable-background-storehouse-indexing" so logs match the actual CLI flag
naming.
- Around line 144-147: The comment claims blockExecutedNotifier is optional but
the code passes it directly into NewBackgroundIndexerEngine which calls
newFinalizedAndExecutedNotifier(blockExecutedNotifier, followerDistributor) and
can panic on nil; either remove "optional" or add a nil check before calling
NewBackgroundIndexerEngine: validate that blockExecutedNotifier is non-nil (or
replace it with a no-op/default notifier) and return an explicit error if
missing; reference the parameter name blockExecutedNotifier and the constructors
NewBackgroundIndexerEngine and newFinalizedAndExecutedNotifier when locating
where to add the check or defaulting logic.
♻️ Duplicate comments (2)
cmd/execution_config.go (1)

185-188: Avoid silently overriding mutually-exclusive flags.

If a user explicitly sets both flags, silently flipping one off hides misconfiguration and contradicts the CLI contract. Consider returning a validation error (or at least surfacing a warning) when both are set.

🔧 Suggested validation change
-	// Explicitly turn off background storehouse indexing when storehouse is enabled
-	if exeConf.enableStorehouse {
-		exeConf.enableBackgroundStorehouseIndexing = false
-	}
+	if exeConf.enableStorehouse && exeConf.enableBackgroundStorehouseIndexing {
+		return fmt.Errorf("--enable-background-storehouse-indexing cannot be used with --enable-storehouse")
+	}
engine/execution/storehouse/background_indexer.go (1)

68-72: Guard against zero-duration throttling at high rates.

For very large heightsPerSecond (e.g., > 1e9), integer division yields a zero duration, effectively disabling throttling. Consider clamping or handling the zero-duration case explicitly.

🔧 Possible clamp to avoid zero-duration sleep
 	if b.heightsPerSecond > 0 {
 		sleepDuration = time.Second / time.Duration(b.heightsPerSecond)
+		if sleepDuration == 0 {
+			sleepDuration = time.Nanosecond
+		}
 	}
🧹 Nitpick comments (1)
engine/execution/storehouse/background_indexer_factory.go (1)

169-204: Unused context parameter in bootstrapper closure.

The ctx context.Context parameter is captured but never used within the bootstrapper. Since LoadRegisterStore can perform long-running checkpoint imports, consider propagating the context for cancellation support, or document why cancellation isn't supported during bootstrap.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0f41ed9 and b78b4ae.

📒 Files selected for processing (6)
  • cmd/execution_builder.go
  • cmd/execution_config.go
  • engine/execution/storehouse/background_indexer.go
  • engine/execution/storehouse/background_indexer_factory.go
  • engine/execution/storehouse/background_indexer_factory_test.go
  • engine/execution/storehouse/background_indexer_provider.go
🧰 Additional context used
📓 Path-based instructions (5)
**/*.go

📄 CodeRabbit inference engine (.cursor/rules/coding_conventions.mdc)

Follow Go coding conventions as documented in @docs/agents/CodingConventions.md

Follow Go coding standards and conventions as documented in @docs/agents/GoDocs.md

**/*.go: Follow the existing module structure in /module/, /engine/, /model/ and use dependency injection patterns for component composition
Implement proper interfaces before concrete types
Follow Go naming conventions and the project's coding style defined in /docs/CodingConventions.md
Use mock generators: run make generate-mocks after interface changes
All inputs must be considered potentially byzantine; error classification is context-dependent and no code path is safe unless explicitly proven and documented
Use comprehensive error wrapping for debugging; avoid fmt.Errorf, use irrecoverable package for exceptions
NEVER log and continue on best effort basis; ALWAYS explicitly handle errors
Uses golangci-lint with custom configurations (.golangci.yml) and custom linters for Flow-specific conventions (struct write checking)

Files:

  • cmd/execution_config.go
  • engine/execution/storehouse/background_indexer_provider.go
  • engine/execution/storehouse/background_indexer.go
  • engine/execution/storehouse/background_indexer_factory.go
  • engine/execution/storehouse/background_indexer_factory_test.go
  • cmd/execution_builder.go
{module,engine,cmd}/**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

All major processing components must implement the Component interface from /module/component/component.go to ensure consistent lifecycle management and graceful shutdown patterns

Files:

  • cmd/execution_config.go
  • engine/execution/storehouse/background_indexer_provider.go
  • engine/execution/storehouse/background_indexer.go
  • engine/execution/storehouse/background_indexer_factory.go
  • engine/execution/storehouse/background_indexer_factory_test.go
  • cmd/execution_builder.go
{crypto,fvm,ledger,access,engine}/**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

Cryptographic operations require careful handling; refer to crypto library documentation for proper implementation

Files:

  • engine/execution/storehouse/background_indexer_provider.go
  • engine/execution/storehouse/background_indexer.go
  • engine/execution/storehouse/background_indexer_factory.go
  • engine/execution/storehouse/background_indexer_factory_test.go
{network,engine,consensus}/**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

Network messages must be authenticated and validated

Files:

  • engine/execution/storehouse/background_indexer_provider.go
  • engine/execution/storehouse/background_indexer.go
  • engine/execution/storehouse/background_indexer_factory.go
  • engine/execution/storehouse/background_indexer_factory_test.go
**/*_test.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*_test.go: Unit tests should be co-located with the code they test
Follow the existing pattern of *_test.go files for test naming
Use fixtures for realistic test data as defined in /utils/unittest/

Files:

  • engine/execution/storehouse/background_indexer_factory_test.go
🧠 Learnings (3)
📚 Learning: 2025-12-23T00:28:41.005Z
Learnt from: CR
Repo: onflow/flow-go PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-23T00:28:41.005Z
Learning: Applies to **/*_test.go : Use fixtures for realistic test data as defined in `/utils/unittest/`

Applied to files:

  • engine/execution/storehouse/background_indexer_factory_test.go
📚 Learning: 2025-12-23T00:28:41.005Z
Learnt from: CR
Repo: onflow/flow-go PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-23T00:28:41.005Z
Learning: Applies to integration/tests/**/*.go : Integration tests should go in `/integration/tests/`

Applied to files:

  • engine/execution/storehouse/background_indexer_factory_test.go
📚 Learning: 2025-12-23T00:28:41.005Z
Learnt from: CR
Repo: onflow/flow-go PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-23T00:28:41.005Z
Learning: Applies to {storage,ledger,execution,fvm}/**/*.go : State consistency is paramount; use proper synchronization primitives

Applied to files:

  • cmd/execution_builder.go
🧬 Code graph analysis (5)
cmd/execution_config.go (1)
engine/execution/storehouse/background_indexer.go (1)
  • DefaultHeightsPerSecond (21-21)
engine/execution/storehouse/background_indexer_provider.go (4)
engine/execution/storehouse/background_indexer.go (1)
  • RegisterUpdatesProvider (17-19)
model/flow/ledger.go (1)
  • RegisterEntries (220-220)
storage/errors.go (1)
  • ErrNotFound (17-17)
storage/store/payloads.go (1)
  • Payloads (13-20)
engine/execution/storehouse/background_indexer.go (1)
model/flow/ledger.go (1)
  • RegisterEntries (220-220)
engine/execution/storehouse/background_indexer_factory.go (5)
consensus/hotstuff/notifications/pubsub/distributor.go (2)
  • Distributor (10-14)
  • FollowerDistributor (36-39)
model/bootstrap/filenames.go (1)
  • FilenameWALRootCheckpoint (31-31)
engine/execution/storehouse/background_indexer_engine.go (2)
  • BackgroundIndexerEngine (21-30)
  • NewBackgroundIndexerEngine (52-80)
engine/execution/storehouse/background_indexer.go (2)
  • BackgroundIndexer (25-32)
  • NewBackgroundIndexer (34-50)
engine/execution/storehouse/background_indexer_provider.go (1)
  • NewExecutionDataRegisterUpdatesProvider (22-30)
engine/execution/storehouse/background_indexer_factory_test.go (1)
engine/execution/storehouse/background_indexer_factory.go (3)
  • ImportRegistersFromCheckpoint (37-37)
  • LoadBackgroundIndexerEngine (130-215)
  • LoadRegisterStore (41-127)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (37)
  • GitHub Check: Lint (./)
  • GitHub Check: Lint (./integration/)
  • GitHub Check: Lint (./insecure/)
  • GitHub Check: Integration Tests Others (integration)
  • GitHub Check: Unit Tests (admin)
  • GitHub Check: Unit Tests (network/alsp)
  • GitHub Check: Unit Tests (network/test/cohort2)
  • GitHub Check: Unit Tests (module)
  • GitHub Check: Unit Tests (network/p2p/node)
  • GitHub Check: Unit Tests (others)
  • GitHub Check: Unit Tests (network/p2p/scoring)
  • GitHub Check: Unit Tests (engine/execution/computation)
  • GitHub Check: Unit Tests (engine/execution)
  • GitHub Check: Unit Tests (network)
  • GitHub Check: Unit Tests (engine/consensus)
  • GitHub Check: Unit Tests (network/test/cohort1)
  • GitHub Check: Unit Tests (engine/verification)
  • GitHub Check: Unit Tests (ledger)
  • GitHub Check: Unit Tests (consensus)
  • GitHub Check: Unit Tests (network/p2p/connection)
  • GitHub Check: Unit Tests (engine/collection)
  • GitHub Check: Unit Tests (network/p2p)
  • GitHub Check: Unit Tests (module/dkg)
  • GitHub Check: Unit Tests (storage)
  • GitHub Check: Unit Tests (engine/execution/ingestion)
  • GitHub Check: Unit Tests (engine)
  • GitHub Check: Unit Tests (engine/access)
  • GitHub Check: Unit Tests (state)
  • GitHub Check: Unit Tests (cmd)
  • GitHub Check: Unit Tests (engine/common)
  • GitHub Check: Unit Tests (fvm)
  • GitHub Check: Unit Tests (utils)
  • GitHub Check: Unit Tests Insecure (insecure)
  • GitHub Check: Unit Tests Insecure (insecure/integration/functional/test/gossipsub/scoring)
  • GitHub Check: Unit Tests Insecure (insecure/integration/functional/test/gossipsub/rpc_inspector)
  • GitHub Check: Docker Build
  • GitHub Check: analyze-code (go)
🔇 Additional comments (21)
cmd/execution_config.go (3)

20-20: No review note for this import change.


73-78: LGTM — config fields are clear and scoped.


150-151: Flag wiring looks good.

cmd/execution_builder.go (7)

175-182: Nice addition of notifier/indexer handles for lifecycle wiring.


222-223: Module registration order looks good.


271-274: Conditional component load matches flag gating.


373-383: Notifier creation is appropriately gated by the background indexing flag.


883-913: Register store factory integration and shutdown wiring look solid.


1102-1123: Block-executed callback wiring into ingestion is clean.


1373-1407: Background indexer engine factory wiring looks correct.

engine/execution/storehouse/background_indexer_provider.go (2)

15-30: Constructor wiring is straightforward.


32-84: Update aggregation and error handling look good.

engine/execution/storehouse/background_indexer.go (1)

16-49: API surface and constructor wiring are clean.

engine/execution/storehouse/background_indexer_factory_test.go (4)

38-80: Good coverage for the disabled-engine path.


82-124: Background-indexing-disabled test looks good.


126-303: Bootstrap path test is thorough and well-signaled.


305-593: Indexing flow test provides solid end-to-end coverage.

engine/execution/storehouse/background_indexer_factory.go (4)

28-32: LGTM!

Clean, minimal interface that follows the Interface Segregation Principle. Good documentation.


34-37: LGTM!

Good use of function type for dependency injection to avoid circular dependencies.


39-127: LGTM! Error handling is thorough.

All error paths after the Pebble DB is opened properly close the database using multierror.Append. The previous concern about resource leaks has been addressed.


217-229: LGTM!

Clean implementation of io.Closer with proper error wrapping and interface compliance assertion.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +144 to +147
blockExecutedNotifier BlockExecutedNotifier, // optional: notifier for block executed events
followerDistributor *pubsub.FollowerDistributor,
heightsPerSecond uint64, // rate limit for indexing heights per second
) (*BackgroundIndexerEngine, bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential nil pointer dereference for "optional" parameter.

The comment marks blockExecutedNotifier as optional, but there's no nil check before passing it to NewBackgroundIndexerEngine. Looking at the engine constructor, it calls newFinalizedAndExecutedNotifier(blockExecutedNotifier, followerDistributor) which could panic if blockExecutedNotifier is nil.

Either remove the "optional" comment if it's actually required, or add a nil check similar to lines 159-164.

Suggested fix if parameter is required
-	blockExecutedNotifier BlockExecutedNotifier, // optional: notifier for block executed events
+	blockExecutedNotifier BlockExecutedNotifier,

Or if it's truly optional, add validation:

 	if resultsReader == nil {
 		return nil, false, fmt.Errorf("execution results reader is not initialized")
 	}
+	if blockExecutedNotifier == nil {
+		return nil, false, fmt.Errorf("block executed notifier is not initialized")
+	}
+	if followerDistributor == nil {
+		return nil, false, fmt.Errorf("follower distributor is not initialized")
+	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
blockExecutedNotifier BlockExecutedNotifier, // optional: notifier for block executed events
followerDistributor *pubsub.FollowerDistributor,
heightsPerSecond uint64, // rate limit for indexing heights per second
) (*BackgroundIndexerEngine, bool, error) {
blockExecutedNotifier BlockExecutedNotifier,
followerDistributor *pubsub.FollowerDistributor,
heightsPerSecond uint64, // rate limit for indexing heights per second
) (*BackgroundIndexerEngine, bool, error) {
🤖 Prompt for AI Agents
In `@engine/execution/storehouse/background_indexer_factory.go` around lines 144 -
147, The comment claims blockExecutedNotifier is optional but the code passes it
directly into NewBackgroundIndexerEngine which calls
newFinalizedAndExecutedNotifier(blockExecutedNotifier, followerDistributor) and
can panic on nil; either remove "optional" or add a nil check before calling
NewBackgroundIndexerEngine: validate that blockExecutedNotifier is non-nil (or
replace it with a no-op/default notifier) and return an explicit error if
missing; reference the parameter name blockExecutedNotifier and the constructors
NewBackgroundIndexerEngine and newFinalizedAndExecutedNotifier when locating
where to add the check or defaulting logic.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@engine/execution/storehouse/background_indexer_factory_test.go`:
- Around line 509-515: The test mutates mock expectations at runtime by clearing
finalSnapshot.ExpectedCalls and re-registering finalSnapshot.On("Head") after
engine.Start, which can race with concurrent calls to finalSnapshot.Head();
instead, set the finalSnapshot expectation (call
finalSnapshot.On("Head").Return(finalizedBlock.ToHeader(), nil)) before invoking
engine.Start (or stop the engine before mutating expectations) so no
expectations are changed while the engine/background indexer may call Head;
update the test to remove the runtime mutation of finalSnapshot.ExpectedCalls
and ensure finalSnapshot's Head expectation is configured prior to starting the
engine/background indexer.
🧹 Nitpick comments (1)
engine/execution/storehouse/background_indexer_factory_test.go (1)

499-508: Consider replacing fixed sleeps with eventual assertions/signals.
time.Sleep‑based waits can be flaky under load; an eventual condition tied to the metrics channel or engine progress is more reliable.

Also applies to: 527-535

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b78b4ae and ada2fd4.

📒 Files selected for processing (1)
  • engine/execution/storehouse/background_indexer_factory_test.go
🧰 Additional context used
📓 Path-based instructions (5)
**/*.go

📄 CodeRabbit inference engine (.cursor/rules/coding_conventions.mdc)

Follow Go coding conventions as documented in @docs/agents/CodingConventions.md

Follow Go coding standards and conventions as documented in @docs/agents/GoDocs.md

**/*.go: Follow the existing module structure in /module/, /engine/, /model/ and use dependency injection patterns for component composition
Implement proper interfaces before concrete types
Follow Go naming conventions and the project's coding style defined in /docs/CodingConventions.md
Use mock generators: run make generate-mocks after interface changes
All inputs must be considered potentially byzantine; error classification is context-dependent and no code path is safe unless explicitly proven and documented
Use comprehensive error wrapping for debugging; avoid fmt.Errorf, use irrecoverable package for exceptions
NEVER log and continue on best effort basis; ALWAYS explicitly handle errors
Uses golangci-lint with custom configurations (.golangci.yml) and custom linters for Flow-specific conventions (struct write checking)

Files:

  • engine/execution/storehouse/background_indexer_factory_test.go
**/*_test.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*_test.go: Unit tests should be co-located with the code they test
Follow the existing pattern of *_test.go files for test naming
Use fixtures for realistic test data as defined in /utils/unittest/

Files:

  • engine/execution/storehouse/background_indexer_factory_test.go
{module,engine,cmd}/**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

All major processing components must implement the Component interface from /module/component/component.go to ensure consistent lifecycle management and graceful shutdown patterns

Files:

  • engine/execution/storehouse/background_indexer_factory_test.go
{crypto,fvm,ledger,access,engine}/**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

Cryptographic operations require careful handling; refer to crypto library documentation for proper implementation

Files:

  • engine/execution/storehouse/background_indexer_factory_test.go
{network,engine,consensus}/**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

Network messages must be authenticated and validated

Files:

  • engine/execution/storehouse/background_indexer_factory_test.go
🧬 Code graph analysis (1)
engine/execution/storehouse/background_indexer_factory_test.go (1)
engine/execution/storehouse/background_indexer_factory.go (3)
  • ImportRegistersFromCheckpoint (37-37)
  • LoadBackgroundIndexerEngine (130-215)
  • LoadRegisterStore (41-127)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (37)
  • GitHub Check: Lint (./integration/)
  • GitHub Check: Lint (./)
  • GitHub Check: Lint (./insecure/)
  • GitHub Check: Unit Tests Insecure (insecure/integration/functional/test/gossipsub/rpc_inspector)
  • GitHub Check: Unit Tests Insecure (insecure/integration/functional/test/gossipsub/scoring)
  • GitHub Check: Unit Tests Insecure (insecure)
  • GitHub Check: Integration Tests Others (integration)
  • GitHub Check: Unit Tests (network/test/cohort1)
  • GitHub Check: Unit Tests (network/p2p/scoring)
  • GitHub Check: Unit Tests (network/test/cohort2)
  • GitHub Check: Unit Tests (engine/execution/ingestion)
  • GitHub Check: Unit Tests (others)
  • GitHub Check: Unit Tests (network/p2p/connection)
  • GitHub Check: Unit Tests (state)
  • GitHub Check: Unit Tests (network/p2p)
  • GitHub Check: Unit Tests (module)
  • GitHub Check: Unit Tests (network)
  • GitHub Check: Unit Tests (engine/common)
  • GitHub Check: Unit Tests (network/p2p/node)
  • GitHub Check: Unit Tests (engine)
  • GitHub Check: Unit Tests (module/dkg)
  • GitHub Check: Unit Tests (consensus)
  • GitHub Check: Unit Tests (admin)
  • GitHub Check: Unit Tests (network/alsp)
  • GitHub Check: Unit Tests (engine/verification)
  • GitHub Check: Unit Tests (utils)
  • GitHub Check: Unit Tests (engine/execution/computation)
  • GitHub Check: Unit Tests (engine/access)
  • GitHub Check: Unit Tests (fvm)
  • GitHub Check: Unit Tests (engine/execution)
  • GitHub Check: Unit Tests (engine/consensus)
  • GitHub Check: Unit Tests (engine/collection)
  • GitHub Check: Unit Tests (storage)
  • GitHub Check: Unit Tests (ledger)
  • GitHub Check: Unit Tests (cmd)
  • GitHub Check: Docker Build
  • GitHub Check: analyze-code (go)
🔇 Additional comments (3)
engine/execution/storehouse/background_indexer_factory_test.go (3)

38-82: Looks good for the enabled-path setup.
Covers the “enabled + required deps present” branch cleanly.


84-126: Good coverage of the disabled flag behavior.
The nil execution data store is correctly tolerated when the flag is off.


128-305: Solid bootstrap-path coverage.
The checkpoint import + metrics signal path is well exercised.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +509 to +515
// Update finalized snapshot to return the last finalized block AFTER finalizing
// This allows the background indexer to know which blocks are finalized
// Remove the previous mock and set a new one that returns height 105
finalizedBlock := blocks[numBlocks-1]
finalSnapshot.ExpectedCalls = nil // Clear previous expectations
finalSnapshot.On("Head").Return(finalizedBlock.ToHeader(), nil)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid mutating mock expectations while the engine is running.
finalSnapshot.ExpectedCalls = nil and re‑On after engine.Start can race with concurrent calls to Head() and trigger data races/flaky tests. Set the expectations before starting the engine (or stop the engine before mutation).

🔧 Safer expectation setup (before engine.Start)
-// Initially set finalized snapshot to return startHeight (no blocks finalized yet)
-// This prevents the initial notification from trying to process blocks
-initialFinalHeader := unittest.BlockHeaderFixture()
-initialFinalHeader.Height = startHeight
-finalSnapshot.On("Head").Return(initialFinalHeader, nil)
+// Initially return startHeight, then allow subsequent calls to return the final head.
+initialFinalHeader := unittest.BlockHeaderFixture()
+initialFinalHeader.Height = startHeight
+finalSnapshot.On("Head").Return(initialFinalHeader, nil).Once()
+finalizedBlock := blocks[numBlocks-1]
+finalSnapshot.On("Head").Return(finalizedBlock.ToHeader(), nil).Maybe()
-// Update finalized snapshot to return the last finalized block AFTER finalizing
-// This allows the background indexer to know which blocks are finalized
-// Remove the previous mock and set a new one that returns height 105
-finalizedBlock := blocks[numBlocks-1]
-finalSnapshot.ExpectedCalls = nil // Clear previous expectations
-finalSnapshot.On("Head").Return(finalizedBlock.ToHeader(), nil)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Update finalized snapshot to return the last finalized block AFTER finalizing
// This allows the background indexer to know which blocks are finalized
// Remove the previous mock and set a new one that returns height 105
finalizedBlock := blocks[numBlocks-1]
finalSnapshot.ExpectedCalls = nil // Clear previous expectations
finalSnapshot.On("Head").Return(finalizedBlock.ToHeader(), nil)
🤖 Prompt for AI Agents
In `@engine/execution/storehouse/background_indexer_factory_test.go` around lines
509 - 515, The test mutates mock expectations at runtime by clearing
finalSnapshot.ExpectedCalls and re-registering finalSnapshot.On("Head") after
engine.Start, which can race with concurrent calls to finalSnapshot.Head();
instead, set the finalSnapshot expectation (call
finalSnapshot.On("Head").Return(finalizedBlock.ToHeader(), nil)) before invoking
engine.Start (or stop the engine before mutating expectations) so no
expectations are changed while the engine/background indexer may call Head;
update the test to remove the runtime mutation of finalSnapshot.ExpectedCalls
and ensure finalSnapshot's Head expectation is configured prior to starting the
engine/background indexer.

Copy link
Member

@fxamacker fxamacker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 👍 I left a minor comment.

Component("grpc server", exeNode.LoadGrpcServer)

// Only load background indexer engine when both flags indicate it should be enabled
if !exeNode.exeConf.enableStorehouse && exeNode.exeConf.enableBackgroundStorehouseIndexing {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we can just check enableBackgroundStorehouseIndexing flag since ValidateFlags() turns off enableBackgroundStorehouseIndexing if enableStorehouse is true.

Suggested change
if !exeNode.exeConf.enableStorehouse && exeNode.exeConf.enableBackgroundStorehouseIndexing {
if exeNode.exeConf.enableBackgroundStorehouseIndexing {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I see your point.

This is to ensure that as long as storehouse is enabled, the background index should not be turned on.

I kept both to be a bit defensive, because It's ok if we forgot to turn on the background indexing (it can be turned on later), but it's not ok if we accidentally turned on the background indexing when it shouldn't (when storehouse is also running, it creates complex race condition).

Comment on lines +69 to +117
var sleepDuration time.Duration
if b.heightsPerSecond > 0 {
sleepDuration = time.Second / time.Duration(b.heightsPerSecond)
}

// Loop through each unindexed finalized height, fetch register updates and store them
for h := lastIndexedHeight + 1; h <= latestFinalized.Height; h++ {
// Check context cancellation before processing each height
select {
case <-ctx.Done():
return ctx.Err()
default:
}

header, err := b.headers.ByHeight(h)
if err != nil {
return fmt.Errorf("failed to get header for height %d: %w", h, err)
}

// Get register entries for this height
registerEntries, executed, err := b.provider.RegisterUpdatesByBlockID(ctx, header.ID())
if err != nil {
return fmt.Errorf("failed to get register entries for height %d: %w", h, err)
}

if !executed {
// if the finalized block has not been executed, then we finish indexing,
// as we have finished indexing all executed blocks up to this point.
// in happy case, all finalized blocks should have been executed.
// this might happen when the execution node is catching up or during HCU.
return nil
}

// Store registers directly to disk store
err = b.registerStore.SaveRegisters(header, registerEntries)
if err != nil {
return fmt.Errorf("failed to store registers for height %d: %w", h, err)
}

// Throttle indexing rate if configured
if b.heightsPerSecond > 0 && h < latestFinalized.Height {
select {
case <-ctx.Done():
return ctx.Err()
case <-time.After(sleepDuration):
// Continue to next iteration
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: I feel like with timed loops it makes more sense to try to start the task in its allotted time slot instead of having a predefined pause between them

Suggested change
var sleepDuration time.Duration
if b.heightsPerSecond > 0 {
sleepDuration = time.Second / time.Duration(b.heightsPerSecond)
}
// Loop through each unindexed finalized height, fetch register updates and store them
for h := lastIndexedHeight + 1; h <= latestFinalized.Height; h++ {
// Check context cancellation before processing each height
select {
case <-ctx.Done():
return ctx.Err()
default:
}
header, err := b.headers.ByHeight(h)
if err != nil {
return fmt.Errorf("failed to get header for height %d: %w", h, err)
}
// Get register entries for this height
registerEntries, executed, err := b.provider.RegisterUpdatesByBlockID(ctx, header.ID())
if err != nil {
return fmt.Errorf("failed to get register entries for height %d: %w", h, err)
}
if !executed {
// if the finalized block has not been executed, then we finish indexing,
// as we have finished indexing all executed blocks up to this point.
// in happy case, all finalized blocks should have been executed.
// this might happen when the execution node is catching up or during HCU.
return nil
}
// Store registers directly to disk store
err = b.registerStore.SaveRegisters(header, registerEntries)
if err != nil {
return fmt.Errorf("failed to store registers for height %d: %w", h, err)
}
// Throttle indexing rate if configured
if b.heightsPerSecond > 0 && h < latestFinalized.Height {
select {
case <-ctx.Done():
return ctx.Err()
case <-time.After(sleepDuration):
// Continue to next iteration
}
}
}
var sleepDuration time.Duration
if b.heightsPerSecond > 0 {
sleepDuration = time.Second / time.Duration(b.heightsPerSecond)
}
ticker := time.NewTicker(sleepDuration)
defer ticker.Stop()
// Loop through each unindexed finalized height, fetch register updates and store them
for h := lastIndexedHeight + 1; h <= latestFinalized.Height; h++ {
// Check context cancellation before processing each height
select {
case <-ctx.Done():
return ctx.Err()
default:
}
header, err := b.headers.ByHeight(h)
if err != nil {
return fmt.Errorf("failed to get header for height %d: %w", h, err)
}
// Get register entries for this height
registerEntries, executed, err := b.provider.RegisterUpdatesByBlockID(ctx, header.ID())
if err != nil {
return fmt.Errorf("failed to get register entries for height %d: %w", h, err)
}
if !executed {
// if the finalized block has not been executed, then we finish indexing,
// as we have finished indexing all executed blocks up to this point.
// in happy case, all finalized blocks should have been executed.
// this might happen when the execution node is catching up or during HCU.
return nil
}
// Store registers directly to disk store
err = b.registerStore.SaveRegisters(header, registerEntries)
if err != nil {
return fmt.Errorf("failed to store registers for height %d: %w", h, err)
}
// Throttle indexing rate if configured
if b.heightsPerSecond > 0 && h < latestFinalized.Height {
select {
case <-ctx.Done():
return ctx.Err()
case <-ticker.C:
// Continue to next iteration
}
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I looks at the usage, it kind of feels like that the ticker should be part of the BackgroundIndexer and not recreated every time because IndexUpToLatestFinalizedAndExecutedHeight is called every time a block is executed or finalized.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference between ticker and sleep is that, sleep would force to create a pause for the duration. But ticker might not take a pause at all if the time took to finish one round is longer than the tick duration.

So the difference is when the node is trying to catch up: with ticker, the background indexer will probably index faster. However, that might cause more machine resource being used by background indexer, which I'm trying to avoid.

return
}
// ensure the register store database is closed when the worker loop exits
defer closer.Close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlogged error

return fmt.Errorf("failed to get latest finalized height: %w", err)
}

b.log.Info().
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this log will be frequent. Maybe it could be a debug log.


registerStore, err := NewRegisterStore(
diskStore,
nil, // TODO: replace with real WAL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a future TODO? do we not need the real WAL in this iteration?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, this is a future TODO for like storehouse phase 4. But thanks for checking.


for _, chunk := range data.ChunkExecutionDatas {
// Collect register updates from this chunk
if chunk.TrieUpdate != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion invert condition to improve readability (if !condition {continue})

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@engine/execution/storehouse/background_indexer_engine.go`:
- Around line 107-109: Finish the incomplete comment on line 108 to clearly
describe the behavior: replace "the background indexer is" with a concise phrase
such as "the background indexer is triggered to index new blocks" so it
documents that receiving from b.newBlockExecutedOrFinalized.Channel() causes
backgroundIndexer.IndexUpToLatestFinalizedAndExecutedHeight(ctx) to run; ensure
the comment mentions the trigger (b.newBlockExecutedOrFinalized.Channel()) and
the action (backgroundIndexer.IndexUpToLatestFinalizedAndExecutedHeight(ctx)).
- Around line 94-99: The deferred close handler must not call ctx.Throw because
the goroutine is exiting; replace the ctx.Throw call inside the defer with a
non-panicking log of the error — e.g., call an existing logger (ctx.Logger.Error
or similar) or fallback to log.Printf to record fmt.Errorf("failed to close
register store database: %w", err) — in the defer that wraps closer.Close() (the
anonymous defer function referencing closer.Close and ctx.Throw).
♻️ Duplicate comments (1)
engine/execution/storehouse/background_indexer_factory.go (1)

144-147: Add nil check for blockExecutedNotifier or remove "optional" comment.

The comment marks blockExecutedNotifier as optional, but looking at NewBackgroundIndexerEngine (lines 207-212), it's passed directly to newFinalizedAndExecutedNotifier which will call AddConsumer on it. This will panic if nil. The validation at lines 159-164 checks other parameters but not this one.

Suggested fix: Add validation
 	if resultsReader == nil {
 		return nil, false, fmt.Errorf("execution results reader is not initialized")
 	}
+	if blockExecutedNotifier == nil {
+		return nil, false, fmt.Errorf("block executed notifier is not initialized")
+	}
+	if followerDistributor == nil {
+		return nil, false, fmt.Errorf("follower distributor is not initialized")
+	}
🧹 Nitpick comments (1)
engine/execution/storehouse/background_indexer_provider.go (1)

71-86: Consider pre-allocating with exact capacity after map population.

The slice is pre-allocated with len(registerUpdates) capacity, which is correct. However, iteration order over a map is non-deterministic. If deterministic ordering matters for downstream consumers (e.g., for testing or debugging), consider sorting the entries.

Comment on lines +107 to +109
case <-b.newBlockExecutedOrFinalized.Channel():
// the background indexer is
err := backgroundIndexer.IndexUpToLatestFinalizedAndExecutedHeight(ctx)
Copy link
Contributor

@coderabbitai coderabbitai bot Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Incomplete comment on line 108.

The comment "the background indexer is" appears to be unfinished.

Suggested fix
 		case <-b.newBlockExecutedOrFinalized.Channel():
-			// the background indexer is
+			// Trigger indexing up to the latest finalized and executed height
 			err := backgroundIndexer.IndexUpToLatestFinalizedAndExecutedHeight(ctx)
🤖 Prompt for AI Agents
In `@engine/execution/storehouse/background_indexer_engine.go` around lines 107 -
109, Finish the incomplete comment on line 108 to clearly describe the behavior:
replace "the background indexer is" with a concise phrase such as "the
background indexer is triggered to index new blocks" so it documents that
receiving from b.newBlockExecutedOrFinalized.Channel() causes
backgroundIndexer.IndexUpToLatestFinalizedAndExecutedHeight(ctx) to run; ensure
the comment mentions the trigger (b.newBlockExecutedOrFinalized.Channel()) and
the action (backgroundIndexer.IndexUpToLatestFinalizedAndExecutedHeight(ctx)).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In @engine/execution/storehouse/background_indexer_engine.go around lines 107 -
109, Finish the incomplete comment on line 108 to clearly describe the behavior:
replace "the background indexer is" with a concise phrase such as "the
background indexer is triggered to index new blocks" so it documents that
receiving from b.newBlockExecutedOrFinalized.Channel() causes
backgroundIndexer.IndexUpToLatestFinalizedAndExecutedHeight(ctx) to run; ensure
the comment mentions the trigger (b.newBlockExecutedOrFinalized.Channel()) and
the action (backgroundIndexer.IndexUpToLatestFinalizedAndExecutedHeight(ctx)).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhangchiqing I'll help you complete that comment.

Task: Complete incomplete comment on line 108

  • Update the comment to clearly describe the trigger and action when receiving from the channel

💡 For code changes across multiple files, use @coderabbitai plan in an issue comment

@j1010001 j1010001 added this pull request to the merge queue Jan 26, 2026
Merged via the queue into master with commit f756892 Jan 26, 2026
61 checks passed
@j1010001 j1010001 deleted the leo/background-storehouse-indexing branch January 26, 2026 16:22
@coderabbitai coderabbitai bot mentioned this pull request Jan 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants