Skip to content

Gate EnableVqueues on partitions with no in-flight data#4786

Merged
tillrohrmann merged 3 commits into
restatedev:mainfrom
tillrohrmann:fail-if-vqueue-migration-is-needed
May 21, 2026
Merged

Gate EnableVqueues on partitions with no in-flight data#4786
tillrohrmann merged 3 commits into
restatedev:mainfrom
tillrohrmann:fail-if-vqueue-migration-is-needed

Conversation

@tillrohrmann
Copy link
Copy Markdown
Contributor

Block the EnableVqueues state-machine feature change from being applied
to a partition that holds pre-existing in-flight data. This binary does
not ship the migration that would rewrite that data into vqueue form,
so applying the change would otherwise leave the data stranded on the
legacy code path.

The gate runs deterministically inside OnVersionBarrierCommand::apply
for any feature change that flips a feature off->on. It probes the
inbox table (catches inbox invocations and state mutations) and the
invocation status table (catches non-Completed entries, which
transitively cover held virtual-object locks and scheduled-invocation
timers via the InvocationStatus::Scheduled source-of-truth). When the
gate trips, the whole barrier fails atomically with the new
Error::MigrationRequired { features } variant; the transaction rolls
back so no partial state (incl. min_restate_version) is persisted, and
the partition halts until rolled to a server version that supports the
migration.

@tillrohrmann tillrohrmann requested a review from AhmedSoliman May 21, 2026 16:03
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@tillrohrmann tillrohrmann force-pushed the fail-if-vqueue-migration-is-needed branch from 9bcb561 to 5ceb3fb Compare May 21, 2026 16:07
Copy link
Copy Markdown
Contributor

@AhmedSoliman AhmedSoliman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty cool!

tillrohrmann and others added 3 commits May 21, 2026 23:55
Extend the StateMachineFeatures trait with is_vqueues_enabled (gated on the
persisted feature flag) and move the impl from SemanticRestateVersion to
StateMachine + StateMachineApplyContext so each method can consult both the
min Restate version and the persisted feature set. Replace the in-state-machine
Configuration::pinned().common.experimental.is_vqueues_enabled() call sites
with self.is_vqueues_enabled() / ctx.is_vqueues_enabled().

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When a partition becomes leader and Configuration::common::experimental
has is_vqueues_enabled() set, propose a VersionBarrierCommand carrying
PartitionFeatureChange::EnableVqueues — but only if the FSM hasn't already
recorded the opt-in. The persisted state update flows through the existing
OnVersionBarrierCommand apply path; become_leader does not touch the FSM
mirror locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Block the EnableVqueues state-machine feature change from being applied
to a partition that holds pre-existing in-flight data. This binary does
not ship the migration that would rewrite that data into vqueue form,
so applying the change would otherwise leave the data stranded on the
legacy code path.

The gate runs deterministically inside OnVersionBarrierCommand::apply
for any feature change that flips a feature off->on. It probes the
inbox table (catches inbox invocations and state mutations) and the
invocation status table (catches non-Completed entries, which
transitively cover held virtual-object locks and scheduled-invocation
timers via the InvocationStatus::Scheduled source-of-truth). When the
gate trips, the whole barrier fails atomically with the new
Error::MigrationRequired { features } variant; the transaction rolls
back so no partial state (incl. min_restate_version) is persisted, and
the partition halts until rolled to a server version that supports the
migration.
@tillrohrmann tillrohrmann force-pushed the fail-if-vqueue-migration-is-needed branch from 5ceb3fb to ec29d04 Compare May 21, 2026 22:03
@tillrohrmann tillrohrmann merged commit ec29d04 into restatedev:main May 21, 2026
28 checks passed
@github-actions github-actions Bot locked and limited conversation to collaborators May 21, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants