Fix deferred task resume failure when worker is older than server#64598
Merged
vatsrahul1001 merged 2 commits intoapache:mainfrom Apr 2, 2026
Merged
Conversation
vatsrahul1001
approved these changes
Apr 1, 2026
eladkal
approved these changes
Apr 1, 2026
d5379c9 to
57b9114
Compare
Workers on task-sdk < 1.2 crash with `KeyError: __var` when resuming deferred tasks whose trigger has fired on a 3.2 server. The server's handle_event_submit re-serializes next_kwargs with SDK serde (plain dicts), but old workers expect BaseSerialization format (__type/__var wrapping). submit_failure and scheduler timeout paths also write plain dicts that old workers cannot parse. Add a Cadwyn response converter on TIRunContext that deserializes SDK serde then re-serializes with BaseSerialization for old API versions. This catches all next_kwargs writers at the single API read point. Short-circuits when data is already in BaseSerialization format.
57b9114 to
858818b
Compare
Lee-W
approved these changes
Apr 2, 2026
2 tasks
Contributor
|
e2e test failing |
1 task
ephraimbuddy
approved these changes
Apr 2, 2026
2 tasks
ashb
approved these changes
Apr 2, 2026
Contributor
|
E2E failure looks unrelated. Failing for others' PR as well. |
Backport successfully created: v3-2-testNote: As of Merging PRs targeted for Airflow 3.X In matter of doubt please ask in #release-management Slack channel.
|
github-actions bot
pushed a commit
to aws-mwaa/upstream-to-airflow
that referenced
this pull request
Apr 2, 2026
…n server (apache#64598) Fix deferred task resume failure when worker is older than server (apache#64598) (cherry picked from commit 891c7fb) Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
vatsrahul1001
pushed a commit
that referenced
this pull request
Apr 2, 2026
vatsrahul1001
pushed a commit
that referenced
this pull request
Apr 8, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
KeyError: __varwhen resuming deferred tasks whose trigger has fired on a 3.2 serverTIRunContextthat wrapsnext_kwargsin BaseSerialization format for old API versionsContext
After #59711 switched trigger kwargs to SDK serde,
handle_event_submitre-serializesnext_kwargsas plain dicts. Old workers only knowBaseSerialization.deserialize(), which requires{"__type": "dict", "__var": {...}}wrapping on all dicts -- so they crash withKeyError: __var.This was flagged during review of #59711 -- the compat shim was added for the deserialize path but not the serialize path.
Approach
Instead of changing
trigger.pyto useBaseSerialization.serialize()(which regresses the DB to old format), the fix adds a response converter in the Execution API versioning layer (ModifyDeferredTaskKwargsToJsonValue). This:__type/__varkeys (rolling upgrade with old data in DB)