Skip to content

Move Serialization/Deserialization (serde) to task SDK#58992

Merged
amoghrajesh merged 27 commits intoapache:mainfrom
astronomer:move-serde-to-task-sdk
Dec 17, 2025
Merged

Move Serialization/Deserialization (serde) to task SDK#58992
amoghrajesh merged 27 commits intoapache:mainfrom
astronomer:move-serde-to-task-sdk

Conversation

@amoghrajesh
Copy link
Copy Markdown
Contributor

closes: #58887
closes: #58885

Motivation

This change is part of the broader effort to achieve client-server separation. The serialization and deserialization utilities (serde) are execution time utilities used primarily during task execution for:

  • Serializing/deserializing XComs when communicating with the API server
  • Processing deadline alerts

Since these utilities are only needed at execution time (in workers, DAG processors, and triggerers), they belong in the task-sdk rather than airflow-core. This allows the core server components (Scheduler, API Server) to operate without requiring the task SDK, and also enabled task SDK to not having to import serde from airflow core, enabling independent deployments and upgrades.

Blockers and Solutions

Several blockers were identified and resolved:

  1. Migration File: The migration 0092_3_2_0_replace_deadline_inline_callback_with_fkey.py uses serde.deserialize()
    Effort tracked by @ramitkataria in Do not use serde library for database migrations

  2. XCom API Stringification: The core API's xcom endpoint (/api/v2/dags/{dag_id}/dagRuns/{run_id}/taskInstances/{task_id}/xcomEntries) used deserialize(full=False) to stringify XCom values for UI display.
    The solution was to create a dedicated stringify.py module in airflow-core that provides stringification without any SDK dependencies. This module matches the behavior of deserialize(full=False) exactly but is self-contained and independent of any external libraries.

  3. XComEncoder/XComDecoder: These JSON encoder/decoder classes in airflow-core directly imported serde for full serialization/deserialization of XCom values.
    The fix here was to implement lazy imports that attempt to load serde from the SDK, making the dependency optional. If the SDK is not available, appropriate error messages are shown. The consumers of this anyways were mostly fixed / handled by: Decouple xcom public API from using XcomEncoder #58900, which discovered that the same effect can be achieved without using those encoders. In a follow-up, I might move it to task SDK as well.

High Level Changes

  1. Serde module is moved to task sdk now

    • All serializers moved to task-sdk/src/airflow/sdk/serialization/serializers/
  2. Stringification Separation

    • Created airflow-core/src/airflow/serialization/stringify.py for UI stringification
    • This module is self-contained and does not import from SDK
    • Matches deserialize(full=False) behavior exactly
  3. Deprecation Path

    • Old imports (from airflow.serialization.serde import ) continue to work
    • Redirection to SDK variant with deprecation warnings
    • Ensures backward compatibility till we remove it
  4. Tests

    • Serialize/deserialize tests moved to task-sdk/tests/task_sdk/serialization/test_serde.py
    • Stringify tests created in airflow-core/tests/unit/serialization/test_stringify.py
  5. Docs

    • Updated serializer documentation to reference `airflow.sdk.serialization.serializers

What does this mean for new serializers? (Debatable, check open questions below)

Important: New serializers should now be added to the SDK namespace:

The old airflow-core/src/airflow/serialization/serializers/ directory still exists for backward compatibility but is deprecated. New serializers should not be added there.

Open Questions

  1. When should we delete airflow-core/src/airflow/serialization/serializers/? Currently kept for backward compatibility, but should be removed in a future release, no code reads off of here, shall we just delete it? What for people who have added their serializers in here?

  2. Should XcomEncoder + Decoder stay in airflow core or task sdk?

Testing

  • All serialize/deserialize tests moved to SDK and passing
  • New stringify tests created to test all possible situations

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@potiuk
Copy link
Copy Markdown
Member

potiuk commented Dec 8, 2025

This is actually splitting serialization properly as it should rightfully be and it separates things rightly towards the future direction. So, serialization is in two parts (somewhat messy, but let me still explain):

Makes sense. I never liked the idea of having "one serialization to rule them all". Thanks for explanation.

@potiuk
Copy link
Copy Markdown
Member

potiuk commented Dec 8, 2025

Makes sense. I never liked the idea of having "one serialization to rule them all". Thanks for explanation.

Also it makes it far better from the security standpoint. And opens the door of having serde implementation connected with providers installed (and likely discovered by Providers Manager)

@amoghrajesh
Copy link
Copy Markdown
Contributor Author

This is actually splitting serialization properly as it should rightfully be and it separates things rightly towards the future direction. So, serialization is in two parts (somewhat messy, but let me still explain):

Makes sense. I never liked the idea of having "one serialization to rule them all". Thanks for explanation.

Absolutely me too!

Copy link
Copy Markdown
Member

@pierrejeambrun pierrejeambrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API side LGTM.

@amoghrajesh
Copy link
Copy Markdown
Contributor Author

amoghrajesh commented Dec 9, 2025

Thanks, I will merge it once I am back from holidays (16 Dec), just because the scope of this PR

@amoghrajesh
Copy link
Copy Markdown
Contributor Author

Alright, let's merge this one today

@amoghrajesh
Copy link
Copy Markdown
Contributor Author

Broken tests are unrelated. Same failure on main too: https://github.com/apache/airflow/actions/runs/20298266352/job/58300350219

@amoghrajesh
Copy link
Copy Markdown
Contributor Author

Merging this PR

@amoghrajesh amoghrajesh merged commit 3af4d28 into apache:main Dec 17, 2025
235 of 237 checks passed
@amoghrajesh amoghrajesh deleted the move-serde-to-task-sdk branch December 17, 2025 10:28
@github-actions
Copy link
Copy Markdown

Backport failed to create: v3-1-test. View the failure log Run details

Status Branch Result
v3-1-test Commit Link

You can attempt to backport this manually by running:

cherry_picker 3af4d28 v3-1-test

This should apply the commit to the v3-1-test branch and leave the commit in conflict state marking
the files that need manual conflict resolution.

After you have resolved the conflicts, you can continue the backport process by running:

cherry_picker --continue

@amoghrajesh
Copy link
Copy Markdown
Contributor Author

No backport needed, this is targetted for 3.2.0

@potiuk
Copy link
Copy Markdown
Member

potiuk commented Dec 17, 2025

#protm

FoxHelms pushed a commit to FoxHelms/airflow that referenced this pull request Dec 17, 2025
Lohith625 pushed a commit to Lohith625/airflow that referenced this pull request Dec 19, 2025
jhgoebbert pushed a commit to jhgoebbert/airflow_Owen-CH-Leung that referenced this pull request Feb 8, 2026
@amoghrajesh amoghrajesh added this to the Airflow 3.2.0 milestone Feb 11, 2026
@amoghrajesh
Copy link
Copy Markdown
Contributor Author

cc: @atul-astronomer try with various xcoms in our test suite to see if something breaks, it can be handled

Subham-KRLX pushed a commit to Subham-KRLX/airflow that referenced this pull request Mar 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Isolate xcom API from using serde deserialize for UI display Move over serde library to task sdk

6 participants