Skip to content

Enforce supervisor schema class name matches its type literal#66899

Merged
jason810496 merged 2 commits into
apache:mainfrom
jason810496:ci/guard-supervisor-schema-name-and-type-in-sync
May 15, 2026
Merged

Enforce supervisor schema class name matches its type literal#66899
jason810496 merged 2 commits into
apache:mainfrom
jason810496:ci/guard-supervisor-schema-name-and-type-in-sync

Conversation

@jason810496
Copy link
Copy Markdown
Member

@jason810496 jason810496 commented May 14, 2026

Why

The six supervisor discriminated unions (ToTask, ToSupervisor, ToManager, ToDagProcessor, ToTriggerRunner, ToTriggerSupervisor) need every member's class __name__ to equal its type: Literal[...] value so CommsDecoder routes wire frames to the right class — but nothing enforced that invariant, and two members had silently drifted on main.

What

  • Add task-sdk/tests/task_sdk/execution_time/test_supervisor_schemas_name_type_sync.py — a parametrized unit test (one case per union) that walks every member and asserts class __name__ equals its single-value type Literal. Catches drift, a missing type field, and multi-value Literals. Runs as part of the task-sdk test suite.
  • Fix the two pre-existing mismatches the new test surfaces in task-sdk/src/airflow/sdk/execution_time/comms.py:
    • XComCountResponse: Literal["XComLengthResponse"]Literal["XComCountResponse"]
    • GetXComCount: Literal["GetNumberXComs"]Literal["GetXComCount"]
  • Update the one assertion in task-sdk/tests/task_sdk/execution_time/test_supervisor.py that pinned the old wire string.

Verfication

Screenshot 2026-05-14 at 2 33 09 PM
Was generative AI tooling used to co-author this PR?

@jason810496 jason810496 force-pushed the ci/guard-supervisor-schema-name-and-type-in-sync branch from c37f6da to 45c5d86 Compare May 14, 2026 05:25
@jason810496 jason810496 changed the title Add prek hook to keep supervisor schema class name in sync with its `… Add prek hook to keep supervisor schema class name in sync with its type literal May 14, 2026
@jason810496 jason810496 self-assigned this May 14, 2026
Comment thread task-sdk/src/airflow/sdk/execution_time/comms.py
Comment thread dev/breeze/doc/images/output_pr_auto-triage.svg Outdated
Comment thread task-sdk/src/airflow/sdk/execution_time/comms.py
Comment thread task-sdk/src/airflow/sdk/execution_time/comms.py
Comment thread task-sdk/src/airflow/sdk/execution_time/comms.py
@jason810496 jason810496 marked this pull request as draft May 14, 2026 06:24
@jason810496 jason810496 changed the title Add prek hook to keep supervisor schema class name in sync with its type literal Enforce supervisor schema class name matches its type literal May 14, 2026
@jason810496 jason810496 force-pushed the ci/guard-supervisor-schema-name-and-type-in-sync branch from af83d31 to e513938 Compare May 14, 2026 06:40
@jason810496 jason810496 marked this pull request as ready for review May 14, 2026 06:41
@jscheffl
Copy link
Copy Markdown
Contributor

Not sure if considered but besides making it static check at coding time, have we considered making this a runtime property handled by Pydantic?

class BaseCommClass(BaseModel):
    """Base type class."""

    type: str = "__undefined__"

    def __init__(self):
        super().__init__()
        self.type = self.__class__.__name__

class MyType1(BaseCommClass):
    """Another type."""

    pass

class MyType2(BaseCommClass):
    """Another type."""

    pass


print(MyType1().type)
print(MyType2().type)

@jason810496 jason810496 removed backport-to-v3-2-test Mark PR with this label to backport to v3-2-test branch area:dev-tools labels May 15, 2026
@jason810496 jason810496 merged commit 28e82d2 into apache:main May 15, 2026
138 of 145 checks passed
@jason810496
Copy link
Copy Markdown
Member Author

Not sure if considered but besides making it static check at coding time, have we considered making this a runtime property handled by Pydantic?

I will follow-up on this one. BTW, not sure if there's any reason to explictly specify the typ literal field (e.g. binding the field at runtime might change the generated schema?) I will check it out.

@jason810496
Copy link
Copy Markdown
Member Author

Tried it. Runtime-binding changes the schema and also breaks the tagged union itself, because Pydantic v2 needs the discriminator field to be a Literal.

from typing import Annotated, Literal, Union
from pydantic import BaseModel, Field, TypeAdapter

# --- Current pattern: Literal type field ---
class A1(BaseModel):
    type: Literal["A1"] = "A1"

class A2(BaseModel):
    type: Literal["A2"] = "A2"

print(A1.model_json_schema())
#. {'properties': {'type': {'const': 'A1', 'default': 'A1', 'title': 'Type', 'type': 'string'}}, 'title': 'A1', 'type': 'object'}
TypeAdapter(Annotated[Union[A1, A2], Field(discriminator="type")])  # OK

# --- Proposed pattern: runtime-bind in __init__ ---
class B(BaseModel):
    type: str = "__undefined__"
    def __init__(self, **kw):
        super().__init__(**kw)
        self.type = type(self).__name__

class B1(B): pass
class B2(B): pass

print(B1.model_json_schema())
# {'properties': {'type': {'default': '__undefined__', 'title': 'Type', 'type': 'string'}}, 'title': 'B1', 'type': 'object'}
print(B1().type)  # 'B1' (instance state only; not reflected in schema)

TypeAdapter(Annotated[Union[B1, B2], Field(discriminator="type")])
# PydanticUserError: Model 'B1' needs field 'type' to be of type `Literal`

So CommsDecoder's TypeAdapter-based routing relies on the Literal annotation, and any downstream schema consumer relies on the const it produces.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants