Skip to content

Make Task Instance primary key be a UUID #43161

@kaxil

Description

@kaxil

As part of AIP-72, we want to pass the Task Instance to the worker. Currently, the primary key of TI is a combination of dag_id, task_id, run_id, map_index.

__tablename__ = "task_instance"
task_id = Column(StringID(), primary_key=True, nullable=False)
dag_id = Column(StringID(), primary_key=True, nullable=False)
run_id = Column(StringID(), primary_key=True, nullable=False)
map_index = Column(Integer, primary_key=True, nullable=False, server_default=text("-1"))

Instead of sending the entire key from the executor to worker via API-server, ideally the API server can just send over a TI UUID and the worker then uses it to fetch the correct TI to execute.

We want to add a single column pk of a UUID, and should use UUID v7 (as it has better temporal sorting behaviours than the random v4). For the migration to update existing rows we can use v4 which most DBs have natively.

The scope of this GitHub issue is to add UUIDs -- but not use it anywhere in the codebase yet until we need it on the Task Execution API server. We will keep the "denormalized" columns of dag_id and run_id for easier searching/querying.

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions