Skip to content

fix: eliminate duplicate JOINs in get_task_instances endpoint#62910

Merged
pierrejeambrun merged 6 commits intoapache:mainfrom
ShubhamGondane:fix/eliminate-duplicate-joins-get-task-instances
Mar 18, 2026
Merged

fix: eliminate duplicate JOINs in get_task_instances endpoint#62910
pierrejeambrun merged 6 commits intoapache:mainfrom
ShubhamGondane:fix/eliminate-duplicate-joins-get-task-instances

Conversation

@ShubhamGondane
Copy link
Copy Markdown
Contributor

@ShubhamGondane ShubhamGondane commented Mar 5, 2026

Root cause

get_task_instances combined explicit join(TI.dag_run) + outerjoin(TI.dag_version) with joinedload options for the same tables inside eager_load_TI_and_TIH_for_validation(). SQLAlchemy emits a separate JOIN for each joinedload even when the table is already in the FROM clause, producing redundant duplicate JOINs in the SQL.

Before — 7 JOINs (dag_run and dag_version each joined twice):

FROM task_instance
  JOIN dag_run ON ...               -- explicit
  LEFT OUTER JOIN dag_version ON ... -- explicit
  LEFT OUTER JOIN dag_version AS dag_version_1 ON ...  -- joinedload duplicate
  LEFT OUTER JOIN dag_bundle ON ...
  JOIN dag_run AS dag_run_1 ON ...  -- joinedload duplicate
  LEFT OUTER JOIN dag AS dag_1 ON ...
  LEFT OUTER JOIN task_instance_note ON ...

After — 5 JOINs (each table once):

FROM task_instance
  JOIN dag_run ON ...               -- reused by contains_eager
  LEFT OUTER JOIN dag_version ON ... -- reused by contains_eager
  LEFT OUTER JOIN dag_bundle ON ...
  LEFT OUTER JOIN dag AS dag_1 ON ...
  LEFT OUTER JOIN task_instance_note ON ...

Note on performance: Because the query selects from task_instance and all the duplicated JOINs traverse many-to-one relationships (each TI has exactly one DagRun, one DagVersion), there is no row multiplication — the duplicate JOINs are redundant but not harmful at the row level. Benchmarks on PostgreSQL at scale (20,000 rows) confirmed no measurable timing difference. The value of this change is SQL correctness: contains_eager is the right SQLAlchemy idiom when the JOIN is already explicit, and it avoids emitting unnecessary SQL.

Fix

Replace joinedload options for already-joined tables with contains_eager, which tells SQLAlchemy to reuse the existing JOIN result for eager loading instead of adding a new one.

eager_load_TI_and_TIH_for_validation() is intentionally left unchanged — its other callers (get_mapped_task_instances, get_task_instances_history) have different join patterns and are not affected.

Testing

  • All 47 existing TestGetTaskInstances tests pass unchanged.
  • Replaced the query-count regression test with test_no_duplicate_joins_in_get_task_instances_query, which calls the real endpoint and captures the SQL it emits via a SQLAlchemy event listener, then asserts dag_run and dag_version each appear exactly once in the JOINs.

related: #62027


Was generative AI tooling used to co-author this PR?
  • Yes — Claude

Generated-by: Claude Sonnet 4.6 following the guidelines

Copy link
Copy Markdown
Member

@pierrejeambrun pierrejeambrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to adjust the test case which seems irrelevant.

From a performance standpoint it doesn't seem to make a difference. Can you provide a benchmark or more details as to the expected performance gains?

Before

Image

After

Image

@ShubhamGondane
Copy link
Copy Markdown
Contributor Author

Thanks for the review @pierrejeambrun
You are right on both counts.

Regarding the performance benchmark - I ran a PostgreSQL benchmark at 20,000 rows and didn't see any measurable difference. Root cause: all duplicated JOINs are on many-to-one relationships (each TI has exactly one DagRun/DagVersion), so there's no row multiplication regardless. Updated the PR description to remove performance claims.

Copy link
Copy Markdown
Member

@pierrejeambrun pierrejeambrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI need fixing.

@potiuk
Copy link
Copy Markdown
Member

potiuk commented Mar 11, 2026

CI need fixing.

looks good :)

@potiuk
Copy link
Copy Markdown
Member

potiuk commented Mar 11, 2026

@ShubhamGondane This PR has a few issues that need to be addressed before it can be reviewed — please see our Pull Request quality criteria.

Issues found:

  • ⚠️ Unresolved review comments: This PR has 1 unresolved review thread from maintainers. Please review and resolve all inline review comments before requesting another review. You can resolve a conversation by clicking 'Resolve conversation' on each thread after addressing the feedback. See pull request guidelines.

Note: Your branch is 253 commits behind main. Some check failures may be caused by changes in the base branch rather than by your PR. Please rebase your branch and push again to get up-to-date CI results.

What to do next:

  • The comment informs you what you need to do.
  • Fix each issue, then mark the PR as "Ready for review" in the GitHub UI - but only after making sure that all the issues are fixed.
  • Maintainers will then proceed with a normal review.

Please address the issues above and push again. If you have questions, feel free to ask on the Airflow Slack.

@potiuk potiuk added the ready for maintainer review Set after triaging when all criteria pass. label Mar 11, 2026
@potiuk
Copy link
Copy Markdown
Member

potiuk commented Mar 12, 2026

@ShubhamGondane This PR has a few issues that need to be addressed before it can be reviewed — please see our Pull Request quality criteria.

Issues found:

  • ⚠️ Unresolved review comments: This PR has 1 unresolved review thread from maintainers: @pierrejeambrun (MEMBER): 1 unresolved thread. Please review and resolve all inline review comments before requesting another review. You can resolve a conversation by clicking 'Resolve conversation' on each thread after addressing the feedback. See pull request guidelines.

Note: Your branch is 76 commits behind main. Some check failures may be caused by changes in the base branch rather than by your PR. Please rebase your branch and push again to get up-to-date CI results.

What to do next:

  • The comment informs you what you need to do.
  • Fix each issue, then mark the PR as "Ready for review" in the GitHub UI - but only after making sure that all the issues are fixed.
  • Maintainers will then proceed with a normal review.

Please address the issues above and push again. If you have questions, feel free to ask on the Airflow Slack.

@potiuk potiuk removed the ready for maintainer review Set after triaging when all criteria pass. label Mar 12, 2026
@ShubhamGondane ShubhamGondane force-pushed the fix/eliminate-duplicate-joins-get-task-instances branch from 90f3a49 to fc621a5 Compare March 12, 2026 21:44
@ShubhamGondane
Copy link
Copy Markdown
Contributor Author

Thanks for the feedback @pierrejeambrun! I've reworked the approach:

  • Moved the explicit .join(dag_run) / .outerjoin(dag_version) into eager_load_TI_and_TIH_for_validation so all join + eager-loading logic is centralised in one function
  • The function now takes the query as input, adds the joins, applies contains_eager, and returns the modified query
  • Updated all 4: get_task_instances, get_mapped_task_instances, get_task_instances_batch, and get_task_instance_tries
  • Added SQL-structure regression tests for the mapped and batch endpoints as well

@potiuk potiuk added the ready for maintainer review Set after triaging when all criteria pass. label Mar 13, 2026
Copy link
Copy Markdown
Member

@pierrejeambrun pierrejeambrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks

@pierrejeambrun pierrejeambrun added this to the Airflow 3.2.0 milestone Mar 18, 2026
@pierrejeambrun pierrejeambrun merged commit efce3b1 into apache:main Mar 18, 2026
132 checks passed
imrichardwu pushed a commit to imrichardwu/airflow that referenced this pull request Mar 18, 2026
…#62910)

* fix: eliminate duplicate JOINs in get_task_instances endpoint (apache#62027)

Replaces joinedload with contains_eager for already-joined tables, reducing the main query from 7 JOINs to 5.

* tests: replace query-count test with SQL-structure regression test for apache#62027

* fix: mypy type annotation for dialect in benchmark script

* fix: centralise join + eager-loading in eager_load_TI_and_TIH_for_validation for all task instance endpoints

* ci: retrigger checks
imrichardwu pushed a commit to imrichardwu/airflow that referenced this pull request Mar 18, 2026
…#62910)

* fix: eliminate duplicate JOINs in get_task_instances endpoint (apache#62027)

Replaces joinedload with contains_eager for already-joined tables, reducing the main query from 7 JOINs to 5.

* tests: replace query-count test with SQL-structure regression test for apache#62027

* fix: mypy type annotation for dialect in benchmark script

* fix: centralise join + eager-loading in eager_load_TI_and_TIH_for_validation for all task instance endpoints

* ci: retrigger checks
fat-catTW pushed a commit to fat-catTW/airflow that referenced this pull request Mar 22, 2026
…#62910)

* fix: eliminate duplicate JOINs in get_task_instances endpoint (apache#62027)

Replaces joinedload with contains_eager for already-joined tables, reducing the main query from 7 JOINs to 5.

* tests: replace query-count test with SQL-structure regression test for apache#62027

* fix: mypy type annotation for dialect in benchmark script

* fix: centralise join + eager-loading in eager_load_TI_and_TIH_for_validation for all task instance endpoints

* ci: retrigger checks
techcodie pushed a commit to techcodie/airflow that referenced this pull request Mar 23, 2026
…#62910)

* fix: eliminate duplicate JOINs in get_task_instances endpoint (apache#62027)

Replaces joinedload with contains_eager for already-joined tables, reducing the main query from 7 JOINs to 5.

* tests: replace query-count test with SQL-structure regression test for apache#62027

* fix: mypy type annotation for dialect in benchmark script

* fix: centralise join + eager-loading in eager_load_TI_and_TIH_for_validation for all task instance endpoints

* ci: retrigger checks
abhijeets25012-tech pushed a commit to abhijeets25012-tech/airflow that referenced this pull request Apr 9, 2026
…#62910)

* fix: eliminate duplicate JOINs in get_task_instances endpoint (apache#62027)

Replaces joinedload with contains_eager for already-joined tables, reducing the main query from 7 JOINs to 5.

* tests: replace query-count test with SQL-structure regression test for apache#62027

* fix: mypy type annotation for dialect in benchmark script

* fix: centralise join + eager-loading in eager_load_TI_and_TIH_for_validation for all task instance endpoints

* ci: retrigger checks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:API Airflow's REST/HTTP API ready for maintainer review Set after triaging when all criteria pass.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants