Bugfix/fix dagrun starvation by Nataneljpwd · Pull Request #64109 · apache/airflow

Nataneljpwd · 2026-03-23T16:45:40Z

We have been experiencing severe dagrun starvation at our cluster, where when there were a lot of dagruns, and a low max_active_runs limit (hundreds to thousands runs with a limit in the 10s) this caused a lot of dags to get stuck in queued state without moving to running, causing those dagruns to timeout.
After investigation, we found that the reason was due to the _start_queued_dagruns method, where the query was returning dagruns which cannot be set to running due to the max_active_runs limit, meaning that other dagruns where starved.

A similar issue occurs when new dagruns are created in large batches (due to the nulls first), yet this is out of scope for the given pr, I will submit an additional PR soon.

closes #49508

Was generative AI tooling used to co-author this PR?

Yes (please specify the tool below)

Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
When adding dependency, check compliance with the ASF 3rd Party License Policy.
For significant user-facing changes create newsfragment: {pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.

collinmcnulty · 2026-03-23T17:10:14Z

I'm struggling to understand how the problem has been solved from reading the code. Can you explain your solution for preventing the starvation?

Nataneljpwd · 2026-03-23T17:31:38Z

I'm struggling to understand how the problem has been solved from reading the code. Can you explain your solution for preventing the starvation?

Sure,
Instead of (as of now) querying N first runs, and then filtering on the max active runs, we query the first N runs where we (in SQL) check the the max active runs (before the limit is applied)
And so we skip a lot of runs which cannot be scheduled

Assume dags a, b
a - 3 max active runs
b - no limit (default to 16 from config)
If now the query result looked like so (small letter is schedulable, capital letter is schedulable according to ) where each row represents a run (the - determine the limit, all runs before the - are selected, all other are ignored) where the max dagruns to schedule per loop (the limit) is 5

A
A
A
a
a

B
B
B

Here (as of now) the last 3 dagruns are ommitted and ignored (starving runs from b)

After the change it will look like so:

A
A
A
B
B

B

Now we do schedule everything we queried without dagruns from a limiting us (the limit now becomes the max dagruns per loop to schedule configuration) and it is guaranteed that the runs queried will be able to run

Hope this explained it, if anything is not clear feel free to let me know, I will write a better explanation.

eladkal · 2026-03-24T10:10:54Z

airflow-core/src/airflow/jobs/scheduler_job_runner.py

-            if dag_model.exceeds_max_non_backfill:
-                self.log.warning(
-                    "Dag run cannot be created; max active runs exceeded.",
-                    dag_id=dag_model.dag_id,
-                    max_active_runs=dag_model.max_active_runs,
-                    active_runs=active_runs_of_dags.get(dag_model.dag_id),
-                )
-                continue


is it right to remove this check?
Backfill has a special case as explained in the docs
https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/backfill.html#concurrency-control

You can set max_active_runs on a backfill and it will control how many Dag runs in the backfill can run concurrently. Backfill max_active_runs is applied independently the Dag max_active_runs setting.

The idea is to give the backfill run the power to limit concurrency even if the scheduler can schedule it all at once.

cc @dstandish

It is checked in the window query added, and so the data that gets here is after the query validated the non backfill runs

kaxil · 2026-03-24T12:16:16Z

This will have to wait until 3.2.0 -- This touches the core and I don't want to hurry until 3.2.0 is out.

We have 1200+ commits in 3.2.0

Copilot

Pull request overview

This PR addresses scheduler DAG run starvation by changing how queued DagRuns are selected for transition to RUNNING when max_active_runs is low and there are large backlogs of queued runs.

Changes:

Update DagRun.get_queued_dag_runs_to_set_running() to limit queued candidates per DAG/backfill using a row_number() window so the scheduler doesn’t spend its per-loop budget examining non-runnable runs from a single DAG.
Adjust scheduler unit tests to validate the improved fairness/scheduling behavior under backlog conditions.
Remove a max-active guard/logging in _create_dag_runs() and add a TODO note around _set_exceeds_max_active_runs.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
`airflow-core/src/airflow/models/dagrun.py`	Changes queued dagrun selection query to avoid starvation by limiting per DAG/backfill candidates.
`airflow-core/src/airflow/jobs/scheduler_job_runner.py`	Removes a creation-time max-active skip and adds a TODO comment near `_set_exceeds_max_active_runs`.
`airflow-core/tests/unit/jobs/test_scheduler_job.py`	Updates/reshapes scheduling tests to assert non-starved behavior and updated run counts.

airflow-core/src/airflow/jobs/scheduler_job_runner.py

Copilot · 2026-03-24T12:21:33Z

airflow-core/tests/unit/jobs/test_scheduler_job.py

+        # now we finish all lower priority backfill tasks, and observe new higher priority tasks are started
+        session.execute(
+            update(DagRun)
+            .where(DagRun.dag_id == "test_dag2", DagRun.state == DagRunState.RUNNING)
+            .values(state=DagRunState.SUCCESS)
+        )


This comment says you’re finishing “lower priority backfill tasks”, but the code updates DAG runs for test_dag2 (non-backfill) from RUNNING→SUCCESS. Please adjust the comment to match what the test is actually doing (finishing test_dag2 runs) to avoid confusion when maintaining the test expectations.

Copilot · 2026-03-24T12:21:33Z

airflow-core/src/airflow/models/dagrun.py

+                .over(partition_by=[DagRun.dag_id, DagRun.backfill_id], order_by=DagRun.logical_date)
+                .label("rn"),
+            )
+            .where(DagRun.state == DagRunState.QUEUED)


available_dagruns_rn uses row_number(... order_by=DagRun.logical_date), but logical_date is nullable and does not reflect the scheduler’s actual priority ordering (run_after, and for backfills BackfillDagRun.sort_ordinal, which can differ e.g. reverse backfills). This can cause the rn-capacity filter to pick a non-runnable/future run_after row (or wrong backfill ordinal) as rn=1 and then filter out runnable candidates, reintroducing starvation or breaking backfill ordering. Consider basing the window order_by on run_after with a deterministic tiebreaker (e.g. id), and for backfills incorporate BackfillDagRun.sort_ordinal (likely via computing the row_number after joining that table), and/or apply run_after <= now() in the window subquery so rn is computed only among runnable queued runs.

Suggested change

.over(partition_by=[DagRun.dag_id, DagRun.backfill_id], order_by=DagRun.logical_date)

.label("rn"),

)

.where(DagRun.state == DagRunState.QUEUED)

.over(

partition_by=[DagRun.dag_id, DagRun.backfill_id],

order_by=[BackfillDagRun.sort_ordinal, DagRun.run_after, DagRun.id],

)

.label("rn"),

)

.join(

BackfillDagRun,

DagRun.backfill_id == BackfillDagRun.id,

isouter=True,

)

.where(

and_(

DagRun.state == DagRunState.QUEUED,

DagRun.run_after <= func.now(),

)

)

kaxil

The core idea — using row_number() to cap how many queued runs per DAG/backfill pass through the query — is a reasonable approach to solving the starvation problem in get_queued_dag_runs_to_set_running. However, this PR bundles in unrelated and incorrect changes to the DagRun creation path (_create_dag_runs), which is a separate concern from the QUEUED→RUNNING promotion path.

See inline comments for details.

airflow-core/src/airflow/jobs/scheduler_job_runner.py

airflow-core/src/airflow/models/dagrun.py

airflow-core/tests/unit/jobs/test_scheduler_job.py

Nataneljpwd · 2026-03-24T13:10:19Z

This will have to wait until 3.2.0 -- This touches the core and I don't want to hurry until 3.2.0 is out.

We have 1200+ commits in 3.2.0

Sure, no problem

…ore to be set to running

Nataneljpwd · 2026-03-24T19:55:18Z

@kaxil, @eladkal I have addressed the review comments and would appreciate another review

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

Copilot · 2026-04-10T19:58:50Z

airflow-core/tests/unit/jobs/test_scheduler_job.py

+            .where(DagRun.dag_id == "test_dag2", DagRun.state == DagRunState.RUNNING)
+            .values(state=DagRunState.SUCCESS)
+        )
+        session.commit()


Calling session.commit() inside a unit test can break transactional test isolation (fixtures that rely on nested transactions/rollbacks) and is usually unnecessary here since the updated rows are read again within the same session. Prefer removing the commit and relying on flush() (or keep it as a flush() only) so the state transition is visible without finalizing the transaction.

Suggested change

session.commit()

Copilot · 2026-04-10T19:58:50Z

airflow-core/tests/unit/jobs/test_scheduler_job.py

+        dag_run = dag_maker.create_dagrun(state=State.RUNNING, session=session, run_type=DagRunType.SCHEDULED)
+
+        for _ in range(5):
+            # create a bunch of dagruns in queued state, to make sure they are filtered by max_active_runs


The comment says these are created in queued state, but the code sets state=State.RUNNING. If the intent is to pre-fill/exceed max_active_runs with RUNNING dagruns (which makes sense for this test), update the comment to match the actual setup to avoid confusion for future maintainers.

Suggested change

# create a bunch of dagruns in queued state, to make sure they are filtered by max_active_runs

# create a bunch of dagruns in running state, to exceed max_active_runs

Copilot · 2026-04-10T19:58:51Z

airflow-core/src/airflow/models/dagrun.py

+                DagRun.dag_id,
+                DagRun.id,
+                func.row_number()
+                .over(partition_by=[DagRun.dag_id, DagRun.backfill_id], order_by=DagRun.logical_date)


The window order_by uses only DagRun.logical_date, which may not be a total ordering (ties can occur), making the chosen dagruns nondeterministic across DB engines/plans and potentially causing flaky behavior. Add a stable tie-breaker (e.g., also order by DagRun.id or DagRun.run_after) so row-number assignment is deterministic within each (dag_id, backfill_id) partition.

Suggested change

.over(partition_by=[DagRun.dag_id, DagRun.backfill_id], order_by=DagRun.logical_date)

.over(

partition_by=[DagRun.dag_id, DagRun.backfill_id],

order_by=[DagRun.logical_date, DagRun.id],

)

Copilot · 2026-04-10T19:58:51Z

airflow-core/src/airflow/models/dagrun.py

+                func.row_number()
+                .over(partition_by=[DagRun.dag_id, DagRun.backfill_id], order_by=DagRun.logical_date)
+                .label("rn"),
+            )


This computes row_number() across all queued dagruns before applying later eligibility filters (DagModel/Backfill joins, paused checks, etc.). On large installations with many queued dagruns, that full-table window can become a costly bottleneck. Consider pushing more predicates/joins into the same subquery/CTE used for the window (so the window runs only on eligible candidates), or otherwise narrowing the queued set prior to the window calculation.

Suggested change

)

)

.join(

DagModel,

and_(

DagModel.dag_id == DagRun.dag_id,

DagModel.is_paused == false(),

DagModel.is_stale == false(),

),

)

Copilot · 2026-04-10T19:58:51Z

airflow-core/tests/unit/jobs/test_scheduler_job.py

+        # this is because there are 30 dags, most of which get filtered due to max_active_runs
+        # and so due to the default dagruns to examine, we look at the first 20 dags which CAN be run
+        # according to the max_active_runs parameter, meaning 3 backfill runs will start, 1 non backfill and
+        # all dagruns of dag2
        # any runs for dag2 get started


These explanatory comments repeatedly say 'dags' where they appear to mean 'dagruns' (e.g., '30 dags', 'first 20 dags'), which makes the rationale hard to follow. Clarifying the terminology here (dag vs dagrun) would prevent misunderstanding when debugging scheduler selection behavior.

Suggested change

# this is because there are 30 dags, most of which get filtered due to max_active_runs

# and so due to the default dagruns to examine, we look at the first 20 dags which CAN be run

# according to the max_active_runs parameter, meaning 3 backfill runs will start, 1 non backfill and

# all dagruns of dag2

# any runs for dag2 get started

# this is because there are 30 queued dagruns, many of which get filtered because their DAGs

# have already reached max_active_runs

# and so due to the default dagruns-to-examine limit, we look at the first 20 dagruns that CAN be run

# according to the max_active_runs parameter, meaning 3 backfill runs will start, 1 non-backfill,

# and all runnable dagruns for dag2

Nataneljpwd added 4 commits March 23, 2026 18:24

fixed dagrun starvation with the max_active_tasks limit

f922f2b

formatted files

85e4c6a

removed print

dd6b74d

removed redundent tests

ecdb722

Nataneljpwd requested review from XD-DENG and ashb as code owners March 23, 2026 16:45

boring-cyborg bot added the area:Scheduler including HA (high availability) scheduler label Mar 23, 2026

Nataneljpwd marked this pull request as draft March 23, 2026 16:48

Nataneljpwd added 2 commits March 23, 2026 21:20

fix mysql test

8356843

merge branch main

fe5462d

Nataneljpwd force-pushed the bugfix/fix-dagrun-starvation branch from 6b09818 to fe5462d Compare March 23, 2026 19:21

Nataneljpwd marked this pull request as ready for review March 23, 2026 21:01

eladkal requested a review from kaxil March 24, 2026 10:05

eladkal reviewed Mar 24, 2026

View reviewed changes

kaxil requested a review from Copilot March 24, 2026 12:15

kaxil added this to the Airflow 3.2.1 milestone Mar 24, 2026

Copilot started reviewing on behalf of kaxil March 24, 2026 12:16 View session

Copilot AI reviewed Mar 24, 2026

View reviewed changes

kaxil requested changes Mar 24, 2026

View reviewed changes

eladkal added the type:bug-fix Changelog: Bug Fixes label Mar 24, 2026

Nataneljpwd added 2 commits March 24, 2026 21:35

address cr comments

1c0c28f

merge branch main

c47e216

Nataneljpwd force-pushed the bugfix/fix-dagrun-starvation branch from a9c77cd to c47e216 Compare March 24, 2026 19:35

Nataneljpwd and others added 2 commits March 24, 2026 21:52

added explicit test for more running dagruns than limit not causing m…

9c28d0d

…ore to be set to running

Merge branch 'main' into bugfix/fix-dagrun-starvation

9992724

Nataneljpwd requested review from eladkal and kaxil March 24, 2026 20:39

Merge branch 'main' into bugfix/fix-dagrun-starvation

1923f9f

Nataneljpwd mentioned this pull request Mar 25, 2026

DAG Runs from a single DAG can prevent scheduler from seeing other DAG's runs #49508

Open

2 tasks

Nataneljpwd added 4 commits March 26, 2026 22:34

Merge branch 'main' into bugfix/fix-dagrun-starvation

c78e0a8

Merge branch 'main' into bugfix/fix-dagrun-starvation

512bcf8

Merge branch 'main' into bugfix/fix-dagrun-starvation

a257185

Merge branch 'main' into bugfix/fix-dagrun-starvation

ee4c094

kaxil requested a review from Copilot April 2, 2026 00:43

Copilot AI reviewed Apr 2, 2026

View reviewed changes

Nataneljpwd added 2 commits April 4, 2026 08:49

Merge branch 'main' into bugfix/fix-dagrun-starvation

049f50e

Merge branch 'main' into bugfix/fix-dagrun-starvation

aeeff8b

kaxil requested a review from Copilot April 10, 2026 19:55

Copilot AI reviewed Apr 10, 2026

View reviewed changes

-                .over(partition_by=[DagRun.dag_id, DagRun.backfill_id], order_by=DagRun.logical_date)
-                .label("rn"),
-            )
-            .where(DagRun.state == DagRunState.QUEUED)
+                .over(
+                    partition_by=[DagRun.dag_id, DagRun.backfill_id],
+                    order_by=[BackfillDagRun.sort_ordinal, DagRun.run_after, DagRun.id],
+                )
+                .label("rn"),
+            )
+            .join(
+                BackfillDagRun,
+                DagRun.backfill_id == BackfillDagRun.id,
+                isouter=True,
+            )
+            .where(
+                and_(
+                    DagRun.state == DagRunState.QUEUED,
+                    DagRun.run_after <= func.now(),
+                )
+            )

	# create a bunch of dagruns in queued state, to make sure they are filtered by max_active_runs
	# create a bunch of dagruns in running state, to exceed max_active_runs

-            )
+            )
+            .join(
+                DagModel,
+                and_(
+                    DagModel.dag_id == DagRun.dag_id,
+                    DagModel.is_paused == false(),
+                    DagModel.is_stale == false(),
+                ),
+            )

Conversation

Nataneljpwd commented Mar 23, 2026

Was generative AI tooling used to co-author this PR?

Uh oh!

collinmcnulty commented Mar 23, 2026

Uh oh!

Nataneljpwd commented Mar 23, 2026

A A A a a

A A A B B

Uh oh!

eladkal Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kaxil Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Nataneljpwd Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

kaxil commented Mar 24, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

kaxil left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Nataneljpwd commented Mar 24, 2026

Uh oh!

Nataneljpwd commented Mar 24, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

A
A
A
a
a

A
A
A
B
B

eladkal Mar 24, 2026 •

edited

Loading