Apache Airflow version
3.1.0
If "Other Airflow 2/3 version" selected, which one?
No response
What happened?
Asset scheduling behaviors
Asset Event triggered DAGs behave one of 3 different ways:
1. A single Asset Event triggers a single DAG Run
2. Multiple Asset Events trigger a single DAG Run
3. Asset Events that haven't triggered a DAG Run, but are older than the last run are silently ignored
How to make Datasets Behave differently
To force behavior 2 & 3 to happen, one can set max_active_runs=1 and every time the DAG runs it will "consume" (either via behavior 2 or 3) all available Asset Events.
To force behavior 1, one must set max_active_runs to a high value, and hope that Asset Events are not generated faster than the scheduler runs (or else we fall into behavior 2)
It is important to note that the catchup argument does not seem to affect this mechanic in any way.
The main Issue
The main issue here is:
Asset Event Scheduling behaves in very different ways, based on DAG parallelism & Airflow Scheduler performance
These things should be unrelated, and as far as I could tell, this behavior is undocumented.
Linked Issues
Other issues that would likely be solved by addressing this issue:
#56749 (UI changes)
#53896 (distinct DAG Run per Asset Event)
#50890 (want catchup on Assets)
#56691 (distinct DAG Run per Asset Event)
#56050 (Max active runs = 1 changes behavior)
#55956 (Force separate Events)
#47398
Unclear issues that may be related:
#56541 ? (unclear)
#42015 ? (unclear)
What you think should happen instead?
In my professional setting we use both behavior 1 (for Event based scheduling) and behavior 2 & 3 (for table refreshes). Check out my Talk from Airflow Summit 2025 for more details.
So I suggest we make the Asset Event DAG triggering behavior configurable on a DAG level.
For example by adding a asset_grouping argument:
- if
asset_grouping=True then we have behavior 2
- if
asset_grouping=False then we have behavior 1
Behavior 3 is a bug in my opinion and should never happen.
I've put more info on the Asset Event attribution in this issue
I also suggest we rename catchup to time_interval_catchup or some similar value, so that it is clear it does not apply to Asset Event based scheduling.
And we should document all this stuff.
How to reproduce
To reproduce simply upload the following DAGs in a brand new Airflow instance:
check_dataset_sync.py
make sure to use a DB other than SQlite so you can compare the difference between max_active_runs=1 and max_active_runs=10.
Then use the airflow standalone command.
Turn all the DAGs on.
You should obtain the following DAGs:

And manually trigger the asset generator DAG once.

You will then see that the non-parallel DAGs only trigger twice, and the parallel DAG triggers 4-5 times, depending on scheduler frequency.
You can check the logs to see how many Asset Events each DAG is consuming:

You can also do similar tests for Event Driven Asset Events:
event_scheduling_test.py
But be sure to add your dags repo to the PYTHONPATH export PYTHONPATH=$AIRFLOW_HOME/dags
Operating System
Ubuntu 24
Versions of Apache Airflow Providers
apache-airflow-providers-common-compat 1.7.3
apache-airflow-providers-common-io 1.6.2
apache-airflow-providers-common-sql 1.27.5
apache-airflow-providers-postgres 6.2.3
apache-airflow-providers-smtp 2.2.0
apache-airflow-providers-standard 1.6.0
Deployment
Virtualenv installation
Deployment details
Using postgres for the Airflow DB
Anything else?
@cmarteepants I've finally gotten around to making this issue as previously discussed.
Let me know if everything is clear and understandable.
@uranusjr enjoy ;)
Are you willing to submit PR?
Code of Conduct
Apache Airflow version
3.1.0
If "Other Airflow 2/3 version" selected, which one?
No response
What happened?
Asset scheduling behaviors
Asset Event triggered DAGs behave one of 3 different ways:
How to make Datasets Behave differently
To force behavior 2 & 3 to happen, one can set
max_active_runs=1and every time the DAG runs it will "consume" (either via behavior 2 or 3) all available Asset Events.To force behavior 1, one must set
max_active_runsto a high value, and hope that Asset Events are not generated faster than the scheduler runs (or else we fall into behavior 2)It is important to note that the
catchupargument does not seem to affect this mechanic in any way.The main Issue
The main issue here is:
Asset Event Scheduling behaves in very different ways, based on DAG parallelism & Airflow Scheduler performance
These things should be unrelated, and as far as I could tell, this behavior is undocumented.
Linked Issues
Other issues that would likely be solved by addressing this issue:
#56749 (UI changes)
#53896 (distinct DAG Run per Asset Event)
#50890 (want catchup on Assets)
#56691 (distinct DAG Run per Asset Event)
#56050 (Max active runs = 1 changes behavior)
#55956 (Force separate Events)
#47398
Unclear issues that may be related:
#56541 ? (unclear)
#42015 ? (unclear)
What you think should happen instead?
In my professional setting we use both behavior 1 (for Event based scheduling) and behavior 2 & 3 (for table refreshes). Check out my Talk from Airflow Summit 2025 for more details.
So I suggest we make the Asset Event DAG triggering behavior configurable on a DAG level.
For example by adding a
asset_groupingargument:asset_grouping=Truethen we have behavior 2asset_grouping=Falsethen we have behavior 1Behavior 3 is a bug in my opinion and should never happen.
I've put more info on the Asset Event attribution in this issue
I also suggest we rename
catchuptotime_interval_catchupor some similar value, so that it is clear it does not apply to Asset Event based scheduling.And we should document all this stuff.
How to reproduce
To reproduce simply upload the following DAGs in a brand new Airflow instance:
check_dataset_sync.py
make sure to use a DB other than SQlite so you can compare the difference between
max_active_runs=1andmax_active_runs=10.Then use the
airflow standalonecommand.Turn all the DAGs on.
You should obtain the following DAGs:
And manually trigger the asset generator DAG once.
You will then see that the non-parallel DAGs only trigger twice, and the parallel DAG triggers 4-5 times, depending on scheduler frequency.
You can check the logs to see how many Asset Events each DAG is consuming:
You can also do similar tests for Event Driven Asset Events:
event_scheduling_test.py
But be sure to add your dags repo to the PYTHONPATH
export PYTHONPATH=$AIRFLOW_HOME/dagsOperating System
Ubuntu 24
Versions of Apache Airflow Providers
Deployment
Virtualenv installation
Deployment details
Using postgres for the Airflow DB
Anything else?
@cmarteepants I've finally gotten around to making this issue as previously discussed.
Let me know if everything is clear and understandable.
@uranusjr enjoy ;)
Are you willing to submit PR?
Code of Conduct