Skip to content

Asset Event trigger attribution is misleading #56749

@SCrocky

Description

@SCrocky

Apache Airflow version

3.1.0

If "Other Airflow 2/3 version" selected, which one?

No response

What happened?

When triggering a DAG via Asset Events, one of three things can happen:

  1. A single Asset Event triggers a DAG Run
  2. Multiple Asset Events trigger a single DAG Run
  3. Asset Events that haven't triggered a DAG Run, but are older than the last run are silently ignored

In this example:
Image

The consumer DAG runs twice.
It consumes 1 Asset Event on the first run, two on the second, and stops running.

However if we look at the Asset Events:
Image

We see there are 5 Asset Events, and 3 of them have "triggered" DAG Runs.

This is misleading as it makes the user believe there should be 3 DAG Runs, not 2.

What you think should happen instead?

I think we should add the following changes in the Asset UI page (and the corresponding internal changes):

  • Triggered DAG Runs should only be used on the most recent Asset Event that a DAG Run consumes.
  • Included in DAG Run mention should be used for all other consumed

As for the ignored Asset Events, I see one of two possible solutions:

  1. Either this is a bug, and they should trigger a DAG Run
  2. Or this is a feature and they should have some sort of silently consumed state & label attached to them.

How to reproduce

To reproduce simply upload the following DAGs in a brand new Airflow instance:

check_dataset_sync.py

make sure to use a DB other than SQlite so you can compare the difference between max_active_runs=1 and max_active_runs=10.

Then use the airflow standalone command.
Turn all the DAGs on. And manually trigger the asset generator DAG once.

Operating System

Ubuntu 24

Versions of Apache Airflow Providers

apache-airflow-providers-common-compat   1.7.3
apache-airflow-providers-common-io       1.6.2
apache-airflow-providers-common-sql      1.27.5
apache-airflow-providers-postgres        6.2.3
apache-airflow-providers-smtp            2.2.0
apache-airflow-providers-standard        1.6.0

Deployment

Virtualenv installation

Deployment details

No response

Anything else?

Do not use the default SQLite Database as it cannot show the parallelism issues

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    affected_version:3.1Issues Reported for 3.1area:UIRelated to UI/UX. For Frontend Developers.area:corearea:data-aware-schedulingassets, datasets, AIP-48, AIP-76, AIP-74, Asset Partitionskind:bugThis is a clearly a bug

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions