Skip to content

Send task logs from KubeExecutor tasks to pod stdout too#47731

Merged
jedcunningham merged 1 commit intoapache:mainfrom
astronomer:ke-task-logs-stdout
Mar 13, 2025
Merged

Send task logs from KubeExecutor tasks to pod stdout too#47731
jedcunningham merged 1 commit intoapache:mainfrom
astronomer:ke-task-logs-stdout

Conversation

@ashb
Copy link
Copy Markdown
Member

@ashb ashb commented Mar 13, 2025

This is needed so that when the KubeExecutor is asked to get logs for a
running pod they show up in the pod.

This changes the airflow.sdk.execution_time.execute_workload entrypoint to
a. produce all logs as JSON, and
b. ask the SDK to send all output log messages to the top level logger to, so
they appear on stdout.

In order to make the output a bit nicer this also tidies up/removes some of
the logging from dispose_orm so that it doesn't "pollute" the logs (this was
required as due to the current hack we have to upload remote logs, we ended up
alling dispose_orm at the end and that wasn't JSON formatted).

Closes #46894


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@ashb ashb added the area:task-execution-interface-aip72 AIP-72: Task Execution Interface (TEI) aka Task SDK label Mar 13, 2025
@ashb ashb force-pushed the ke-task-logs-stdout branch from 85749d8 to 5cddc04 Compare March 13, 2025 15:56
This is needed so that when the KubeExecutor is asked to get logs for a
running pod they show up in the pod.

This changes the `airflow.sdk.execution_time.execute_workload` entrypoint to
a. produce all logs as JSON, and
b. ask the SDK to send all output log messages to the top level logger to, so
   they appear on stdout.

In order to make the output a bit nicer this also tidies up/removes some of
the logging from dispose_orm so that it doesn't "pollute" the logs (this was
required as due to the current hack we have to upload remote logs, we ended up
alling dispose_orm at the end and that _wasn't_ JSON formatted).

Closes apache#46894
@ashb ashb force-pushed the ke-task-logs-stdout branch from 5cddc04 to 9950707 Compare March 13, 2025 16:22
@jedcunningham jedcunningham merged commit 835bbcb into apache:main Mar 13, 2025
61 checks passed
@jedcunningham jedcunningham deleted the ke-task-logs-stdout branch March 13, 2025 17:12
nailo2c pushed a commit to nailo2c/airflow that referenced this pull request Apr 4, 2025
This is needed so that when the KubeExecutor is asked to get logs for a
running pod they show up in the pod.

This changes the `airflow.sdk.execution_time.execute_workload` entrypoint to
a. produce all logs as JSON, and
b. ask the SDK to send all output log messages to the top level logger to, so
   they appear on stdout.

In order to make the output a bit nicer this also tidies up/removes some of
the logging from dispose_orm so that it doesn't "pollute" the logs (this was
required as due to the current hack we have to upload remote logs, we ended up
alling dispose_orm at the end and that _wasn't_ JSON formatted).

Closes apache#46894
taranlu-houzz added a commit to taranlu-houzz/airflow that referenced this pull request Mar 20, 2026
LocalExecutor's `_execute_work()` calls `supervise()` without passing
`subprocess_logs_to_stdout=True`, so task logs are only written to log
files and never reach the container's stdout. This breaks log collection
in Kubernetes deployments that rely on container stdout for aggregation
(e.g., Fluentd, Coralogix, Datadog).

The containerized executor (`execute_workload.py`) already passes this
flag (added in apache#47731), but LocalExecutor was not updated.

Closes apache#54501
taranlu-houzz added a commit to taranlu-houzz/airflow that referenced this pull request Mar 20, 2026
LocalExecutor's `_execute_work()` calls `supervise()` without passing
`subprocess_logs_to_stdout=True`, so task logs are only written to log
files and never reach the container's stdout. This breaks log collection
in Kubernetes deployments that rely on container stdout for aggregation
(e.g., Fluentd, Coralogix, Datadog).

The containerized executor (`execute_workload.py`) already passes this
flag (added in apache#47731), but LocalExecutor was not updated.

Closes apache#54501
eladkal pushed a commit to taranlu-houzz/airflow that referenced this pull request Mar 22, 2026
LocalExecutor's `_execute_work()` calls `supervise()` without passing
`subprocess_logs_to_stdout=True`, so task logs are only written to log
files and never reach the container's stdout. This breaks log collection
in Kubernetes deployments that rely on container stdout for aggregation
(e.g., Fluentd, Coralogix, Datadog).

The containerized executor (`execute_workload.py`) already passes this
flag (added in apache#47731), but LocalExecutor was not updated.

Closes apache#54501
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:task-execution-interface-aip72 AIP-72: Task Execution Interface (TEI) aka Task SDK area:task-sdk

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Write task logs to stdout for tasks run with KubernetesExecutor

4 participants