Add option to override Celery worker images#63514
Add option to override Celery worker images#63514sg-c0de wants to merge 3 commits intoapache:mainfrom
Conversation
|
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
|
jscheffl
left a comment
There was a problem hiding this comment.
Thanks, that option makes sense to me. Can you also please add some tests?
|
@sg-c0de This PR has a few issues that need to be addressed before it can be reviewed — please see our Pull Request quality criteria. Issues found:
What to do next:
There is no rush — take your time and work at your own pace. We appreciate your contribution and are happy to wait for updates. If you have questions, feel free to ask on the Airflow Slack. |
366cbfc to
9105d17
Compare
|
@potiuk Tests added:Added
Description vs code mismatch clarified:Updated the All 24 tests pass locally. Branch has been rebased onto main. |
Miretpl
left a comment
There was a problem hiding this comment.
Overall, this is a nice change that aligns with the Workers Sets feature. Let's wait a bit and revisit it in the next round after some discussions on the Helm Chart releases (the images section and design of it are one of the action items) are finished.
| in worker-deployment.yaml has already merged celery/set values into Values.workers, so the | ||
| effective image is at Values.workers.image (post-merge). | ||
| */}} | ||
| {{- define "airflow_worker_image" -}} |
There was a problem hiding this comment.
Most of this function copies logic from airflow_image. Maybe there is some nice way to not duplicate code here?
There was a problem hiding this comment.
Sure. I created common helpers for images and tags/digests, and reused them across the image helpers.
| def _all_airflow_images(doc): | ||
| """Extract all airflow images from a rendered worker deployment/statefulset.""" | ||
| init_images = jmespath.search("spec.template.spec.initContainers[*].image", doc) or [] | ||
| container_images = jmespath.search("spec.template.spec.containers[*].image", doc) or [] | ||
| # Filter out git-sync sidecar images (they don't use airflow_worker_image) | ||
| all_images = init_images + container_images | ||
| return [img for img in all_images if "git-sync" not in img] | ||
|
|
||
|
|
||
| def _all_airflow_pull_policies(doc): | ||
| """Extract all imagePullPolicy values from airflow containers.""" | ||
| init_policies = jmespath.search("spec.template.spec.initContainers[*].imagePullPolicy", doc) or [] | ||
| container_policies = jmespath.search("spec.template.spec.containers[*].imagePullPolicy", doc) or [] | ||
| return init_policies + container_policies |
There was a problem hiding this comment.
This added test complexity, which can lead to not-so-straightforward debugging in case of failure. Maybe we could simplify the test cases a little?
There was a problem hiding this comment.
Removed this helpers and to make more straightforward tests
| return init_policies + container_policies | ||
|
|
||
|
|
||
| class TestWorkerImageDefault: |
There was a problem hiding this comment.
Not sure where the images section tests are stored, but I think we should move them there to have all images test in one place.
There was a problem hiding this comment.
Image tests are divided on a per-component basis. I removed the separate file for the worker image and moved the tests to the test_worker.py and test_worker_sets.py files.
| assert image == "celery-custom/airflow@sha256:abcdef1234567890" | ||
|
|
||
|
|
||
| class TestWorkerImagePerSetOverride: |
There was a problem hiding this comment.
All of the tests for sets overwrite of celery, and the workers section is in the workers sets dedicated test file. Could we move these tests there?
|
@Miretpl, I’ve pushed a new commit with the changes resolving all the issues. |
|
@Miretpl OK, let's circle back once the Helm Chart direction is a bit clearer. |
Summary
Since we have the option to configure multiple Celery worker sets, it is useful to be able to use different images for specific sets (for example, to include CUDA drivers). This adds the option to override the image for all Celery workers globally or on a per-set basis using
Values.workers.celery.imageandValues.workers.celery.sets[].image.Changes
chart/templates/_helpers.yaml: Add
airflow_worker_image,airflow_worker_image_pull_policy, andairflow_worker_image_for_migrationshelpers, and patch thefullOverwritehelper to support image overrides.chart/templates/workers/worker-deployment.yaml: Update
imageandimagePullPolicyreferences to use the new helper templates.chart/values.yaml: Add new values to support the image overrides.