Docs: Add JWT authentication docs and strengthen security model by potiuk · Pull Request #64760 · apache/airflow

potiuk · 2026-04-06T11:31:56Z

Add comprehensive documentation for JWT token authentication and strengthen the security model
documentation to be precise, unambiguous, and aligned with the security team's policies.

Changes

New document: airflow-core/docs/security/jwt_token_authentication.rst

Detailed technical reference for JWT authentication in both REST API and Execution API
Token structure, claims, signing modes (symmetric/asymmetric), JWKS support
Token lifecycle: acquisition, validation, refresh, revocation (REST API only)
Execution API token flow: generation by executor, delivery to workers, ti:self enforcement
DFP/Triggerer in-process bypass documentation
Worker memory protection via PR_SET_DUMPABLE
Workload isolation limitations and multi-team caveats
Complete configuration reference with defaults and timings

Updated: airflow-core/docs/security/security_model.rst

Precise language about database access: workers don't have it, DFP/Triggerer do
New section "JWT authentication and workload isolation" with current isolation limitations
New section "Deployment hardening for improved isolation" with actionable guidance
New section "What is NOT considered a security vulnerability" covering all categories
from the security team's canned response policies:
- DAG author arbitrary code execution
- Unsanitized input in operators/hooks
- DFP/Triggerer database access
- Shared Execution API resources
- Connection configuration capabilities
- DoS/self-XSS by authenticated users
- Simple Auth Manager issues
- Docker image scan results
- Automated scanner reports without human verification
Multi-team feature clearly marked as experimental with no task-level isolation guarantee
Added ref labels for cross-referencing

Fixed contradictions across documentation:

configurations-ref.rst / set-config.rst: Removed "use same config across all components"
recommendation, replaced with per-component security guidance
production-deployment.rst: Same fix
upgrading_to_airflow3.rst: "ensures isolation" → "improves isolation" with DFP/Triggerer caveat
best-practices.rst: "Complete isolation" → "Strong process-level isolation" for KPO
public-airflow-interface.rst: "task code" → "worker task code" for DB access restriction
multi-team.rst: Added "(at the UI and API level)" caveat to resource isolation use case
config.yml: Updated jwt_secret description with security guidance

Updated: AGENTS.md

Architecture Boundaries: DFP/Triggerer explicitly documented as having DB access and bypassing JWT
New Security Model section listing intentional design choices vs actual vulnerabilities
Full "NOT vulnerabilities" list for AI agents performing security analysis

Updated: .github/instructions/code-review.instructions.md

Clarified DFP/Triggerer "isolation" — they have DB access and bypass JWT by design

Was generative AI tooling used to co-author this PR?

Yes — Claude Code (Claude Opus 4.6)

Generated-by: Claude Code (Claude Opus 4.6) following the guidelines

airflow-core/docs/security/jwt_token_authentication.rst

potiuk · 2026-04-06T15:22:44Z

Added more points from our earlier discussions

ashb · 2026-04-07T09:51:21Z

Taking a look.

ashb

Overall this is great, thanks for picking this up!

Some subtleties, and a few corrections.

There's a lot of duplication between docs. Rather than having "not a vuln" section in public docs an AGENTS.md can we have AGENTS.md refer/link to the public doc.

AGENTS.md

airflow-core/docs/administration-and-deployment/production-deployment.rst

airflow-core/docs/configurations-ref.rst

airflow-core/docs/security/security_model.rst

Add comprehensive JWT token authentication documentation covering both the REST API and Execution API flows, including token structure, timings, refresh mechanisms, and the DFP/Triggerer in-process bypass. Update the security model to: - Document current isolation limitations (DFP/Triggerer DB access, shared Execution API resources, multi-team not guaranteeing task-level isolation) - Add deployment hardening guidance (per-component config, asymmetric JWT keys, env vars with PR_SET_DUMPABLE protection) - Add "What is NOT a security vulnerability" section covering all categories from the security team's response policies - Fix contradicting statements across docs that overstated isolation guarantees or recommended sharing all config across components Update AGENTS.md with security model awareness so AI agents performing security research distinguish intentional design choices from actual vulnerabilities.

- Add dumpable, sandboxing, unsanitized, XSS to spelling wordlist - Use 'potentially' consistently when describing Dag File Processor and Triggerer database access and JWT authentication bypass, since these are capabilities that Dag author code could exploit rather than guaranteed behaviors of normal operation

New hook `check-security-doc-constants` validates that: - [section] option references in security RST files match config.yml - AIRFLOW__X__Y env var references correspond to real config options - Default values in doc tables match config.yml defaults - Sensitive config variables are listed (warning, not error, since the list is documented as non-exhaustive) Loads both airflow-core config.yml and provider.yaml files to cover all config sections (including celery, sentry, workers, etc.). Runs automatically when config.yml or security RST docs are modified.

…date Update security_model.rst sensitive config variables section: - List ALL sensitive vars from config.yml and provider.yaml files - Core vars organized in a table with "Needed by" column mapping each var to the components that require it (API Server, Scheduler, Workers, Dag File Processor, Triggerer) - Provider vars in a separate table noting they should only be set where the provider functionality is needed - Tables are auto-generated between AUTOGENERATED markers Update prek hook to auto-update the sensitive var tables: - Reads config.yml and all provider.yaml files - Generates RST list-table content for core and provider sensitive vars - Replaces content between markers on each run - Warns when new sensitive vars need component mapping added to the hook - Validates [section] option and AIRFLOW__X__Y references against config - Skips autogenerated sections when checking env var references

Address issues raised in security discussion about the gap between Airflow's isolation promises and reality: - Clearly distinguish software guards (prevent accidental DB access) from the inability to prevent intentional malicious access by code running as the same Unix user as the parent process - Document the specific mechanisms: /proc/PID/environ, config files, _CMD commands, secrets manager credential reuse - Clarify that worker isolation is genuine (no DB credentials at all) while DFP/Triggerer isolation is software-level only - Add Unix user impersonation as a deployment hardening measure - Document strategic (API-based DFP/Triggerer) and tactical (user impersonation) planned improvements - Add warning about sensitive config leakage through task logs - Add guidance to restrict task log access

…mmend DagBundle - Reword DFP/Triggerer descriptions to clarify software guards vs intentional bypass - Extract workload isolation section from jwt_token_authentication into workload.rst - Recommend Dag Bundle mechanism (GitDagBundle) for DAG synchronization - Fix typo in public-airflow-interface.rst and broken backtick in jwt_token_authentication.rst - Update cross-references between security docs

potiuk · 2026-04-07T13:49:34Z

Updated all comments @ashb - We might improve and deduplicate it further as a follow-up - but I also re-reviewed it and it seems to me that we have no "factual" errors - and enough of the guardrails to avoid misunderstanding of what we have and what we don't have.

Changes made, I'll re-review when I can but not blocking if someone else reviews the changes in the mean time.

potiuk · 2026-04-07T15:38:14Z

Merge for now - we will refine it before 3.2.1

… model (#64760) * Docs: Add JWT authentication docs and strengthen security model Add comprehensive JWT token authentication documentation covering both the REST API and Execution API flows, including token structure, timings, refresh mechanisms, and the DFP/Triggerer in-process bypass. Update the security model to: - Document current isolation limitations (DFP/Triggerer DB access, shared Execution API resources, multi-team not guaranteeing task-level isolation) - Add deployment hardening guidance (per-component config, asymmetric JWT keys, env vars with PR_SET_DUMPABLE protection) - Add "What is NOT a security vulnerability" section covering all categories from the security team's response policies - Fix contradicting statements across docs that overstated isolation guarantees or recommended sharing all config across components Update AGENTS.md with security model awareness so AI agents performing security research distinguish intentional design choices from actual vulnerabilities. * Fix spelling errors and use 'potentially' for DFP/Triggerer access - Add dumpable, sandboxing, unsanitized, XSS to spelling wordlist - Use 'potentially' consistently when describing Dag File Processor and Triggerer database access and JWT authentication bypass, since these are capabilities that Dag author code could exploit rather than guaranteed behaviors of normal operation * Add prek hook to validate security doc constants against config.yml New hook `check-security-doc-constants` validates that: - [section] option references in security RST files match config.yml - AIRFLOW__X__Y env var references correspond to real config options - Default values in doc tables match config.yml defaults - Sensitive config variables are listed (warning, not error, since the list is documented as non-exhaustive) Loads both airflow-core config.yml and provider.yaml files to cover all config sections (including celery, sentry, workers, etc.). Runs automatically when config.yml or security RST docs are modified. * Expand sensitive vars to full list with component mapping and auto-update Update security_model.rst sensitive config variables section: - List ALL sensitive vars from config.yml and provider.yaml files - Core vars organized in a table with "Needed by" column mapping each var to the components that require it (API Server, Scheduler, Workers, Dag File Processor, Triggerer) - Provider vars in a separate table noting they should only be set where the provider functionality is needed - Tables are auto-generated between AUTOGENERATED markers Update prek hook to auto-update the sensitive var tables: - Reads config.yml and all provider.yaml files - Generates RST list-table content for core and provider sensitive vars - Replaces content between markers on each run - Warns when new sensitive vars need component mapping added to the hook - Validates [section] option and AIRFLOW__X__Y references against config - Skips autogenerated sections when checking env var references * Clarify software guards vs intentional access in DFP/Triggerer Address issues raised in security discussion about the gap between Airflow's isolation promises and reality: - Clearly distinguish software guards (prevent accidental DB access) from the inability to prevent intentional malicious access by code running as the same Unix user as the parent process - Document the specific mechanisms: /proc/PID/environ, config files, _CMD commands, secrets manager credential reuse - Clarify that worker isolation is genuine (no DB credentials at all) while DFP/Triggerer isolation is software-level only - Add Unix user impersonation as a deployment hardening measure - Document strategic (API-based DFP/Triggerer) and tactical (user impersonation) planned improvements - Add warning about sensitive config leakage through task logs - Add guidance to restrict task log access * Docs: Improve security docs wording, extract workload isolation, recommend DagBundle - Reword DFP/Triggerer descriptions to clarify software guards vs intentional bypass - Extract workload isolation section from jwt_token_authentication into workload.rst - Recommend Dag Bundle mechanism (GitDagBundle) for DAG synchronization - Fix typo in public-airflow-interface.rst and broken backtick in jwt_token_authentication.rst - Update cross-references between security docs (cherry picked from commit 0a03b4e) Co-authored-by: Jarek Potiuk <jarek@potiuk.com>

github-actions · 2026-04-07T15:39:34Z

Backport successfully created: v3-2-test

Note: As of Merging PRs targeted for Airflow 3.X
the committer who merges the PR is responsible for backporting the PRs that are bug fixes (generally speaking) to the maintenance branches.

In matter of doubt please ask in #release-management Slack channel.

Status	Branch	Result
✅	v3-2-test

…he#64760) * Docs: Add JWT authentication docs and strengthen security model Add comprehensive JWT token authentication documentation covering both the REST API and Execution API flows, including token structure, timings, refresh mechanisms, and the DFP/Triggerer in-process bypass. Update the security model to: - Document current isolation limitations (DFP/Triggerer DB access, shared Execution API resources, multi-team not guaranteeing task-level isolation) - Add deployment hardening guidance (per-component config, asymmetric JWT keys, env vars with PR_SET_DUMPABLE protection) - Add "What is NOT a security vulnerability" section covering all categories from the security team's response policies - Fix contradicting statements across docs that overstated isolation guarantees or recommended sharing all config across components Update AGENTS.md with security model awareness so AI agents performing security research distinguish intentional design choices from actual vulnerabilities. * Fix spelling errors and use 'potentially' for DFP/Triggerer access - Add dumpable, sandboxing, unsanitized, XSS to spelling wordlist - Use 'potentially' consistently when describing Dag File Processor and Triggerer database access and JWT authentication bypass, since these are capabilities that Dag author code could exploit rather than guaranteed behaviors of normal operation * Add prek hook to validate security doc constants against config.yml New hook `check-security-doc-constants` validates that: - [section] option references in security RST files match config.yml - AIRFLOW__X__Y env var references correspond to real config options - Default values in doc tables match config.yml defaults - Sensitive config variables are listed (warning, not error, since the list is documented as non-exhaustive) Loads both airflow-core config.yml and provider.yaml files to cover all config sections (including celery, sentry, workers, etc.). Runs automatically when config.yml or security RST docs are modified. * Expand sensitive vars to full list with component mapping and auto-update Update security_model.rst sensitive config variables section: - List ALL sensitive vars from config.yml and provider.yaml files - Core vars organized in a table with "Needed by" column mapping each var to the components that require it (API Server, Scheduler, Workers, Dag File Processor, Triggerer) - Provider vars in a separate table noting they should only be set where the provider functionality is needed - Tables are auto-generated between AUTOGENERATED markers Update prek hook to auto-update the sensitive var tables: - Reads config.yml and all provider.yaml files - Generates RST list-table content for core and provider sensitive vars - Replaces content between markers on each run - Warns when new sensitive vars need component mapping added to the hook - Validates [section] option and AIRFLOW__X__Y references against config - Skips autogenerated sections when checking env var references * Clarify software guards vs intentional access in DFP/Triggerer Address issues raised in security discussion about the gap between Airflow's isolation promises and reality: - Clearly distinguish software guards (prevent accidental DB access) from the inability to prevent intentional malicious access by code running as the same Unix user as the parent process - Document the specific mechanisms: /proc/PID/environ, config files, _CMD commands, secrets manager credential reuse - Clarify that worker isolation is genuine (no DB credentials at all) while DFP/Triggerer isolation is software-level only - Add Unix user impersonation as a deployment hardening measure - Document strategic (API-based DFP/Triggerer) and tactical (user impersonation) planned improvements - Add warning about sensitive config leakage through task logs - Add guidance to restrict task log access * Docs: Improve security docs wording, extract workload isolation, recommend DagBundle - Reword DFP/Triggerer descriptions to clarify software guards vs intentional bypass - Extract workload isolation section from jwt_token_authentication into workload.rst - Recommend Dag Bundle mechanism (GitDagBundle) for DAG synchronization - Fix typo in public-airflow-interface.rst and broken backtick in jwt_token_authentication.rst - Update cross-references between security docs

… model (#64760) (#64849) * Docs: Add JWT authentication docs and strengthen security model Add comprehensive JWT token authentication documentation covering both the REST API and Execution API flows, including token structure, timings, refresh mechanisms, and the DFP/Triggerer in-process bypass. Update the security model to: - Document current isolation limitations (DFP/Triggerer DB access, shared Execution API resources, multi-team not guaranteeing task-level isolation) - Add deployment hardening guidance (per-component config, asymmetric JWT keys, env vars with PR_SET_DUMPABLE protection) - Add "What is NOT a security vulnerability" section covering all categories from the security team's response policies - Fix contradicting statements across docs that overstated isolation guarantees or recommended sharing all config across components Update AGENTS.md with security model awareness so AI agents performing security research distinguish intentional design choices from actual vulnerabilities. * Fix spelling errors and use 'potentially' for DFP/Triggerer access - Add dumpable, sandboxing, unsanitized, XSS to spelling wordlist - Use 'potentially' consistently when describing Dag File Processor and Triggerer database access and JWT authentication bypass, since these are capabilities that Dag author code could exploit rather than guaranteed behaviors of normal operation * Add prek hook to validate security doc constants against config.yml New hook `check-security-doc-constants` validates that: - [section] option references in security RST files match config.yml - AIRFLOW__X__Y env var references correspond to real config options - Default values in doc tables match config.yml defaults - Sensitive config variables are listed (warning, not error, since the list is documented as non-exhaustive) Loads both airflow-core config.yml and provider.yaml files to cover all config sections (including celery, sentry, workers, etc.). Runs automatically when config.yml or security RST docs are modified. * Expand sensitive vars to full list with component mapping and auto-update Update security_model.rst sensitive config variables section: - List ALL sensitive vars from config.yml and provider.yaml files - Core vars organized in a table with "Needed by" column mapping each var to the components that require it (API Server, Scheduler, Workers, Dag File Processor, Triggerer) - Provider vars in a separate table noting they should only be set where the provider functionality is needed - Tables are auto-generated between AUTOGENERATED markers Update prek hook to auto-update the sensitive var tables: - Reads config.yml and all provider.yaml files - Generates RST list-table content for core and provider sensitive vars - Replaces content between markers on each run - Warns when new sensitive vars need component mapping added to the hook - Validates [section] option and AIRFLOW__X__Y references against config - Skips autogenerated sections when checking env var references * Clarify software guards vs intentional access in DFP/Triggerer Address issues raised in security discussion about the gap between Airflow's isolation promises and reality: - Clearly distinguish software guards (prevent accidental DB access) from the inability to prevent intentional malicious access by code running as the same Unix user as the parent process - Document the specific mechanisms: /proc/PID/environ, config files, _CMD commands, secrets manager credential reuse - Clarify that worker isolation is genuine (no DB credentials at all) while DFP/Triggerer isolation is software-level only - Add Unix user impersonation as a deployment hardening measure - Document strategic (API-based DFP/Triggerer) and tactical (user impersonation) planned improvements - Add warning about sensitive config leakage through task logs - Add guidance to restrict task log access * Docs: Improve security docs wording, extract workload isolation, recommend DagBundle - Reword DFP/Triggerer descriptions to clarify software guards vs intentional bypass - Extract workload isolation section from jwt_token_authentication into workload.rst - Recommend Dag Bundle mechanism (GitDagBundle) for DAG synchronization - Fix typo in public-airflow-interface.rst and broken backtick in jwt_token_authentication.rst - Update cross-references between security docs (cherry picked from commit 0a03b4e) Co-authored-by: Jarek Potiuk <jarek@potiuk.com>

potiuk requested review from amoghrajesh, ashb, choo121600, jason810496, jscheffl, kaxil and shahar1 as code owners April 6, 2026 11:31

boring-cyborg bot added area:ConfigTemplates area:dev-tools backport-to-v3-2-test Mark PR with this label to backport to v3-2-test branch kind:documentation labels Apr 6, 2026

potiuk added this to the Airflow 3.2.0 milestone Apr 6, 2026

jscheffl approved these changes Apr 6, 2026

View reviewed changes

airflow-core/docs/security/jwt_token_authentication.rst Show resolved Hide resolved

potiuk requested review from bugraoz93 and gopidesupavan as code owners April 6, 2026 14:49

potiuk mentioned this pull request Apr 6, 2026

verify-action-build: add node_modules verification for vendored deps apache/infrastructure-actions#652

Merged

2 tasks

bugraoz93 approved these changes Apr 6, 2026

View reviewed changes

ashb previously requested changes Apr 7, 2026

View reviewed changes

potiuk added 6 commits April 7, 2026 15:30

potiuk force-pushed the docs/jwt-auth-security-model branch from 282a181 to 55f9e84 Compare April 7, 2026 13:47

potiuk requested a review from ashb April 7, 2026 13:48

potiuk merged commit 0a03b4e into apache:main Apr 7, 2026
143 checks passed

potiuk deleted the docs/jwt-auth-security-model branch April 7, 2026 15:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docs: Add JWT authentication docs and strengthen security model#64760

Docs: Add JWT authentication docs and strengthen security model#64760
potiuk merged 6 commits intoapache:mainfrom
potiuk:docs/jwt-auth-security-model

potiuk commented Apr 6, 2026

Uh oh!

Uh oh!

potiuk commented Apr 6, 2026

Uh oh!

ashb commented Apr 7, 2026

Uh oh!

ashb left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

potiuk commented Apr 7, 2026

Uh oh!

potiuk commented Apr 7, 2026

Uh oh!

Uh oh!

github-actions bot commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

potiuk commented Apr 6, 2026

Changes

Was generative AI tooling used to co-author this PR?

Uh oh!

Uh oh!

potiuk commented Apr 6, 2026

Uh oh!

ashb commented Apr 7, 2026

Uh oh!

ashb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

potiuk commented Apr 7, 2026

Uh oh!

potiuk commented Apr 7, 2026

Uh oh!

Uh oh!

github-actions bot commented Apr 7, 2026

Backport successfully created: v3-2-test

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants