Skip to content

[CI] Flash Attention RDNA CI#2222

Open
micmelesse wants to merge 1 commit intomainfrom
micmelesse/rdna_ci_v2
Open

[CI] Flash Attention RDNA CI#2222
micmelesse wants to merge 1 commit intomainfrom
micmelesse/rdna_ci_v2

Conversation

@micmelesse
Copy link
Contributor

@micmelesse micmelesse commented Mar 9, 2026

Motivation

The triton backend of flash attention is used on consumer cards (RDNA) in addition to datacenter cards (CDNA). This adds an RDNA CI runner (aiter-gfx1100) to the existing flash_attention_integration workflow via matrix strategy so both architectures are tested in parallel. This is a follow up to #1974

Technical Details

Test Plan

Test Result

Submission Checklist

@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2026

🏷️ CI Guide

Runs automatically on every PR:

  • ✅ Pre-checks (submodule verification, code formatting)
  • ✅ Aiter op tests (gfx942 + gfx950)
  • ✅ Triton tests (only when aiter/ops/triton/** or related paths are changed)

Extended tests (opt-in via labels):

Label Tests
ci:sglang SGLang integration tests
ci:atom ATOM benchmark (DeepSeek-R1 + GPT-OSS)
ci:multi-gpu Multi-GPU op tests (8 GPU)
ci:vllm vLLM benchmark
ci:all All of the above

Add labels via the sidebar or gh pr edit 2222 --add-label <label>

@micmelesse micmelesse marked this pull request as ready for review March 9, 2026 17:21
@micmelesse micmelesse requested review from a team and Copilot March 9, 2026 17:21
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds RDNA (consumer GPU) coverage to the existing Flash Attention Triton integration workflow by running the same job on both CDNA (MI355) and RDNA3 runners in parallel, improving CI validation for the Triton backend across supported AMD architectures.

Changes:

  • Convert the Triton integration job to a matrix strategy to run on both MI355 and RDNA3 runners.
  • Make job names and benchmark artifacts/logs architecture-specific to avoid collisions across matrix runs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants