Skip to content

Support broadcasting just lora weights and drop support for merged weights#1320

Merged
Jackmin801 merged 44 commits intomainfrom
feat-update-lora
Nov 21, 2025
Merged

Support broadcasting just lora weights and drop support for merged weights#1320
Jackmin801 merged 44 commits intomainfrom
feat-update-lora

Conversation

@Jackmin801
Copy link
Copy Markdown
Member

@Jackmin801 Jackmin801 commented Nov 20, 2025

Note

Adds adapter-only filesystem weight broadcasting and runtime LoRA loading, wires LoRA names through orchestrator/scheduler, removes merged-weight path, and updates tests/CI accordingly.

  • LoRA broadcasting and loading:
    • Add adapter_only option to weight broadcast configs (trainer/rl/config.py); filesystem broadcaster now saves only adapters and writes adapter_config.json; NCCL path disallows adapter-only.
    • Introduce get_adapter_state_dict and adapter filename handling in trainer/weights.py; drop merged-LoRA gathering path; simplify gather_weights_on_master.
    • Pass lora_config into broadcaster setup and persist with adapter saves.
  • Orchestrator & Scheduler:
    • Add OrchestratorConfig.lora_name and thread through Scheduler; use LoRA name as model_name after updates.
    • update_weights now accepts lora_name and uses new /v1/load_lora_adapter; avoid base reload when using LoRA.
  • Inference (vLLM server):
    • Inject hack to bypass adapter already-loaded checks to allow in-place LoRA updates.
  • Trainer:
    • Broadcast weights each step using adapter_only flag; checkpoint manager unchanged but aligned with new adapter flow.
  • Tests/CI:
    • Add integration tests for LoRA RL with dynamic adapter loading; separate LoRA tests in CI; capture vLLM/stdout logs for debugging.

Written by Cursor Bugbot for commit 7857428. This will update automatically on new commits. Configure here.

@Jackmin801 Jackmin801 marked this pull request as ready for review November 20, 2025 18:34
Comment thread src/prime_rl/trainer/ckpt.py
Comment thread tests/integration/lora/test_rl.py Outdated
Comment thread src/prime_rl/orchestrator/orchestrator.py
Comment thread tests/conftest.py Outdated
Jackmin801 and others added 4 commits November 21, 2025 11:13
Comment out the unloading of the LoRA adapter in orchestrator.py.

Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com>
Comment thread src/prime_rl/trainer/rl/broadcast/__init__.py
Comment thread src/prime_rl/trainer/rl/broadcast/filesystem.py
Copy link
Copy Markdown
Member

@mikasenghaas mikasenghaas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very clean, i love it. nice that we decouple the full vs adapter only weight saving as well, has been bugging me for a while how entangled these were

Comment thread .github/workflows/gpu_tests.yaml
Comment thread src/prime_rl/orchestrator/config.py
Comment thread src/prime_rl/orchestrator/scheduler.py
Comment thread .github/workflows/gpu_tests.yaml
Comment thread src/prime_rl/trainer/rl/config.py Outdated
Comment thread src/prime_rl/trainer/rl/config.py Outdated
Comment thread src/prime_rl/trainer/rl/train.py
@Jackmin801 Jackmin801 merged commit 2aa47ae into main Nov 21, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants