feat: add teacher config/sft loss for hosted training SFT by eligotts · Pull Request #514 · PrimeIntellect-ai/prime

eligotts · 2026-04-14T06:33:14Z

Summary

Implements APR-157 SFT distillation support through the existing prime train config path.

Adds top-level loss = "rl" | "sft" with [teacher] and [teacher.sampling] config models.
Validates that SFT requires a teacher, RL rejects teacher config, and teacher.save = true is not supported.
Defaults omitted SFT rollouts_per_example to 1 while preserving explicit overrides.
Forwards the public platform payload shape as loss and teacher.
Updates the confirmation summary so Training, Teacher, and Run Config render as separate sections.
Keeps the generated template TOML-safe when SFT teacher config and checkpoint_id are uncommented together.

Config Shape

model = "openai/gpt-oss-20b"
loss = "sft"

[teacher]
model = "openai/gpt-oss-120b"
save = false

[teacher.sampling]
max_tokens = 2048
reasoning_effort = "medium"

[[env]]
id = "primeintellect/wordle"

API Payload

{
  "loss": "sft",
  "teacher": {
    "model": "openai/gpt-oss-120b",
    "save": false,
    "sampling": {
      "max_tokens": 2048,
      "reasoning_effort": "medium"
    }
  }
}

Tests

uv run pytest packages/prime/tests/test_rl_config.py packages/prime/tests/test_train_cli.py -q

Add TeacherRolloutModelConfig to the RL config schema so users can specify an external teacher model for SFT hard distill via TOML: [teacher_rollout_model] base_url = ["https://..."] api_key_var = "PRIME_API_KEY" name = "model-name" The field flows through the API client to the platform, which merges it into the orchestrator's run_config as CLI overrides. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 5da06aa. Configure here.}

cursor Bot reviewed Apr 27, 2026

View reviewed changes

Comment thread packages/prime/src/prime_cli/commands/rl.py Outdated

eligotts and others added 2 commits May 3, 2026 22:46

fix: validate hosted SFT config

5da06aa

willccbb force-pushed the eli/hosted-sft branch from 746d7d1 to 5da06aa Compare May 4, 2026 06:48

cursor Bot reviewed May 4, 2026

View reviewed changes

Comment thread packages/prime/src/prime_cli/commands/rl.py

willccbb changed the title ~~feat: add teacher_rollout_model config for hosted SFT hard distill~~ feat: add teacher config/sft loss for hosted training SFT May 4, 2026

Fix SFT template section ordering

540a5f8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add teacher config/sft loss for hosted training SFT#514

feat: add teacher config/sft loss for hosted training SFT#514
eligotts wants to merge 3 commits intomainfrom
eli/hosted-sft

eligotts commented Apr 14, 2026 •

edited by willccbb

Loading

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

eligotts commented Apr 14, 2026 • edited by willccbb Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Config Shape

API Payload

Tests

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

eligotts commented Apr 14, 2026 •

edited by willccbb

Loading