feature: prime train routes full_finetune TOMLs to hosted endpoint by JannikSt · Pull Request #592 · PrimeIntellect-ai/prime

JannikSt · 2026-05-02T23:30:48Z

Note

Medium Risk
Changes how prime train dispatches and deletes runs by introducing a second backend endpoint and typed 404 handling; mis-detection or error mapping could route jobs to the wrong API or change delete behavior.

Overview
prime train now detects full fine-tune TOMLs (type = "full_finetune" or a [deployment] GPU block) and dispatches them to a new hosted endpoint (POST /v1/training/runs) instead of the existing LoRA/shared RL endpoint.

Adds a new HostedTrainingClient + payload builder for full-FT, wires secrets from env/env-files, and intentionally suppresses the returned per-run token from CLI output.

Run deletion now tries the hosted full-FT delete endpoint first and falls back to the existing RL delete path on a typed NotFoundError; the core HTTP client now raises NotFoundError for 404s, and RLRun gains an optional kind discriminator for forward/backward compatibility.

^{Reviewed by Cursor Bugbot for commit 28fdcd2. Bugbot is set up for automated code reviews on this repo. Configure here.}

…o hosted endpoint `prime train <toml>` stays the single entry point. When the TOML carries `type = "full_finetune"` (or a `[hosted]` block, or a `[deployment]` block matching prime-rl's qwen30b_math/rl.toml shape), the CLI routes to the new public API at /api/v1/training/runs instead of the LoRA shared-cluster path. Backwards compatible: configs without these markers run through the existing flow unchanged. * api/training.py: new HostedTrainingClient + build_payload_from_toml (whitelist-maps prime-rl example fields onto the API payload). * api/rl.py: surface `kind` on RLRun so `prime train delete` can route to the right endpoint based on run kind. * commands/rl.py: peek the TOML before strict RLConfig parse; on full-FT hand off to _dispatch_full_finetune_run with shared env-file/secrets plumbing. Delete looks up kind and dispatches via HostedTrainingClient for DEDICATED_FULL_FT runs. Tested end-to-end against local backend on rft-freyr cluster: dispatch + status mirroring + completion + delete all clean.

Drop the friction of looking up a cluster cuid for the common single- cluster setup. Backend now auto-picks the first uncordoned PrimeCluster when the field is omitted, so `prime train backend/examples/training/ reverse_text.toml` is zero-config.

…eyRef The platform materialises the per-run RFT API token into the per-run k8s Secret on dispatch and the chart binds it to the orchestrator pod's PRIME_API_KEY env var. The token already lives where prime-monitor needs it — surfacing it on stdout just makes it easy to leak into shared shell history or CI logs.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7b88b5f7d0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

- Drop prime_cluster_id from CLI plumbing entirely. Backend auto-picks the first uncordoned PrimeCluster; the CLI never threads a cluster id, removing a footgun (mistargeting a config to the wrong cluster). Drops the [hosted] discriminator path too — type = 'full_finetune' or a [deployment] block remain the only triggers. - codex P1 / cursor: --output json on the full-FT path no longer short-circuits with a {would_dispatch} preview. Mirrors the LoRA path's 'create then format' contract — automation that pipes the JSON to grab run_id now actually dispatches the run. - codex P2: env_file (deprecated, singular) is loaded BEFORE env_files (canonical, plural) so env_files wins on key collision. Matches the LoRA path's documented precedence.

build_payload_from_toml used to whitelist ~12 individual fields; the backend rebuilt a minimal TOML from them, dropping anything outside the whitelist (custom optim schedules, eval configs, custom scheduler params, …). E2E prime-rl runs that depended on those knobs silently diverged from `uv run rl @ rl.toml` behaviour. Now: ship the whole TOML as 'config' (companion to platform PR #1824 faa934d56). The backend's build_values takes the config dict directly so the same TOML works locally with prime-rl and remote-dispatched through prime-cli with no fork in behaviour.

Unrelated to the full-FT training payload change; accidentally picked up by 'git add -A'. Restoring inference.py to its 4b9be8f state. The streaming improvements will land in a sibling PR off main.

Cursor caught: when rl_client.get_run fails (e.g. pydantic ValidationError because a DEDICATED_FULL_FT row doesn't carry the LoRA-required RLRun fields), the prior except APIError block silently set kind=None and routed delete to the LoRA endpoint. The hosted helm release + namespace would stay live with no signal back. Restructure: try hosted_client.delete_run first; on HTTP 404 the backend's kind gate told us 'not a DEDICATED_FULL_FT', so we fall through to the LoRA path. Removes the get_run discriminator roundtrip entirely — and any pydantic surprise it could have raised.

Cursor flagged: distinguishing 404 by 'HTTP 404' not in str(e) is fragile — depends on the message format never changing AND on no unrelated error body coincidentally containing the substring. - Add NotFoundError(APIError) subclass; APIClient raises it for 404 responses (sync + async paths). - HostedTrainingClient.create_run / delete_run no longer catch Exception and rewrap into a generic APIError — typed APIError subclasses propagate so callers can branch by class. - prime train delete uses except NotFoundError as the 'not a hosted run, try LoRA' fallback signal.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 2e1688d. Configure here.}

Cursor caught: `if output != "json" and not yes` meant a researcher piping JSON for scripting would have full-FT runs auto-launched without ack. The LoRA path always prompts unless --yes regardless of output format. Match that contract — gate confirmation purely on --yes.

JannikSt added 3 commits May 1, 2026 11:28

chatgpt-codex-connector Bot reviewed May 2, 2026

View reviewed changes

Comment thread packages/prime/src/prime_cli/commands/rl.py Outdated

Comment thread packages/prime/src/prime_cli/commands/rl.py Outdated

cursor Bot reviewed May 2, 2026

View reviewed changes

Comment thread packages/prime/src/prime_cli/commands/rl.py Outdated

JannikSt added 3 commits May 2, 2026 16:45

fix: move inference reasoning-content streaming changes off this PR

3142f87

Unrelated to the full-FT training payload change; accidentally picked up by 'git add -A'. Restoring inference.py to its 4b9be8f state. The streaming improvements will land in a sibling PR off main.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature: prime train routes full_finetune TOMLs to hosted endpoint#592

feature: prime train routes full_finetune TOMLs to hosted endpoint#592
JannikSt wants to merge 9 commits intomainfrom
feature/train-hosted-full-finetune

JannikSt commented May 2, 2026 •

edited by cursor Bot

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JannikSt commented May 2, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

JannikSt commented May 2, 2026 •

edited by cursor Bot

Loading