[DYNAMO] feat: example k8s manifests + local smoke tests#2394
Draft
[DYNAMO] feat: example k8s manifests + local smoke tests#2394
Conversation
… inference backend
Adds a worked example for running prime-rl with NVIDIA Dynamo as the
inference backend instead of prime-rl's bundled vLLM frontend.
Self-contained, additive-only — touches no existing source code, so
zero risk to existing deployments.
k8s/dynamo-deploy/:
- dynamo-dgd.yaml: Example DynamoGraphDeployment (frontend + vLLM
worker). Requires DYN_ENABLE_RL=true on the Dynamo runtime so
/v1/rl/* admin endpoints are served natively.
- prime-rl-values.yaml: Helm values overlay for k8s/prime-rl that
disables prime-rl's own inference component and points
base_url/admin_base_url at the Dynamo frontend.
- prime-rl-configs.yaml: ConfigMap mounted at /configs in the
orchestrator and trainer pods (used together with the
`<component>.configMap` Helm value -- see #2393).
- admin-stub.yaml: Optional admin-stub Deployment + Service for
older Dynamo builds that don't serve /v1/rl/* natively.
tools/dynamo/:
- admin_stub.py: Local-dev fallback admin stub (aiohttp). Mirrors
the optional k8s admin-stub for laptop/single-node runs.
- configs/smoke_*.toml: Short (5-step) and long (20-step) RL +
trainer configs pointed at a local Dynamo on localhost:8000.
- run_dynamo.sh / run_smoke_test.sh: Convenience launchers for
a 2-GPU smoke flow (GPU 0 = Dynamo, GPU 1 = trainer).
All manifests use placeholders for namespace, image, and image-pull
secret -- no secrets, paths, or registry coordinates are baked in.
06b03e3 to
8f1aafe
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a worked example for running prime-rl with NVIDIA Dynamo as the inference backend instead of prime-rl's bundled vLLM frontend. The PR is self-contained: it only adds files under
k8s/dynamo-deploy/andtools/dynamo/, plus a formatting fix for the new local admin stub.k8s/dynamo-deploy/cluster deployment exampledynamo-dgd.yamlDynamoGraphDeploymentwith frontend + vLLM worker. RequiresDYN_ENABLE_RL=trueso/v1/rl/*admin endpoints are served natively.prime-rl-values.yamlbase_url/admin_base_urlat Dynamo.prime-rl-configs.yaml/configsin orchestrator and trainer pods.admin-stub.yaml/v1/rl/*.tools/dynamo/local smoke flowadmin_stub.pyconfigs/smoke_rl.toml/smoke_rl_long.tomllocalhost:8000.configs/smoke_trainer.toml/smoke_trainer_long.tomlrun_dynamo.shrun_smoke_test.sh--long.Notes
<component>.configMap,imagePullSecrets, andtolerations.client.base_urlandclient.admin_base_urlpoint at.<your-namespace>and<your-registry>/...; no secrets, paths, or registry coordinates are baked in.What's not in this PR
Extracted from the same upstream branch as #2391 and #2393. Skipped:
experimental.use_prefix_cache_saltopt-out, because upstream went the other direction in chore: remove prefix-cache-salt and reset-prefix-cache config flags #2314.tools/rl_monitor.py/tools/rl_plot.py, which can be a separate PR if reviewers want it.tools/patches/vllm_lora_retry.py, a temporary monkeypatch.Latest Validation
After rebasing onto latest
prime-rl/main, formattedtools/dynamo/admin_stub.pyon this base branch so #2394 no longer depends on the follow-up PR for Ruff hygiene.uvx ruff==0.13.0 check tools/dynamo/admin_stub.pyuvx ruff==0.13.0 format --check tools/dynamo/admin_stub.pypython -m py_compile tools/dynamo/admin_stub.pybash -n tools/dynamo/run_dynamo.sh tools/dynamo/run_smoke_test.sh tools/dynamo/run_full_smoke.sh