Skip to content

[DYNAMO] refactor: extract admin endpoints into AdminAPI Protocol#2397

Draft
AmeenP wants to merge 1 commit intomainfrom
refactor/admin-api-protocol
Draft

[DYNAMO] refactor: extract admin endpoints into AdminAPI Protocol#2397
AmeenP wants to merge 1 commit intomainfrom
refactor/admin-api-protocol

Conversation

@AmeenP
Copy link
Copy Markdown
Contributor

@AmeenP AmeenP commented May 2, 2026

Summary

Pure refactor with no behavior change. Extracts admin-endpoint construction into a typed AdminAPI Protocol so a DynamoAdminAPI follow-up can drop in without touching every call site or scattering if backend == "dynamo" branches across client.py.

What changes

  • AdminAPI Protocol — 7 admin operations (health, list_models, pause, resume, update_weights, load_lora_adapter, init_broadcaster).
  • VLLMAdminAPI — single implementation, byte-identical paths/params/timeouts/raise-for-status to the previous inline code.
  • _DEFAULT_ADMIN: AdminAPI = VLLMAdminAPI() — module singleton so existing call sites need zero changes.
  • Each free admin function takes a keyword-only admin: AdminAPI = _DEFAULT_ADMIN and delegates HTTP construction to it.

Inlined two functions while I was here:

  • _pause_engines / _resume_engines — were one-line gather() wrappers around admin.pause / admin.resume, with a single caller. Inlined into update_weights. Net: -12 lines, callsite gets clearer (the pause→try→resume pattern is visible inline).

What does not change

  • orchestrator.py, scheduler.py, elastic.py are untouched — they pick up VLLMAdminAPI behavior via the default arg.
  • tests/unit/utils/test_client.py is untouched — mock_client.post is called with the same args, just routed through VLLMAdminAPI.load_lora_adapter.
  • No config schema changes. No new TOML fields.

Why this shape (and not alternatives)

Option Why not
Per-endpoint paths in TOML Every deployment spells out 6+ routes. Doesn't address request-shape differences (only paths). Contract lives in config — drifts on rename.
if backend == "dynamo" inside each free function 5+ scattered branches. Same divergence the Protocol localises, but uglier and untyped.
Drop the Protocol, just have two classes Loses the typed seam — callers can't be admin: AdminAPI. Implicit inheritance also creates the risk of accidentally inheriting vLLM-specific behavior.

The Protocol earns its keep now, not hypothetically: there are already two backends (vLLM, Dynamo) with materially different surfaces, and the existing use_token_client, skip_model_check, and bis-branch's use_prefix_cache_salt flags are all the same shape ("backend has different conventions, papered over by a config flag").

Follow-ups (separate PRs)

  1. Add client.backend: Literal["vllm", "dynamo"] = "vllm" config field; pick the right AdminAPI in setup_admin_clients.
  2. Add DynamoAdminAPI with the correct paths for Dynamo's admin frontend (some under /v1/rl/, some not).
  3. Migrate scattered shape flags into backend defaults.

Each is independently shippable. This PR is the foundation.

Verification

  • ruff format + ruff check clean.
  • Hand-traced test_load_lora_adapter_succeeds_on_first_attempt: same mock_client.post("/load_lora_adapter", json=..., timeout=...) call via VLLMAdminAPI.load_lora_adapter.
  • Local uv run pytest is unrunnable due to an unrelated uv.lock parse issue (exclude-newer = "7 days" from chore: add 7-day supply chain cooldown via exclude-newer #2382 + an older uv binary). CI please verify.

Stat

+156 / -75 (one file). The ~30 lines of net growth are the Protocol contract — exactly what the DynamoAdminAPI follow-up will need.

Pure refactor, no behavior change. Sets up the foundation for adding a
DynamoAdminAPI in a follow-up without touching every call site.

Adds:
- AdminAPI Protocol describing 7 admin operations (health, list_models,
  pause, resume, update_weights, load_lora_adapter, init_broadcaster).
- VLLMAdminAPI class implementing it with the *exact* paths, params,
  timeouts, and raise-for-status behavior the previous inline code had.
- Module-level singleton _DEFAULT_ADMIN: AdminAPI = VLLMAdminAPI().

Each free admin function (update_weights, load_lora_adapter,
_pause_engines, _resume_engines, init_nccl_broadcast, check_health,
maybe_check_has_model) now takes a keyword-only `admin: AdminAPI =
_DEFAULT_ADMIN` and delegates HTTP construction to it. Existing call
sites in orchestrator.py, scheduler.py, and elastic.py are unchanged --
they pick up VLLMAdminAPI behavior via the default.

The retry / health-poll / NCCL-404-tolerance logic stays in the free
functions where it always lived. AdminAPI is responsible only for
constructing the per-call HTTP request.

Why this shape:

The previous design assumed every admin endpoint shares the same path
prefix (admin_base_url) and bare relative paths. That works for vLLM
because vLLM aliases /load_lora_adapter and /v1/load_lora_adapter,
masking inconsistencies. It does not work when a backend (e.g. Dynamo's
/v1/rl/*) splits some endpoints out of the shared prefix or uses
different request shapes (token_ids via nvext, native cache_salt, model
listing not under the admin path). Encoding paths in TOML would force
every deployment to spell out 6+ routes and wouldn't help with shape
differences. A backend-typed Protocol localises both concerns.

Existing test (tests/unit/utils/test_client.py) is unchanged: the same
mock_client.post call is made via VLLMAdminAPI.load_lora_adapter with
the same args. Local pytest was unrunnable due to an unrelated uv.lock
parse issue (exclude-newer = "7 days") -- verified by tracing the call
graph manually.
@AmeenP AmeenP force-pushed the refactor/admin-api-protocol branch from 5b085a8 to d21a4ad Compare May 2, 2026 23:13
@AmeenP AmeenP changed the title refactor: extract admin endpoints into AdminAPI Protocol [DYNAMO] refactor: extract admin endpoints into AdminAPI Protocol May 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant