Skip to content

Honor client-resolved service tier defaults#23537

Merged
shijie-oai merged 3 commits into
mainfrom
shijie/better-service-tier-2
May 20, 2026
Merged

Honor client-resolved service tier defaults#23537
shijie-oai merged 3 commits into
mainfrom
shijie/better-service-tier-2

Conversation

@shijie-oai
Copy link
Copy Markdown
Collaborator

@shijie-oai shijie-oai commented May 19, 2026

Why

Model catalog responses can now advertise a nullable default_service_tier for each model. Codex needs to preserve three distinct states all the way from config/app-server inputs to inference:

  • no explicit service tier, so the client may apply the current model catalog default when FastMode is enabled
  • explicit default, meaning the user intentionally wants standard routing
  • explicit catalog tier ids such as priority, flex, or future tiers

Keeping those states distinct prevents the UI from showing one tier while core sends another, especially after model switches or app-server thread/start / turn/start updates.

What Changed

  • Plumbed default_service_tier through model catalog protocol types, app-server model responses, generated schemas, model cache fixtures, and provider/model-manager conversions.
  • Added the request-only default service tier sentinel and normalized legacy config spelling so fast in config.toml still materializes as the runtime/request id priority.
  • Moved catalog default resolution to the TUI/client side, including recomputing the effective service tier when model/FastMode-dependent surfaces change.
  • Updated app-server thread lifecycle config construction so serviceTier: null preserves explicit standard-routing intent by mapping to default instead of internal None.
  • Kept core responsible for validating explicit tiers against the current model and stripping default before /v1/responses, without applying catalog defaults itself.

Validation

  • CARGO_INCREMENTAL=0 cargo build -p codex-cli
  • CARGO_INCREMENTAL=0 cargo test -p codex-app-server model_list
  • cargo test -p codex-tui service_tier
  • cargo test -p codex-protocol service_tier_for_request
  • cargo test -p codex-core get_service_tier
  • RUST_MIN_STACK=8388608 CARGO_INCREMENTAL=0 cargo test -p codex-core service_tier

Comment on lines -43 to -44
/// Toggle the opt-out marker for Codex-managed fast defaults.
SetNoticeFastDefaultOptOut(bool),
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are deprecating this notice

default_input_modalities(),
);
model.service_tiers = vec![ModelServiceTier {
id: ServiceTier::Fast.request_value().to_string(),
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Capturing the service-tier fixture note here: ServiceTier::Fast is still a legacy compatibility variant, but ServiceTier::Fast.request_value() resolves to the actual catalog/request id (priority). We discussed spelling priority directly in catalog-shaped tests, but decided not to make that cleanup right now; the important invariant is that the generated id/default value remains priority, not literal fast.

}
}

pub(crate) fn service_tier_update_for_core(
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open for better name - the effective_service_tier indicates what the user choice is and service_tier_update_for_core is what we actually provide to core for inferencing.

@shijie-oai shijie-oai marked this pull request as ready for review May 20, 2026 04:27
@shijie-oai shijie-oai requested a review from a team as a code owner May 20, 2026 04:27
@shijie-oai shijie-oai force-pushed the shijie/better-service-tier-2 branch from a7a2e86 to ed7aa15 Compare May 20, 2026 04:29
Comment on lines +794 to +795
if !fast_mode_enabled {
return None;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

codex:

This early return seems too broad. It drops every configured tier when FastMode is disabled, including explicit flex, even though config parsing and existing tests still treat flex as a valid independent service tier. Should this only suppress managed/catalog Fast defaults, while preserving explicit supported tiers?

Copy link
Copy Markdown
Collaborator Author

@shijie-oai shijie-oai May 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if fast_mode is off, we should respect it no matter what service_tier is populated with.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this a feature flag that is always true?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically no - I think enterprise are using this to turn off the fast mode in general - we will do another pass to clean that up. I can work with you guys to figure that path forward.

return None;
}

let effective = effective_service_tier(config, model, models);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

codex:

This makes catalog-default resolution a TUI-side responsibility. AppServerSession benefits from it, but direct app-server callers that omit serviceTier still reach core with no resolved default. Should this live below the client layer so backend default_service_tier semantics apply consistently across all entrypoints?

but you mentioned offline that we are ok offloading this responsibility to clients like TUI and app and not doing it in core? double checking

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya I think that was a choice we made to intentionally have the responsibility sit with the client.

@shijie-oai shijie-oai requested a review from aibrahim-oai May 20, 2026 15:15
@shijie-oai
Copy link
Copy Markdown
Collaborator Author

@aibrahim-oai Can you give this a read too? want to make sure you also understand the mechnism behind what we talked about.

Comment thread codex-rs/protocol/src/config_types.rs Outdated
#[serde(rename_all = "lowercase")]
#[strum(serialize_all = "lowercase")]
pub enum ServiceTier {
Default,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is deprecated, do we actually need to add default to it?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point - let me get rid of this and will add a comment about deprecation here.

Comment thread codex-rs/core/src/config/edit.rs Outdated
Comment thread codex-rs/core/src/config/mod.rs Outdated
@shijie-oai shijie-oai force-pushed the shijie/better-service-tier-2 branch 2 times, most recently from 4babfe4 to a77ebfa Compare May 20, 2026 20:18
@shijie-oai shijie-oai force-pushed the shijie/better-service-tier-2 branch from a77ebfa to c9d92b4 Compare May 20, 2026 22:21
@shijie-oai shijie-oai merged commit 370b13a into main May 20, 2026
31 checks passed
@shijie-oai shijie-oai deleted the shijie/better-service-tier-2 branch May 20, 2026 22:57
@github-actions github-actions Bot locked and limited conversation to collaborators May 20, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants