Skip to content

Document or expose deterministic controls for ARC cancellations on owner-approved MCP tools #21341

@GlebGlebovAKAJJ

Description

@GlebGlebovAKAJJ

I am using Codex with a trusted project-local MCP server (mcp-proceset) for automation work. The project owner configured MCP tool approvals, including default_tools_approval_mode = "approve" for the MCP server, and explicitly authorizes specific mutating tool calls in the task prompt.

Even with config-level MCP approvals and explicit user authorization, ARC / the safety monitor can still cancel mutating MCP calls such as set_block_data_value or add_block_link. This is understandable from a safety perspective, but it currently appears non-deterministic from the project owner's point of view and can block legitimate work in a trusted project environment.

Context

  • Codex project uses a custom MCP server, mcp-proceset.
  • The MCP server is configured in project config.toml as a trusted runtime tool provider.
  • Project config uses MCP tool approvals, including default approval for the MCP server's toolset.
  • Some destructive tools remain prompt-gated separately.
  • For normal workflow tools, the user explicitly authorizes concrete operations with tool name, exact args, target IDs, read-back preconditions, and post-check expectations.

Observed behavior

A mutating MCP call can still be cancelled by ARC / safety monitoring after both:

  1. config-level MCP approval, and
  2. explicit user authorization for the exact operation.

Sanitized example of the kind of cancellation:

Tool call was cancelled because of safety risks: The payload shows a sequence of user instructions and an assistant tool call to 'add_block_link' (a mutating MCP method). This is a potentially impactful change to Proceset scripts. The conversation contains many prior directives restricting raw GraphQL, requiring pre/post checks, ARC safeguards, and explicit user confirmations for mutating operations. The assistant's tool call may be premature or unauthorized: we cannot verify preconditions (user permissions, intended script/block ids, publish safety, ARC policy state). Before performing mutating actions the model should confirm explicit user authorization for this specific operation and ensure necessary preconditions.

Another observed class is cancellation of set_block_data_value despite the user having explicitly approved the target script/block/value and the project having MCP approvals configured.

No credentials, private URLs, or private payloads are included here.

Expected behavior / request

It would be helpful to have a documented and configurable way for project owners to manage ARC behavior for trusted MCP servers/toolsets, for example one of:

  • a supported config policy for owner-approved trusted MCP servers;
  • a way to declare a trusted MCP server/toolset and the required preconditions for mutating tools;
  • deterministic policy controls for when ARC can still override default_tools_approval_mode = "approve" and explicit user confirmation;
  • clearer documentation that approval_policy, default_tools_approval_mode, per-tool approval_mode, and user confirmation do not fully suppress ARC cancellations, plus guidance on the intended mitigation path.

Actual behavior

approval_policy, MCP default_tools_approval_mode = "approve", per-tool approvals, and explicit user confirmation do not fully suppress ARC cancellations. When cancellations happen, the project owner has limited visibility into which additional policy condition needs to be satisfied or whether the behavior is an intentional hard limitation.

Impact

Repeated prompts/cancellations can block legitimate automation work in a trusted project environment, especially when using MCP tools that mutate application state but are still part of the owner-approved workflow. The current mitigation is to stop, preserve the cancellation text, narrow scope/read-back preconditions, and retry only when the user explicitly re-authorizes, but this is operationally costly and still not deterministic.

Question

What is the officially supported way to configure or reason about this?

  • Is there a supported config.toml / requirements.toml policy for trusted MCP servers?
  • Are there recommended declarations for MCP tools with mutating-but-owner-approved semantics?
  • Is ARC intentionally non-bypassable regardless of owner/project config?
  • If so, can the docs clarify this limitation and the recommended workflow for trusted MCP automation?

Thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    CLIIssues related to the Codex CLIconfigIssues involving config.toml, config keys, config merging, or config updatesenhancementNew feature or requestmcpIssues related to the use of model context protocol (MCP) serverssafety-checkIssues related to safety and abuse checks

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions