Skip to content

Add SEP-2663 Tasks Extension conformance scenarios (+ MRTR↔Tasks composition placeholder)#1

Closed
panyam wants to merge 4 commits intomainfrom
feat/tasks-mrtr-extension
Closed

Add SEP-2663 Tasks Extension conformance scenarios (+ MRTR↔Tasks composition placeholder)#1
panyam wants to merge 4 commits intomainfrom
feat/tasks-mrtr-extension

Conversation

@panyam
Copy link
Copy Markdown
Owner

@panyam panyam commented May 5, 2026

Proposal

Add server-conformance scenarios for SEP-2663 (Tasks Extension), with incidental coverage of SEP-2575 (per-request capability override) and SEP-2243 (Mcp-Method/Mcp-Name request headers) in the parts of the surface where they bind to tasks.

This is a review-PR within the fork. Once it's solid, will graduate to a Draft PR upstream against modelcontextprotocol/conformance:main.

Relationship to PR 188 (SEP-2322 MRTR by CaitieM20)

This proposal is complementary to modelcontextprotocol#188, not overlapping. SEP-2663 builds on SEP-2322's base types, so a few of the tasks scenarios touch the MRTR shape (inputRequests, requestState) in their tasks-on-the-wire form (status:"input_required" on tasks/get, tasks/update resume path, partial inputResponses fulfillment). The standalone-ephemeral-MRTR coverage stays in 188.

Ordering assumption: SEP-2322 / 188 lands first. The one genuinely new MRTR-adjacent check this proposal contributes is mrtr-tasks-composition (currently SKIPPED — see Open Questions below) which exercises SEP-2663 commit 451f5e1's MRTR→Tasks promotion flow.

Scope (8 ClientScenario classes, ~33 checks)

  • tasks-lifecycle — sync vs task dispatch, DetailedTask shape, tool errors vs protocol errors, cancel ack, cancel-on-terminal -32602
  • tasks-capability-negotiation — extension advertised under capabilities.extensions; tasks/* gated behind negotiation; SEP-2575 per-request opt-in
  • tasks-wire-fieldsttlSeconds / pollIntervalMilliseconds renames, no early TTL expiry, no related-task _meta on inlined result
  • tasks-request-state — optional emission, echo acceptance, stale-but-valid tolerance (the tasks-surface form)
  • tasks-mrtr-input — inputRequests on tasks/get, tasks/update resume, partial-fulfillment with multi-input fixture
  • tasks-request-headers — SEP-2243 server tolerates routing headers; body authoritative when conflicting
  • tasks-dispatch-and-envelope — removed v1 methods (-32601), legacy task param ignored, resultType:"complete" on every non-task response, strong-consistency immediate tasks/get, unknown taskId -32602
  • tasks-status-notifications — optional INFO check (notifications are MAY per spec)

Design highlights

  • Brand-neutral, language-agnostic runner. Fixture wired via TASKS_SERVER_URL / TASKS_SERVER_CMD (and MRTR_SERVER_URL / MRTR_SERVER_CMD); sh -c spawn, TCP-poll readiness, no log-line scanning. Suite is describe.skip'd when env vars are unset, so default everything-server runs stay green until that fixture grows extension support.
  • Raw-fetch escape hatch. SDK's typed schemas strip the SEP-2663 wire fields (resultType, taskId, inputRequests, requestState, inlined result/error). Helpers in src/scenarios/server/tasks/helpers.ts provide initRawSession + rawRequest/rawRequestFull so scenarios read those fields directly. When the SDK gains schemas for SEP-2663 wire shapes, the call sites switch back to client.request(..., AnyResult) and the helper shrinks. (Similar shape to 188's raw-MCP helper — could converge on a shared helper if you'd like.)
  • Registered in pendingClientScenariosListall-scenarios.test.ts skips them since everything-server doesn't implement the extension yet. CLI lookup (getClientScenario(name)) still finds them.
  • One example reference fixture (any-language is fine): https://github.com/panyam/mcpkit/tree/main/examples/tasks-v2 + https://github.com/panyam/mcpkit/tree/main/examples/mrtr.

Open spec questions

  1. MRTR resultType discriminator value. SEP-2322's draft uses "input_required"; SEP-2663's draft uses "incomplete". Centralized as MRTR_INCOMPLETE_RESULT_TYPE so it's a one-line flip when SEP authors converge. Tracked at modelcontextprotocol/modelcontextprotocol PR 2663 comment 4381885336 / PR 2322 comment 4381884825 (re: prezaei).

  2. mrtr-tasks-composition. SEP-2663 commit 451f5e1 made the MRTR→Tasks promotion flow normative on the wire: a single tools/call MAY exchange one or more IncompleteResult rounds and then return CreateTaskResult on a subsequent round. Implementing this requires the server middleware to defer task creation until the handler signals async-promotion — the natural alternative (mint the task up-front the moment a tool advertises task support) doesn't fit, because by the time the handler's IsIncomplete signal is observable, the CreateTaskResult is already on the wire. This is a wire-contract requirement, not an SDK-specific implementation choice; existing SDKs across languages that took the up-front pattern will need refactoring before this check can pass anywhere. Combined with Add SEP-2663 Tasks Extension conformance scenarios (+ MRTR↔Tasks composition placeholder) #1 above, that's why the check is SKIPPED today.

Testing

TASKS_SERVER_URL=http://localhost:18092/mcp \
TASKS_SERVER_CMD="/path/to/tasks-fixture --serve --addr :18092" \
MRTR_SERVER_URL=http://localhost:18093/mcp \
MRTR_SERVER_CMD="/path/to/mrtr-fixture --serve --addr :18093" \
  npx vitest run src/scenarios/server/

Branch passes:

  • Tasks: 8/8 scenarios (~33 internal checks)
  • MRTR: 1/1 scenario (7 SUCCESS + 1 SKIPPED — see Open Question 2)

against reference Go fixtures. Happy to drop the duplicative MRTR ephemeral checks once 188 lands; the mrtr-tasks-composition skip would rebase onto whatever fixture 188 settles on.

Asks

  1. Confirm the proposed scenario boundaries — split, merge, rename?
  2. Confirm placement (src/scenarios/server/tasks/ + sibling mrtr/ folder; alternatives: extensions/ bucket).
  3. Acknowledge the relationship to 188 — happy to drop the mrtr-ephemeral-flow overlap once 188 lands and rebase the tasks-composition skip onto whatever fixture 188 settles on.

panyam added 4 commits May 5, 2026 14:14
Adds the first scenario for the SEP-2663 io.modelcontextprotocol/tasks
extension — a single TasksLifecycleScenario covering sync vs async
dispatch, DetailedTask shape on tasks/get, tool errors vs protocol
errors, and cancellation semantics. 8 ConformanceCheck records, all
passing against a SEP-2663-conformant Go fixture.

Why "tasks" (not "tasks-v2"): SEP-2663 IS the tasks surface once it
lands; the v2 suffix is only meaningful in implementations that
maintain a v1 surface alongside, which the conformance suite does not.

Layout:
- src/scenarios/server/tasks/lifecycle.ts — scenario class
- src/scenarios/server/tasks/helpers.ts — raw-fetch escape hatch
  (the SDK's typed schemas strip resultType/inputRequests/...)
- src/scenarios/server/tasks/lifecycle.test.ts — fork-local vitest
  runner. Two modes: spawn a fixture binary via MCPKIT_TASKS_BINARY,
  or point at an already-running server via MCPKIT_TASKS_SERVER_URL.
  Skips when neither is set so it doesn't break upstream CI runs that
  go through everything-server (which doesn't yet implement
  io.modelcontextprotocol/tasks).

Scenario is registered in pendingClientScenariosList so
all-scenarios.test.ts skips it; promote to active once the upstream
fixture grows extension support.

Tagged ['extension', DRAFT_PROTOCOL_VERSION] — selectable via
--suite extensions and --spec-version draft.
Builds out the rest of the tasks scenarios (atop the lifecycle canary)
and adds the SEP-2322 ephemeral MRTR scenario in a sibling folder.
Both target their own fixtures; both runners are brand-neutral and
language-agnostic (TASKS_SERVER_URL / TASKS_SERVER_CMD,
MRTR_SERVER_URL / MRTR_SERVER_CMD; readiness via TCP polling).

Tasks ClientScenario classes:
- TasksLifecycleScenario          (8 checks; v2-01..v2-08)
- TasksCapabilityNegotiationScenario (4 checks; v2-11/22/23/25, SEP-2575)
- TasksWireFieldsScenario         (3 checks; v2-12/13/21)
- TasksRequestStateScenario       (3 checks; v2-14/15/28)
- TasksMRTRInputScenario          (3 checks; v2-16/17/29 partial fulfillment)
- TasksRequestHeadersScenario     (3 checks; SEP-2243 request-header tolerance)
- TasksDispatchScenario           (8 checks; v2-09/10/19/20/26/27/30/31)
- TasksStatusNotificationsScenario (1 check; SEP-2663 §notifications, optional)

MRTR ClientScenario class:
- MrtrEphemeralFlowScenario       (7 checks + 1 SKIPPED; mrtr-01..07,
                                   mrtr-08 deferred for spec terminology +
                                   reference-impl reasons)

Both runners spawn the fixture via a shell command and detect readiness
by TCP-polling the URL's host/port — no log-line scanning, no
language-specific assumptions. The same env vars work for any server
implementation.

Scenarios are tagged ['extension', DRAFT_PROTOCOL_VERSION] and registered
in pendingClientScenariosList so all-scenarios.test.ts (which targets
the upstream everything-server) skips them until the fixture grows
SEP-2322 / SEP-2663 support.
Restructured around ClientScenario classes (one row per class with
check-list under it) rather than per-numbered-test slugs. Documents
fixture requirements, env vars, open spec questions, and the
wire-format diff for each suite.

Per AGENTS.md, severity follows spec keyword (MUST/MUST NOT → FAILURE,
SHOULD/SHOULD NOT → WARNING). The READMEs explain why some checks emit
INFO rather than FAILURE (optional emission paths per SEP-2322).
@panyam
Copy link
Copy Markdown
Owner Author

panyam commented May 5, 2026

Superseded — opening upstream Draft PR against modelcontextprotocol/conformance instead, which is where the maintainers review. Branch stays alive as the head of the upstream PR.

@panyam panyam closed this May 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant