Add SEP-2663 Tasks Extension conformance scenarios (+ MRTR↔Tasks composition placeholder)#1
Closed
Add SEP-2663 Tasks Extension conformance scenarios (+ MRTR↔Tasks composition placeholder)#1
Conversation
Adds the first scenario for the SEP-2663 io.modelcontextprotocol/tasks extension — a single TasksLifecycleScenario covering sync vs async dispatch, DetailedTask shape on tasks/get, tool errors vs protocol errors, and cancellation semantics. 8 ConformanceCheck records, all passing against a SEP-2663-conformant Go fixture. Why "tasks" (not "tasks-v2"): SEP-2663 IS the tasks surface once it lands; the v2 suffix is only meaningful in implementations that maintain a v1 surface alongside, which the conformance suite does not. Layout: - src/scenarios/server/tasks/lifecycle.ts — scenario class - src/scenarios/server/tasks/helpers.ts — raw-fetch escape hatch (the SDK's typed schemas strip resultType/inputRequests/...) - src/scenarios/server/tasks/lifecycle.test.ts — fork-local vitest runner. Two modes: spawn a fixture binary via MCPKIT_TASKS_BINARY, or point at an already-running server via MCPKIT_TASKS_SERVER_URL. Skips when neither is set so it doesn't break upstream CI runs that go through everything-server (which doesn't yet implement io.modelcontextprotocol/tasks). Scenario is registered in pendingClientScenariosList so all-scenarios.test.ts skips it; promote to active once the upstream fixture grows extension support. Tagged ['extension', DRAFT_PROTOCOL_VERSION] — selectable via --suite extensions and --spec-version draft.
Builds out the rest of the tasks scenarios (atop the lifecycle canary)
and adds the SEP-2322 ephemeral MRTR scenario in a sibling folder.
Both target their own fixtures; both runners are brand-neutral and
language-agnostic (TASKS_SERVER_URL / TASKS_SERVER_CMD,
MRTR_SERVER_URL / MRTR_SERVER_CMD; readiness via TCP polling).
Tasks ClientScenario classes:
- TasksLifecycleScenario (8 checks; v2-01..v2-08)
- TasksCapabilityNegotiationScenario (4 checks; v2-11/22/23/25, SEP-2575)
- TasksWireFieldsScenario (3 checks; v2-12/13/21)
- TasksRequestStateScenario (3 checks; v2-14/15/28)
- TasksMRTRInputScenario (3 checks; v2-16/17/29 partial fulfillment)
- TasksRequestHeadersScenario (3 checks; SEP-2243 request-header tolerance)
- TasksDispatchScenario (8 checks; v2-09/10/19/20/26/27/30/31)
- TasksStatusNotificationsScenario (1 check; SEP-2663 §notifications, optional)
MRTR ClientScenario class:
- MrtrEphemeralFlowScenario (7 checks + 1 SKIPPED; mrtr-01..07,
mrtr-08 deferred for spec terminology +
reference-impl reasons)
Both runners spawn the fixture via a shell command and detect readiness
by TCP-polling the URL's host/port — no log-line scanning, no
language-specific assumptions. The same env vars work for any server
implementation.
Scenarios are tagged ['extension', DRAFT_PROTOCOL_VERSION] and registered
in pendingClientScenariosList so all-scenarios.test.ts (which targets
the upstream everything-server) skips them until the fixture grows
SEP-2322 / SEP-2663 support.
Restructured around ClientScenario classes (one row per class with check-list under it) rather than per-numbered-test slugs. Documents fixture requirements, env vars, open spec questions, and the wire-format diff for each suite. Per AGENTS.md, severity follows spec keyword (MUST/MUST NOT → FAILURE, SHOULD/SHOULD NOT → WARNING). The READMEs explain why some checks emit INFO rather than FAILURE (optional emission paths per SEP-2322).
Owner
Author
|
Superseded — opening upstream Draft PR against modelcontextprotocol/conformance instead, which is where the maintainers review. Branch stays alive as the head of the upstream PR. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Proposal
Add server-conformance scenarios for SEP-2663 (Tasks Extension), with incidental coverage of SEP-2575 (per-request capability override) and SEP-2243 (Mcp-Method/Mcp-Name request headers) in the parts of the surface where they bind to tasks.
This is a review-PR within the fork. Once it's solid, will graduate to a Draft PR upstream against
modelcontextprotocol/conformance:main.Relationship to PR 188 (SEP-2322 MRTR by CaitieM20)
This proposal is complementary to modelcontextprotocol#188, not overlapping. SEP-2663 builds on SEP-2322's base types, so a few of the tasks scenarios touch the MRTR shape (
inputRequests,requestState) in their tasks-on-the-wire form (status:"input_required"ontasks/get,tasks/updateresume path, partial inputResponses fulfillment). The standalone-ephemeral-MRTR coverage stays in 188.Ordering assumption: SEP-2322 / 188 lands first. The one genuinely new MRTR-adjacent check this proposal contributes is
mrtr-tasks-composition(currentlySKIPPED— see Open Questions below) which exercises SEP-2663 commit451f5e1's MRTR→Tasks promotion flow.Scope (8 ClientScenario classes, ~33 checks)
tasks-lifecycle— sync vs task dispatch, DetailedTask shape, tool errors vs protocol errors, cancel ack, cancel-on-terminal -32602tasks-capability-negotiation— extension advertised undercapabilities.extensions;tasks/*gated behind negotiation; SEP-2575 per-request opt-intasks-wire-fields—ttlSeconds/pollIntervalMillisecondsrenames, no early TTL expiry, norelated-task_meta on inlined resulttasks-request-state— optional emission, echo acceptance, stale-but-valid tolerance (the tasks-surface form)tasks-mrtr-input— inputRequests on tasks/get, tasks/update resume, partial-fulfillment with multi-input fixturetasks-request-headers— SEP-2243 server tolerates routing headers; body authoritative when conflictingtasks-dispatch-and-envelope— removed v1 methods (-32601), legacytaskparam ignored,resultType:"complete"on every non-task response, strong-consistency immediate tasks/get, unknown taskId -32602tasks-status-notifications— optional INFO check (notifications are MAY per spec)Design highlights
TASKS_SERVER_URL/TASKS_SERVER_CMD(andMRTR_SERVER_URL/MRTR_SERVER_CMD);sh -cspawn, TCP-poll readiness, no log-line scanning. Suite isdescribe.skip'd when env vars are unset, so defaulteverything-serverruns stay green until that fixture grows extension support.resultType,taskId,inputRequests,requestState, inlined result/error). Helpers insrc/scenarios/server/tasks/helpers.tsprovideinitRawSession+rawRequest/rawRequestFullso scenarios read those fields directly. When the SDK gains schemas for SEP-2663 wire shapes, the call sites switch back toclient.request(..., AnyResult)and the helper shrinks. (Similar shape to 188's raw-MCP helper — could converge on a shared helper if you'd like.)pendingClientScenariosList—all-scenarios.test.tsskips them sinceeverything-serverdoesn't implement the extension yet. CLI lookup (getClientScenario(name)) still finds them.Open spec questions
MRTR resultType discriminator value. SEP-2322's draft uses
"input_required"; SEP-2663's draft uses"incomplete". Centralized asMRTR_INCOMPLETE_RESULT_TYPEso it's a one-line flip when SEP authors converge. Tracked at modelcontextprotocol/modelcontextprotocol PR 2663 comment 4381885336 / PR 2322 comment 4381884825 (re: prezaei).mrtr-tasks-composition. SEP-2663 commit
451f5e1made the MRTR→Tasks promotion flow normative on the wire: a singletools/callMAY exchange one or moreIncompleteResultrounds and then returnCreateTaskResulton a subsequent round. Implementing this requires the server middleware to defer task creation until the handler signals async-promotion — the natural alternative (mint the task up-front the moment a tool advertises task support) doesn't fit, because by the time the handler'sIsIncompletesignal is observable, theCreateTaskResultis already on the wire. This is a wire-contract requirement, not an SDK-specific implementation choice; existing SDKs across languages that took the up-front pattern will need refactoring before this check can pass anywhere. Combined with Add SEP-2663 Tasks Extension conformance scenarios (+ MRTR↔Tasks composition placeholder) #1 above, that's why the check isSKIPPEDtoday.Testing
Branch passes:
against reference Go fixtures. Happy to drop the duplicative MRTR ephemeral checks once 188 lands; the
mrtr-tasks-compositionskip would rebase onto whatever fixture 188 settles on.Asks
src/scenarios/server/tasks/+ siblingmrtr/folder; alternatives:extensions/bucket).mrtr-ephemeral-flowoverlap once 188 lands and rebase the tasks-composition skip onto whatever fixture 188 settles on.