Skip to content

feat(console): MCP server harness with OAuth 2.1#1303

Draft
vklimontovich wants to merge 7 commits into
newjitsufrom
feat/mcp-server-harness
Draft

feat(console): MCP server harness with OAuth 2.1#1303
vklimontovich wants to merge 7 commits into
newjitsufrom
feat/mcp-server-harness

Conversation

@vklimontovich
Copy link
Copy Markdown
Contributor

Summary

Empty MCP server with a single stub `whoami` tool — proves the auth path
end-to-end. The real Jitsu API tools land in follow-up PRs.

What's in this PR

  • MCP HTTP endpoint at `/mcp/mcp` built directly on `@modelcontextprotocol/sdk`
    (StreamableHTTPServerTransport, stateful, with our own `EventStore` impl over a
    Postgres-backed KV store — no mcp-handler, no Redis dependency).
  • OAuth 2.1 authorization server with metadata discovery (RFC 8414 + 9728),
    Dynamic Client Registration (RFC 7591), PKCE-required authorize, both
    `authorization_code` and `refresh_token` grants. Refresh rotation on every
    refresh.
  • Token model: MCP-issued connections live in the existing `UserApiToken`
    table (refresh, 90-day default, shown on /user). Short-lived access tokens
    (1h) live in a new `OAuthAccessToken` table. MCP-ness is inferred from
    `UserApiToken.oauthClientId IS NOT NULL` — no `type="mcp"`.
  • Cascade delete: revoking an MCP key from `/user` removes its
    `OAuthClient` and all linked `OAuthAccessToken` rows in one transaction.
    Enforced in code (not the DB) so we can move from the current 1:1 model
    (one active token per client) to 1:N later without a schema migration.
  • All logic consolidated into a single `McpServer` class
    (`lib/server/mcp-server/`); every page handler in `pages/api/mcp/*` is a
    1-line wrapper. DI throughout: prisma, kv, baseUrl, getCurrentUser are
    constructor fields. The class never reaches for singletons internally.
  • Console-local `KeyValueStore` (interface + Postgres impl, mirrors the
    one in ee-api) — used for OAuth codes (60s TTL) and the MCP event log
    (1h TTL). No cross-app coupling to ee-api.

Public URL surface

URL Purpose
`/mcp/mcp` MCP Streamable HTTP transport
`/oauth/register` DCR
`/oauth/authorize` Consent UI (Next.js page)
`/oauth/token` Token exchange (both grants)
`/.well-known/oauth-authorization-server` RFC 8414 metadata
`/.well-known/oauth-protected-resource` RFC 9728 metadata

All exposed at top-level paths via `next.config.js` rewrites onto internal
`/api/mcp/*` handlers.

Schema changes

Three additive changes — `prisma db push` covers it (no SQL migrations
tracked in this repo).

```prisma
model UserApiToken { ... oauthClientId String?; oauthClient OAuthClient?; accessTokens OAuthAccessToken[] }
model OAuthClient { id, clientSecretHash, name, redirectUris, ... }
model OAuthAccessToken { id, hash, expiresAt, refreshTokenId, ... }
```

KV table (`newjitsu.kvstore`) is auto-created on first use, no Prisma model.

Test plan

  • `pnpm typecheck` clean
  • `pnpm lint` clean (remaining errors are pre-existing in untracked files)
  • `pnpm db:update-schema` to push the new models
  • `curl https://console.jitsu.localhost/.well-known/oauth-authorization-server\` returns metadata
  • DCR via `curl -X POST .../oauth/register -d '{"client_name":"test","redirect_uris":["http://localhost:6274/callback"]}'` returns `client_id` + `client_secret`
  • Add the server to `mcp-inspector` pointing at `/mcp/mcp`, complete OAuth flow in browser
  • Call `whoami` → returns logged-in user's email
  • `/user` shows the new key with `mcp` type tag and the client name
  • Force-expire access token, let inspector refresh → new `OAuthAccessToken` row, `UserApiToken.hash` rotated
  • Delete the key from `/user` → `OAuthClient` and `OAuthAccessToken` rows gone; retry MCP call → 401

Notes for review

  • The 1:1 token-per-client semantics are enforced in code (not via DB FK).
    Comment in `schema.prisma` on `OAuthClient` explains the rule and how to
    switch to 1:N later without a schema change.
  • DCR is open (no auth) per RFC 7591 — the real gate is consent. Rate
    limiting per IP can be added in a follow-up via the existing rate limiter.
  • Refresh-token replay detection is deferred; we reject invalid refreshes
    but don't auto-revoke the whole chain.

Empty MCP server with a single stub `whoami` tool, mounted at /mcp/mcp, with
the full OAuth 2.1 + DCR + PKCE dance to issue refresh + access tokens that
live in the existing UserApiToken table (refresh) and a new OAuthAccessToken
table (access, 1h). MCP-issued tokens show up on /user — deleting one there
cascades through the access tokens and the OAuthClient row.

All MCP/OAuth logic lives in a single `McpServer` class
(lib/server/mcp-server/) with thin 1-line page handlers in pages/api/mcp/*.
DI throughout: prisma, kv, baseUrl, getCurrentUser are constructor fields;
the singleton wiring at the bottom of index.ts is the only place that
reaches for globals.

Built directly on @modelcontextprotocol/sdk (no mcp-handler) so we can plug
our own EventStore over a console-local Postgres KV store. No new infra
deps: OAuth codes and MCP event log both go through the KV.
Copy link
Copy Markdown

@jitsu-code-review jitsu-code-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a few blocking issues before this can ship:

  1. Deny on /oauth/authorize redirects to an untrusted redirect_uri from query params without any server-side validation, which creates an open redirect (and allows dangerous URI schemes).
  2. OAuth auth codes are not consumed atomically, so concurrent token exchanges can both succeed for the same one-time code.
  3. MCP event replay can drop events created in the same millisecond as lastEventId, violating replay semantics.

Please address these before merge.


const deny = () => {
if (!redirectUri) return;
const url = new URL(redirectUri);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deny() builds a redirect directly from untrusted query params (redirect_uri) without validating it against the registered client. This is an open redirect, and it also permits non-HTTP(S) schemes via new URL(...). Please route deny through server-side validation (same whitelist check as approve) and reject unsafe schemes.

// (PKCE verifier, client secret), but in practice no real client races
// itself on a single auth code.
async consumeCode(code: string): Promise<CodePayload | undefined> {
const payload = (await this.table.get(code)) as CodePayload | undefined;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consumeCode() does get then del, which is not atomic. Two concurrent /oauth/token requests can both read the same code before either delete executes, so one authorization code can be exchanged more than once. Please make consume a single atomic operation (e.g., delete-with-return semantics in the KV backend).

const all = await this.table.list();
const sameStream = all
.map(({ id, obj }) => ({ id, obj: obj as StoredEvent }))
.filter(e => e.obj.streamId === last.streamId && e.obj.createdAt > last.createdAt)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replay filtering uses createdAt > last.createdAt. If multiple events are stored in the same millisecond, events after lastEventId but with equal createdAt are silently skipped. This breaks resume correctness. Please order/filter using a strictly monotonic cursor (or at least (createdAt, eventId) and exclude only the exact lastEventId).

// (PKCE verifier, client secret), but in practice no real client races
// itself on a single auth code.
async consumeCode(code: string): Promise<CodePayload | undefined> {
const payload = (await this.table.get(code)) as CodePayload | undefined;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consumeCode() does get then del, which is not atomic. Two concurrent /oauth/token requests can both read the same code before either delete executes, so one authorization code can be exchanged more than once. Please make consume a single atomic operation (for example, delete-with-return semantics in the KV backend).


const deny = () => {
if (!redirectUri) return;
const url = new URL(redirectUri);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deny() builds a redirect directly from untrusted query params (redirect_uri) without validating it against the registered client. This is an open redirect, and it also permits dangerous URI schemes via new URL(...). Please route deny through server-side validation (same whitelist check as approve) and reject unsafe schemes.

const all = await this.table.list();
const sameStream = all
.map(({ id, obj }) => ({ id, obj: obj as StoredEvent }))
.filter(e => e.obj.streamId === last.streamId && e.obj.createdAt > last.createdAt)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replay filtering uses createdAt > last.createdAt. If multiple events are stored in the same millisecond, events after lastEventId but with equal createdAt are skipped. This breaks resume correctness. Please use a strictly monotonic cursor (or at least (createdAt, eventId) and exclude only the exact lastEventId).



KV store redesigned without the ee-api "getTable" abstraction. Flat Redis-
shaped namespace, callers use prefix conventions (`oauth:code:abc`,
`mcp:event:...`). Atomic ops are first-class — `getDel` (delete-with-return)
and `set({ ifNotExists })` are SQL primitives rather than synthesized from
non-atomic get/del pairs.

Review fixes:
- OAuth codes now consumed via atomic `getDel` — two concurrent /oauth/token
  calls for the same code can't both succeed.
- MCP event IDs are now `<streamId>:<ms-padded>-<uuid>`, lex-sortable per
  stream. Replay uses `scanByPrefix` + strict-`>` on the full ID, so events
  in the same millisecond are distinguished by the UUID tail and never
  silently dropped.
- /oauth/deny goes through a new server-side endpoint that whitelist-checks
  redirect_uri against the OAuthClient (same path approve uses). Scheme also
  validated at DCR time so http(s)-only ever enters the whitelist. Closes
  the open-redirect path that built the URL client-side from raw query
  params.
The 3-fallback chain in mcpServer's wiring belongs in a named function, not
an IIFE inside the singleton call. Adds `getPublicOrigin()` to
lib/server/origin.ts (next to the request-derived host helpers already
there). Falls back through JITSU_PUBLIC_URL / JITSU_PUBLIC / VERCEL_URL /
NEXTAUTH_URL / localhost so callers don't have to handle undefined.

Account-alerts kept its local helper — it intentionally returns undefined
when no public URL is set (to skip email links pointing at localhost).
Different semantics; not collapsing them.
Drop baseUrl and getCurrentUser from the class's Deps. They were pure
utilities wearing a DI suit — getPublicOrigin() and getUser() are stateless
imports; threading them through the constructor was overhead without
payoff. The remaining deps (prisma, kv, ttls) are the things tests would
actually want to swap.

OAuthHandlers.approve calls getUser() directly; OAuthHandlers.*Metadata and
AuthChecker.send401 call getPublicOrigin() inline.
…tstrap

The KV table is now a real Prisma model (`Kv` → `newjitsu.kv`), created
through the same `pnpm db:update-schema` flow as everything else. PgKvStore
takes a `PrismaClient` instead of a `Pool`, uses the typed client for
findUnique/upsert/deleteMany/findMany, and drops to `$queryRaw` only for
the two atomic primitives Prisma can't express:
  - getDel: DELETE ... RETURNING
  - set({ ifNotExists }): INSERT ... ON CONFLICT ... WHERE expires_at <= NOW()

Removes the CREATE TABLE IF NOT EXISTS + lazy initOnce wiring entirely.
At runtime we assume the table exists; migration is the deployer's job,
same as for every other Prisma model.
@vklimontovich vklimontovich marked this pull request as draft May 18, 2026 01:04
Copy link
Copy Markdown

@jitsu-code-review jitsu-code-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found two correctness issues that can lead to user-visible breakage under valid usage patterns. Please address them before merge.

// accumulate.
const issued = await this.deps.prisma.$transaction(async tx => {
const prior = await tx.userApiToken.findMany({
where: { oauthClientId: client.id },
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cleanup is scoped only by oauthClientId, so any new authorization for a shared client_id deletes refresh/access tokens for all users of that client. Since DCR clients are not user-bound and schema allows multiple UserApiToken rows per client, this causes cross-user revocation. Please scope revocation to the current user (userId: codePayload.userId) or enforce a DB-level uniqueness model that matches intended semantics.

): Promise<StreamId> {
const streamId = streamIdFromEventId(lastEventId);
if (!streamId) return "";
const rows = await this.kv.scanByPrefix<{ streamId: StreamId; message: JSONRPCMessage }>(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scanByPrefix is capped at 1000 rows by default, and replay does not paginate. If a stream has >1000 events buffered since lastEventId, older events are silently dropped on resume. Please paginate (or otherwise iterate) until exhausted so replay remains lossless.

- Kv → KvStoreEntry. The two-letter name didn't survive the read of
  `prisma.kv.findUnique(...)` in adjacent code.
- @@index on key with text_pattern_ops so `WHERE key LIKE 'seg1:seg2:%'`
  becomes an index range scan instead of a seq scan. Without this the
  default text ops only help equality, so prefix-scan keyspaces would
  degrade linearly with table size.
- Drop $queryRaw entirely. getDel now uses prisma.delete() (which compiles
  to DELETE ... RETURNING — atomic); set({ifNotExists}) uses create() and
  catches P2002 (unique constraint), which is exactly the atomic semantics
  we want (one winner per key).
Copy link
Copy Markdown

@jitsu-code-review jitsu-code-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two blocking issues found:

  1. MCP refresh tokens can be used as full Console API bearer keys

    • createTokenPair() stores refresh tokens in UserApiToken (webapps/console/lib/server/mcp-server/oauth.ts).
    • Existing auth (webapps/console/lib/api.ts) accepts any UserApiToken row as a bearer credential for /api/*.
    • Result: an OAuth client that receives an MCP refresh token can call normal Console APIs directly with that refresh token, bypassing the MCP access-token boundary and effectively extending broad API access to the refresh token lifetime.
    • Please make MCP refresh tokens non-usable by generic bearer auth (e.g., dedicated table/type guard in getUser, or an explicit deny path for oauthClientId != null).
  2. Authorizing one user revokes other users for the same OAuth client_id

    • In tokenFromCode(), prior tokens are deleted by oauthClientId only (where: { oauthClientId: client.id } in webapps/console/lib/server/mcp-server/oauth.ts).
    • This enforces 1:1 globally per client, not per user, so a second user authorizing the same client invalidates the first user's connection.
    • That is cross-user interference and user-visible breakage for shared client credentials.
    • Please scope replacement to the same user (or explicitly enforce/communicate a different model with constraints that prevent cross-user revocation surprises).

I did not flag style/nit issues.

Codes already used `oauth:code:<id>`; event IDs were
`mcp:event:<streamId>:<ms>-<uuid>` with a stray hyphen. Switch to all
colons for consistency with Redis-style key conventions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant