Skip to content

Commit 68e05d3

Browse files
committed
feat(appkit): tool primitives and ToolProvider surfaces on core plugins
Second layer of the agents feature. Adds the primitives for defining agent tools and implements them on every core ToolProvider plugin. ### User-facing factories - `tool(config)` — inline function tools backed by a Zod schema. Auto- generates JSON Schema for the LLM via `z.toJSONSchema()` (stripping the top-level `$schema` annotation that Gemini rejects), runtime- validates tool-call arguments, returns an LLM-friendly error string on validation failure so the model can self-correct. - `mcpServer(name, url)` — tiny factory for hosted custom MCP server configs. Replaces the verbose `{ type: "custom_mcp_server", custom_mcp_server: { app_name, app_url } }` wrapper. - `FunctionTool` / `HostedTool` types + `isFunctionTool` / `isHostedTool` type guards. `HostedTool` is a union of Genie, VectorSearch, custom MCP, and external-connection configs. - `ToolkitEntry` + `ToolkitOptions` types + `isToolkitEntry` guard. `AgentTool = FunctionTool | HostedTool | ToolkitEntry` is the canonical union later PRs spread into agent definitions. ### Internal registry + JSON Schema helper - `defineTool(config)` + `ToolRegistry` — plugin authors' internal shape for declaring a keyed set of tools with Zod-typed handlers. - `toolsFromRegistry()` — produces the `AgentToolDefinition[]` exposed via `ToolProvider.getAgentTools()`. - `executeFromRegistry()` — validates args then dispatches to the handler. Returns LLM-friendly errors on bad args. - `toToolJSONSchema()` — shared helper at `packages/appkit/src/plugins/agents/tools/json-schema.ts` that wraps `toJSONSchema()` and strips `$schema`. Used by `tool()`, `toolsFromRegistry()`, and `buildToolkitEntries()`. - `buildToolkitEntries(pluginName, registry, opts?)` — converts a plugin's internal `ToolRegistry` into a keyed record of `ToolkitEntry` markers, honoring `prefix` / `only` / `except` / `rename`. ### MCP client - `AppKitMcpClient` — minimal JSON-RPC 2.0 client over SSE, zero deps. Handles auth refresh, per-server connection pooling, and tool definition aggregation. - `resolveHostedTools()` — maps `HostedTool` configs to Databricks MCP endpoint URLs. ### ToolProvider surfaces on core plugins - **analytics** — `query` tool (Zod-typed, asUser dispatch) - **files** — per-volume tool family: `${volumeKey}.{list,read,exists,metadata,upload,delete}` (dynamically named from the plugin's volume config) - **genie** — per-space tool family: `${alias}.{sendMessage,getConversation}` (dynamically named from the plugin's spaces config) - **lakebase** — `query` tool Each plugin gains `getAgentTools()` + `executeAgentTool()` satisfying the `ToolProvider` interface, plus a `.toolkit(opts?)` method that returns a record of `ToolkitEntry` markers for later spread into agent definitions. ### Test plan - 58 new tests across tool primitives + plugin ToolProvider surfaces - Full appkit vitest suite: 1212 tests passing - Typecheck clean - Build clean, publint clean Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com> ### Zero-trust MCP host policy (S1 security) New `mcp-host-policy.ts` module enforces an allowlist on every MCP URL before the first byte is sent. Same-origin Databricks workspace URLs are admitted by default; any other host must be explicitly trusted via the new `AgentsPluginConfig.mcp.trustedHosts` field (added in a subsequent stack layer). - Rejects non-`http(s)` schemes and plaintext `http://` outside of localhost-in-dev. - Blocks link-local (`169.254/16` — cloud metadata), RFC1918, CGNAT, loopback (unless `allowLocalhost`), ULA, multicast, and IPv4-mapped IPv6 equivalents at DNS-resolve time. IP-literal URLs in these ranges are rejected without a DNS lookup. Malformed IPs fail-closed. - `AppKitMcpClient` constructor now takes the policy as a third arg. Workspace credentials (SP on `initialize`/`tools/list`; caller- supplied OBO on `tools/call`) are never attached to non-workspace hosts — `callTool` drops caller OBO overrides for external destinations, and `sendRpc`/`sendNotification` never invoke `authenticate()` when `forwardWorkspaceAuth` is false. - Constructor accepts optional `{ dnsLookup, fetchImpl }` for test DI. New tests: - `mcp-host-policy.test.ts` (42 tests): config builder, URL check, IP blocklist, DNS-backed resolution with split-DNS defense. - `mcp-client.test.ts` (8 tests): integrated client with recording fetch — verifies no fetch + no `authenticate()` call when URL is rejected, and that auth headers are scoped correctly per-destination. ### Drive-by - `json-schema.ts`: biome formatting fix (pre-existing drift). - `packages/appkit/src/index.ts`: biome organizeImports fix (pre-existing sort order drift). Full appkit vitest suite: 1262 tests passing (+50 from security). Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com> ### SQL read-only enforcement (S2 security) New `sql-policy.ts` module provides `classifyReadOnly(sql)` and `assertReadOnlySql(sql)` — a dependency-free tokenizer-based classifier that rejects any statement outside `SELECT | WITH | SHOW | EXPLAIN | DESCRIBE` at execution time. Also exports `wrapInReadOnlyTransaction(stmt)` which produces a `BEGIN READ ONLY … ROLLBACK` envelope for belt-and-suspenders enforcement on PostgreSQL. Why a hand-rolled tokenizer rather than `node-sql-parser` or `pgsql-parser`: - `node-sql-parser`'s Hive/Spark dialect coverage rejects common Databricks SQL patterns (three-part names, `SHOW TABLES IN`, `DESCRIBE EXTENDED`, `EXPLAIN`); its PostgreSQL grammar rejects the same meta-commands. - `pgsql-parser` (libpg_query) is a native binding and fails to install cleanly on Databricks App runtimes. - We only need statement-type classification, not full parsing. The tokenizer handles line/block comments (nested), single- and double-quoted literals, ANSI/backtick identifiers, PostgreSQL dollar-quoted strings, `E'..'` escape strings, and reports unterminated literals as fail-closed. 62 tests exercise evasion vectors (stacked writes, quoted keywords, comment-hidden writes, mismatched dollar-quote tags, unterminated strings). ### analytics.query — enforced readOnly `analytics.query` was annotated `{ readOnly: true, requiresUserContext: true }` but the annotation was a claim only. A prompt-injected LLM could send `UPDATE`, `DELETE`, or `DROP` and the warehouse would run it subject to the end user's SQL grants. The tool now calls `assertReadOnlySql` before reaching `this.query()`. A rejection surfaces an LLM-friendly error the model can self-correct on; tests verify writes never reach the warehouse. Public `AppKit.analytics.query(...)` continues to accept arbitrary SQL — app authors use it intentionally; LLMs do not. ### lakebase.query — opt-in, truthful annotations, RO transaction wrap `lakebase.query` previously shipped as an always-on agent tool with `{ readOnly: false, destructive: false, idempotent: false }` (`destructive: false` was an outright lie) and executed arbitrary LLM SQL against the SP-scoped pool, auto-inherited by every markdown agent. The plugin now registers **no** agent tool by default. Opt-in via: ```ts lakebase({ exposeAsAgentTool: { iUnderstandRunsAsServicePrincipal: true, readOnly: true, // default }, }); ``` The acknowledgement flag is required because the pool is bound to the service principal regardless of which end user invokes the tool — enabling the tool is a deliberate privilege grant. When opted in with `readOnly: true` (default): - Statement classified by `classifyReadOnly` (rejects non-SELECT with an LLM-friendly error). - Remaining statement executed inside `BEGIN READ ONLY; …; ROLLBACK` so PostgreSQL enforces server-side even if a side-effecting function slips past the classifier. - Annotations: `{ readOnly: true, destructive: false, idempotent: false }`. When opted in with `readOnly: false`: - Statement passed through unchanged. - Annotations: `{ readOnly: false, destructive: true, idempotent: false }`. The `destructive: true` signal will be honored by the agents plugin's HITL approval gate in PR #304. `LakebasePlugin` is now `export class` so tests can construct it directly. New test file `lakebase-agent-tool.test.ts` (9 tests) verifies defaults, opt-in, acknowledgement enforcement, readOnly rejection + wrap, and destructive pass-through. Full appkit vitest suite: 1340 tests passing (+78 from S-2 Layer 1+2). Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>
1 parent 5d060a6 commit 68e05d3

31 files changed

Lines changed: 3695 additions & 11 deletions

packages/appkit/src/index.ts

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,11 +7,20 @@
77

88
// Types from shared
99
export type {
10+
AgentAdapter,
11+
AgentEvent,
12+
AgentInput,
13+
AgentRunContext,
14+
AgentToolDefinition,
1015
BasePluginConfig,
1116
CacheConfig,
1217
IAppRouter,
18+
Message,
1319
PluginData,
1420
StreamExecutionSettings,
21+
Thread,
22+
ThreadStore,
23+
ToolProvider,
1524
} from "shared";
1625
export { isSQLTypeMarker, sql } from "shared";
1726
export { CacheManager } from "./cache";
@@ -54,6 +63,21 @@ export {
5463
toPlugin,
5564
} from "./plugin";
5665
export { analytics, files, genie, lakebase, server, serving } from "./plugins";
66+
export {
67+
type FunctionTool,
68+
type HostedTool,
69+
isFunctionTool,
70+
isHostedTool,
71+
mcpServer,
72+
type ToolConfig,
73+
tool,
74+
} from "./plugins/agents/tools";
75+
export {
76+
type AgentTool,
77+
isToolkitEntry,
78+
type ToolkitEntry,
79+
type ToolkitOptions,
80+
} from "./plugins/agents/types";
5781
export type {
5882
EndpointConfig,
5983
ServingEndpointEntry,
Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
import type { AgentToolDefinition } from "shared";
2+
import type { ToolRegistry } from "./tools/define-tool";
3+
import { toToolJSONSchema } from "./tools/json-schema";
4+
import type { ToolkitEntry, ToolkitOptions } from "./types";
5+
6+
/**
7+
* Converts a plugin's internal `ToolRegistry` into a keyed record of
8+
* `ToolkitEntry` markers suitable for spreading into an `AgentDefinition.tools`
9+
* record.
10+
*
11+
* The `opts` record controls shape and filtering:
12+
* - `prefix` — overrides the default `${pluginName}.` prefix; `""` drops it.
13+
* - `only` — allowlist of local tool names to include (post-prefix).
14+
* - `except` — denylist of local names.
15+
* - `rename` — per-tool key remapping (applied after prefix/filter).
16+
*
17+
* Each entry carries `pluginName` + `localName` so the agents plugin can
18+
* dispatch back through `PluginContext.executeTool` for OBO + telemetry.
19+
*/
20+
export function buildToolkitEntries(
21+
pluginName: string,
22+
registry: ToolRegistry,
23+
opts: ToolkitOptions = {},
24+
): Record<string, ToolkitEntry> {
25+
const prefix = opts.prefix ?? `${pluginName}.`;
26+
const only = opts.only ? new Set(opts.only) : null;
27+
const except = opts.except ? new Set(opts.except) : null;
28+
const rename = opts.rename ?? {};
29+
30+
const out: Record<string, ToolkitEntry> = {};
31+
32+
for (const [localName, entry] of Object.entries(registry)) {
33+
if (only && !only.has(localName)) continue;
34+
if (except?.has(localName)) continue;
35+
36+
const keyAfterPrefix = `${prefix}${localName}`;
37+
const key = rename[localName] ?? keyAfterPrefix;
38+
39+
const parameters = toToolJSONSchema(
40+
entry.schema,
41+
) as unknown as AgentToolDefinition["parameters"];
42+
43+
const def: AgentToolDefinition = {
44+
name: key,
45+
description: entry.description,
46+
parameters,
47+
};
48+
if (entry.annotations) {
49+
def.annotations = entry.annotations;
50+
}
51+
52+
out[key] = {
53+
__toolkitRef: true,
54+
pluginName,
55+
localName,
56+
def,
57+
annotations: entry.annotations,
58+
};
59+
}
60+
61+
return out;
62+
}
Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
import { describe, expect, test } from "vitest";
2+
import { z } from "zod";
3+
import { buildToolkitEntries } from "../build-toolkit";
4+
import { defineTool, type ToolRegistry } from "../tools/define-tool";
5+
import { isToolkitEntry } from "../types";
6+
7+
const registry: ToolRegistry = {
8+
query: defineTool({
9+
description: "Run a query",
10+
schema: z.object({ sql: z.string() }),
11+
handler: () => "ok",
12+
}),
13+
history: defineTool({
14+
description: "Get query history",
15+
schema: z.object({}),
16+
handler: () => [],
17+
}),
18+
};
19+
20+
describe("buildToolkitEntries", () => {
21+
test("produces ToolkitEntry per registry item with default dotted prefix", () => {
22+
const entries = buildToolkitEntries("analytics", registry);
23+
expect(Object.keys(entries).sort()).toEqual([
24+
"analytics.history",
25+
"analytics.query",
26+
]);
27+
for (const entry of Object.values(entries)) {
28+
expect(isToolkitEntry(entry)).toBe(true);
29+
expect(entry.pluginName).toBe("analytics");
30+
}
31+
});
32+
33+
test("respects prefix option (empty drops the namespace)", () => {
34+
const entries = buildToolkitEntries("analytics", registry, { prefix: "" });
35+
expect(Object.keys(entries).sort()).toEqual(["history", "query"]);
36+
});
37+
38+
test("respects custom prefix", () => {
39+
const entries = buildToolkitEntries("analytics", registry, {
40+
prefix: "db.",
41+
});
42+
expect(Object.keys(entries).sort()).toEqual(["db.history", "db.query"]);
43+
});
44+
45+
test("only filter keeps the listed local names", () => {
46+
const entries = buildToolkitEntries("analytics", registry, {
47+
only: ["query"],
48+
});
49+
expect(Object.keys(entries)).toEqual(["analytics.query"]);
50+
});
51+
52+
test("except filter drops the listed local names", () => {
53+
const entries = buildToolkitEntries("analytics", registry, {
54+
except: ["history"],
55+
});
56+
expect(Object.keys(entries)).toEqual(["analytics.query"]);
57+
});
58+
59+
test("rename remaps specific local names (overrides the prefix key)", () => {
60+
const entries = buildToolkitEntries("analytics", registry, {
61+
rename: { query: "sql" },
62+
});
63+
expect(Object.keys(entries).sort()).toEqual(["analytics.history", "sql"]);
64+
});
65+
66+
test("exposes the original plugin+local name so dispatch can route", () => {
67+
const entries = buildToolkitEntries("analytics", registry, {
68+
prefix: "db.",
69+
});
70+
const qEntry = entries["db.query"];
71+
expect(qEntry.pluginName).toBe("analytics");
72+
expect(qEntry.localName).toBe("query");
73+
expect(qEntry.def.name).toBe("db.query");
74+
});
75+
});
Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
import { describe, expect, test, vi } from "vitest";
2+
import { z } from "zod";
3+
import {
4+
defineTool,
5+
executeFromRegistry,
6+
type ToolRegistry,
7+
toolsFromRegistry,
8+
} from "../tools/define-tool";
9+
10+
describe("defineTool()", () => {
11+
test("returns an entry matching the input config", () => {
12+
const entry = defineTool({
13+
description: "echo",
14+
schema: z.object({ msg: z.string() }),
15+
annotations: { readOnly: true },
16+
handler: ({ msg }) => msg,
17+
});
18+
19+
expect(entry.description).toBe("echo");
20+
expect(entry.annotations).toEqual({ readOnly: true });
21+
expect(typeof entry.handler).toBe("function");
22+
});
23+
});
24+
25+
describe("executeFromRegistry", () => {
26+
const registry: ToolRegistry = {
27+
echo: defineTool({
28+
description: "echo",
29+
schema: z.object({ msg: z.string() }),
30+
handler: ({ msg }) => `got ${msg}`,
31+
}),
32+
};
33+
34+
test("validates args and calls handler on success", async () => {
35+
const result = await executeFromRegistry(registry, "echo", { msg: "hi" });
36+
expect(result).toBe("got hi");
37+
});
38+
39+
test("returns formatted error string on validation failure", async () => {
40+
const result = await executeFromRegistry(registry, "echo", {});
41+
expect(typeof result).toBe("string");
42+
expect(result).toContain("Invalid arguments for echo");
43+
expect(result).toContain("msg");
44+
});
45+
46+
test("throws for unknown tool names", async () => {
47+
await expect(executeFromRegistry(registry, "missing", {})).rejects.toThrow(
48+
/Unknown tool: missing/,
49+
);
50+
});
51+
52+
test("forwards AbortSignal to the handler", async () => {
53+
const handler = vi.fn(async (_args: { x: string }, signal?: AbortSignal) =>
54+
signal?.aborted ? "aborted" : "ok",
55+
);
56+
const reg: ToolRegistry = {
57+
t: defineTool({
58+
description: "t",
59+
schema: z.object({ x: z.string() }),
60+
handler,
61+
}),
62+
};
63+
64+
const controller = new AbortController();
65+
controller.abort();
66+
await executeFromRegistry(reg, "t", { x: "hi" }, controller.signal);
67+
68+
expect(handler).toHaveBeenCalledTimes(1);
69+
expect(handler.mock.calls[0][1]).toBe(controller.signal);
70+
});
71+
});
72+
73+
describe("toolsFromRegistry", () => {
74+
test("produces AgentToolDefinition[] with JSON Schema parameters", () => {
75+
const registry: ToolRegistry = {
76+
query: defineTool({
77+
description: "Execute a SQL query",
78+
schema: z.object({
79+
query: z.string().describe("SQL query"),
80+
}),
81+
annotations: { readOnly: true, requiresUserContext: true },
82+
handler: () => "ok",
83+
}),
84+
};
85+
86+
const defs = toolsFromRegistry(registry);
87+
expect(defs).toHaveLength(1);
88+
expect(defs[0].name).toBe("query");
89+
expect(defs[0].description).toBe("Execute a SQL query");
90+
expect(defs[0].parameters).toMatchObject({
91+
type: "object",
92+
properties: {
93+
query: { type: "string", description: "SQL query" },
94+
},
95+
required: ["query"],
96+
});
97+
expect(defs[0].annotations).toEqual({
98+
readOnly: true,
99+
requiresUserContext: true,
100+
});
101+
});
102+
103+
test("preserves dotted names like uploads.list from the registry keys", () => {
104+
const registry: ToolRegistry = {
105+
"uploads.list": defineTool({
106+
description: "list uploads",
107+
schema: z.object({}),
108+
handler: () => [],
109+
}),
110+
"documents.list": defineTool({
111+
description: "list documents",
112+
schema: z.object({}),
113+
handler: () => [],
114+
}),
115+
};
116+
117+
const names = toolsFromRegistry(registry).map((d) => d.name);
118+
expect(names).toContain("uploads.list");
119+
expect(names).toContain("documents.list");
120+
});
121+
122+
test("omits annotations when none are provided", () => {
123+
const registry: ToolRegistry = {
124+
plain: defineTool({
125+
description: "plain",
126+
schema: z.object({}),
127+
handler: () => "ok",
128+
}),
129+
};
130+
const [def] = toolsFromRegistry(registry);
131+
expect(def.annotations).toBeUndefined();
132+
});
133+
});

0 commit comments

Comments
 (0)