Skip to content

Commit 2593039

Browse files
samchonCopilot
andauthored
feat(website): the harness wording on llm module. (#1800)
* feat(website): the harness wording on llm module. * fix remove * do not abuse harness wording much a lot * fix(website): correct grammar in chat.mdx harness section (#1802) * Initial plan * fix(website): correct grammar in chat.mdx harness section Co-authored-by: samchon <13158709+samchon@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: samchon <13158709+samchon@users.noreply.github.com> * fix: add missing `tags` import in function-calling-harness.md example (#1801) * Initial plan * fix: add tags to import in function-calling-harness.md example Co-authored-by: samchon <13158709+samchon@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: samchon <13158709+samchon@users.noreply.github.com> --------- Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
1 parent c99421d commit 2593039

File tree

16 files changed

+1017
-56
lines changed

16 files changed

+1017
-56
lines changed

README.md

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -23,12 +23,12 @@ export namespace json {
2323
export function assertStringify<T>(input: T): string; // safe and faster
2424
}
2525

26-
// AI FUNCTION CALLING SCHEMA
26+
// AI FUNCTION CALLING HARNESS
2727
export namespace llm {
2828
// collection of function calling schemas + validators/parsers
2929
export function application<Class>(): ILlmApplication<Class>;
3030
export function structuredOutput<P>(): ILlmStructuredOutput;
31-
// lenient json parser + type corecion
31+
// lenient json parser + type coercion
3232
export function parse<T>(str: string): T;
3333
}
3434

@@ -47,7 +47,7 @@ export function random<T>(g?: Partial<IRandomGenerator>): T;
4747

4848
- Super-fast Runtime Validators
4949
- Enhanced JSON schema and serde functions
50-
- LLM function calling schema and structured output
50+
- LLM function calling harness
5151
- Protocol Buffer encoder and decoder
5252
- Random data generator
5353

@@ -56,6 +56,7 @@ export function random<T>(g?: Partial<IRandomGenerator>): T;
5656
> - **Only one line** required, with pure TypeScript type
5757
> - Runtime validator is **20,000x faster** than `class-validator`
5858
> - JSON serialization is **200x faster** than `class-transformer`
59+
> - LLM function calling harness turns **6.75% → 100%** accuracy
5960
6061
## Transformation
6162
If you call `typia` function, it would be compiled like below.
@@ -121,7 +122,7 @@ Check out the document in the [website](https://typia.io/docs/):
121122
- [JSON Schema](https://typia.io/docs/json/schema/)
122123
- [`stringify()` functions](https://typia.io/docs/json/stringify/)
123124
- [`parse()` functions](https://typia.io/docs/json/parse/)
124-
- LLM Function Calling
125+
- LLM Function Calling Harness
125126
- [`application()` function](https://typia.io/docs/llm/application/)
126127
- [`structuredOutput()` function](https://typia.io/docs/llm/structuredOutput/)
127128
- [`HttpLlm` module](https://typia.io/docs/llm/http/)
@@ -136,6 +137,9 @@ Check out the document in the [website](https://typia.io/docs/):
136137
### 🔗 Appendix
137138
- [API Documents](https://typia.io/api)
138139
- Utilization Cases
140+
- [MCP](https://typia.io/docs/utilization/mcp/)
141+
- [Vercel AI SDK](https://typia.io/docs/utilization/vercel/)
142+
- [LangChain](https://typia.io/docs/utilization/langchain/)
139143
- [NestJS](https://typia.io/docs/utilization/nestjs/)
140144
- [tRPC](https://typia.io/docs/utilization/trpc/)
141145
- [⇲ Benchmark Result](https://github.com/samchon/typia/tree/master/benchmark/results/11th%20Gen%20Intel(R)%20Core(TM)%20i5-1135G7%20%40%202.40GHz)

website/articles/function-calling-harness.md

Lines changed: 882 additions & 0 deletions
Large diffs are not rendered by default.

website/src/content/docs/_meta.ts

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,4 +34,3 @@ export default {
3434
href: "https://dev.to/samchon/series/22474",
3535
},
3636
} satisfies MetaRecord;
37-

website/src/content/docs/index.mdx

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -66,12 +66,12 @@ export namespace json {
6666
export function assertStringify<T>(input: T): string; // safe and faster
6767
}
6868

69-
// AI FUNCTION CALLING SCHEMA
69+
// AI FUNCTION CALLING HARNESS
7070
export namespace llm {
7171
// collection of function calling schemas + validators/parsers
7272
export function application<Class>(): ILlmApplication<Class>;
7373
export function structuredOutput<P>(): ILlmStructuredOutput;
74-
// lenient json parser + type corecion
74+
// lenient json parser + type coercion
7575
export function parse<T>(str: string): T;
7676
}
7777

@@ -90,7 +90,7 @@ export function random<T>(g?: Partial<IRandomGenerator>): T;
9090

9191
- Super-fast Runtime Validators
9292
- Enhanced JSON functions
93-
- LLM function calling schema and structured output
93+
- LLM function calling harness
9494
- Protocol Buffer encoder and decoder
9595
- Random data generator
9696

@@ -105,6 +105,9 @@ export function random<T>(g?: Partial<IRandomGenerator>): T;
105105
<Alert severity="info">
106106
JSON serialization is **200x faster** than `class-transformer`
107107
</Alert>
108+
<Alert severity="success">
109+
LLM function calling harness turns **6.75% → 100%** accuracy
110+
</Alert>
108111
</Stack>
109112

110113
## Transformation

website/src/content/docs/llm/_meta.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ import { MetaRecord } from "nextra";
22

33
export default {
44
application: "application() function",
5-
structuredOutput: "structuredOutput() function",
5+
structuredOutput: "structuredOutput()",
66
parameters: "parameters() function",
77
schema: "schema() function",
88
http: "HttpLlm module",

website/src/content/docs/llm/application.mdx

Lines changed: 21 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,7 @@ export namespace llm {
6262

6363
LLM function calling application schema from a native TypeScript class or interface type.
6464

65-
`typia.llm.application<App>()` is a function composing LLM (Large Language Model) calling application schema from a native TypeScript class or interface type. The function returns an `ILlmApplication` instance, which is a data structure representing a collection of LLM function calling schemas.
65+
`typia.llm.application<App>()` is a function composing LLM (Large Language Model) calling application schema from a native TypeScript class or interface type. The function returns an `ILlmApplication` instance, which is a data structure representing a collection of LLM function calling schemas — each with built-in `parse()`, `coerce()`, and `validate()` methods.
6666

6767
If you put LLM function schema instances registered in the `ILlmApplication.functions` to the LLM provider like `OpenAI ChatGPT`, the LLM will select a proper function to call with parameter values of the target function in the conversations with the user. This is the "LLM Function Calling".
6868

@@ -207,7 +207,17 @@ registerMcpControllers({
207207
</Tabs.Tab>
208208
</Tabs>
209209

210-
## Lenient JSON Parsing
210+
## The Function Calling Harness
211+
212+
The **function calling harness** is typia's three-layer pipeline that turns unreliable LLM output into 100% correct structured data:
213+
214+
1. **Lenient JSON Parsing** — recovers broken JSON (unclosed brackets, trailing commas, markdown wrapping, etc.)
215+
2. **Type Coercion** — fixes wrong types (`"42"``42`, double-stringified objects → objects, etc.)
216+
3. **Validation Feedback** — pinpoints remaining value errors with inline `// ❌` annotations so the LLM can self-correct and retry
217+
218+
Each layer catches what the previous one didn't. Together they form a deterministic correction loop around the probabilistic LLM.
219+
220+
### Lenient JSON Parsing & Type Coercion
211221

212222
<Tabs items={[
213223
"Parsing Example",
@@ -266,7 +276,7 @@ Some LLM SDKs (Anthropic, Vercel AI, LangChain, MCP) parse JSON internally and r
266276
For more details, see [JSON Utilities](./json).
267277
</Callout>
268278

269-
## Validation Feedback
279+
### Validation Feedback
270280

271281
<LocalSource
272282
path="examples/src/llm/application-validate.ts"
@@ -296,15 +306,21 @@ For more details, see [JSON Utilities](./json).
296306
}
297307
```
298308

299-
The LLM reads this feedback and self-corrects on the next turn.
309+
The LLM reads this feedback and self-corrects on the next turn. Together with lenient parsing and type coercion above, this parse → coerce → validate → feedback → retry cycle completes the harness.
300310

301-
In the [AutoBe](https://github.com/wrtnlabs/autobe) project (AI-powered backend code generator), `qwen3-coder-next` showed only 6.75% raw function calling success rate on compiler AST types. However, with validation feedback, it reached 100%.
311+
<Callout type="info">
312+
**In Production**
313+
314+
In the [AutoBe](https://github.com/wrtnlabs/autobe) project (AI-powered backend code generator by [Wrtn Technologies](https://wrtn.io)), `qwen3-coder-next` showed only **6.75%** raw function calling success rate on compiler AST types. With the complete harness, it reached **100%** — across all four tested Qwen models.
315+
316+
AutoBe once shipped a build with the system prompt completely missing. Nobody noticed — output quality was identical. The types were the best prompt; the harness was the best orchestration.
302317

303318
Working on compiler AST means working on any type and any use case.
304319

305320
- [AutoBeDatabase](https://github.com/wrtnlabs/autobe/blob/main/packages/interface/src/database/AutoBeDatabase.ts)
306321
- [AutoBeOpenApi](https://github.com/wrtnlabs/autobe/blob/main/packages/interface/src/openapi/AutoBeOpenApi.ts)
307322
- [AutoBeTest](https://github.com/wrtnlabs/autobe/blob/main/packages/interface/src/test/AutoBeTest.ts)
323+
</Callout>
308324

309325
```typescript filename="AutoBeTest.IExpression" showLineNumbers
310326
// Compiler AST may be the hardest type structure possible

website/src/content/docs/llm/chat.mdx

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -225,15 +225,15 @@ export const correctFunctionCall = (p: {
225225
}
226226
```
227227

228-
Is LLM function calling perfect?
228+
Is LLM function calling perfect?
229229

230-
The answer is not, and LLM (Large Language Model) vendors like OpenAI take a lot of type level mistakes when composing the arguments of the target function to call. Even though an LLM function calling schema has defined an `Array<string>` type, LLM often fills it just by a `string` typed value.
230+
The answer is no, and LLM (Large Language Model) vendors like OpenAI take a lot of type level mistakes when composing the arguments of the target function to call. Even though an LLM function calling schema has defined an `Array<string>` type, LLM often fills it just by a `string` typed value. This is where the **function calling harness** comes in — a deterministic correction loop of schema generation, lenient parsing, type coercion, and validation feedback that turns unreliable LLM output into 100% correct structured data.
231231

232232
Therefore, when developing an LLM function calling agent, the validation feedback process is essentially required. If LLM takes a type level mistake on arguments composition, the agent must feedback the most detailed validation errors, and let the LLM to retry the function calling referencing the validation errors.
233233

234234
About the validation feedback, `@agentica/core` is utilizing [`typia.validate<T>()`](https://typia.io/docs/validators/validate) and [`typia.llm.application<Class>()`](https://typia.io/docs/llm/application/#application) functions. They construct validation logic by analyzing TypeScript source codes and types in the compilation level, so that detailed and accurate than any other validators like below.
235235

236-
Such validation feedback strategy and combination with `typia` runtime validator, `@agentica/core` has achieved the most ideal LLM function calling. In my experience, when using OpenAI's `gpt-4o-mini` model, it tends to construct invalid function calling arguments at the first trial about 50% of the time. By the way, if correct it through validation feedback with `typia`, success rate soars to 99%. And I've never had a failure when trying validation feedback twice.
236+
Such validation feedback strategy and combination with `typia` runtime validator, `@agentica/core` has achieved the most ideal LLM function calling through the **function calling harness** pattern. In my experience, when using OpenAI's `gpt-4o-mini` model, it tends to construct invalid function calling arguments at the first trial about 50% of the time. By the way, if you correct it through validation feedback with `typia`, success rate soars to 99%. And I've never had a failure when trying validation feedback twice.
237237

238238
For reference, the embedded [`typia.validate<T>()`](/docs/validators/validate) function creates validation logic by analyzing TypeScript source codes and types in the compilation level. Therefore, it is accurate and detailed than any other validator libraries. This is exactly what is needed for function calling, and I can confidentelly say that `typia` is the best library for LLM function calling.
239239

website/src/content/docs/llm/http.mdx

Lines changed: 14 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ export namespace HttpLlm {
6060

6161
LLM function calling from OpenAPI documents.
6262

63-
`HttpLlm` is a utility module from `@typia/utils` that converts OpenAPI (Swagger) documents into LLM function calling schemas. While [`typia.llm.application<Class>()`](./application) generates schemas from TypeScript class types at compile time, `HttpLlm` generates them from OpenAPI documents at runtime — making any REST API instantly callable by LLMs.
63+
`HttpLlm` is a utility module from `@typia/utils` that converts OpenAPI (Swagger) documents into LLM function calling schemas. While [`typia.llm.application<Class>()`](./application) generates schemas from TypeScript class types at compile time, `HttpLlm` generates them from OpenAPI documents at runtime — making any REST API instantly callable by LLMs. Every generated tool includes lenient parsing, type coercion, and validation feedback.
6464

6565
It supports all OpenAPI versions: Swagger v2.0, OpenAPI v3.0, v3.1, and v3.2.
6666

@@ -215,7 +215,17 @@ registerMcpControllers({
215215
</Tabs.Tab>
216216
</Tabs>
217217

218-
## Validation Feedback
218+
## The Function Calling Harness
219+
220+
The **function calling harness** is typia's three-layer pipeline that turns unreliable LLM output into 100% correct structured data:
221+
222+
1. **Lenient JSON Parsing** — recovers broken JSON (unclosed brackets, trailing commas, markdown wrapping, etc.)
223+
2. **Type Coercion** — fixes wrong types (`"42"``42`, double-stringified objects → objects, etc.)
224+
3. **Validation Feedback** — pinpoints remaining value errors with inline `// ❌` annotations so the LLM can self-correct and retry
225+
226+
Each layer catches what the previous one didn't. Together they form a deterministic correction loop around the probabilistic LLM.
227+
228+
### Validation Feedback
219229

220230
When used through [MCP](/docs/llm/mcp), [Vercel AI SDK](/docs/llm/vercel), or [Agentica](/docs/llm/chat), `HttpLlm.controller()` embeds [`typia.validate<T>()`](/docs/validators/validate) in every tool for automatic argument validation. When validation fails, the error is returned as text content with inline `// ❌` comments at each invalid property:
221231

@@ -230,7 +240,7 @@ When used through [MCP](/docs/llm/mcp), [Vercel AI SDK](/docs/llm/vercel), or [A
230240

231241
The LLM reads this feedback and self-corrects on the next turn.
232242

233-
In the [AutoBe](https://github.com/wrtnlabs/autobe) project (AI-powered backend code generator), `qwen3-coder-next` showed only 6.75% raw function calling success rate on compiler AST types. However, with validation feedback, it reached 100%.
243+
In the [AutoBe](https://github.com/wrtnlabs/autobe) project (AI-powered backend code generator by [Wrtn Technologies](https://wrtn.io)), `qwen3-coder-next` showed only **6.75%** raw function calling success rate on compiler AST types. However, with the complete harness, it reached **100%** — across all four tested Qwen models.
234244

235245
Working on compiler AST means working on any type and any use case.
236246

@@ -264,7 +274,7 @@ export type IExpression =
264274
| ... // 30+ expression types total
265275
```
266276
267-
## Lenient JSON Parsing
277+
### Lenient JSON Parsing & Type Coercion
268278
269279
<Tabs items={[
270280
"Parsing Example",

website/src/content/docs/llm/json.mdx

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ export namespace LlmJson {
5151

5252
JSON utilities for LLM function calling.
5353

54-
`LlmJson` is a utility module from `@typia/utils` package, specifically designed for LLM (Large Language Model) function calling scenarios. It handles the common issues that arise when working with LLM responses:
54+
`LlmJson` is a utility module from `@typia/utils` package, specifically designed for LLM (Large Language Model) function calling scenarios. Together, these utilities form the **function calling harness** — handling every common failure mode of LLM responses:
5555

5656
1. **Validation Feedback**: Format validation errors for LLM auto-correction
5757
2. **Lenient JSON Parsing**: LLMs often produce incomplete, malformed, or non-standard JSON
@@ -121,7 +121,7 @@ This format is designed for LLM auto-correction. The LLM reads this feedback and
121121
filename="examples/src/llm/application-parse.ts"
122122
showLineNumbers />
123123

124-
`LlmJson.parse()` is a lenient JSON parser specifically designed for LLM outputs. It combines two capabilities:
124+
`LlmJson.parse()` is a lenient JSON parser specifically designed for LLM outputs. It combines two capabilities in a single call:
125125

126126
1. **Lenient JSON parsing**: Handles malformed/incomplete JSON that would fail with `JSON.parse()`
127127
2. **Type coercion**: Fixes double-stringified values based on the expected schema
@@ -190,7 +190,7 @@ If you omit the `parameters` argument, `LlmJson.parse()` still performs lenient
190190
filename="examples/src/llm/application-coerce.ts"
191191
showLineNumbers />
192192

193-
`LlmJson.coerce()` performs type coercion on already-parsed objects. This is the coercion logic from `parse()` extracted for use when you already have a JavaScript object (not a JSON string).
193+
`LlmJson.coerce()` performs type coercion on already-parsed objects. Use it when an SDK has already parsed the JSON — this is the coercion logic from `parse()` extracted for use when you already have a JavaScript object (not a JSON string).
194194

195195
### When to Use `coerce()` vs `parse()`
196196

@@ -262,7 +262,7 @@ strict({ name: "John", age: 25, extra: "ignored" }); // success: false
262262
If you have TypeScript types available at compile time, prefer using `typia.validate<T>()` directly. It's faster (AOT-compiled) and provides better error messages. Use `LlmJson.validate()` only when you need runtime schema-based validation.
263263
</Callout>
264264

265-
## Validation Feedback Loop
265+
## Validation Feedback Loop (The Complete Harness)
266266

267267
The real power of these utilities is enabling automatic error correction by LLMs:
268268

@@ -355,7 +355,7 @@ const main = async (): Promise<void> => {
355355
};
356356
```
357357

358-
This pattern enables LLMs to automatically correct their mistakes by:
358+
This is the **function calling harness** in action — the same pattern that powers [AutoBe](https://github.com/wrtnlabs/autobe)'s 100% compilation success across all tested LLM models. It enables LLMs to automatically correct their mistakes:
359359

360360
1. Parse LLM response with `func.parse()` (handles malformed JSON + type coercion)
361361
2. Validate with `func.validate()`

website/src/content/docs/llm/parameters.mdx

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -157,7 +157,17 @@ You can utilize the `typia.llm.parameters<Parameters>()` function to generate st
157157
158158
Just configure output mode as JSON schema, and deliver the `typia.llm.parameters<Parameters>()` function returned value to the LLM provider like OpenAI (ChatGPT). Then, the LLM provider will automatically transform the output conversation into a structured data format of the `Parameters` type.
159159
160-
## Lenient JSON Parsing
160+
## The Function Calling Harness
161+
162+
The **function calling harness** is typia's three-layer pipeline that turns unreliable LLM output into 100% correct structured data:
163+
164+
1. **Lenient JSON Parsing** — recovers broken JSON (unclosed brackets, trailing commas, markdown wrapping, etc.)
165+
2. **Type Coercion** — fixes wrong types (`"42"` → `42`, double-stringified objects → objects, etc.)
166+
3. **Validation Feedback** — pinpoints remaining value errors with inline `// ❌` annotations so the LLM can self-correct and retry
167+
168+
Each layer catches what the previous one didn't. Together they form a deterministic correction loop around the probabilistic LLM.
169+
170+
### Lenient JSON Parsing & Type Coercion
161171
162172
<Tabs items={[
163173
"Parsing Example",
@@ -216,7 +226,7 @@ Some LLM SDKs (Anthropic, Vercel AI, LangChain, MCP) parse JSON internally and r
216226
For more details, see [JSON Utilities](./json).
217227
</Callout>
218228
219-
## Validation Feedback
229+
### Validation Feedback
220230
221231
<LocalSource
222232
path="examples/src/llm/parameters-validate.ts"
@@ -241,7 +251,7 @@ Use [`typia.validate<T>()`](/docs/validators/validate) for validation feedback o
241251
242252
The LLM reads this feedback and self-corrects on the next turn.
243253
244-
In the [AutoBe](https://github.com/wrtnlabs/autobe) project (AI-powered backend code generator), `qwen3-coder-next` showed only 6.75% raw function calling success rate on compiler AST types. However, with validation feedback, it reached 100%.
254+
In the [AutoBe](https://github.com/wrtnlabs/autobe) project (AI-powered backend code generator by [Wrtn Technologies](https://wrtn.io)), `qwen3-coder-next` showed only **6.75%** raw function calling success rate on compiler AST types. However, with the complete harness, it reached **100%** — across all four tested Qwen models.
245255
246256
Working on compiler AST means working on any type and any use case.
247257

0 commit comments

Comments
 (0)