-
Notifications
You must be signed in to change notification settings - Fork 360
Add spec-ingestion agent skill for TypeSpec ingestion workflow #957
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
dd2ad1b
811570d
45a60a4
de3076d
3fa48fc
88d186a
197b574
8f68129
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,53 @@ | ||
| --- | ||
| name: spec-ingestion | ||
| description: Guide for ingesting the latest OpenAI TypeSpec specification into the openai-dotnet SDK. Use this when asked to update or ingest OpenAI API specs, copy base TypeSpec files from upstream, fix client TSP compile errors, or run code generation for new API areas. | ||
| --- | ||
|
|
||
| # Spec Ingestion | ||
|
|
||
| ## Overview | ||
|
|
||
| This skill describes how to ingest the latest OpenAI TypeSpec specification (from the upstream [`microsoft/openai-openapi-pr`](https://github.com/microsoft/openai-openapi-pr) repository) into the `openai-dotnet` SDK, area by area. | ||
|
|
||
| The process involves: | ||
| 1. Copying updated base specs from upstream (exact copy, no modifications) | ||
| 2. Reporting any compile errors in the base TSP (do NOT fix — base spec must stay unmodified) | ||
| 3. Fixing compile errors in the client TSP layer | ||
| 4. Preserving custom C# code (renames, stubs) | ||
| 5. Running code generation | ||
| 6. Verifying the output | ||
|
|
||
| ## Skill Documents | ||
|
|
||
| This skill is split across multiple files for easier navigation: | ||
|
|
||
| | Document | Description | | ||
| |----------|-------------| | ||
| | [steps.md](steps.md) | **Step-by-step process** — the full 9-step workflow from copying spec to post-generation verification | | ||
| | [file-locations.md](file-locations.md) | **Key file locations** — quick reference for all upstream and local paths, area mappings | | ||
| | [patterns-and-gotchas.md](patterns-and-gotchas.md) | **Common patterns & gotchas** — lessons learned, pitfalls, and conventions to follow | | ||
| | [checklist.md](checklist.md) | **Checklist** — a task-by-task checklist for tracking progress during an ingestion | | ||
| | [references.md](references.md) | **Reference PRs** — detailed notes on past ingestion PRs with lessons learned | | ||
|
|
||
| ## Quick Start | ||
|
|
||
| 1. Review [references.md](references.md) for examples of past ingestions in your area | ||
| 2. Read [file-locations.md](file-locations.md) to understand the repo layout | ||
| 3. Follow [steps.md](steps.md) for the full ingestion workflow | ||
| 4. Use [checklist.md](checklist.md) to track your progress | ||
| 5. Consult [patterns-and-gotchas.md](patterns-and-gotchas.md) when you hit issues | ||
|
|
||
| ## Available Areas | ||
|
|
||
| Areas that can be ingested independently: | ||
|
|
||
| `administration` · `assistants` · `audio` · `batch` · `chat` · `containers` · `conversations` · `embeddings` · `evals` · `files` · `fine-tuning` · `graders` · `images` · `models` · `moderations` · `realtime` · `responses` · `runs` · `threads` · `vector-stores` · `videos` | ||
|
|
||
| ## Key Rules | ||
|
|
||
| 1. **Always add `@@clientLocation`** for every operation in the client TSP (the latest spec no longer uses `interface` blocks) | ||
| 2. **NEVER modify the base spec** — it must be an exact copy of upstream. Handle all issues (type unions, suppressions, etc.) in `specification/client/` instead | ||
| 3. **Update `[CodeGenType]` stubs** in `src/Custom/{Area}/Internal/GeneratorStubs.cs` for any renamed types | ||
| 4. **Defer complex features** — suggest them as follow-up items rather than implementing in the same ingestion | ||
| 5. **Run `./scripts/Invoke-CodeGen.ps1`** to generate code, then `dotnet build` to verify | ||
| 6. **Work locally only** — do NOT create PRs or file issues. Instead, suggest a list of issues that may need to be filed upstream |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,71 @@ | ||
| # Spec Ingestion Checklist | ||
|
|
||
| Use this checklist when performing a spec ingestion for any area. | ||
|
|
||
| ## Pre-Ingestion | ||
|
|
||
| - [ ] Identify the target area(s) to ingest | ||
| - [ ] **Review reference PRs** in [references.md](references.md) for the area or similar areas — note visitor changes, deferred features, and custom C# patterns | ||
| - [ ] Pull latest from upstream `microsoft/openai-openapi-pr` (branch: `main`) | ||
| - [ ] Pull latest from `openai/openai-dotnet` (branch: `main`) | ||
|
|
||
| ## Base Spec Update | ||
|
|
||
| - [ ] Sparse checkout upstream repo including **both** `{area}` and `common` folders | ||
| - [ ] Copy latest base spec from upstream `packages/openai-typespec/src/{area}/` to `specification/base/typespec/{area}/` — **exact copy, no modifications** | ||
| - [ ] **Keep the temporary sparse checkout** — don't delete it yet. If `./scripts/Invoke-CodeGen.ps1` fails with missing types from `common/`, you'll need to look them up in the clone's `src/common/` folder and copy the specific type definition into the local `specification/base/typespec/common/` file (do NOT copy the entire file or folder) | ||
| - [ ] Delete the temporary sparse checkout after `./scripts/Invoke-CodeGen.ps1` succeeds and no more upstream files are needed | ||
|
|
||
| ## Client TSP Update | ||
|
|
||
| - [ ] Extract operation names from new `operations.tsp` (`Select-String -Pattern "^op "` ) | ||
| - [ ] Fix errors in `specification/client/{area}.client.tsp` | ||
| - [ ] Add `@@clientLocation` for **all** operations (no more `interface` blocks) | ||
| - [ ] Update `@@clientName` for any renamed operations | ||
| - [ ] Update `@@visibility`, `@@alternateType`, `@@usage` — fix these if `./scripts/Invoke-CodeGen.ps1` reports errors referencing these decorators (e.g., a type or property was renamed or removed upstream) | ||
| - [ ] Update client models TSP (`specification/client/models/{area}.models.tsp`) if applicable | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we define the criteria for "if applicable"?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A client models file exists only for areas that need discriminated union wrappers or .NET-specific model overrides. The criteria is: check if specification/client/models/{area}.models.tsp exists. If it does, update it if needed. If it doesn't, skip this step. |
||
|
|
||
| ## Compile and Generate Code | ||
|
|
||
| - [ ] Run `./scripts/Invoke-CodeGen.ps1` (no params) — this handles `npm ci`, build, compile, and code generation in one step | ||
| - [ ] Ignore warnings; only **errors** matter | ||
| - [ ] If `prohibited-namespace` errors appear, add `[CodeGenType]` stubs — internal types go in `Internal/GeneratorStubs.cs`, public types go in `GeneratorStubs.cs` (see patterns-and-gotchas.md §5) | ||
| - [ ] If client TSP fixes are needed, fix and re-run `./scripts/Invoke-CodeGen.ps1` | ||
| - [ ] Report any remaining base spec compile errors — **do NOT modify base spec directly** | ||
ShivangiReja marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| ## Custom C# Code Update | ||
|
|
||
| - [ ] Compare old vs. new spec for **renames** → update `[CodeGenType]` stubs in `src/Custom/{Area}/Internal/GeneratorStubs.cs` | ||
| - [ ] Update any custom code in `src/Custom/{Area}/` referencing renamed types | ||
| - [ ] Add `[CodeGenType]` stubs for new internal types | ||
| - [ ] Remove stubs for deleted types | ||
|
|
||
| ## Documentation | ||
|
|
||
| - [ ] List all **new** types, properties, and operations | ||
| - [ ] List all **renamed** types/properties (old → new mapping) | ||
| - [ ] List all **removed** types, properties, and operations | ||
| - [ ] Note any **type unions** that need discriminator treatment (don't modify base spec) | ||
|
Comment on lines
+45
to
+48
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Where should these be listed or noted?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These should be listed in the final summary that the agent presents to the user at the end (the "Final Summary" section at the bottom of the checklist). The agent compiles these lists as it works through the steps and includes them in its output. |
||
|
|
||
| ## Post-Generation Verification | ||
|
|
||
| - [ ] Verify generated files: `Get-ChildItem src/Generated/Models/{Area}/ -Name` | ||
| - [ ] **Review numeric types** — check if any `long` or `double` properties, parameters, or fields were incorrectly converted to `int`/`float` by the `NumericTypesVisitor`; add exclusions if needed (see patterns-and-gotchas.md §3) | ||
| - [ ] Verify build: `dotnet build` | ||
| - [ ] Export API surface: `./scripts/Export-Api.ps1` | ||
|
|
||
| ## Post-Generation Review | ||
|
|
||
| - [ ] Diff generated code (`src/Generated/`) — list new, removed, and changed files | ||
| - [ ] Diff API surface (`api/`) — identify breaking changes | ||
| - [ ] List compile issues in generated code | ||
| - [ ] List items needing discriminator patterns | ||
| - [ ] Identify features needing follow-up work | ||
|
|
||
| ## Final Summary (Local Work Only) | ||
|
|
||
| > Do NOT create PRs or file issues. Present a summary for the user to act on. | ||
|
|
||
| - [ ] Summarize all changes made locally | ||
| - [ ] List suggested upstream issues (spec bugs, missing types, etc.) | ||
| - [ ] List suggested follow-up items (deferred features, complex implementations) | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,85 @@ | ||
| # Key File Locations | ||
|
|
||
| Quick reference for all paths involved in spec ingestion. | ||
|
|
||
| ## Upstream (Source) | ||
|
|
||
| | What | Path | | ||
| |------|------| | ||
| | Base spec (all areas) | `https://github.com/microsoft/openai-openapi-pr/tree/main/packages/openai-typespec/src` | | ||
| | Common shared types | `https://github.com/microsoft/openai-openapi-pr/tree/main/packages/openai-typespec/src/common` | | ||
| | Area spec (e.g., audio) | `https://github.com/microsoft/openai-openapi-pr/tree/main/packages/openai-typespec/src/{area}` | | ||
|
|
||
| ## Local Repository — Specification | ||
|
|
||
| | What | Path | | ||
| |------|------| | ||
| | Base spec entry point | `specification/base/typespec/main.tsp` | | ||
| | Area base spec | `specification/base/typespec/{area}/` | | ||
| | Common shared types | `specification/base/typespec/common/` | | ||
| | Common models | `specification/base/typespec/common/models.tsp` | | ||
| | Common custom types | `specification/base/typespec/common/custom.tsp` | | ||
| | SDK entrypoint | `specification/base/entrypoints/sdk.dotnet/main.tsp` | | ||
| | Client customizations | `specification/client/{area}.client.tsp` | | ||
| | Client model overrides | `specification/client/models/{area}.models.tsp` | | ||
| | Main TSP entry (all imports) | `specification/main.tsp` | | ||
| | TSP config | `specification/tspconfig.yaml` | | ||
|
|
||
| ## Local Repository — C# Source | ||
|
|
||
| | What | Path | | ||
| |------|------| | ||
| | Custom C# code (per area) | `src/Custom/{Area}/` | | ||
| | Internal generator stubs | `src/Custom/{Area}/Internal/GeneratorStubs.cs` | | ||
| | Generated C# code | `src/Generated/` | | ||
| | Generated models (per area) | `src/Generated/Models/{Area}/` | | ||
| | Generated client | `src/Generated/{Area}Client.cs` | | ||
| | Generated REST client | `src/Generated/{Area}Client.RestClient.cs` | | ||
|
|
||
| ## Local Repository — Scripts | ||
|
|
||
| | What | Path | | ||
| |------|------| | ||
| | Code generation script | `scripts/Invoke-CodeGen.ps1` | | ||
| | API export script | `scripts/Export-Api.ps1` | | ||
| | API compatibility test | `scripts/Test-ApiCompatibility.ps1` | | ||
| | AOT compatibility test | `scripts/Test-AotCompatibility.ps1` | | ||
|
|
||
| ## Local Repository — API Surface | ||
|
|
||
| | What | Path | | ||
| |------|------| | ||
| | .NET 8.0 API surface | `api/OpenAI.net8.0.cs` | | ||
| | .NET 10.0 API surface | `api/OpenAI.net10.0.cs` | | ||
| | .NET Standard 2.0 API surface | `api/OpenAI.netstandard2.0.cs` | | ||
|
|
||
| ## Local Repository — Codegen Plugin | ||
|
|
||
| | What | Path | | ||
| |------|------| | ||
| | Codegen plugin source | `codegen/generator/src/` | | ||
| | Numeric types visitor | `codegen/generator/src/Visitors/NumericTypesVisitor.cs` | | ||
|
|
||
| ## Available Areas | ||
|
|
||
| These areas map between the base spec directories, client TSP files, and C# custom code: | ||
|
|
||
| | Area Folder | Client TSP | C# Custom Folder | | ||
| |-------------|-----------|-------------------| | ||
| | `audio` | `audio.client.tsp` | `Audio` | | ||
| | `assistants` | `assistants.client.tsp` | `Assistants` | | ||
| | `batch` | `batch.client.tsp` | `Batch` | | ||
| | `chat` | `chat.client.tsp` | `Chat` | | ||
| | `containers` | `containers.client.tsp` | `Containers` | | ||
| | `conversations` | `conversations.client.tsp` | `Conversations` | | ||
| | `embeddings` | `embeddings.client.tsp` | `Embeddings` | | ||
| | `files` | `files.client.tsp` | `Files` | | ||
| | `fine-tuning` | `fine-tuning.client.tsp` | `FineTuning` | | ||
| | `graders` | `graders.client.tsp` | `Graders` | | ||
| | `images` | `images.client.tsp` | `Images` | | ||
| | `models` | `models.client.tsp` | `Models` | | ||
| | `moderations` | `moderations.client.tsp` | `Moderations` | | ||
| | `realtime` | — | `Realtime` | | ||
| | `responses` | `responses.client.tsp` | `Responses` | | ||
| | `vector-stores` | `vector-stores.client.tsp` | `VectorStores` | | ||
| | `videos` | `videos.client.tsp` | `Videos` | |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,156 @@ | ||
| # Common Patterns and Gotchas | ||
|
|
||
| Lessons learned from previous spec ingestion PRs. Review these before starting a new ingestion. | ||
|
|
||
| --- | ||
|
|
||
| ## 1. Operations Without Interfaces Need `@@clientLocation` | ||
|
|
||
| The latest upstream spec removes `interface` blocks. Every operation **MUST** have a `@@clientLocation` decorator in the client TSP or it won't be assigned to the correct client class. | ||
|
|
||
| ```typespec | ||
| // OLD pattern (upstream used to have this): | ||
| interface Audio { | ||
| createSpeech(...): ...; | ||
| createTranscription(...): ...; | ||
| } | ||
|
|
||
| // NEW pattern (operations are standalone, so you need): | ||
| @@clientLocation(createSpeech, "Audio"); | ||
| @@clientLocation(createTranscription, "Audio"); | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## 2. `GeneratorStubs.cs` is the Rename Registry | ||
|
|
||
| When types are renamed in the spec, the `[CodeGenType("NewGeneratedName")]` attribute in `src/Custom/{Area}/Internal/GeneratorStubs.cs` maps the new generated name to the existing custom partial class. This is how backward compatibility is maintained without renaming public types. | ||
|
|
||
| ```csharp | ||
| // Maps generated type "CreateSpeechRequestModel" to internal class | ||
| [CodeGenType("CreateSpeechRequestModel")] internal readonly partial struct InternalCreateSpeechRequestModel { } | ||
| ``` | ||
|
|
||
| If a TypeSpec model is renamed from `FooBar` to `BazQux`, update: | ||
| ```csharp | ||
| // Before: | ||
| [CodeGenType("FooBar")] internal partial class InternalFooBar { } | ||
| // After: | ||
| [CodeGenType("BazQux")] internal partial class InternalFooBar { } | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## 3. Numeric Type Conversions | ||
|
|
||
| TypeSpec's `integer` type maps to `long` in C# by default. The `NumericTypesVisitor` (at `codegen/generator/src/Visitors/NumericTypesVisitor.cs`) converts: | ||
| - `long` → `int` | ||
| - `double` → `float` | ||
|
|
||
| for all generated properties unless explicitly excluded. | ||
|
|
||
| **After code generation, you MUST review the generated numeric properties.** If a property genuinely requires `long` (e.g., byte counts, large IDs) or `double` (high-precision values), add it to the exclusion list in the visitor: | ||
|
|
||
| ```csharp | ||
| private static readonly HashSet<string> _excludedLongProperties = new(StringComparer.OrdinalIgnoreCase) | ||
| { | ||
| "OpenAI.{Area}.{TypeName}.{PropertyName}", | ||
| }; | ||
| ``` | ||
|
|
||
| See [PR #935 (VectorStore)](https://github.com/openai/openai-dotnet/pull/935) for an example where this visitor was enhanced to handle fields and methods in addition to properties. | ||
|
|
||
| --- | ||
|
|
||
| ## 4. Streaming Responses and Discriminated Unions | ||
|
|
||
| Some areas (audio, chat, responses) have streaming variants. The client models TSP typically needs **discriminated union wrappers** for streaming event types: | ||
|
|
||
| ```typespec | ||
| @usage(Usage.output | Usage.json) | ||
| @discriminator("type") | ||
| model DotNetCreateTranscriptionStreamingResponse { | ||
| type: DotNetCreateTranscriptionStreamingResponseType; | ||
| } | ||
|
|
||
| union DotNetCreateTranscriptionStreamingResponseType { | ||
| `transcript.text.segment`: "transcript.text.segment", | ||
| `transcript.text.delta`: "transcript.text.delta", | ||
| `transcript.text.done`: "transcript.text.done", | ||
| string | ||
| } | ||
|
|
||
| model DotNetTranscriptTextSegmentEvent extends DotNetCreateTranscriptionStreamingResponse { | ||
| ...TranscriptTextSegmentEvent; | ||
| } | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## 5. `prohibited-namespace` Errors Require `[CodeGenType]` Stubs | ||
|
|
||
| A `prohibited-namespace` compile error means the generator found a type that doesn't have a corresponding `[CodeGenType]` stub in the custom C# code. This can be triggered by any type — inline unions, new models, new enums, etc. — but **not every new type causes it**. Only fix the specific types named in the error. | ||
|
|
||
| **Fix:** Add a `[CodeGenType]` stub for each type named in the error, placing it in the correct location: | ||
|
|
||
| - **Internal types** → `src/Custom/{Area}/Internal/GeneratorStubs.cs` | ||
| - **Public types** → `src/Custom/{Area}/GeneratorStubs.cs` | ||
|
|
||
| Look at existing stubs in the area to determine the right pattern (class vs. struct, readonly, etc.). | ||
|
|
||
| **Example — internal stubs** (`src/Custom/{Area}/Internal/GeneratorStubs.cs`): | ||
| ```csharp | ||
| [CodeGenType("ContainerResourceMemoryLimit")] internal readonly partial struct InternalContainerResourceMemoryLimit { } | ||
| [CodeGenType("ContainerListResource")] internal partial class InternalContainerListResource { } | ||
| ``` | ||
|
|
||
| **Example — public stubs** (`src/Custom/{Area}/GeneratorStubs.cs`): | ||
| ```csharp | ||
| [CodeGenType("ContainerResource")] public partial class ContainerResource { } | ||
| [CodeGenType("ContainerCollectionOrder")] public readonly partial struct ContainerCollectionOrder { } | ||
| ``` | ||
|
|
||
| **How to identify these:** The compiler error message will name the type exactly. Only add stubs for the types that appear in `prohibited-namespace` errors — do not preemptively stub every new type. | ||
|
|
||
| --- | ||
|
|
||
| ## 6. NEVER Modify the Base Spec | ||
|
|
||
| > **CRITICAL:** The base spec at `specification/base/typespec/` must be an **exact copy** of the upstream spec from `microsoft/openai-openapi-pr`. Do NOT modify it for any reason — not for type unions, not for import paths, not for namespaces, not for suppression directives. | ||
|
|
||
| If there are issues with the base spec: | ||
| - **Type unions** that would generate binary data types → handle in `specification/client/models/{area}.models.tsp` using discriminator patterns | ||
| - **Any other issues** → resolve in the client TSP layer if possible, or suggest them as upstream issues to be filed (do NOT file issues yourself) | ||
|
|
||
| --- | ||
|
|
||
| ## 7. Follow-up PRs for Complex Features | ||
|
|
||
| Not everything needs to be done during the spec ingestion. New features that require significant custom C# implementation should be listed as suggested follow-up items for the user to review. | ||
|
|
||
| **Examples from past ingestions:** | ||
| - Speech streaming events → [#914](https://github.com/openai/openai-dotnet/issues/914) (from Audio #913) | ||
| - Diarized transcription → [#916](https://github.com/openai/openai-dotnet/issues/916) (from Audio #913) | ||
| - Pagination for `GetFiles` → [#895](https://github.com/openai/openai-dotnet/issues/895) (from Files #894) | ||
| - `ExpiresAfter` parameter → [#896](https://github.com/openai/openai-dotnet/issues/896) (from Files #894) | ||
|
|
||
| --- | ||
|
|
||
| ## 8. `[Experimental]` Attribute for New Features | ||
|
|
||
| New public types and properties that are not yet stable should be marked with `[Experimental]` in the custom C# code. This was done during the Moderations ingestion (#888). | ||
|
|
||
| --- | ||
|
|
||
| ## 9. Test Fixes After Ingestion | ||
|
|
||
| Expect test updates after spec ingestion: | ||
| - **Session records** may need regeneration if API shapes changed | ||
| - **Assertion changes** for renamed/retyped properties | ||
| - **New test coverage** for new features (can be deferred) | ||
|
|
||
| --- | ||
|
|
||
| ## 10. API Export After Ingestion | ||
|
|
||
| Always run `./scripts/Export-Api.ps1` after successful code generation to update the API surface files (`api/OpenAI.net8.0.cs`, etc.). These files are used for API compatibility checks and should be committed as part of the PR. |


There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we define what area(s) mean in this context? For example, a specific API capability or client specific capability such as "Responses" or "Realtime".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The process starts from SKILL.md, which is the entry point that Copilot loads first. It already contains an Available Areas section listing all the areas (e.g.,
audio,containers, etc.), and file-locations.md has a full mapping table showing how each area maps to its base spec folder, client TSP file, and C# custom code folder.By the time the agent reaches the checklist, it already has all that context from
SKILL.mdand the linked documents. The checklist is intentionally lightweight, it's a tracking tool, not meant to be read standalone. I kept the definitions inSKILL.mdandfile-locations.mdto avoid duplicating info across files.