Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 53 additions & 0 deletions .github/skills/Spec-ingestion/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
---
name: spec-ingestion
description: Guide for ingesting the latest OpenAI TypeSpec specification into the openai-dotnet SDK. Use this when asked to update or ingest OpenAI API specs, copy base TypeSpec files from upstream, fix client TSP compile errors, or run code generation for new API areas.
---

# Spec Ingestion

## Overview

This skill describes how to ingest the latest OpenAI TypeSpec specification (from the upstream [`microsoft/openai-openapi-pr`](https://github.com/microsoft/openai-openapi-pr) repository) into the `openai-dotnet` SDK, area by area.

The process involves:
1. Copying updated base specs from upstream (exact copy, no modifications)
2. Reporting any compile errors in the base TSP (do NOT fix — base spec must stay unmodified)
3. Fixing compile errors in the client TSP layer
4. Preserving custom C# code (renames, stubs)
5. Running code generation
6. Verifying the output

## Skill Documents

This skill is split across multiple files for easier navigation:

| Document | Description |
|----------|-------------|
| [steps.md](steps.md) | **Step-by-step process** — the full 9-step workflow from copying spec to post-generation verification |
| [file-locations.md](file-locations.md) | **Key file locations** — quick reference for all upstream and local paths, area mappings |
| [patterns-and-gotchas.md](patterns-and-gotchas.md) | **Common patterns & gotchas** — lessons learned, pitfalls, and conventions to follow |
| [checklist.md](checklist.md) | **Checklist** — a task-by-task checklist for tracking progress during an ingestion |
| [references.md](references.md) | **Reference PRs** — detailed notes on past ingestion PRs with lessons learned |

## Quick Start

1. Review [references.md](references.md) for examples of past ingestions in your area
2. Read [file-locations.md](file-locations.md) to understand the repo layout
3. Follow [steps.md](steps.md) for the full ingestion workflow
4. Use [checklist.md](checklist.md) to track your progress
5. Consult [patterns-and-gotchas.md](patterns-and-gotchas.md) when you hit issues

## Available Areas

Areas that can be ingested independently:

`administration` · `assistants` · `audio` · `batch` · `chat` · `containers` · `conversations` · `embeddings` · `evals` · `files` · `fine-tuning` · `graders` · `images` · `models` · `moderations` · `realtime` · `responses` · `runs` · `threads` · `vector-stores` · `videos`

## Key Rules

1. **Always add `@@clientLocation`** for every operation in the client TSP (the latest spec no longer uses `interface` blocks)
2. **NEVER modify the base spec** — it must be an exact copy of upstream. Handle all issues (type unions, suppressions, etc.) in `specification/client/` instead
3. **Update `[CodeGenType]` stubs** in `src/Custom/{Area}/Internal/GeneratorStubs.cs` for any renamed types
4. **Defer complex features** — suggest them as follow-up items rather than implementing in the same ingestion
5. **Run `./scripts/Invoke-CodeGen.ps1`** to generate code, then `dotnet build` to verify
6. **Work locally only** — do NOT create PRs or file issues. Instead, suggest a list of issues that may need to be filed upstream
71 changes: 71 additions & 0 deletions .github/skills/Spec-ingestion/checklist.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# Spec Ingestion Checklist

Use this checklist when performing a spec ingestion for any area.

## Pre-Ingestion

- [ ] Identify the target area(s) to ingest
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we define what area(s) mean in this context? For example, a specific API capability or client specific capability such as "Responses" or "Realtime".

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The process starts from SKILL.md, which is the entry point that Copilot loads first. It already contains an Available Areas section listing all the areas (e.g., audio, containers, etc.), and file-locations.md has a full mapping table showing how each area maps to its base spec folder, client TSP file, and C# custom code folder.

By the time the agent reaches the checklist, it already has all that context from SKILL.md and the linked documents. The checklist is intentionally lightweight, it's a tracking tool, not meant to be read standalone. I kept the definitions in SKILL.md and file-locations.md to avoid duplicating info across files.

- [ ] **Review reference PRs** in [references.md](references.md) for the area or similar areas — note visitor changes, deferred features, and custom C# patterns
- [ ] Pull latest from upstream `microsoft/openai-openapi-pr` (branch: `main`)
- [ ] Pull latest from `openai/openai-dotnet` (branch: `main`)

## Base Spec Update

- [ ] Sparse checkout upstream repo including **both** `{area}` and `common` folders
- [ ] Copy latest base spec from upstream `packages/openai-typespec/src/{area}/` to `specification/base/typespec/{area}/` — **exact copy, no modifications**
- [ ] **Keep the temporary sparse checkout** — don't delete it yet. If `./scripts/Invoke-CodeGen.ps1` fails with missing types from `common/`, you'll need to look them up in the clone's `src/common/` folder and copy the specific type definition into the local `specification/base/typespec/common/` file (do NOT copy the entire file or folder)
- [ ] Delete the temporary sparse checkout after `./scripts/Invoke-CodeGen.ps1` succeeds and no more upstream files are needed

## Client TSP Update

- [ ] Extract operation names from new `operations.tsp` (`Select-String -Pattern "^op "` )
- [ ] Fix errors in `specification/client/{area}.client.tsp`
- [ ] Add `@@clientLocation` for **all** operations (no more `interface` blocks)
- [ ] Update `@@clientName` for any renamed operations
- [ ] Update `@@visibility`, `@@alternateType`, `@@usage` — fix these if `./scripts/Invoke-CodeGen.ps1` reports errors referencing these decorators (e.g., a type or property was renamed or removed upstream)
- [ ] Update client models TSP (`specification/client/models/{area}.models.tsp`) if applicable
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we define the criteria for "if applicable"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A client models file exists only for areas that need discriminated union wrappers or .NET-specific model overrides. The criteria is: check if specification/client/models/{area}.models.tsp exists. If it does, update it if needed. If it doesn't, skip this step.


## Compile and Generate Code

- [ ] Run `./scripts/Invoke-CodeGen.ps1` (no params) — this handles `npm ci`, build, compile, and code generation in one step
- [ ] Ignore warnings; only **errors** matter
- [ ] If `prohibited-namespace` errors appear, add `[CodeGenType]` stubs — internal types go in `Internal/GeneratorStubs.cs`, public types go in `GeneratorStubs.cs` (see patterns-and-gotchas.md §5)
- [ ] If client TSP fixes are needed, fix and re-run `./scripts/Invoke-CodeGen.ps1`
- [ ] Report any remaining base spec compile errors — **do NOT modify base spec directly**

## Custom C# Code Update

- [ ] Compare old vs. new spec for **renames** → update `[CodeGenType]` stubs in `src/Custom/{Area}/Internal/GeneratorStubs.cs`
- [ ] Update any custom code in `src/Custom/{Area}/` referencing renamed types
- [ ] Add `[CodeGenType]` stubs for new internal types
- [ ] Remove stubs for deleted types

## Documentation

- [ ] List all **new** types, properties, and operations
- [ ] List all **renamed** types/properties (old → new mapping)
- [ ] List all **removed** types, properties, and operations
- [ ] Note any **type unions** that need discriminator treatment (don't modify base spec)
Comment on lines +45 to +48
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where should these be listed or noted?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be listed in the final summary that the agent presents to the user at the end (the "Final Summary" section at the bottom of the checklist). The agent compiles these lists as it works through the steps and includes them in its output.

At the end it will give the summary about "What was done":
image

and "What changed":
image


## Post-Generation Verification

- [ ] Verify generated files: `Get-ChildItem src/Generated/Models/{Area}/ -Name`
- [ ] **Review numeric types** — check if any `long` or `double` properties, parameters, or fields were incorrectly converted to `int`/`float` by the `NumericTypesVisitor`; add exclusions if needed (see patterns-and-gotchas.md §3)
- [ ] Verify build: `dotnet build`
- [ ] Export API surface: `./scripts/Export-Api.ps1`

## Post-Generation Review

- [ ] Diff generated code (`src/Generated/`) — list new, removed, and changed files
- [ ] Diff API surface (`api/`) — identify breaking changes
- [ ] List compile issues in generated code
- [ ] List items needing discriminator patterns
- [ ] Identify features needing follow-up work

## Final Summary (Local Work Only)

> Do NOT create PRs or file issues. Present a summary for the user to act on.

- [ ] Summarize all changes made locally
- [ ] List suggested upstream issues (spec bugs, missing types, etc.)
- [ ] List suggested follow-up items (deferred features, complex implementations)
85 changes: 85 additions & 0 deletions .github/skills/Spec-ingestion/file-locations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
# Key File Locations

Quick reference for all paths involved in spec ingestion.

## Upstream (Source)

| What | Path |
|------|------|
| Base spec (all areas) | `https://github.com/microsoft/openai-openapi-pr/tree/main/packages/openai-typespec/src` |
| Common shared types | `https://github.com/microsoft/openai-openapi-pr/tree/main/packages/openai-typespec/src/common` |
| Area spec (e.g., audio) | `https://github.com/microsoft/openai-openapi-pr/tree/main/packages/openai-typespec/src/{area}` |

## Local Repository — Specification

| What | Path |
|------|------|
| Base spec entry point | `specification/base/typespec/main.tsp` |
| Area base spec | `specification/base/typespec/{area}/` |
| Common shared types | `specification/base/typespec/common/` |
| Common models | `specification/base/typespec/common/models.tsp` |
| Common custom types | `specification/base/typespec/common/custom.tsp` |
| SDK entrypoint | `specification/base/entrypoints/sdk.dotnet/main.tsp` |
| Client customizations | `specification/client/{area}.client.tsp` |
| Client model overrides | `specification/client/models/{area}.models.tsp` |
| Main TSP entry (all imports) | `specification/main.tsp` |
| TSP config | `specification/tspconfig.yaml` |

## Local Repository — C# Source

| What | Path |
|------|------|
| Custom C# code (per area) | `src/Custom/{Area}/` |
| Internal generator stubs | `src/Custom/{Area}/Internal/GeneratorStubs.cs` |
| Generated C# code | `src/Generated/` |
| Generated models (per area) | `src/Generated/Models/{Area}/` |
| Generated client | `src/Generated/{Area}Client.cs` |
| Generated REST client | `src/Generated/{Area}Client.RestClient.cs` |

## Local Repository — Scripts

| What | Path |
|------|------|
| Code generation script | `scripts/Invoke-CodeGen.ps1` |
| API export script | `scripts/Export-Api.ps1` |
| API compatibility test | `scripts/Test-ApiCompatibility.ps1` |
| AOT compatibility test | `scripts/Test-AotCompatibility.ps1` |

## Local Repository — API Surface

| What | Path |
|------|------|
| .NET 8.0 API surface | `api/OpenAI.net8.0.cs` |
| .NET 10.0 API surface | `api/OpenAI.net10.0.cs` |
| .NET Standard 2.0 API surface | `api/OpenAI.netstandard2.0.cs` |

## Local Repository — Codegen Plugin

| What | Path |
|------|------|
| Codegen plugin source | `codegen/generator/src/` |
| Numeric types visitor | `codegen/generator/src/Visitors/NumericTypesVisitor.cs` |

## Available Areas

These areas map between the base spec directories, client TSP files, and C# custom code:

| Area Folder | Client TSP | C# Custom Folder |
|-------------|-----------|-------------------|
| `audio` | `audio.client.tsp` | `Audio` |
| `assistants` | `assistants.client.tsp` | `Assistants` |
| `batch` | `batch.client.tsp` | `Batch` |
| `chat` | `chat.client.tsp` | `Chat` |
| `containers` | `containers.client.tsp` | `Containers` |
| `conversations` | `conversations.client.tsp` | `Conversations` |
| `embeddings` | `embeddings.client.tsp` | `Embeddings` |
| `files` | `files.client.tsp` | `Files` |
| `fine-tuning` | `fine-tuning.client.tsp` | `FineTuning` |
| `graders` | `graders.client.tsp` | `Graders` |
| `images` | `images.client.tsp` | `Images` |
| `models` | `models.client.tsp` | `Models` |
| `moderations` | `moderations.client.tsp` | `Moderations` |
| `realtime` | — | `Realtime` |
| `responses` | `responses.client.tsp` | `Responses` |
| `vector-stores` | `vector-stores.client.tsp` | `VectorStores` |
| `videos` | `videos.client.tsp` | `Videos` |
156 changes: 156 additions & 0 deletions .github/skills/Spec-ingestion/patterns-and-gotchas.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
# Common Patterns and Gotchas

Lessons learned from previous spec ingestion PRs. Review these before starting a new ingestion.

---

## 1. Operations Without Interfaces Need `@@clientLocation`

The latest upstream spec removes `interface` blocks. Every operation **MUST** have a `@@clientLocation` decorator in the client TSP or it won't be assigned to the correct client class.

```typespec
// OLD pattern (upstream used to have this):
interface Audio {
createSpeech(...): ...;
createTranscription(...): ...;
}

// NEW pattern (operations are standalone, so you need):
@@clientLocation(createSpeech, "Audio");
@@clientLocation(createTranscription, "Audio");
```

---

## 2. `GeneratorStubs.cs` is the Rename Registry

When types are renamed in the spec, the `[CodeGenType("NewGeneratedName")]` attribute in `src/Custom/{Area}/Internal/GeneratorStubs.cs` maps the new generated name to the existing custom partial class. This is how backward compatibility is maintained without renaming public types.

```csharp
// Maps generated type "CreateSpeechRequestModel" to internal class
[CodeGenType("CreateSpeechRequestModel")] internal readonly partial struct InternalCreateSpeechRequestModel { }
```

If a TypeSpec model is renamed from `FooBar` to `BazQux`, update:
```csharp
// Before:
[CodeGenType("FooBar")] internal partial class InternalFooBar { }
// After:
[CodeGenType("BazQux")] internal partial class InternalFooBar { }
```

---

## 3. Numeric Type Conversions

TypeSpec's `integer` type maps to `long` in C# by default. The `NumericTypesVisitor` (at `codegen/generator/src/Visitors/NumericTypesVisitor.cs`) converts:
- `long` → `int`
- `double` → `float`

for all generated properties unless explicitly excluded.

**After code generation, you MUST review the generated numeric properties.** If a property genuinely requires `long` (e.g., byte counts, large IDs) or `double` (high-precision values), add it to the exclusion list in the visitor:

```csharp
private static readonly HashSet<string> _excludedLongProperties = new(StringComparer.OrdinalIgnoreCase)
{
"OpenAI.{Area}.{TypeName}.{PropertyName}",
};
```

See [PR #935 (VectorStore)](https://github.com/openai/openai-dotnet/pull/935) for an example where this visitor was enhanced to handle fields and methods in addition to properties.

---

## 4. Streaming Responses and Discriminated Unions

Some areas (audio, chat, responses) have streaming variants. The client models TSP typically needs **discriminated union wrappers** for streaming event types:

```typespec
@usage(Usage.output | Usage.json)
@discriminator("type")
model DotNetCreateTranscriptionStreamingResponse {
type: DotNetCreateTranscriptionStreamingResponseType;
}

union DotNetCreateTranscriptionStreamingResponseType {
`transcript.text.segment`: "transcript.text.segment",
`transcript.text.delta`: "transcript.text.delta",
`transcript.text.done`: "transcript.text.done",
string
}

model DotNetTranscriptTextSegmentEvent extends DotNetCreateTranscriptionStreamingResponse {
...TranscriptTextSegmentEvent;
}
```

---

## 5. `prohibited-namespace` Errors Require `[CodeGenType]` Stubs

A `prohibited-namespace` compile error means the generator found a type that doesn't have a corresponding `[CodeGenType]` stub in the custom C# code. This can be triggered by any type — inline unions, new models, new enums, etc. — but **not every new type causes it**. Only fix the specific types named in the error.

**Fix:** Add a `[CodeGenType]` stub for each type named in the error, placing it in the correct location:

- **Internal types** → `src/Custom/{Area}/Internal/GeneratorStubs.cs`
- **Public types** → `src/Custom/{Area}/GeneratorStubs.cs`

Look at existing stubs in the area to determine the right pattern (class vs. struct, readonly, etc.).

**Example — internal stubs** (`src/Custom/{Area}/Internal/GeneratorStubs.cs`):
```csharp
[CodeGenType("ContainerResourceMemoryLimit")] internal readonly partial struct InternalContainerResourceMemoryLimit { }
[CodeGenType("ContainerListResource")] internal partial class InternalContainerListResource { }
```

**Example — public stubs** (`src/Custom/{Area}/GeneratorStubs.cs`):
```csharp
[CodeGenType("ContainerResource")] public partial class ContainerResource { }
[CodeGenType("ContainerCollectionOrder")] public readonly partial struct ContainerCollectionOrder { }
```

**How to identify these:** The compiler error message will name the type exactly. Only add stubs for the types that appear in `prohibited-namespace` errors — do not preemptively stub every new type.

---

## 6. NEVER Modify the Base Spec

> **CRITICAL:** The base spec at `specification/base/typespec/` must be an **exact copy** of the upstream spec from `microsoft/openai-openapi-pr`. Do NOT modify it for any reason — not for type unions, not for import paths, not for namespaces, not for suppression directives.

If there are issues with the base spec:
- **Type unions** that would generate binary data types → handle in `specification/client/models/{area}.models.tsp` using discriminator patterns
- **Any other issues** → resolve in the client TSP layer if possible, or suggest them as upstream issues to be filed (do NOT file issues yourself)

---

## 7. Follow-up PRs for Complex Features

Not everything needs to be done during the spec ingestion. New features that require significant custom C# implementation should be listed as suggested follow-up items for the user to review.

**Examples from past ingestions:**
- Speech streaming events → [#914](https://github.com/openai/openai-dotnet/issues/914) (from Audio #913)
- Diarized transcription → [#916](https://github.com/openai/openai-dotnet/issues/916) (from Audio #913)
- Pagination for `GetFiles` → [#895](https://github.com/openai/openai-dotnet/issues/895) (from Files #894)
- `ExpiresAfter` parameter → [#896](https://github.com/openai/openai-dotnet/issues/896) (from Files #894)

---

## 8. `[Experimental]` Attribute for New Features

New public types and properties that are not yet stable should be marked with `[Experimental]` in the custom C# code. This was done during the Moderations ingestion (#888).

---

## 9. Test Fixes After Ingestion

Expect test updates after spec ingestion:
- **Session records** may need regeneration if API shapes changed
- **Assertion changes** for renamed/retyped properties
- **New test coverage** for new features (can be deferred)

---

## 10. API Export After Ingestion

Always run `./scripts/Export-Api.ps1` after successful code generation to update the API surface files (`api/OpenAI.net8.0.cs`, etc.). These files are used for API compatibility checks and should be committed as part of the PR.
Loading