Skip to content
22 changes: 22 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,16 @@ export default defineConfig({

The plugin installs the shared CSS, applies the monochrome Expressive Code theme, and overrides Starlight's `ThemeSelect` with an empty component because this theme is intentionally dark-only. Project-level `customCss` is kept after the package CSS so consuming docs sites can make small local adjustments.

## Known consumers

These repositories use this theme and help define future compatibility requirements:

- [getsentry/warden](https://github.com/getsentry/warden)
- [getsentry/vitest-evals](https://github.com/getsentry/vitest-evals)
- [getsentry/dotagents](https://github.com/getsentry/dotagents)
- [getsentry/cli](https://github.com/getsentry/cli)
- [getsentry/junior](https://github.com/getsentry/junior)

## Markdown for AI agents

`sentryAgentMarkdown()` generates static `.md` versions of Starlight docs pages for LLM and coding-agent clients:
Expand All @@ -44,6 +54,16 @@ The plugin installs the shared CSS, applies the monochrome Expressive Code theme

Markdown responses include YAML metadata (`title`, `description`, and `url`), use `text/markdown; charset=utf-8`, and rewrite internal docs links to `.md` URLs when possible. The exporter uses rendered HTML when Astro makes it available and otherwise falls back to normalized source Markdown/MDX, stripping common JSX wrappers and import/export statements.

Markdown pages also include lightweight navigation sections derived from the docs collection and explicit Starlight sidebar config when present. Pages with visible child pages get a `Pages in this section` list. Draft, hidden, and versioned pages are omitted from these lists.

Disable generated navigation if a site already owns its Markdown index content:

```js
sentryAgentMarkdown({
navigation: false,
});
```

The plugin also adds a copy-to-clipboard Markdown action below Starlight's right-sidebar table of contents. Disable this if a site has its own table-of-contents override:

```js
Expand All @@ -62,6 +82,8 @@ sentryAgentMarkdown({

Static deployments cannot vary an already-built HTML page by request headers without platform-level rewrites, so keep content negotiation disabled unless the site runs Astro middleware at request time.

When content negotiation is enabled, the middleware also serves Markdown for `?format=md`, `Accept: text/plain`, and common AI-agent user agents.

This is intentionally lighter weight than Sentry's main docs pipeline, which converts built HTML after the site build. Highly custom MDX components may need site-specific Markdown authoring or a post-build exporter for perfect output.

## Develop
Expand Down
78 changes: 72 additions & 6 deletions src/agent-markdown.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ const tableOfContentsComponent = `${packageName}/agent-markdown/TableOfContents`

interface StarlightUserConfig {
components?: Record<string, string | undefined>;
sidebar?: unknown;
}

interface StarlightPlugin {
Expand Down Expand Up @@ -46,12 +47,17 @@ export interface SentryAgentMarkdownOptions {
* Add Markdown actions below Starlight's right-sidebar table of contents.
*/
markdownActions?: boolean;
/**
* Append navigation sections to Markdown pages with visible child pages.
*/
navigation?: boolean;
}

export function sentryAgentMarkdown({
markdownRoutes = true,
contentNegotiation = false,
markdownActions = true,
navigation = true,
}: SentryAgentMarkdownOptions = {}): StarlightPlugin {
return {
name: pluginName,
Expand Down Expand Up @@ -83,7 +89,12 @@ export function sentryAgentMarkdown({
updateConfig({ components });

addIntegration(
agentMarkdownIntegration({ markdownRoutes, contentNegotiation }),
agentMarkdownIntegration({
markdownRoutes,
contentNegotiation,
navigation,
sidebar: config.sidebar,
}),
);
},
},
Expand All @@ -93,9 +104,14 @@ export function sentryAgentMarkdown({
function agentMarkdownIntegration({
markdownRoutes,
contentNegotiation,
navigation,
sidebar,
}: Required<
Pick<SentryAgentMarkdownOptions, "markdownRoutes" | "contentNegotiation">
>): AstroIntegration {
Pick<
SentryAgentMarkdownOptions,
"contentNegotiation" | "markdownRoutes" | "navigation"
>
> & { sidebar: unknown }): AstroIntegration {
return {
name: pluginName,
hooks: {
Expand All @@ -113,7 +129,12 @@ function agentMarkdownIntegration({

updateConfig({
vite: {
plugins: [agentMarkdownConfigPlugin(config.base)],
plugins: [
agentMarkdownConfigPlugin(config.base, {
navigation,
sidebar,
}),
],
},
});

Expand Down Expand Up @@ -145,7 +166,10 @@ const virtualConfigModuleId =
"virtual:sentry-starlight-theme/agent-markdown/config";
const resolvedVirtualConfigModuleId = `\0${virtualConfigModuleId}`;

function agentMarkdownConfigPlugin(base: string) {
function agentMarkdownConfigPlugin(
base: string,
{ navigation, sidebar }: { navigation: boolean; sidebar: unknown },
) {
return {
name: `${pluginName}/config`,
resolveId(id: string) {
Expand All @@ -157,7 +181,11 @@ function agentMarkdownConfigPlugin(base: string) {
},
load(id: string) {
if (id === resolvedVirtualConfigModuleId) {
return `export const base = ${JSON.stringify(normalizeBase(base))};`;
return [
`export const base = ${JSON.stringify(normalizeBase(base))};`,
`export const appendNavigation = ${JSON.stringify(navigation)};`,
`export const sidebar = ${JSON.stringify(normalizeSidebar(sidebar))};`,
].join("\n");
}

return undefined;
Expand All @@ -172,3 +200,41 @@ function normalizeBase(base: string) {

return `/${base.replace(/^\/|\/$/g, "")}`;
}

function normalizeSidebar(sidebar: unknown): unknown[] {
return Array.isArray(sidebar)
? sidebar.map(normalizeSidebarItem).filter(Boolean)
: [];
}

function normalizeSidebarItem(item: unknown): unknown {
if (typeof item === "string") {
return item;
}

if (!item || typeof item !== "object" || Array.isArray(item)) {
return undefined;
}

const record = item as Record<string, unknown>;
const normalized: Record<string, unknown> = {};

for (const key of ["label", "link", "slug"]) {
if (typeof record[key] === "string") {
Comment thread
sentry[bot] marked this conversation as resolved.
normalized[key] = record[key];
}
}

if (Array.isArray(record.items)) {
normalized.items = record.items.map(normalizeSidebarItem).filter(Boolean);
}

if (record.autogenerate && typeof record.autogenerate === "object") {
const autogenerate = record.autogenerate as Record<string, unknown>;
if (typeof autogenerate.directory === "string") {
normalized.autogenerate = { directory: autogenerate.directory };
}
}

return Object.keys(normalized).length > 0 ? normalized : undefined;
}
58 changes: 48 additions & 10 deletions src/agent-markdown/middleware.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,37 +2,75 @@ import { defineMiddleware } from "astro:middleware";
import { isIgnoredPath, toMarkdownPath } from "./path-utils";
import { base as siteBase } from "virtual:sentry-starlight-theme/agent-markdown/config";

export const onRequest = defineMiddleware((context, next) => {
export const onRequest = defineMiddleware(async (context, next) => {
const { pathname, search } = context.url;
const forceMarkdown = context.url.searchParams.get("format") === "md";

if (isMarkdownPath(pathname) || isIgnoredPath(pathname, siteBase)) {
return next();
}

if (!wantsMarkdown(context.request.headers)) {
const uaTriggered =
!forceMarkdown &&
isAIOrDevTool(context.request.headers.get("user-agent") ?? "");

if (
!forceMarkdown &&
!uaTriggered &&
!acceptsMarkdown(context.request.headers)
) {
return next();
}

const destination = new URL(context.url);
destination.pathname = toMarkdownPath(pathname, siteBase);
destination.search = search;
destination.searchParams.delete("format");

const response = await context.rewrite(destination);

return context.rewrite(destination);
// When the rewrite was triggered by User-Agent rather than an explicit Accept
// header or format param, add Vary: User-Agent so caches key on the UA and
// don't serve Markdown to regular browsers at the same URL.
if (uaTriggered) {
response.headers.append("Vary", "User-Agent");
}
Comment on lines +35 to +37

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: When a rewrite is triggered by a User-Agent, the middleware incorrectly appends Vary: User-Agent to the response, resulting in a semantically incorrect Vary: Accept, User-Agent header.
Severity: MEDIUM

Suggested Fix

The middleware should ensure the final Vary header is correct for UA-triggered rewrites. Instead of unconditionally appending Vary: User-Agent, it should replace the existing Vary header or reconstruct the response headers to only include Vary: User-Agent.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.

Location: src/agent-markdown/middleware.ts#L35-L37

Potential issue: For requests triggered by an AI agent's User-Agent, the middleware
rewrites the request to a markdown endpoint. This endpoint's response includes the
`Vary: Accept` header. The middleware then appends `Vary: User-Agent` to this response.
The resulting `Vary: Accept, User-Agent` header is semantically incorrect because the
content only varies by `User-Agent`, not `Accept` in this scenario. This can lead to
inefficient CDN caching or, in some cases, serving incorrect content from the cache. The
unit tests do not detect this because they use a mock rewrite function that doesn't set
the initial `Vary: Accept` header.


return response;
});

function wantsMarkdown(headers: Headers) {
function acceptsMarkdown(headers: Headers) {
const accept = headers.get("accept") ?? "";
if (!accept) {
return false;
}

return accept.split(",").some((entry) => {
const entries = accept.split(",").map((entry) => {
const [type = "", ...parameters] = entry.trim().toLowerCase().split(";");
return { type: type.trim(), q: getAcceptQuality(parameters) };
});

const mediaType = type.trim();
if (mediaType !== "text/markdown" && mediaType !== "text/x-markdown") {
return false;
const htmlQuality = entries.find(({ type }) => type === "text/html")?.q ?? 0;
const markdownQuality = entries.reduce((max, { type, q }) => {
if (
type === "text/markdown" ||
type === "text/plain" ||
type === "text/x-markdown"
) {
Comment thread
cursor[bot] marked this conversation as resolved.
return Math.max(max, q);
}
return max;
}, 0);

return getAcceptQuality(parameters) > 0;
});
// Only rewrite when a markdown type is explicitly wanted AND outranks text/html.
// Equal quality defers to HTML since that is the native format for these URLs.
return markdownQuality > 0 && markdownQuality > htmlQuality;
}

function isAIOrDevTool(userAgent: string) {
return /claude|anthropic|gptbot|chatgpt|openai|cursor|codex|copilot|perplexity|cohere|gemini/i.test(
userAgent,
);
Comment on lines +70 to +73

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The User-Agent detection regex is too broad, using simple substring matches for terms like cursor and gemini, which can cause false positives by matching legitimate developer tools.
Severity: MEDIUM

Suggested Fix

Make the User-Agent matching regex stricter to avoid false positives. Consider using word boundaries (e.g., \bcursor\b), more specific product names, or patterns that include version numbers to ensure only intended AI agents are matched.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.

Location: src/agent-markdown/middleware.ts#L70-L73

Potential issue: The regular expression used to detect AI agent User-Agents includes
broad, case-insensitive substring matches for terms like `cursor`, `codex`, and
`gemini`. These terms are also used in the names of legitimate development tools that
are not AI crawlers. If such a tool makes a request with a User-Agent containing one of
these keywords, it will be incorrectly served a markdown response instead of the
expected HTML, which could break the tool's functionality. The regex lacks safeguards
like word boundaries or more specific patterns to prevent these false positives.

}

function isMarkdownPath(pathname: string) {
Expand Down
Loading
Loading