feat(agent-markdown): Add navigation to Markdown exports#9
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Append generic child-page navigation to generated Markdown pages and expose a switch for sites that need to own their Markdown index content. Also improve on-demand Markdown negotiation for explicit format requests, plain text clients, and common AI-agent user agents. Co-Authored-By: Codex <noreply@openai.com>
Semver Impact of This PR🟡 Minor (new features) 📋 Changelog PreviewThis is how your changes will appear in the changelog. New Features ✨
Internal Changes 🔧
🤖 This preview updates automatically when you update the PR. |
Fallback order of 99 would incorrectly rank explicit sidebar.order values >= 100 after unordered pages. Use Number.MAX_SAFE_INTEGER to match Starlight's unordered-last semantics. Also sort ties by stable page id instead of display title to match path-based ordering used by Starlight. Co-authored-by: David Cramer <dcramer@gmail.com>
When the middleware rewrites to .md based on a matched AI user-agent, the response must include Vary: User-Agent so caches don't serve Markdown to regular browsers at the same URL. ?format=md and Accept-header triggered rewrites are unaffected — the former is URL-keyed, the latter already covered by Vary: Accept on the .md response. Also renamed wantsMarkdown -> acceptsMarkdown to reflect that it now only checks the Accept header, not the user-agent. Co-authored-by: David Cramer <dcramer@gmail.com>
| context, | ||
| page, | ||
| await getMarkdownPages(), | ||
| siteBase, | ||
| sidebar, |
There was a problem hiding this comment.
Bug: The navigation tree is rebuilt for every page during static builds, resulting in O(n²) time complexity that will significantly slow down build times on large sites.
Severity: MEDIUM
Suggested Fix
Cache the result of buildMarkdownNavigationTree. The navigation tree should be built only once per build process and then reused for rendering each page's navigation, rather than being reconstructed every time.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: src/agent-markdown/utils.ts#L142-L146
Potential issue: During a static site build, the navigation tree is rebuilt from scratch
for every page being rendered. The `buildMarkdownNavigation` function is called for each
of the `n` pages, and inside it, `buildMarkdownNavigationTree` iterates through all `n`
pages to construct the navigation tree. This results in an O(n²) time complexity, where
`n` is the total number of pages. While the list of pages from `getMarkdownPages()` is
cached, the expensive tree-building operation is not. This will cause build times to
grow quadratically, leading to significant performance degradation on sites with a large
number of pages.
Did we get this right? 👍 / 👎 to inform future reviews.
buildMarkdownNavigationTree was called once per page, making static build O(n²) in the number of docs pages. The tree inputs (pages, sidebar config, base) are constant per build, so cache the result in production mode using the same pattern as markdownPagesPromise. Dev and test modes rebuild the tree on every call so live-reload and isolated test cases stay correct. Co-authored-by: David Cramer <dcramer@gmail.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit e4aa7d2. Configure here.
acceptsMarkdown triggered a rewrite whenever any markdown-ish type had q > 0, without comparing it against text/html's quality. A request like Accept: text/html, text/plain;q=0.8 would receive Markdown even though the client preferred HTML. Fix: parse all Accept entries, compare the highest markdown quality against text/html's quality, and only rewrite when markdown wins. Equal quality defers to HTML since that is the native format for these URLs. Co-authored-by: David Cramer <dcramer@gmail.com>
…ring
getSidebarNavData returned undefined for { autogenerate: { directory } }
items, so pages in those directories had no sidebar-derived order. When
an autogenerate block appeared between explicit sidebar entries, pages
listed after it also got incorrect relative order.
Fix:
- Pre-scan explicit IDs so autogenerate blocks don't claim pages that
appear elsewhere in the sidebar with an explicit slug or link.
- Expand each autogenerate item by finding all pages whose IDs start
with the directory prefix, sorting them alphabetically (matching
Starlight's default autogenerate sort), and assigning sequential
order values at the position the block occupies in the sidebar.
- Add SidebarNavData.auto flag so autogenerate-derived order ranks
below per-page frontmatter overrides (sidebar.order / sidebar_order)
but above unlisted pages.
Co-authored-by: David Cramer <dcramer@gmail.com>
When autogenerate directory normalizes to an empty string (e.g. "/"), the filter condition fell through to false and silently skipped all pages. When the prefix is empty, match root-level pages — those whose IDs contain no slash — instead of matching nothing. Co-authored-by: David Cramer <dcramer@gmail.com>
| if (uaTriggered) { | ||
| response.headers.append("Vary", "User-Agent"); | ||
| } |
There was a problem hiding this comment.
Bug: When a rewrite is triggered by a User-Agent, the middleware incorrectly appends Vary: User-Agent to the response, resulting in a semantically incorrect Vary: Accept, User-Agent header.
Severity: MEDIUM
Suggested Fix
The middleware should ensure the final Vary header is correct for UA-triggered rewrites. Instead of unconditionally appending Vary: User-Agent, it should replace the existing Vary header or reconstruct the response headers to only include Vary: User-Agent.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: src/agent-markdown/middleware.ts#L35-L37
Potential issue: For requests triggered by an AI agent's User-Agent, the middleware
rewrites the request to a markdown endpoint. This endpoint's response includes the
`Vary: Accept` header. The middleware then appends `Vary: User-Agent` to this response.
The resulting `Vary: Accept, User-Agent` header is semantically incorrect because the
content only varies by `User-Agent`, not `Accept` in this scenario. This can lead to
inefficient CDN caching or, in some cases, serving incorrect content from the cache. The
unit tests do not detect this because they use a mock rewrite function that doesn't set
the initial `Vary: Accept` header.
| function isAIOrDevTool(userAgent: string) { | ||
| return /claude|anthropic|gptbot|chatgpt|openai|cursor|codex|copilot|perplexity|cohere|gemini/i.test( | ||
| userAgent, | ||
| ); |
There was a problem hiding this comment.
Bug: The User-Agent detection regex is too broad, using simple substring matches for terms like cursor and gemini, which can cause false positives by matching legitimate developer tools.
Severity: MEDIUM
Suggested Fix
Make the User-Agent matching regex stricter to avoid false positives. Consider using word boundaries (e.g., \bcursor\b), more specific product names, or patterns that include version numbers to ensure only intended AI agents are matched.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: src/agent-markdown/middleware.ts#L70-L73
Potential issue: The regular expression used to detect AI agent User-Agents includes
broad, case-insensitive substring matches for terms like `cursor`, `codex`, and
`gemini`. These terms are also used in the names of legitimate development tools that
are not AI crawlers. If such a tool makes a request with a User-Agent containing one of
these keywords, it will be incorrectly served a markdown response instead of the
expected HTML, which could break the tool's functionality. The regex lacks safeguards
like word boundaries or more specific patterns to prevent these false positives.

Add generic navigation to the Markdown routes generated by sentryAgentMarkdown().
The plugin now appends a
Pages in this sectionlist from the docs collection and Starlight sidebar/frontmatter metadata. This keeps the package reusable for normal Starlight docs and leaves Sentry-docs-specific platform/framework navigation out of the shared theme.This also broadens optional content negotiation so SSR deployments can serve Markdown for
?format=md,Accept: text/plain, and common AI-agent user agents. The README now records current known consumers so future changes can be checked against them.Validation completed locally:
pnpm run test:unitpnpm run typecheckpnpm lintpnpm playground:buildLocal validation completed with the existing Node engine warning because this shell is on Node v20.11.1 and the package expects >=22.12.0.