Skip to content

hotfix(ai-diagnosis): correct misclassifications and surface provider error JSON#15241

Open
SiinXu wants to merge 3 commits into
mainfrom
hotfix/ai-diagnosis-clean
Open

hotfix(ai-diagnosis): correct misclassifications and surface provider error JSON#15241
SiinXu wants to merge 3 commits into
mainfrom
hotfix/ai-diagnosis-clean

Conversation

@SiinXu
Copy link
Copy Markdown
Collaborator

@SiinXu SiinXu commented May 21, 2026

What this PR does

Before this PR:
The AI diagnosis feature gave misleading results in several common scenarios:

  • A 429 from high-frequency requests was diagnosed as "account quota exhausted, please recharge" (it's actually a rate limit, not a billing issue).
  • A 403 from a geo-restricted region (OpenAI's unsupported_country_region_territory, Anthropic geo-blocks) was diagnosed as "API key invalid, please reconfigure" — sending users to re-check a key that was fine.
  • MCP / OCR timeouts were stolen by the generic network rule and routed to /settings/general instead of the relevant feature settings.
  • The AI second layer ignored the provider's actual error JSON (responseBody, data) and Gemini/Anthropic's structured finishReason (SAFETY / RECITATION), so the AI had to guess from the wrapper message.

After this PR:

  • Rate limit is a separate category from quota; the order of keyword checks ensures 429 + insufficient_balance still routes to quota.
  • HTTP 402 Payment Required routes to quota and is mirrored in the AI prompt context, so providers that signal billing failure via 402 get billing-language guidance instead of falling through to unknown.
  • New region category for geo-blocks, with its own i18n text and navigation to general/proxy settings.
  • Specialized rules (MCP, OCR) run before the network rule so feature-specific timeouts are categorized correctly.
  • Narrowed stream, parse, and deprecated keyword matches to remove false positives (e.g. max_tokens must be a valid JSON number no longer flags as a parse error; "parameter is deprecated" no longer triggers "model deprecated").
  • Expanded content-filter detection across OpenAI / Anthropic / Gemini (incl. structured finishReason).
  • The AI prompt now includes responseBody, data, finishReason, and auto-fills provider / modelId from the error itself when the caller did not provide context.

Fixes #

Why we need it and why it was done in this way

The misclassifications were already shipping wrong advice to users (e.g. telling them to top up balance when they just need to slow down requests, or to check their API key when their region is blocked). This is a correctness issue, not an enhancement.

The following tradeoffs were made:

  • The rule classifier file grew (~150 → ~200 lines), but each new branch is justified by a real-world provider error pattern; alternatives like a single regex table would obscure the ordering logic that is core to the fix.
  • The AI prompt now includes 2 additional examples (rate_limit, quota) and slightly more rules. This raises per-call token usage modestly. Diagnosis runs on the CherryAI free model, so the cost impact is effectively zero, and the examples meaningfully steer the model away from the most common wrong answers.
  • Narrowed keywords trade some edge-case coverage for accuracy. The cases that fall through now go to unknown, which still triggers the AI second layer, so user experience does not degrade.

The following alternatives were considered:

  • Adding only the rate_limit split and stopping there — rejected because the same root pattern (overly broad rules and lost context) caused several other misdiagnoses surfaced during the audit.
  • Letting the AI handle all classification with no rule layer — rejected because the rule layer is the offline fallback when the diagnosis AI itself fails, and it is also what we display when AI is still loading.

Links to places where the discussion took place: N/A

Breaking changes

None. The change is purely a correction to error-diagnosis logic. No public APIs, data schemas, or persistence formats change. Persisted diagnosis results in block.metadata continue to render unchanged.

Special notes for your reviewer

  • The new region category navigates to /settings/general (proxy config) rather than /settings/provider, because geo-block fixes typically require a proxy, not a new API key.
  • The cause field on SerializedAiSdkError is intentionally read as a string (it is serialized by serializeError in src/renderer/src/utils/error.ts). The new responseBody / data / finishReason extraction relies on the top-level fields already flattened by the same serializer; no nested status-code extraction is needed.
  • 15 new unit tests cover the new categories (incl. HTTP 402 → quota), narrowed rules, and the new AI-input fields; the existing test suite (9320 tests) continues to pass.

Checklist

Release note

Fix AI error diagnosis to correctly distinguish rate limits from quota exhaustion, geo-region blocks from auth failures, and feature-specific (MCP / OCR) timeouts from generic network errors. The diagnosis pipeline now forwards the provider's structured error fields (responseBody, data, finishReason) to the AI model, and falls back to a rule-based classification message when AI diagnosis itself fails.

Related issues

…ror JSON

The rule layer conflated rate-limit with quota, treated 403 geo-blocks as auth,
and let generic network rules steal MCP/OCR timeouts; the AI layer ignored the
actual provider responseBody/data/finishReason that carry the diagnostic signal,
and the UI showed raw exception text when diagnosis itself failed. This rewrites
the classification rules, pipes the missing fields into the AI prompt, and adds
a rule-based fallback in the UI on diagnosis failure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Siin Xu <31815270+SiinXu@users.noreply.github.com>
… fallback

Tighten this hotfix to only the misclassifications confirmed in real provider
responses. Defensive keywords without a known triggering case ('exceeds the
maximum', 'tls', 'ssl handshake', 'geographic', 'geo block') are removed; the
diagnosis-failed UI fallback (rule-based text + ai_failed i18n key) is reverted
because it is a UX improvement orthogonal to this hotfix and belongs in a
separate PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Siin Xu <31815270+SiinXu@users.noreply.github.com>
Copy link
Copy Markdown
Collaborator

@ousugo ousugo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note

This issue/comment/review was translated by Claude.

This code review primarily found one inconsistency between the PR description and the final code implementation (see inline comments for details). Other than that, the refactoring of rule categorization for newly added geo-detection and rate limiting is very clear, and the unit tests are also very comprehensive.

| 'rate_limit'
| 'context_length'
| 'payload'
| 'network'
Copy link
Copy Markdown
Collaborator

@ousugo ousugo May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note

This comment was translated by Claude.

[B7] The PR description mentions: "On diagnosis failure, the UI shows the rule-based classification text first, then the raw error below as a small detail, with the retry button kept." However, in the final commit of this branch, this logic and the i18n translation key "ai_failed" were removed in commit "2e1f28c8a". The current code does not include this fallback UI logic, which conflicts with the PR description. I suggest updating the PR description or implementing this feature.


Original Content

[B7] PR 描述中提及:"On diagnosis failure, the UI shows the rule-based classification text first, then the raw error below as a small detail, with the retry button kept."。但在分支的最终提交中,该部分逻辑和 i18n 翻译键 "ai_failed" 在 commit "2e1f28c8a" 被移除了。目前代码中并未包含此兜底 UI 逻辑,与 PR 描述存在冲突,建议更新 PR 描述或补充该功能实现。

@PhoenixCPH
Copy link
Copy Markdown
Contributor

PhoenixCPH commented May 21, 2026

Note

This comment was translated by Claude.

Why are all PRs submitted by AI now?


Original Content

怎么现在都是全AI提的PR了

Copy link
Copy Markdown
Collaborator

@zhibisora zhibisora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note

This review was translated by Claude.

Two review points: complete the quota categorization/diagnostic context/tests for 402 Payment Required, and complete non-core language translations for the newly added diagnostic text.


Original Content

留下两点 review:补齐 402 Payment Required 的 quota 分类/诊断上下文/测试,以及为新增诊断文案补全非核心语言翻译。

}

// Quota / rate limit (429)
// Quota / balance exhausted — check first so "429 + insufficient_balance" routes here, not to rate_limit
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[A1] This split still misses plain HTTP 402 Payment Required, which several providers use for billing/quota failures. If a provider returns status 402 with a generic message, it can fall through to unknown instead of quota. Please route numStatus === 402 to quota, mirror that in ErrorDiagnosisService so the AI prompt uses quota/billing context for 402, and add unit coverage for the classifier and diagnosis hint.

"payload": "Nội dung yêu cầu quá lớn, vui lòng giảm kích thước tệp hoặc văn bản",
"proxy": "Lỗi proxy hoặc chứng chỉ SSL, hãy kiểm tra cài đặt proxy và mạng",
"quota": "Hạn mức tài khoản đã cạn kiệt, vui lòng nạp tiền hoặc chuyển đổi nhà cung cấp",
"rate_limit": "[to be translated]:Too many requests in a short time. Wait a moment and retry, or switch to a model with a higher rate limit",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[C1] The new rate_limit / region diagnosis strings are user-visible, but the translate files still contain [to be translated]:... English placeholders across the non-core locales. Please provide actual translations for all touched locale files, and include the new error.http.402 translation as well if 402 handling is added.

…ations

Address review feedback from #15241:

- Add `numStatus === 402` to the quota branch in `errorClassifier.ts`. HTTP
  402 Payment Required is the canonical billing-failure status used by
  several providers and gateways. Previously these fell through to `unknown`
  and missed the quota navTarget / context.
- Mirror the 402 check in `ErrorDiagnosisService` so the AI prompt receives
  billing/quota context for 402, not the rate-limit context (which would
  advise the user to "wait and retry" — wrong for a billing failure).
- Add `error.http.402` text to all 12 locale files.
- Translate the previously-placeholder `rate_limit` and `region` strings
  across the 9 non-core locales (de, el, es, fr, ja, pt, ro, ru, vi).
- Add 3 unit tests covering the 402 routing path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Siin Xu <31815270+SiinXu@users.noreply.github.com>
@SiinXu
Copy link
Copy Markdown
Collaborator Author

SiinXu commented May 28, 2026

Thanks for the review — addressed all three points in 1858b4be:

A1 (402 → quota) — done. numStatus === 402 now routes to quota in errorClassifier.ts, and ErrorDiagnosisService.buildContext() mirrors the same check so the AI prompt gets billing/quota language (not the rate-limit "wait and retry" prompt). Added 3 unit tests covering 402 with explicit Payment Required, 402 with a generic message, and verification that the AI prompt picks the quota context branch.

C1 (translations) — done. error.http.402 added to all 12 locales (3 core + 9 translate). The rate_limit and region placeholders across de-de / el-gr / es-es / fr-fr / ja-jp / pt-pt / ro-ro / ru-ru / vi-vn have been replaced with proper translations. pnpm i18n:check passes.

B7 (PR description / ai_failed fallback) — resolved by updating the PR description, not by restoring the code. The deletion in 2e1f28c8 was intentional: the diagnosis-failure rule-based fallback is a UX improvement orthogonal to this hotfix and belongs in a separate PR (the commit message explains this). I've removed the three lines from the PR description that promised behavior the current code doesn't implement (the "On diagnosis failure, the UI shows the rule-based classification text…" bullet, the upstream "no actionable fallback" line, and the IIFE refactor note). The diff now matches what reviewers see in the code.

Also added 402 to the "After this PR" section in the description so it's discoverable from the PR summary.

Ready for re-review when you have a moment. @ousugo @zhibisora

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants