Skip to content

docs(apps): three codegen-quality guidelines for AppKit / smoke tests#67

Merged
keugenek merged 2 commits intomainfrom
evgenii.kniazev/skill-codegen-quality
May 7, 2026
Merged

docs(apps): three codegen-quality guidelines for AppKit / smoke tests#67
keugenek merged 2 commits intomainfrom
evgenii.kniazev/skill-codegen-quality

Conversation

@keugenek
Copy link
Copy Markdown
Contributor

@keugenek keugenek commented May 7, 2026

Summary

Consolidates PR #65 and adds four codegen-quality guidelines to the AppKit skill:

  • AppKit API surface: run npx @databricks/appkit docs before writing call sites — avoids invented signatures that fail tsc --noEmit during validate
  • TypeScript casts: forbid as unknown as <T> double-assertions — appkit lint enforces no-double-type-assertion (from PR docs(apps): forbid as unknown as double type assertions #65)
  • Smoke test selectors: getByLabel (Playwright) not getByLabelText (React Testing Library) — wrong API throws TypeError at runtime
  • Smoke test data: keep result sets under the 1 MB analytics-event payload cap — unbounded queries cause net::ERR_ABORTED and missing UI elements

Addresses nitpicker review feedback: reordered smoke-test bullets together, trimmed AppKit API bullet to avoid duplicating the Frameworks section, replaced magic LIMIT 500 with generic LIMIT, concrete aggregation example, Zod narrowing listed first.

Supersedes #65.

Test plan

  • skills.py validate passes
  • Three sub-agent tests confirm guidelines are clear and actionable
  • Review guidelines against actual AppKit docs and Playwright API

This pull request and its description were written by Isaac.

Three additions to the databricks-apps Generic Guidelines, each pinned to
a real failure pattern observed in the apps-mcp-evals nightly:

1. **AppKit API — consult `appkit docs` first.** Mode-D fingerprint:
   tonight's prod nightly (run 456555456546311) had 14 catastrophic
   build flips; 12/14 were cb_* (cookbook) prompts and the failures were
   typecheck errors against AppKit 0.20.3 (`createApp({setup:...})` when
   the real config doesn't accept `setup`, etc.). Tracked in LKB-12465.

2. **Playwright APIs — use only documented ones.** Mode-C smoke fixture:
   `serving_chat` smoke spec called `page.getByLabelText` (React Testing
   Library, not Playwright) → `TypeError: ... is not a function`. Validate
   step fails before any UI assertion. Easy nudge.

3. **Smoke query payload size.** `price_prediction_tool` smoke failed with
   `Event exceeds max size of 1048576 bytes` + `ERR_ABORTED` because the
   underlying analytics query returned a multi-MB row dump. Smoke specs
   then can't find the elements that were never rendered. LIMIT 500 or
   aggregate.

Each guideline is short and quotes the exact failure signature so the
agent can pattern-match in future generations.

Co-authored-by: Isaac
@keugenek keugenek requested a review from a team as a code owner May 7, 2026 14:09
Comment thread skills/databricks-apps/SKILL.md Outdated
…tpicks

Subsumes PR #65 (TypeScript casts). Reorders smoke-test bullets together,
trims AppKit API bullet to avoid duplicating the Frameworks section,
and tightens wording per nitpicker review.

Co-authored-by: Isaac
@keugenek keugenek merged commit 8d44124 into main May 7, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants