Replace Playwright spec tests with QUnit-based card testing (CS-10599) by habdelra · Pull Request #4337 · cardstack/boxel

habdelra · 2026-04-06T22:36:55Z

Summary

Replace factory-generated Playwright .spec.ts tests with QUnit .test.gts files that render cards in a real browser DOM
Test files are co-located with card definitions (hello.test.gts next to hello.gts), not in a separate Tests/ folder
Eliminate test artifacts realm — QUnit tests use in-memory browser realms, test instances never leave the browser
Purge "spec" in Playwright context: SpecResultData → TestModuleResultData, specResults → moduleResults, specRef → moduleRef (Catalog Spec unchanged)
Fix realm module resolution bug for dotted filenames (hello.test.gts)

How the QUnit test page works

What the software factory needs

When factory:go runs the implement→test→iterate loop, it needs to execute QUnit tests that live as .test.gts files in the target realm. The test executor (executeTestRunFromRealm) must:

Serve a browser page that has the full Ember/card runtime (for rendering cards)
Load QUnit and the host's test helpers (setupCardTest, renderCard, @ember/test-helpers)
Use the live-test infrastructure (PR introduce-realm-enabled-tests #4191) to discover .test.gts files via _mtimes and import them through the realm loader
Collect structured QUnit results and write them to a TestRun card

In mise run dev-all, the Ember dev server at localhost:4200 does serve /tests/ with the QUnit page — but the factory can't rely on that because:

In production, the Ember production build has no QUnit (CS-10650). The production vendor.js doesn't include QUnit — it's only available in test-support.js which ships with test/development builds. The host app at app.boxel.ai serves a production build, so there are no test assets available at all — no tests/index.html, no test-support.js, no test helper chunks.
The realm-server-hosted Boxel app blocks the route. The realm server serves host assets from dist/ via serve --single, but that SPA fallback catches /tests and serves the app root instead of tests/index.html.

What our Playwright test harness needs

The software-factory's Playwright tests run in a hermetic environment — an isolated realm server on random ports with its own postgres and synapse. There's no Ember dev server running at all. The test harness needs the exact same QUnit page capability, but fully self-contained — no external host app dependency, no network access to anything outside the test process. Additionally, the Ember config meta tag has hardcoded resolvedBaseRealmURL, realmServerURL, etc. from build time that won't match the harness's realm server on its random ports.

The solution: a self-hosted test page server

Rather than depending on a running Ember dev server or the realm-server-hosted Boxel app's routing, executeTestRunFromRealm starts its own minimal HTTP server. This serves both the software factory's factory:go flow and the hermetic Playwright test harness with identical code:

Reads host/dist/tests/index.html at runtime to extract its <script>, <link>, and <meta> tags — including the chunk hashes that change with every Ember build. This is how we get the correct asset references without hardcoding them.
Rewrites asset URLs from root-relative (/assets/vendor.js) to absolute (http://127.0.0.1:<port>/assets/vendor.js) pointing at our server.
Rewrites the Ember config meta tag — replaces resolvedBaseRealmURL, realmServerURL, etc. with the browser-accessible realm proxy URL. This is needed for the hermetic test harness where the realm server is on random ports that don't match the build-time URLs. In production factory:go the URLs already match, but the rewrite is harmless.
Serves all files from host/dist/ — JS chunks, CSS, WASM (SQLite), fonts, images — with correct MIME types. This includes test-support.js (which contains QUnit, @ember/test-helpers, qunit-dom) and all the webpack chunks that contain the test helper code.
Injects QUnit result collection hooks — QUnit.on('testEnd') and QUnit.on('runEnd') callbacks that store structured results on window.__qunitResults, which Playwright reads after QUnit completes.
Passes ?liveTest=true&realmURL=<targetRealm> as URL query params so the host's test-helper.js activates live-test mode, which discovers .test.gts files via _mtimes and imports them through the realm loader.

This approach is fully hermetic — no external host app needed. The only requirement is a built host/dist/ directory. The same code path serves production factory:go, the smoke:test-realm CLI, and the Playwright test harness.

Known limitation: production host builds (CS-10650)

The test page server requires a development or test host build. The Ember production build (ember build -prod) strips all test assets (tests/index.html, test-support.js, test helper chunks). This means the software factory cannot run QUnit card tests in a production deployment where the host was built in production mode. This works today because mise run dev-all uses a development build.

This is not just a software factory limitation — it's a deeper live-test limitation. Running card tests in Code Mode within the Boxel app (the end goal of the live-test infrastructure from PR #4191) will face the same problem: the production Boxel app has no QUnit or test helpers available. Solving this for one solves it for both.

Options are tracked in CS-10650.

Dotted filename resolution bug fix

Also fixes a bug in runtime-common/stream.ts where getFileWithFallbacks() checked for any dot in the filename to skip extension fallbacks. This meant hello.test (from hello.test.gts with .gts stripped by the live-test module discovery) was treated as already having an extension (.test), so the function never tried appending .gts to find the actual file. The fix: only skip fallbacks when the path has a known executable extension (.gts, .ts, .js, .gjs). The same fix was applied to realm.ts's fallbackHandle. A separate bug was filed for the same pattern in dependency-tracker.ts and dependency-normalization.ts (CS-10649).

Try it out — Smoke Test

Prerequisites

mise run dev-all running

Run the smoke test

The smoke test simulates the full factory workflow — the LLM implementation phase followed by QUnit-based testing.

Phase 1 — Simulate LLM implementation output. The smoke test creates a realm and writes what the LLM would have produced during the implementation phase:

A HelloCard card definition (hello.gts)
A Catalog Spec card (Spec/hello-card.json) pointing to the HelloCard definition
A passing QUnit test (hello.test.gts) — co-located with the card definition
A deliberately failing QUnit test (hello-fail.test.gts)

Phase 2 — Run QUnit tests via Playwright. The smoke test calls executeTestRunFromRealm, which:

Creates a TestRun card with status: running in the target realm's Test Runs/ folder
Serves a custom QUnit test page that loads the host app's test assets locally
Launches a Playwright browser and navigates to the QUnit page with ?liveTest=true&realmURL=<targetRealmUrl>
QUnit discovers all .test.gts files in the target realm via _mtimes, imports them through the realm loader, and runs any that export runTests()
Test instances are created in browser memory only — no test artifacts realm needed
Collects structured results via QUnit testEnd/runEnd callbacks
Completes the TestRun card with pass/fail results grouped by QUnit module

cd packages/software-factory

MATRIX_URL=http://localhost:8008 \
MATRIX_USERNAME=your-username \
MATRIX_PASSWORD=your-password \
pnpm smoke:test-realm -- \
  --target-realm-url http://localhost:4201/your-username/smoke-test-realm/

Note: The realm smoke-test-realm does not need to exist beforehand — the smoke test creates it and populates it. If the realm already exists from a previous run, the new content will be written into the existing realm.

What to expect on the command line:

=== Factory Test Realm Smoke Test (QUnit) ===

Target realm: http://localhost:4201/your-username/smoke-test-realm/

--- Phase 1: Writing LLM implementation output to target realm ---

  ✓ hello.gts
  ✓ Spec/hello-card.json
  ✓ hello.test.gts
  ✓ hello-fail.test.gts

--- Phase 2: Running QUnit tests ---

  TestRun ID:  Test Runs/hello-smoke-1
  Status:      failed

--- Results ---

  TestRun status: ✓ failed (as expected — one test passes, one deliberately fails)

✓ Smoke test passed! QUnit test execution works correctly.

What to expect in the Boxel app:

Navigate to your smoke-test-realm workspace
You'll see the HelloCard definition (hello.gts) with its co-located test (hello.test.gts), the Catalog Spec card (Spec/hello-card), and the sample instance
In Test Runs/ you'll find hello-smoke-1 — the TestRun card produced by the testing phase
The fitted view shows: status badge (failed), sequence number (Bring in a demo from ember-animated #1), pass/fail counts, duration
The isolated view shows: full test results grouped by QUnit module, with individual test names, status per test, and failure details
No test artifacts realm is created — all test instances lived in browser memory during execution

Try it out — Full Factory E2E

Prerequisites

mise run dev-all running
A brief card published in the software-factory realm (e.g., http://localhost:4201/software-factory/Wiki/sticky-note)
An OpenRouter API key
Matrix credentials (username/password) that can create realms on the server

Run the factory

cd packages/software-factory

MATRIX_URL=http://localhost:8008/ \
MATRIX_USERNAME=your-username \
MATRIX_PASSWORD=your-password \
OPENROUTER_API_KEY=sk-or-v1-your-key-here \
pnpm factory:go -- \
  --brief-url http://localhost:4201/software-factory/Wiki/sticky-note \
  --target-realm-url http://localhost:4201/your-username/my-test-realm/ \
  --debug

What to expect on the command line

[factory:go] mode=implement brief=http://localhost:4201/software-factory/Wiki/sticky-note
[factory:go] Starting bootstrap + implement flow...
[test-run-execution] Serving QUnit page at http://127.0.0.1:<port> for realm ...
[test-run-execution] QUnit completed in <N>ms: <N> test(s)
[factory-implement] Updated ticket status to done
[factory:go] Implement complete: outcome=tests_passed iterations=<N> toolCalls=<N>

What to expect in the Boxel host app (target realm)

Folder / File	What it is
`Projects/`	A Project card with the brief's objective and success criteria
`Tickets/`	Ticket cards — the active ticket should show status `done`
`Knowledge Articles/`	Context articles derived from the brief
`*.gts`	Card definition file(s) for the implemented card
`*.test.gts`	Co-located QUnit test file(s)
`StickyNote/` (or similar)	Sample card instance(s) with realistic data
`Spec/`	Catalog Spec card(s) linking to the card definition and sample instances
`Test Runs/`	TestRun card(s) with structured pass/fail results grouped by QUnit module

E2E screenshots

TestRun card — all 10 QUnit tests passing (Code Mode, showing the TestRun JSON and rendered card):

Co-located .test.gts file (Code Mode, showing the LLM-generated QUnit tests alongside the card definition). Note: the co-located test encounters a fetch error for @cardstack/host/tests/helpers when viewed in Code Mode — this is a known issue related to the test helpers not being available in the realm-server-hosted Boxel app's production build (CS-10650):

StickyNote card definition and preview (Code Mode, showing the .gts source and rendered card preview):

Linear tickets

CS-10599 — Main ticket
CS-10649 — Follow-up: dependency tracker dotted filename bug
CS-10650 — Follow-up: production host build has no test assets
CS-10651 — Follow-up: surface skipped/todo tests instead of hiding as passed

Test plan

386/386 unit tests pass (pnpm test:node)
25/25 Playwright tests pass (pnpm test:playwright)
ESLint clean (pnpm lint:js)
TypeScript types clean (pnpm lint:types)
Prettier clean (pnpm lint:format)
Realm-server test: dotted filename resolution (hello.test → hello.test.gts)
pnpm smoke:test-realm against live app
Full E2E factory:go with QUnit test generation

🤖 Generated with Claude Code

Overhaul the software factory's testing infrastructure to use QUnit .test.gts files that render cards in a real browser DOM, replacing Playwright .spec.ts files that only did API round-trips. Key changes: - Test files are co-located with card definitions (hello.test.gts next to hello.gts), not in a separate Tests/ folder - Test executor serves a custom QUnit page that loads the host app's test assets and uses the live-test infrastructure (PR 4191) for module discovery - No test artifacts realm needed: QUnit tests use in-memory browser realms - Rename SpecResultData -> TestModuleResultData, specResults -> moduleResults (purge "spec" in Playwright context; Catalog Spec unchanged) - Self-hosted test page server serves host dist assets directly, rewriting Ember config meta tag to point resolvedBaseRealmURL at the actual realm server Infrastructure: - test-run-execution.ts: custom QUnit HTML page builder, local HTTP server for host assets, Playwright browser navigation with result collection - test-run-parsing.ts: parseQunitResults() replaces Playwright JSON parsing - test-run-types.ts: QunitTestResult, QunitRunSummary, QunitResults types - realm/test-results.gts: TestModuleResult replaces SpecResult - fixtures.ts: hostAppUrl on StartedFactoryRealm Updated: skills, prompts, docs, smoke tests, all unit tests (385/385 pass) Known issue: .test.gts module imports fail silently in the hermetic Playwright harness (live-test discovers modules but can't import them). The QUnit page infrastructure works end-to-end. Debugging the import chain is next. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The realm server's getFileWithFallbacks() in stream.ts checked if a path contained a dot and skipped extension fallbacks if so. This meant a request for "hello.test" (without .gts) would never find "hello.test.gts" — the dot in "hello.test" was treated as a file extension. Fix: only skip fallbacks when the path already has a known executable extension (.gts, .ts, .js, .gjs), using hasExecutableExtension() instead of a generic dot check. Applied the same fix to: - runtime-common/stream.ts (getFileWithFallbacks) - runtime-common/realm.ts (fallbackHandle) - runtime-common/dependency-tracker.ts (hasPathExtension) - runtime-common/index-runner/dependency-normalization.ts (isExtensionlessPath) Added realm-server test: GET /hello.test resolves to hello.test.gts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…e changes These operate in a different context and need a separate, more considered fix. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…pec.ts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The test page server only served /assets/* paths. SQLite WASM lives at the dist root (e.g., c29fc2dacfd64764a6ad.wasm) and fonts at various paths. Serve all dist files for any non-root URL request. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Keep only essential log lines (server URL, completion stats). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…esting

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

This PR migrates software-factory card verification from factory-generated Playwright .spec.ts files (and a test-artifacts realm) to QUnit-based .test.gts files that run in a real browser DOM via the host’s live-test infrastructure, with results persisted back to TestRun cards.

Changes:

Replace Playwright spec-based test execution with a Playwright-driven QUnit live-test page that discovers and runs co-located .test.gts files.
Rename TestRun result structures from spec-oriented naming (SpecResultData, specResults, specRef) to module-oriented naming (TestModuleResultData, moduleResults, moduleRef).
Fix dotted-filename resolution by only skipping fallbacks when an executable extension is already present.

Reviewed changes

Copilot reviewed 34 out of 34 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
packages/software-factory/tests/fixtures.ts	Add `hostAppUrl` to realm fixture metadata for QUnit test runs.
packages/software-factory/tests/factory-tool-executor.spec.ts	Remove `testRealmUrl` usage from tool-building in tests.
packages/software-factory/tests/factory-tool-builder.test.ts	Update tool-builder tests for single-realm targeting + new run_tests contract.
packages/software-factory/tests/factory-test-realm.test.ts	Replace Playwright report parsing tests with QUnit results parsing tests + moduleResults renames.
packages/software-factory/tests/factory-test-realm.spec.ts	Update e2e to write/run `.test.gts` files and assert persisted moduleResults.
packages/software-factory/tests/factory-prompt-loader.test.ts	Adjust prompt assertions to reflect updated system prompt content checks.
packages/software-factory/tests/factory-implement.test.ts	Update expectations around derived test realm URL exposure in agent context.
packages/software-factory/tests/factory-agent.test.ts	Adjust system message assertions (now checks for `read_file`).
packages/software-factory/test-fixtures/test-realm-runner/hello.test.gts	Add QUnit test fixture co-located with card definition.
packages/software-factory/src/harness/support-services.ts	Stop rejecting Ember test builds in host dist validation.
packages/software-factory/src/factory-entrypoint.ts	Remove `testRealmUrl` from implement summary output.
packages/software-factory/src/cli/smoke-test-realm.ts	Update smoke test to generate `.test.gts` files and invoke new test runner.
packages/software-factory/scripts/smoke-tests/factory-tools-smoke.ts	Remove `testRealmUrl` from smoke tool config.
packages/software-factory/scripts/lib/test-run-types.ts	Introduce QUnit result types and switch TestRun attributes to moduleResults/moduleRef.
packages/software-factory/scripts/lib/test-run-parsing.ts	Implement `parseQunitResults` and remove Playwright/run-realm-tests parsing logic.
packages/software-factory/scripts/lib/test-run-execution.ts	Replace pull-and-run Playwright specs flow with self-hosted QUnit page + Playwright browser collection.
packages/software-factory/scripts/lib/test-run-cards.ts	Persist `moduleResults` instead of `specResults` in TestRun card lifecycle.
packages/software-factory/scripts/lib/factory-tool-builder.ts	Remove test-realm targeting and update `run_tests` tool to QUnit mode.
packages/software-factory/scripts/lib/factory-test-realm.ts	Re-export new QUnit parsing/types and drop test-artifacts realm helpers.
packages/software-factory/scripts/lib/factory-skill-loader.ts	Switch always-loaded testing reference from Playwright to QUnit.
packages/software-factory/scripts/lib/factory-implement.ts	Update test runner discovery/execution logic for `.test.gts` and new runner options.
packages/software-factory/realm/test-results.gts	Rename SpecResult → TestModuleResult and specResults → moduleResults in TestRun schema/UI.
packages/software-factory/prompts/ticket-test.md	Update agent instruction from Playwright specs to QUnit `.test.gts` files.
packages/software-factory/prompts/ticket-implement.md	Update implementation checklist to produce co-located QUnit tests.
packages/software-factory/prompts/system.md	Update global rule to require `.test.gts` tests.
packages/software-factory/docs/testing-strategy.md	Update testing strategy docs to remove test-artifacts realm and describe QUnit live-test flow.
packages/software-factory/docs/phase-1-plan.md	Update phase plan docs to reflect new QUnit-based execution model.
packages/software-factory/.agents/skills/software-factory-operations/SKILL.md	Update skill docs to describe QUnit test file creation/execution patterns.
packages/software-factory/.agents/skills/boxel-development/references/dev-qunit-testing.md	Add QUnit card testing reference doc for agents.
packages/software-factory/.agents/skills/boxel-development/references/dev-playwright-testing.md	Remove Playwright testing reference doc.
packages/runtime-common/stream.ts	Fix fallback behavior for dotted filenames using `hasExecutableExtension`.
packages/runtime-common/realm.ts	Fix server-side fallback handling for dotted filenames in `fallbackHandle`.
packages/realm-server/tests/cards/hello.test.gts	Add fixture card module to validate dotted filename resolution.
packages/realm-server/tests/card-source-endpoints-test.ts	Add test asserting `/hello.test` resolves to `hello.test.gts`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

packages/software-factory/scripts/lib/test-run-execution.ts

packages/software-factory/scripts/lib/test-run-parsing.ts

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…atus - Validate asset server paths to prevent directory traversal (normalize, reject '..', verify resolved path stays within hostDistDir) - Poll for QUnit availability instead of relying on window 'load' event to avoid race where QUnit starts before hooks are attached - Map QUnit skipped/todo to 'passed' instead of 'pending' so they're terminal states that don't confuse resume logic or isComplete checks Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-07T01:27:11Z

Host Test Results

2 120 tests ±0 2 105 ✅ ±0 2h 17m 17s ⏱️ - 1m 2s
1 suites ±0 15 💤 ±0
1 files ±0 0 ❌ ±0

Results for commit 18b105e. ± Comparison against base commit ab883ef.

♻️ This comment has been updated with latest results.

github-actions · 2026-04-07T01:28:28Z

Realm Server Test Results

1 files ±0 1 suites ±0 13m 43s ⏱️ -14s
838 tests +1 838 ✅ +1 0 💤 ±0 0 ❌ ±0
909 runs +1 909 ✅ +1 0 💤 ±0 0 ❌ ±0

Results for commit 18b105e. ± Comparison against base commit ab883ef.

♻️ This comment has been updated with latest results.

The test fixture directory now includes hello.test.gts, so the directory GET response test needs to expect it in the listing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…esting

live-test.js fetches _mtimes without auth headers, which fails on private realms (401 Unauthorized). Use page.route() to intercept requests to the realm origin and inject the Authorization header at the network level. Also includes diagnostic console forwarding for live-test and error messages. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add debug option to ExecuteTestRunOptions. Browser console is only forwarded to stderr when debug is enabled, reducing noise in normal runs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Each factory loop iteration should produce its own TestRun card, not overwrite the previous one. Without forceNew, resolveTestRun found the existing 'running' TestRun from the prior iteration and resumed it, resulting in a single TestRun that only showed the final iteration's results. Add forceNew: true to both buildTestRunner() and the run_tests tool. Add regression test verifying consecutive forceNew calls create separate TestRuns with incrementing sequence numbers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

These will be deleted after the PR description references are updated to use GitHub-hosted URLs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Screenshots are now referenced by commit hash in the PR description and no longer needed in the working tree. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Document QUnit test page architecture, test artifacts realm removal, private realm auth, dotted filename fix, forceNew per iteration, skipped test handling, and production build limitation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3eebb50484

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

packages/software-factory/scripts/lib/factory-implement.ts

- Forward hostAppUrl from ImplementConfig into ToolBuilderConfig so the run_tests tool uses the browser-accessible compat proxy URL, not the internal realm server port (which the browser can't reach in the harness) - Wait for written .test.gts files to be accessible in the realm before launching QUnit to avoid flaky failures from indexing delay Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…esting

habdelra and others added 8 commits April 6, 2026 18:35

Revert dependency-tracker and dependency-normalization dotted filenam…

ca76c0f

…e changes These operate in a different context and need a separate, more considered fix. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Fix lint formatting in test-run-execution.ts and factory-test-realm.s…

9f2af2b

…pec.ts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Remove verbose debug logging from test-run-execution

9c2ee92

Keep only essential log lines (server URL, completion stats). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Merge remote-tracking branch 'origin/main' into cs-10599-qunit-card-t…

6735dfd

…esting

Remove unused hasExtension import from realm.ts

ccd8ab9

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

habdelra requested a review from Copilot April 7, 2026 00:39

Copilot started reviewing on behalf of habdelra April 7, 2026 00:39 View session

Copilot AI reviewed Apr 7, 2026

View reviewed changes

packages/software-factory/scripts/lib/test-run-execution.ts Show resolved Hide resolved

packages/software-factory/scripts/lib/test-run-execution.ts Outdated Show resolved Hide resolved

packages/software-factory/scripts/lib/test-run-parsing.ts Outdated Show resolved Hide resolved

habdelra and others added 2 commits April 6, 2026 20:47

Fix prettier formatting in hello.test.gts fixture

342b0e6

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

habdelra and others added 6 commits April 7, 2026 08:44

Add hello.test.gts to expected directory listing in realm-endpoints-test

9a972f4

The test fixture directory now includes hello.test.gts, so the directory GET response test needs to expect it in the listing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Merge remote-tracking branch 'origin/main' into cs-10599-qunit-card-t…

68e9c69

…esting

Gate browser console forwarding on debug flag

7fea4fa

Add debug option to ExecuteTestRunOptions. Browser console is only forwarded to stderr when debug is enabled, reducing noise in normal runs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Temporarily add E2E screenshots for PR description

3eebb50

These will be deleted after the PR description references are updated to use GitHub-hosted URLs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

habdelra marked this pull request as ready for review April 7, 2026 13:39

habdelra and others added 2 commits April 7, 2026 09:40

Remove temporary E2E screenshots from repo

a822cc5

Screenshots are now referenced by commit hash in the PR description and no longer needed in the working tree. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector bot reviewed Apr 7, 2026

View reviewed changes

packages/software-factory/scripts/lib/factory-implement.ts Show resolved Hide resolved

packages/software-factory/scripts/lib/factory-implement.ts Show resolved Hide resolved

habdelra requested a review from a team April 7, 2026 14:21

tintinthong approved these changes Apr 7, 2026

View reviewed changes

Merge remote-tracking branch 'origin/main' into cs-10599-qunit-card-t…

18b105e

…esting

habdelra merged commit 4b4a823 into main Apr 7, 2026
79 of 81 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace Playwright spec tests with QUnit-based card testing (CS-10599)#4337

Replace Playwright spec tests with QUnit-based card testing (CS-10599)#4337
habdelra merged 20 commits intomainfrom
cs-10599-qunit-card-testing

habdelra commented Apr 6, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Apr 7, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 7, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

habdelra commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

How the QUnit test page works

What the software factory needs

What our Playwright test harness needs

The solution: a self-hosted test page server

Known limitation: production host builds (CS-10650)

Dotted filename resolution bug fix

Try it out — Smoke Test

Prerequisites

Run the smoke test

Try it out — Full Factory E2E

Prerequisites

Run the factory

What to expect on the command line

What to expect in the Boxel host app (target realm)

E2E screenshots

Linear tickets

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Host Test Results

Uh oh!

github-actions bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Realm Server Test Results

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

habdelra commented Apr 6, 2026 •

edited

Loading

github-actions bot commented Apr 7, 2026 •

edited

Loading

github-actions bot commented Apr 7, 2026 •

edited

Loading