diff --git a/packages/software-factory/docs/phase-2-plan.md b/packages/software-factory/docs/phase-2-plan.md index c6ba807568..f2f52a2163 100644 --- a/packages/software-factory/docs/phase-2-plan.md +++ b/packages/software-factory/docs/phase-2-plan.md @@ -24,18 +24,19 @@ This makes the loop generic. It doesn't need to know whether an issue is "implem Issues need properties that let the orchestrator determine execution order. Possible fields (may use a combination): -- **priority** — enum (`high`, `medium`, `low`), high = execute first +- **priority** — enum (`critical`, `high`, `medium`, `low`), critical = execute first - **predecessors / blockedBy** — explicit dependency edges; an issue cannot start until its blockers are done - **order** — explicit sequence number for tie-breaking -The selection algorithm: +The selection algorithm (implemented in `IssueScheduler.pickNextIssue()`): -1. Filter to issues with status `ready` or `in_progress` +1. Filter to issues with status `backlog` or `in_progress` 2. Exclude issues whose `blockedBy` list contains any non-completed issue -3. Sort by priority (high first, then medium, then low), then by order (ascending) -4. Pick the first one +3. Exclude exhausted issues (hit `maxIterationsPerIssue` in the current run) +4. Sort: `in_progress` first, then by priority (`critical` > `high` > `medium` > `low`), then by order (ascending) +5. Pick the first one -Resume semantics: if an issue is already `in_progress`, it takes priority over `ready` issues (the factory was interrupted and should continue where it left off). +Resume semantics: if an issue is already `in_progress`, it takes priority over `backlog` issues (the factory was interrupted and should continue where it left off). ## Validation Phase After Every Iteration @@ -56,6 +57,34 @@ After each agent turn in the inner loop, the orchestrator runs these checks dete 4. **Card instantiation** — Verify that sample card instances can be instantiated from their definitions 5. **Run existing tests** — Execute all QUnit `.test.gts` files in the target realm via the QUnit test page +### Validation Architecture (CS-10675) + +The validation pipeline is implemented as a modular system in `src/validators/`: + +**`ValidationStepRunner` interface** — the contract every step must implement: + +```typescript +interface ValidationStepRunner { + readonly step: ValidationStep; + run(targetRealmUrl: string): Promise; + formatForContext(result: ValidationStepResult): string; +} +``` + +**`ValidationPipeline` class** — implements the `Validator` interface and composes step runners: + +- Steps run **concurrently** via `Promise.allSettled()` — a failure or exception in one step does not prevent others from running +- Exceptions thrown by a step are captured as failed `ValidationStepResult` entries with the error message +- `formatForContext()` delegates to each step runner to produce LLM-friendly markdown +- `createDefaultPipeline(config)` factory function composes all 5 steps with config injection + +**Step-specific failure shapes** — each validation type carries its own structured data in `ValidationStepResult.details` (flattened POJOs, not cards): + +- **Test step**: `{ testRunId, passedCount, failedCount, failures: [{ testName, module, message, stackTrace }] }` — reads back the completed TestRun card from the realm for detailed failure data (will become cheap local filesystem reads after boxel-cli integration) +- **Future parse/lint/evaluate/instantiate steps**: each defines its own `details` shape + +**Adding a new validation step** = creating a new module file in `src/validators/` + replacing the `NoOpStepRunner` in `createDefaultPipeline()`. + ### Handling Failures Validation failures are fed back to the agent as context in the **next inner-loop iteration**. The orchestrator does not create fix issues for validation failures — it iterates with the failure details so the agent can self-correct. This mirrors Phase 1's approach (feed test results back, iterate) but with a broader validation pipeline. @@ -64,7 +93,8 @@ The inner loop continues until: - The agent marks the issue as done (all validation passes) - The agent marks the issue as blocked (needs human input) -- Max iterations are reached +- Max iterations are reached with **failing validation** — the orchestrator blocks the issue with the reason ("max iteration limit reached") and the formatted validation failure context in the issue description, then moves to the next issue +- Max iterations are reached with **passing validation** — the issue is exhausted but not blocked (agent did not mark done despite passing validation) The agent always has the option to create new issues via tool calls if it determines that a failure requires separate work (e.g., "this card definition depends on another card that doesn't exist yet — creating a new issue for it"). But the orchestrator does not force this — the agent decides. @@ -127,29 +157,94 @@ This is the "quirk" where an issue's job is to create the project itself. But it The phase 2 orchestrator is a thin scheduler with a built-in validation phase that runs after every agent turn: -``` -while (hasUnblockedIssues()) { - let issue = pickNextIssue(); +```typescript +// As implemented in runIssueLoop() — src/issue-loop.ts +let exhaustedIssues = new Set(); - // Inner loop: multiple iterations per issue - let validationResults = null; - while (issue.status !== 'done' && issue.status !== 'blocked' && iterations < maxIterations) { - await agent.run(contextForIssue(issue, validationResults), tools); - refreshIssueState(issue); +while ( + scheduler.hasUnblockedIssues(exhaustedIssues) && + outerCycles < maxOuterCycles +) { + let issue = scheduler.pickNextIssue(exhaustedIssues); - // Validation phase — runs after EVERY iteration - validationResults = await validate(targetRealm); // parse, lint, evaluate, instantiate, run tests - // Failures are fed back as context in the next iteration — agent self-corrects - // Agent can also create new issues via tool calls if it decides to + // Inner loop: multiple iterations per issue + let validationResults = undefined; + let exitReason = 'max_iterations'; + for (let iteration = 1; iteration <= maxIterationsPerIssue; iteration++) { + let context = await contextBuilder.buildForIssue({ + issue, + targetRealmUrl, + validationResults, + briefUrl, + }); + let result = await agent.run(context, tools); + + // Validation phase — runs after EVERY agent turn + validationResults = await validator.validate(targetRealmUrl); + + // Read issue state from realm (not from AgentRunResult.status) + issue = await scheduler.refreshIssueState(issue); + + if (issue.status === 'done' || issue.status === 'blocked') { + exitReason = issue.status; + break; + } + } - iterations++; + if (exitReason === 'max_iterations') { + // If validation still failing at max iterations, block the issue + // with the reason and failure context written to the realm + if (validationResults && !validationResults.passed) { + exitReason = 'blocked'; + await issueStore.updateIssue(issue.id, { + status: 'blocked', + description: buildMaxIterationBlockedDescription(validationResults), + }); + } + exhaustedIssues.add(issue.id); } + + // Reload to pick up new issues the agent may have created + await scheduler.loadIssues(); } ``` The agent signals progress by updating the issue — tagging it as blocked, marking it done, or leaving it in progress for another iteration. The orchestrator reads issue state from the realm after each agent turn, then runs validation. Validation failures are fed back as context in the next inner-loop iteration so the agent can self-correct. The agent can also create new issues via tool calls if it determines a failure requires separate work. -All domain logic (what to implement, when to create sub-issues, when to tag as blocked) lives in the agent's prompt and skills. The orchestrator owns only: issue selection, agent invocation, and validation. +The orchestrator also **writes** to the realm in one case: when max iterations are reached with failing validation, it updates the issue's status to `blocked` and writes the formatted validation failure context into the issue description. This uses `IssueStore.updateIssue()`, which performs a read-modify-write against the realm card. + +All domain logic (what to implement, when to create sub-issues, when to tag as blocked) lives in the agent's prompt and skills. The orchestrator owns only: issue selection, agent invocation, validation, and max-iteration blocking. + +### Issue Loading via searchRealm() + +`RealmIssueStore` loads issues from the target realm using `searchRealm()` from `realm-operations.ts`. The search filter uses the absolute darkfactory module URL (from `inferDarkfactoryModuleUrl(targetRealmUrl)`), which varies by environment (production, staging, localhost). The store maps JSON:API card responses to `SchedulableIssue` objects. + +Boxel encodes `linksToMany` relationships with dotted keys rather than JSON:API `data` arrays: + +```json +{ + "relationships": { + "blockedBy.0": { "links": { "self": "../Issues/issue-a" } }, + "blockedBy.1": { "links": { "self": "../Issues/issue-b" } } + } +} +``` + +The `extractLinksToManyIds()` helper parses this format to extract blocker IDs for dependency resolution. + +When `searchRealm()` fails (auth, network, query errors), the store logs at `warn` level and returns an empty list — preventing the loop from silently treating a failure as "no issues exist." + +### Loop Outcome Determination + +The loop distinguishes several terminal states: + +| Condition | Outcome | +| --------------------------------------------- | --------------------- | +| No issues loaded | `all_issues_done` | +| Issues exist but all blocked at startup | `no_unblocked_issues` | +| All issues completed successfully | `all_issues_done` | +| Some issues done, others blocked or exhausted | `no_unblocked_issues` | +| Safety guard hit | `max_outer_cycles` | ## Schema Refinement: darkfactory.gts @@ -237,24 +332,25 @@ CS-10671 trims and renames the current schema as a first step. The adoption from ## Issue Lifecycle ``` -created → ready → in_progress → done - → blocked (needs human input) - → failed (max retries exceeded) +backlog → in_progress → done + → blocked (needs human input or max iterations with failing validation) + → review (optional) ``` The agent manages its own transitions by updating the issue directly (e.g., tagging as blocked, marking done). The orchestrator reads the issue state after the agent exits to decide what to do next — it does not inspect the agent's return value for status. +The orchestrator also transitions issues to `blocked` in one case: when max iterations are reached with validation still failing. It writes the reason ("max iteration limit reached") and the formatted validation failure context into the issue description via `IssueStore.updateIssue()`, making the blocking reason visible in the realm. Issues blocked this way are also added to an `exhaustedIssues` set to prevent re-selection within the same run. + ## Migration Path from Phase 1 -Phase 1 and phase 2 can coexist: +Phase 1 and phase 2 coexist during the transition. The implementation lives in separate files to avoid touching Phase 1 code: + +- `src/issue-scheduler.ts` — `IssueScheduler`, `IssueStore`, `RealmIssueStore` +- `src/issue-loop.ts` — `runIssueLoop()`, `Validator`, `NoOpValidator`, config/result types -1. Phase 1 ships with the hard-coded pipeline (`factory-loop.ts`) -2. Phase 2 introduces an `IssueScheduler` that replaces the fixed loop with issue-driven scheduling -3. The `LoopAgent` interface (`run(context, tools)`) stays the same — only the orchestrator changes -4. `ContextBuilder` gains an issue-aware mode that builds context from the current issue rather than a fixed ticket -5. The `TestRunner` callback becomes a tool the agent can call, rather than a loop phase +Phase 1's `factory-loop.ts` (`runFactoryLoop()`) remains untouched. The `LoopAgent` interface (`run(context, tools)`) is unchanged and reused by both loops. `FactoryTool[]` carries forward unchanged. -The `FactoryTool[]` type from phase 1 carries forward unchanged. `AgentRunResult` may be simplified — in phase 2 the agent signals completion by updating the issue (tagging as blocked, marking done), so the orchestrator reads issue state rather than inspecting a return status. The agent just needs to exit; the orchestrator figures out what happened from the issue. +CS-10708 tracks the integration work: wire `runIssueLoop()`, the validation phase (CS-10675), and bootstrap-as-seed (CS-10673) into the `factory:go` entrypoint, then retire all Phase 1 mechanisms (`factory-loop.ts`, `factory-loop.test.ts`, old bootstrap orchestration, `TestRunner`/`buildTestRunner()`). ## Refactor: request_clarification → Blocking Issue @@ -406,7 +502,7 @@ The principle that boxel-cli owns the entire Boxel API surface extends to auth. 1. **Two-tier token model** — boxel-cli understands both realm server tokens (obtained via Matrix OpenID → `POST /_server-session`, grants server-level access) and per-realm tokens (obtained via `POST /_realm-auth`, grants access to specific realms). Both are cached and refreshed automatically. -2. **Automatic token acquisition on realm creation** — When `boxel create-realm` creates a new realm, boxel-cli automatically waits for readiness, obtains the per-realm JWT, and stores it in its auth state. Subsequent `boxel pull`/`boxel sync` on that realm Just Work — no `--jwt` flag, no token passing. +2. **Automatic token acquisition on realm creation** — When `boxel create-realm` creates a new realm, boxel-cli automatically waits for readiness, obtains the per-realm JWT, and stores it in its auth state. Subsequent `boxel pull`/`boxel sync` on that realm Just Work — tokens are managed internally by boxel-cli. 3. **Programmatic auth API** — Export a `BoxelAuth` class (or similar) so the factory imports it and never constructs HTTP requests or manages tokens: diff --git a/packages/software-factory/realm/tsconfig.json b/packages/software-factory/realm/tsconfig.json new file mode 100644 index 0000000000..9557565c2d --- /dev/null +++ b/packages/software-factory/realm/tsconfig.json @@ -0,0 +1,30 @@ +{ + "compilerOptions": { + "target": "es2020", + "allowJs": true, + "moduleResolution": "node16", + "allowSyntheticDefaultImports": true, + "noImplicitAny": true, + "noImplicitThis": true, + "alwaysStrict": true, + "strictNullChecks": true, + "strictPropertyInitialization": true, + "noFallthroughCasesInSwitch": true, + "noUnusedLocals": true, + "noUnusedParameters": true, + "noImplicitReturns": true, + "noEmitOnError": false, + "noEmit": true, + "inlineSourceMap": true, + "inlineSources": true, + "baseUrl": ".", + "module": "node16", + "strict": true, + "experimentalDecorators": true, + "paths": { + "https://cardstack.com/base/*": ["../../base/*"] + }, + "types": ["@cardstack/local-types"] + }, + "include": ["**/*.ts", "**/*.gts"] +} diff --git a/packages/software-factory/scripts/smoke-tests/issue-loop-smoke.ts b/packages/software-factory/scripts/smoke-tests/issue-loop-smoke.ts index c170a46bad..79dcb63bf4 100644 --- a/packages/software-factory/scripts/smoke-tests/issue-loop-smoke.ts +++ b/packages/software-factory/scripts/smoke-tests/issue-loop-smoke.ts @@ -31,11 +31,14 @@ import type { IssueStore } from '../../src/issue-scheduler'; import { runIssueLoop, NoOpValidator, + ValidationPipeline, type IssueContextBuilderLike, type IssueLoopResult, type Validator, } from '../../src/issue-loop'; +import { NoOpStepRunner } from '../../src/validators/noop-step'; + // --------------------------------------------------------------------------- // Helpers // --------------------------------------------------------------------------- @@ -75,6 +78,10 @@ class MockIssueStore implements IssueStore { if (!issue) throw new Error(`Issue "${issueId}" not found`); return { ...issue }; } + + async updateIssue(): Promise { + // no-op for smoke tests + } } interface MockAgentTurn { @@ -461,8 +468,8 @@ async function scenarioMaxIterations(): Promise { printResult(result); check( - 'issue exits after max iterations', - result.issueResults[0]?.exitReason === 'max_iterations', + 'issue blocked after max iterations with failing validation', + result.issueResults[0]?.exitReason === 'blocked', ); check( 'last validation was failed', @@ -539,6 +546,74 @@ async function scenarioEmptyProject(): Promise { log.info(''); } +async function scenarioValidationPipeline(): Promise { + log.info('--- Scenario 7: ValidationPipeline integration ---'); + log.info(''); + + let store = new MockIssueStore([ + makeIssue({ + id: 'ISS-P1', + status: 'backlog', + priority: 'high', + order: 1, + summary: 'Test pipeline integration', + }), + ]); + + let agent = new MockLoopAgent( + [ + { + toolCalls: [ + { tool: 'write_file', args: { path: 'card.gts', content: 'v1' } }, + ], + updateIssue: { id: 'ISS-P1', status: 'done' }, + }, + ], + store, + ); + + // Use a real ValidationPipeline with all NoOp steps (no server needed) + let pipeline = new ValidationPipeline([ + new NoOpStepRunner('parse'), + new NoOpStepRunner('lint'), + new NoOpStepRunner('evaluate'), + new NoOpStepRunner('instantiate'), + new NoOpStepRunner('test'), + ]); + + let result = await runIssueLoop({ + agent, + contextBuilder: new StubContextBuilder(), + tools: TOOLS, + issueStore: store, + validator: pipeline, + targetRealmUrl: 'https://example.test/target/', + }); + + printResult(result); + check('outcome is all_issues_done', result.outcome === 'all_issues_done'); + check('exit reason is done', result.issueResults[0]?.exitReason === 'done'); + check( + 'validation passed (all NoOp steps)', + result.issueResults[0]?.lastValidation?.passed === true, + ); + check( + '5 validation steps reported', + result.issueResults[0]?.lastValidation?.steps.length === 5, + ); + + // Verify formatForContext works + let lastValidation = result.issueResults[0]?.lastValidation; + let formatted = lastValidation + ? pipeline.formatForContext(lastValidation) + : ''; + check( + 'formatForContext reports all passed', + formatted === 'All validation steps passed.', + ); + log.info(''); +} + // --------------------------------------------------------------------------- // Main // --------------------------------------------------------------------------- @@ -554,6 +629,7 @@ async function main(): Promise { await scenarioMaxIterations(); await scenarioBlockedIssue(); await scenarioEmptyProject(); + await scenarioValidationPipeline(); log.info('==========================='); log.info(` ${passed} passed, ${failed} failed`); diff --git a/packages/software-factory/scripts/smoke-tests/smoke-test-realm.ts b/packages/software-factory/scripts/smoke-tests/smoke-test-realm.ts index 6b92160b2a..cabcfd9eec 100644 --- a/packages/software-factory/scripts/smoke-tests/smoke-test-realm.ts +++ b/packages/software-factory/scripts/smoke-tests/smoke-test-realm.ts @@ -1,26 +1,19 @@ /** - * Smoke test for the factory test realm management (QUnit). + * Smoke test for the validation pipeline with real QUnit test execution. * - * Simulates the full factory workflow: implementation phase output followed - * by the testing phase via executeTestRunFromRealm with QUnit .test.gts files. + * 1. Creates a target realm and writes simulated LLM output: + * - A HelloCard definition (.gts) + * - A Spec card instance pointing to the HelloCard definition + * - A passing QUnit test (hello.test.gts) + * - A deliberately failing QUnit test (hello-fail.test.gts) * - * Phase 1 -- Simulate LLM implementation output: - * Writes to the target realm (what the LLM would have produced): - * 1. A sample HelloCard definition (.gts) - * 2. A Spec card instance pointing to the HelloCard definition - * 3. A sample HelloCard instance (HelloCard/sample.json) - * 4. A QUnit test file (hello.test.gts) -- passing - * 5. A QUnit test file (hello-fail.test.gts) -- deliberately failing + * 2. Runs the full ValidationPipeline via createDefaultPipeline(), which + * executes all validation steps (parse, lint, evaluate, instantiate + * are NoOp placeholders; test step runs real QUnit tests via Playwright). * - * Phase 2 -- Run the testing phase via QUnit: - * Calls executeTestRunFromRealm, which: - * - Creates a TestRun card (status: running) in the target realm - * - Launches a headless browser pointing at the host app QUnit page - * - Collects QUnit results (testEnd / runEnd events) - * - Completes the TestRun card with module results - * - The passing test produces a result with passedCount=1 - * - The failing test produces a result with failedCount=1 - * - The overall TestRun status is 'failed' (mixed results) + * 3. Verifies pipeline results: test step fails (deliberately), NoOp steps + * pass, detailed failure data is read back from the TestRun card, and + * formatForContext() produces LLM-friendly markdown. * * Prerequisites: * @@ -35,16 +28,17 @@ */ // This should be first -import '../setup-logger'; +import '../../src/setup-logger'; import { getRealmServerToken, matrixLogin, parseArgs } from '../../src/boxel'; import { logger } from '../../src/logger'; -import { executeTestRunFromRealm } from '../../src/test-run-execution'; import { createRealm, getRealmScopedAuth, writeFile, } from '../../src/realm-operations'; +import { createDefaultPipeline } from '../../src/validators/validation-pipeline'; +import type { TestValidationDetails } from '../../src/validators/test-step'; // --------------------------------------------------------------------------- // Sample LLM output -- what the agent would produce in the implementation phase @@ -319,61 +313,98 @@ async function main() { ); // ------------------------------------------------------------------------- - // Phase 2: Run QUnit tests via executeTestRunFromRealm + // Run validation pipeline against the realm // ------------------------------------------------------------------------- - log.info( - '\n--- Phase 2: Running QUnit tests via executeTestRunFromRealm ---\n', - ); + log.info('\n--- Running ValidationPipeline.validate() ---\n'); - let handle = await executeTestRunFromRealm({ - targetRealmUrl, - testResultsModuleUrl, - slug: 'hello-smoke', - testNames: [], + let pipeline = createDefaultPipeline({ authorization, fetch: fetchImpl, - forceNew: true, realmServerUrl, hostAppUrl: realmServerUrl, + testResultsModuleUrl, }); - log.info(` TestRun ID: ${handle.testRunId}`); - log.info(` Status: ${handle.status}`); - if (handle.errorMessage) { - log.info(` Error: ${handle.errorMessage}`); - } - if ((handle as unknown as Record).error) { + let validationResults = await pipeline.validate(targetRealmUrl); + + log.info( + ` Pipeline result: ${validationResults.passed ? 'PASSED' : 'FAILED'} (${validationResults.steps.length} steps)`, + ); + for (let step of validationResults.steps) { + let statusIcon = step.passed ? '✓' : '✗'; + let detail = ''; + if (step.details) { + let d = step.details as unknown as TestValidationDetails; + if (d.passedCount != null) { + detail = ` (${d.passedCount} passed, ${d.failedCount} failed)`; + } + } log.info( - ` Complete error: ${(handle as unknown as Record).error}`, + ` ${step.step}: ${statusIcon} ${step.passed ? 'passed' : 'failed'}${detail}`, ); } - // ------------------------------------------------------------------------- - // Results - // ------------------------------------------------------------------------- + // Verify pipeline results + let pipelinePassed = true; - log.info('\n--- Results ---\n'); + if (validationResults.passed) { + log.info('\n ✗ Expected pipeline to fail (deliberately failing test)'); + pipelinePassed = false; + } else { + log.info('\n ✓ Pipeline correctly reports failure'); + } - // The TestRun should have status 'failed' because it contains both a - // passing and a deliberately failing QUnit test. The module results inside - // should show one test passed and one test failed. - let expectedStatus = handle.status === 'failed'; + let testStep = validationResults.steps.find((s) => s.step === 'test'); + if (!testStep) { + log.info(' ✗ No test step in results'); + pipelinePassed = false; + } else if (testStep.passed) { + log.info(' ✗ Test step should have failed'); + pipelinePassed = false; + } else { + log.info(' ✓ Test step correctly failed'); + } - log.info( - ` TestRun status: ${expectedStatus ? '✓ failed (as expected -- one test passes, one fails)' : `✗ expected failed, got ${handle.status}`}`, - ); - log.info(`\n View in Boxel: ${targetRealmUrl}${handle.testRunId}`); + let noOpSteps = validationResults.steps.filter((s) => s.step !== 'test'); + let allNoOpsPassed = noOpSteps.every((s) => s.passed); + if (allNoOpsPassed) { + log.info(' ✓ All NoOp steps (parse, lint, evaluate, instantiate) passed'); + } else { + log.info(' ✗ Some NoOp steps failed unexpectedly'); + pipelinePassed = false; + } - if (expectedStatus) { - log.info( - '\n✓ Smoke test passed! TestRun contains both pass and fail QUnit results.', - ); + if (testStep?.details) { + let details = testStep.details as unknown as TestValidationDetails; + if (details.passedCount > 0 && details.failedCount > 0) { + log.info( + ` ✓ Test details: ${details.passedCount} passed, ${details.failedCount} failed`, + ); + } else { + log.info( + ` ✗ Expected both passing and failing tests, got passed=${details.passedCount} failed=${details.failedCount}`, + ); + pipelinePassed = false; + } } else { - log.info('\n✗ Smoke test had unexpected results.'); - log.info( - ` Expected "failed" (mixed pass/fail QUnit tests) but got "${handle.status}"`, - ); + log.info(' ✗ No test details available'); + pipelinePassed = false; + } + + // Show formatted context for LLM + let formatted = pipeline.formatForContext(validationResults); + log.info('\n Formatted context for LLM:'); + log.info(' ─────────────────────────'); + for (let line of formatted.split('\n')) { + log.info(` ${line}`); + } + log.info(' ─────────────────────────'); + + if (pipelinePassed) { + log.info('\n✓ Validation pipeline smoke test passed!'); + } else { + log.info('\n✗ Validation pipeline smoke test failed.'); process.exit(1); } } diff --git a/packages/software-factory/src/factory-agent-types.ts b/packages/software-factory/src/factory-agent-types.ts index def13887fd..a0bc09bf6d 100644 --- a/packages/software-factory/src/factory-agent-types.ts +++ b/packages/software-factory/src/factory-agent-types.ts @@ -135,6 +135,8 @@ export interface ValidationStepResult { passed: boolean; files?: string[]; errors: ValidationError[]; + /** Step-specific structured data for context formatting (POJOs, not cards). */ + details?: Record; } /** Aggregated results from a full validation run (all steps). */ @@ -241,3 +243,12 @@ export function resolveFactoryModel(cliModel?: string): string { return FACTORY_DEFAULT_MODEL; } + +/** + * Derive a slug from an issue ID by taking the last path segment. + * e.g., "Issues/sticky-note-define-core" → "sticky-note-define-core" + */ +export function deriveIssueSlug(issueId: string): string { + let parts = issueId.split('/'); + return parts[parts.length - 1]; +} diff --git a/packages/software-factory/src/factory-implement.ts b/packages/software-factory/src/factory-implement.ts index 5303f9ecf6..40ba347bd6 100644 --- a/packages/software-factory/src/factory-implement.ts +++ b/packages/software-factory/src/factory-implement.ts @@ -24,6 +24,7 @@ import type { TestResult, } from './factory-agent'; import { + deriveIssueSlug, resolveFactoryModel, ToolUseFactoryAgent, type FactoryAgentConfig, @@ -665,19 +666,6 @@ async function readTestRunFailures( } } -// --------------------------------------------------------------------------- -// Issue slug derivation -// --------------------------------------------------------------------------- - -/** - * Derive an issue slug from an issue ID. - * e.g., "Issues/sticky-note-define-core" → "sticky-note-define-core" - */ -function deriveIssueSlug(issueId: string): string { - let parts = issueId.split('/'); - return parts[parts.length - 1]; -} - // --------------------------------------------------------------------------- // Post-loop updates // --------------------------------------------------------------------------- diff --git a/packages/software-factory/src/issue-loop.ts b/packages/software-factory/src/issue-loop.ts index 971efbf2d2..7052bfc0fb 100644 --- a/packages/software-factory/src/issue-loop.ts +++ b/packages/software-factory/src/issue-loop.ts @@ -30,21 +30,23 @@ import { logger } from './logger'; let log = logger('issue-loop'); // --------------------------------------------------------------------------- -// Validator interface (placeholder for CS-10675) +// Validator interface // --------------------------------------------------------------------------- /** * Runs the post-iteration validation pipeline. * Steps: parse, lint, evaluate, instantiate, run tests. - * CS-10675 provides the real implementation. + * See ValidationPipeline for the real implementation. */ export interface Validator { validate(targetRealmUrl: string): Promise; + /** Format validation results for LLM context or issue descriptions. */ + formatForContext?(results: ValidationResults): string; } /** * No-op validator that always passes. Used for bootstrap issues - * or when CS-10675 validation is not yet available. + * or when validation is not needed. */ export class NoOpValidator implements Validator { async validate(): Promise { @@ -52,6 +54,17 @@ export class NoOpValidator implements Validator { } } +// --------------------------------------------------------------------------- +// Re-exports from validators/ +// --------------------------------------------------------------------------- + +export { + ValidationPipeline, + createDefaultPipeline, + type ValidationStepRunner, + type ValidationPipelineConfig, +} from './validators/validation-pipeline'; + // --------------------------------------------------------------------------- // Context builder interface for issue-driven loop // --------------------------------------------------------------------------- @@ -138,6 +151,42 @@ function formatValidation(results: ValidationResults): string { return `FAILED — ${failures.join(', ')}`; } +/** + * Build a description for an issue blocked due to max iterations with + * failing validation, including the formatted failure context. + */ +function buildMaxIterationBlockedDescription( + maxIterations: number, + validationResults: ValidationResults, + validator: Validator, +): string { + let lines = [ + `**Blocked: max iteration limit reached (${maxIterations} turns) with failing validation.**`, + '', + `The agent was unable to resolve validation failures within the allowed number of iterations.`, + '', + `### Last Validation Results`, + '', + ]; + + if (validator.formatForContext) { + lines.push(validator.formatForContext(validationResults)); + } else { + // Fallback: format from the raw results + for (let step of validationResults.steps) { + if (!step.passed) { + lines.push(`**${step.step}**: FAILED`); + for (let error of step.errors) { + lines.push(`- ${error.message}`); + } + lines.push(''); + } + } + } + + return lines.join('\n'); +} + // --------------------------------------------------------------------------- // Main loop // --------------------------------------------------------------------------- @@ -263,8 +312,34 @@ export async function runIssueLoop( if (exitReason === 'max_iterations') { log.info( - ` Max iterations (${maxIterationsPerIssue}) reached for issue ${issueSummaryLabel(issue)} — exiting inner loop`, + ` Max iterations (${maxIterationsPerIssue}) reached for issue ${issueSummaryLabel(issue)}`, ); + + // If validation still failing at max iterations, block the issue with + // the reason and failure context so it's visible in the realm. + if (validationResults && !validationResults.passed) { + log.info( + ` Validation still failing — blocking issue with failure context`, + ); + + try { + let description = buildMaxIterationBlockedDescription( + maxIterationsPerIssue, + validationResults, + validator, + ); + await issueStore.updateIssue(issue.id, { + status: 'blocked', + description, + }); + exitReason = 'blocked'; + } catch (err) { + log.warn( + ` Failed to update issue status to blocked: ${err instanceof Error ? err.message : String(err)}`, + ); + } + } + exhaustedIssues.add(issue.id); } diff --git a/packages/software-factory/src/issue-scheduler.ts b/packages/software-factory/src/issue-scheduler.ts index bc3a7aa616..41b434fa79 100644 --- a/packages/software-factory/src/issue-scheduler.ts +++ b/packages/software-factory/src/issue-scheduler.ts @@ -12,7 +12,12 @@ import type { SchedulableIssue, } from './factory-agent-types'; -import { searchRealm, type RealmFetchOptions } from './realm-operations'; +import { + searchRealm, + readFile, + writeFile, + type RealmFetchOptions, +} from './realm-operations'; import { logger } from './logger'; let log = logger('issue-scheduler'); @@ -30,6 +35,11 @@ export interface IssueStore { listIssues(): Promise; /** Re-read a single issue's current state from the realm. */ refreshIssue(issueId: string): Promise; + /** Update issue fields in the realm (e.g., status, description). */ + updateIssue( + issueId: string, + updates: { status?: string; description?: string }, + ): Promise; } // --------------------------------------------------------------------------- @@ -224,6 +234,52 @@ export class RealmIssueStore implements IssueStore { return mapCardToSchedulableIssue(result.data[0]); } + + async updateIssue( + issueId: string, + updates: { status?: string; description?: string }, + ): Promise { + // Read the source JSON file (not the indexed card, which can have + // stripped relationships during indexing). + let readResult = await readFile( + this.realmUrl, + `${issueId}.json`, + this.options, + ); + if (!readResult.ok || !readResult.document) { + throw new Error( + `Failed to read issue "${issueId}" for update: ${readResult.error ?? 'no document returned'}`, + ); + } + + let doc = readResult.document; + let attrs = (doc.data.attributes ?? {}) as Record; + + if (updates.status != null) { + attrs.status = updates.status; + } + if (updates.description != null) { + attrs.description = updates.description; + } + attrs.updatedAt = new Date().toISOString(); + + doc.data.attributes = attrs; + + let writeResult = await writeFile( + this.realmUrl, + `${issueId}.json`, + JSON.stringify(doc, null, 2), + this.options, + ); + + if (!writeResult.ok) { + throw new Error( + `Failed to write issue "${issueId}": ${writeResult.error}`, + ); + } + + log.info(`Updated issue "${issueId}": ${JSON.stringify(updates)}`); + } } // --------------------------------------------------------------------------- diff --git a/packages/software-factory/src/realm-operations.ts b/packages/software-factory/src/realm-operations.ts index baf1e55c86..6ace513827 100644 --- a/packages/software-factory/src/realm-operations.ts +++ b/packages/software-factory/src/realm-operations.ts @@ -2,7 +2,7 @@ * Shared realm operations for the software-factory scripts. * * Centralizes HTTP-based realm API calls so they're easy to find and - * refactor to boxel-cli tool calls when --jwt support is added (CS-10529). + * refactor to boxel-cli tool calls (CS-10529). */ import { mkdirSync, writeFileSync } from 'node:fs'; @@ -792,6 +792,74 @@ async function addRealmToMatrixAccountData( } } +// --------------------------------------------------------------------------- +// Fetch Realm Filenames +// --------------------------------------------------------------------------- + +/** + * Fetch the list of file paths from a realm via the `_mtimes` endpoint. + * Returns relative file paths (e.g., `hello.gts`, `Cards/my-card.json`). + */ +export async function fetchRealmFilenames( + realmUrl: string, + options?: RealmFetchOptions, +): Promise<{ filenames: string[]; error?: string }> { + let fetchImpl = options?.fetch ?? globalThis.fetch; + let normalizedRealmUrl = ensureTrailingSlash(realmUrl); + + let headers = buildAuthHeaders( + options?.authorization, + SupportedMimeType.JSONAPI, + ); + + let mtimesUrl = `${normalizedRealmUrl}_mtimes`; + let mtimesResponse: Response; + try { + mtimesResponse = await fetchImpl(mtimesUrl, { method: 'GET', headers }); + } catch (err) { + return { + filenames: [], + error: `Failed to fetch _mtimes: ${err instanceof Error ? err.message : String(err)}`, + }; + } + + if (!mtimesResponse.ok) { + let body = await mtimesResponse.text(); + return { + filenames: [], + error: `_mtimes returned HTTP ${mtimesResponse.status}: ${body.slice(0, 300)}`, + }; + } + + let mtimes: Record; + try { + let json = await mtimesResponse.json(); + // _mtimes returns JSON:API format: { data: { attributes: { mtimes: {...} } } } + mtimes = + (json as { data?: { attributes?: { mtimes?: Record } } }) + ?.data?.attributes?.mtimes ?? json; + } catch { + return { + filenames: [], + error: 'Failed to parse _mtimes response as JSON', + }; + } + + let filenames: string[] = []; + for (let fullUrl of Object.keys(mtimes)) { + if (!fullUrl.startsWith(normalizedRealmUrl)) { + continue; + } + let relativePath = fullUrl.slice(normalizedRealmUrl.length); + if (!relativePath || relativePath.endsWith('/')) { + continue; + } + filenames.push(relativePath); + } + + return { filenames: filenames.sort() }; +} + // --------------------------------------------------------------------------- // Pull Realm Files // --------------------------------------------------------------------------- @@ -800,7 +868,7 @@ async function addRealmToMatrixAccountData( * Download all files from a remote realm to a local directory using the * `_mtimes` endpoint to discover file paths. * - * TODO: Replace with `boxel pull --jwt ` once CS-10529 is implemented. + * TODO: Replace with `boxel pull` once CS-10529 is implemented. * * Returns the list of relative file paths that were downloaded. */ diff --git a/packages/software-factory/src/validators/noop-step.ts b/packages/software-factory/src/validators/noop-step.ts new file mode 100644 index 0000000000..3076cdc7fe --- /dev/null +++ b/packages/software-factory/src/validators/noop-step.ts @@ -0,0 +1,24 @@ +import type { ValidationStep, ValidationStepResult } from '../factory-agent'; + +import type { ValidationStepRunner } from './validation-pipeline'; + +/** + * No-op validation step that always passes. + * Used as a placeholder for unimplemented steps (parse, lint, evaluate, instantiate). + * Each placeholder will be replaced by a real implementation via child tickets. + */ +export class NoOpStepRunner implements ValidationStepRunner { + readonly step: ValidationStep; + + constructor(step: ValidationStep) { + this.step = step; + } + + async run(_targetRealmUrl: string): Promise { + return { step: this.step, passed: true, errors: [] }; + } + + formatForContext(_result: ValidationStepResult): string { + return ''; + } +} diff --git a/packages/software-factory/src/validators/test-step.ts b/packages/software-factory/src/validators/test-step.ts new file mode 100644 index 0000000000..a9c8e9b87f --- /dev/null +++ b/packages/software-factory/src/validators/test-step.ts @@ -0,0 +1,359 @@ +/** + * Test validation step — runs QUnit tests in the target realm + * and reads back detailed results from the completed TestRun card. + * + * Wraps `executeTestRunFromRealm()` for the actual test execution, + * then reads the TestRun card from the realm to get detailed failure + * data (individual test names, assertions, stack traces). + * + * Per phase-2-plan.md, realm reads will eventually become local filesystem + * reads after boxel-cli integration — cheap rather than HTTP round-trips. + */ + +import type { ValidationStepResult } from '../factory-agent'; +import { deriveIssueSlug } from '../factory-agent-types'; +import type { LooseSingleCardDocument } from '@cardstack/runtime-common'; + +import { executeTestRunFromRealm } from '../test-run-execution'; +import { + fetchRealmFilenames, + readFile, + type RealmFetchOptions, +} from '../realm-operations'; +import type { ExecuteTestRunOptions, TestRunHandle } from '../test-run-types'; +import { logger } from '../logger'; + +import type { ValidationStepRunner } from './validation-pipeline'; + +let log = logger('test-validation-step'); + +// --------------------------------------------------------------------------- +// Types +// --------------------------------------------------------------------------- + +export interface TestValidationStepConfig { + authorization?: string; + fetch?: typeof globalThis.fetch; + realmServerUrl: string; + hostAppUrl: string; + testResultsModuleUrl: string; + issueId?: string; + /** Injected for testing — defaults to executeTestRunFromRealm. */ + executeTestRun?: (options: ExecuteTestRunOptions) => Promise; + /** Injected for testing — defaults to fetchRealmFilenames. */ + fetchFilenames?: ( + realmUrl: string, + options?: RealmFetchOptions, + ) => Promise<{ filenames: string[]; error?: string }>; + /** Injected for testing — defaults to readFile from realm-operations. */ + readCard?: ( + realmUrl: string, + path: string, + options?: RealmFetchOptions, + ) => Promise<{ + ok: boolean; + document?: LooseSingleCardDocument; + error?: string; + }>; +} + +/** Flattened POJO for test validation details — not a card, just data. */ +export interface TestValidationDetails { + testRunId: string; + passedCount: number; + failedCount: number; + durationMs: number; + failures: TestValidationFailure[]; +} + +export interface TestValidationFailure { + testName: string; + module: string; + message: string; + stackTrace?: string; +} + +// --------------------------------------------------------------------------- +// TestValidationStep +// --------------------------------------------------------------------------- + +export class TestValidationStep implements ValidationStepRunner { + readonly step = 'test' as const; + + private config: TestValidationStepConfig; + private lastSequenceNumber = 0; + + private executeTestRunFn: ( + options: ExecuteTestRunOptions, + ) => Promise; + private fetchFilenamesFn: ( + realmUrl: string, + options?: RealmFetchOptions, + ) => Promise<{ filenames: string[]; error?: string }>; + private readCardFn: ( + realmUrl: string, + path: string, + options?: RealmFetchOptions, + ) => Promise<{ + ok: boolean; + document?: LooseSingleCardDocument; + error?: string; + }>; + + constructor(config: TestValidationStepConfig) { + this.config = config; + this.executeTestRunFn = config.executeTestRun ?? executeTestRunFromRealm; + this.fetchFilenamesFn = config.fetchFilenames ?? fetchRealmFilenames; + this.readCardFn = config.readCard ?? readFile; + } + + async run(targetRealmUrl: string): Promise { + // Step 1: Discover .test.gts files in the realm + let testFiles: string[]; + try { + testFiles = await this.discoverTestFiles(targetRealmUrl); + } catch (err) { + return { + step: 'test', + passed: false, + errors: [ + { + message: `Failed to discover test files: ${err instanceof Error ? err.message : String(err)}`, + }, + ], + }; + } + + if (testFiles.length === 0) { + log.info('No .test.gts files found — nothing to validate'); + return { step: 'test', passed: true, files: [], errors: [] }; + } + + log.info(`Found ${testFiles.length} test file(s): ${testFiles.join(', ')}`); + + // Step 2: Execute tests + let handle: TestRunHandle; + try { + let slug = this.config.issueId + ? deriveIssueSlug(this.config.issueId) + : 'validation'; + + handle = await this.executeTestRunFn({ + targetRealmUrl, + testResultsModuleUrl: this.config.testResultsModuleUrl, + slug, + testNames: [], + authorization: this.config.authorization, + fetch: this.config.fetch, + realmServerUrl: this.config.realmServerUrl, + hostAppUrl: this.config.hostAppUrl, + forceNew: true, + lastSequenceNumber: this.lastSequenceNumber, + }); + + if (handle.sequenceNumber != null) { + this.lastSequenceNumber = handle.sequenceNumber; + } + } catch (err) { + return { + step: 'test', + passed: false, + files: testFiles, + errors: [ + { + message: `Test execution failed: ${err instanceof Error ? err.message : String(err)}`, + }, + ], + }; + } + + // Step 3: Read back the completed TestRun card for detailed results + let details = await this.readTestRunDetails( + targetRealmUrl, + handle.testRunId, + ); + + // Step 4: Map to ValidationStepResult + if (handle.status === 'passed') { + return { + step: 'test', + passed: true, + files: testFiles, + errors: [], + details: details as unknown as Record, + }; + } + + let errors = + details && details.failures.length > 0 + ? details.failures.map((f) => ({ + file: f.module, + message: `${f.testName}: ${f.message}`, + stackTrace: f.stackTrace, + })) + : [ + { + message: handle.errorMessage ?? `Tests ${handle.status}`, + }, + ]; + + return { + step: 'test', + passed: false, + files: testFiles, + errors, + details: details as unknown as Record, + }; + } + + formatForContext(result: ValidationStepResult): string { + if (result.passed) { + let details = result.details as unknown as + | TestValidationDetails + | undefined; + if (details && details.passedCount > 0) { + return `## Test Validation: PASSED\n${details.passedCount} test(s) passed (TestRun: ${details.testRunId})`; + } + return ''; + } + + let details = result.details as unknown as + | TestValidationDetails + | undefined; + if (!details) { + // No detailed data — format from errors array + let errorLines = result.errors.map((e) => `- ${e.message}`).join('\n'); + return `## Test Validation: FAILED\n${errorLines}`; + } + + let lines: string[] = [ + `## Test Validation: FAILED`, + `${details.passedCount} passed, ${details.failedCount} failed (TestRun: ${details.testRunId})`, + ]; + + for (let failure of details.failures) { + lines.push(''); + lines.push(`FAILED: "${failure.testName}" (${failure.module})`); + lines.push(` ${failure.message}`); + if (failure.stackTrace) { + lines.push(` ${failure.stackTrace.slice(0, 300)}`); + } + } + + return lines.join('\n'); + } + + // ------------------------------------------------------------------------- + // Private helpers + // ------------------------------------------------------------------------- + + private async discoverTestFiles(targetRealmUrl: string): Promise { + let result = await this.fetchFilenamesFn(targetRealmUrl, { + authorization: this.config.authorization, + fetch: this.config.fetch, + }); + + if (result.error) { + log.warn(`Failed to fetch realm filenames: ${result.error}`); + throw new Error(result.error); + } + + return result.filenames.filter((f) => f.endsWith('.test.gts')); + } + + private async readTestRunDetails( + targetRealmUrl: string, + testRunId: string, + ): Promise { + try { + let result = await this.readCardFn(targetRealmUrl, testRunId, { + authorization: this.config.authorization, + fetch: this.config.fetch, + }); + + if (!result.ok || !result.document) { + log.warn( + `Could not read TestRun card ${testRunId}: ${result.error ?? 'unknown error'}`, + ); + return undefined; + } + + return extractTestDetails(testRunId, result.document); + } catch (err) { + log.warn( + `Error reading TestRun card ${testRunId}: ${err instanceof Error ? err.message : String(err)}`, + ); + return undefined; + } + } +} + +// --------------------------------------------------------------------------- +// Helpers +// --------------------------------------------------------------------------- + +interface TestRunCardAttributes { + status?: string; + passedCount?: number; + failedCount?: number; + durationMs?: number; + errorMessage?: string; + moduleResults?: { + moduleRef?: { module: string; name: string }; + results?: { + testName?: string; + status?: string; + message?: string; + stackTrace?: string; + durationMs?: number; + }[]; + }[]; +} + +/** + * Extract test validation details from a TestRun card document. + * Handles the JSON:API document shape returned by `readFile()`. + */ +function extractTestDetails( + testRunId: string, + document: LooseSingleCardDocument, +): TestValidationDetails { + let attrs = (document.data?.attributes ?? {}) as TestRunCardAttributes; + + let failures: TestValidationFailure[] = []; + let passedCount = 0; + let failedCount = 0; + + for (let moduleResult of attrs.moduleResults ?? []) { + let moduleName = moduleResult.moduleRef?.module ?? 'unknown'; + for (let result of moduleResult.results ?? []) { + if (result.status === 'passed') { + passedCount++; + } else if (result.status === 'failed' || result.status === 'error') { + failedCount++; + failures.push({ + testName: result.testName ?? 'unknown test', + module: moduleName, + message: result.message ?? `Test ${result.status}`, + stackTrace: result.stackTrace, + }); + } + } + } + + // Prefer the card's computed counts if available + if (attrs.passedCount != null) { + passedCount = attrs.passedCount; + } + if (attrs.failedCount != null) { + failedCount = attrs.failedCount; + } + + return { + testRunId, + passedCount, + failedCount, + durationMs: attrs.durationMs ?? 0, + failures, + }; +} diff --git a/packages/software-factory/src/validators/validation-pipeline.ts b/packages/software-factory/src/validators/validation-pipeline.ts new file mode 100644 index 0000000000..c4686c6480 --- /dev/null +++ b/packages/software-factory/src/validators/validation-pipeline.ts @@ -0,0 +1,176 @@ +/** + * Modular validation pipeline for the issue-driven loop. + * + * Each validation step is a separate module implementing `ValidationStepRunner`. + * The pipeline runs all steps concurrently via `Promise.allSettled()` and + * aggregates results. Adding a new step = creating a new module + one line + * in `createDefaultPipeline()`. + */ + +import type { + ValidationStep, + ValidationStepResult, + ValidationResults, +} from '../factory-agent'; + +import type { Validator } from '../issue-loop'; + +import { NoOpStepRunner } from './noop-step'; +import { TestValidationStep } from './test-step'; + +import type { TestValidationStepConfig } from './test-step'; + +import { logger } from '../logger'; + +let log = logger('validation-pipeline'); + +// --------------------------------------------------------------------------- +// ValidationStepRunner interface +// --------------------------------------------------------------------------- + +/** + * Contract that every validation step module must implement. + * + * Each step: + * - Returns a result even when there's nothing to validate (passed: true) + * - Provides step-specific `details` on the result for context formatting + * - Implements `formatForContext()` to produce LLM-friendly output + */ +export interface ValidationStepRunner { + readonly step: ValidationStep; + run(targetRealmUrl: string): Promise; + /** Format step results for LLM context. Returns human-readable string, empty if nothing to report. */ + formatForContext(result: ValidationStepResult): string; +} + +// --------------------------------------------------------------------------- +// ValidationPipeline +// --------------------------------------------------------------------------- + +/** + * Implements the `Validator` interface from issue-loop.ts. + * Runs all step runners concurrently via `Promise.allSettled()`. + * A failure or exception in one step does not prevent others from completing. + */ +export class ValidationPipeline implements Validator { + private runners: ValidationStepRunner[]; + + constructor(runners: ValidationStepRunner[]) { + this.runners = runners; + } + + async validate(targetRealmUrl: string): Promise { + if (this.runners.length === 0) { + return { passed: true, steps: [] }; + } + + let settled = await Promise.allSettled( + this.runners.map((runner) => runner.run(targetRealmUrl)), + ); + + let stepResults: ValidationStepResult[] = []; + let allPassed = true; + + for (let i = 0; i < settled.length; i++) { + let outcome = settled[i]; + if (outcome.status === 'fulfilled') { + stepResults.push(outcome.value); + if (!outcome.value.passed) { + allPassed = false; + } + } else { + // Step threw an exception — capture as a failed result + let reason = outcome.reason; + let message = reason instanceof Error ? reason.message : String(reason); + log.error( + `Validation step "${this.runners[i].step}" threw: ${message}`, + ); + stepResults.push({ + step: this.runners[i].step, + passed: false, + errors: [{ message }], + }); + allPassed = false; + } + } + + return { passed: allPassed, steps: stepResults }; + } + + /** + * Format all validation results for LLM context. + * Delegates to each step runner's `formatForContext()`. + * Returns a combined markdown string suitable for inclusion in the agent prompt. + */ + formatForContext(results: ValidationResults): string { + if (results.passed && results.steps.every((s) => s.passed)) { + return 'All validation steps passed.'; + } + + let sections: string[] = []; + + // Summary line + let failedSteps = results.steps.filter((s) => !s.passed); + let passedSteps = results.steps.filter((s) => s.passed); + sections.push( + `Validation: ${failedSteps.length} step(s) failed, ${passedSteps.length} passed.`, + ); + + // Per-step details from runners + for (let i = 0; i < this.runners.length; i++) { + let runner = this.runners[i]; + let stepResult = results.steps.find((s) => s.step === runner.step); + if (!stepResult) { + continue; + } + + let formatted = runner.formatForContext(stepResult); + if (formatted) { + sections.push(formatted); + } + } + + return sections.join('\n\n'); + } +} + +// --------------------------------------------------------------------------- +// Factory function +// --------------------------------------------------------------------------- + +export interface ValidationPipelineConfig { + authorization?: string; + fetch?: typeof globalThis.fetch; + realmServerUrl: string; + hostAppUrl: string; + testResultsModuleUrl: string; + issueId?: string; + /** Injected for testing — passed through to TestValidationStep. */ + fetchFilenames?: TestValidationStepConfig['fetchFilenames']; +} + +/** + * Create the default validation pipeline with all 5 steps. + * Currently only the test step is implemented; others are NoOp placeholders. + */ +export function createDefaultPipeline( + config: ValidationPipelineConfig, +): ValidationPipeline { + let testConfig: TestValidationStepConfig = { + authorization: config.authorization, + fetch: config.fetch, + realmServerUrl: config.realmServerUrl, + hostAppUrl: config.hostAppUrl, + testResultsModuleUrl: config.testResultsModuleUrl, + issueId: config.issueId, + fetchFilenames: config.fetchFilenames, + }; + + return new ValidationPipeline([ + new NoOpStepRunner('parse'), + new NoOpStepRunner('lint'), + new NoOpStepRunner('evaluate'), + new NoOpStepRunner('instantiate'), + new TestValidationStep(testConfig), + ]); +} diff --git a/packages/software-factory/tests/index.ts b/packages/software-factory/tests/index.ts index 1ac1eb4b27..4295a33dc1 100644 --- a/packages/software-factory/tests/index.ts +++ b/packages/software-factory/tests/index.ts @@ -16,3 +16,7 @@ import './factory-tool-builder.test'; import './factory-loop.test'; import './factory-implement.test'; import './realm-auth.test'; +import './issue-loop.test'; +import './issue-scheduler.test'; +import './validation-pipeline.test'; +import './test-step.test'; diff --git a/packages/software-factory/tests/issue-loop.test.ts b/packages/software-factory/tests/issue-loop.test.ts index 295d233d1d..b5f1f72336 100644 --- a/packages/software-factory/tests/issue-loop.test.ts +++ b/packages/software-factory/tests/issue-loop.test.ts @@ -25,6 +25,7 @@ import { class MockIssueStore implements IssueStore { issues: SchedulableIssue[]; + updateCalls: { issueId: string; updates: Record }[] = []; constructor(issues: SchedulableIssue[]) { this.issues = issues.map((i) => ({ ...i })); @@ -41,6 +42,13 @@ class MockIssueStore implements IssueStore { } return { ...issue }; } + + async updateIssue( + issueId: string, + updates: { status?: string; description?: string }, + ): Promise { + this.updateCalls.push({ issueId, updates }); + } } // --------------------------------------------------------------------------- @@ -449,7 +457,7 @@ module('issue-loop > blocked issue', function () { // --------------------------------------------------------------------------- module('issue-loop > max inner iterations', function () { - test('exits inner loop after maxIterationsPerIssue', async function (assert) { + test('blocks issue when max iterations reached with failing validation', async function (assert) { let store = new MockIssueStore([ makeIssue({ id: 'iss-1', status: 'backlog', priority: 'high', order: 1 }), ]); @@ -481,13 +489,89 @@ module('issue-loop > max inner iterations', function () { }), ); - assert.strictEqual(result.issueResults[0].exitReason, 'max_iterations'); + // When max iterations is hit with failing validation, issue is blocked + assert.strictEqual(result.issueResults[0].exitReason, 'blocked'); assert.strictEqual(result.issueResults[0].innerIterations, 3); assert.false( result.issueResults[0].lastValidation?.passed, 'last validation was a failure', ); }); + + test('max iterations with passing validation keeps max_iterations exit reason', async function (assert) { + let store = new MockIssueStore([ + makeIssue({ id: 'iss-1', status: 'backlog', priority: 'high', order: 1 }), + ]); + + let turns: MockAgentTurn[] = []; + let validations: ValidationResults[] = []; + + for (let i = 0; i < 3; i++) { + turns.push({ + toolCalls: [ + { + tool: 'write_file', + args: { path: 'card.gts', content: `attempt ${i}` }, + }, + ], + }); + validations.push(makePassingValidation()); + } + + let agent = new MockLoopAgent(turns, store); + + let result = await runIssueLoop( + makeLoopConfig({ + agent, + issueStore: store, + validator: new MockValidator(validations), + maxIterationsPerIssue: 3, + }), + ); + + // When validation passes but issue not done, exit reason stays max_iterations + assert.strictEqual(result.issueResults[0].exitReason, 'max_iterations'); + assert.strictEqual(result.issueResults[0].innerIterations, 3); + }); + + test('updateIssue called when blocking due to max iterations + failing validation', async function (assert) { + let store = new MockIssueStore([ + makeIssue({ id: 'iss-1', status: 'backlog', priority: 'high', order: 1 }), + ]); + + let turns: MockAgentTurn[] = []; + let validations: ValidationResults[] = []; + + for (let i = 0; i < 2; i++) { + turns.push({ + toolCalls: [ + { tool: 'write_file', args: { path: 'card.gts', content: `v${i}` } }, + ], + }); + validations.push(makeFailingValidation()); + } + + let agent = new MockLoopAgent(turns, store); + + await runIssueLoop( + makeLoopConfig({ + agent, + issueStore: store, + validator: new MockValidator(validations), + maxIterationsPerIssue: 2, + }), + ); + + assert.strictEqual(store.updateCalls.length, 1, 'updateIssue called once'); + assert.strictEqual(store.updateCalls[0].issueId, 'iss-1'); + assert.strictEqual(store.updateCalls[0].updates.status, 'blocked'); + assert.ok( + (store.updateCalls[0].updates.description as string).includes( + 'max iteration limit', + ), + 'description includes reason', + ); + }); }); // --------------------------------------------------------------------------- diff --git a/packages/software-factory/tests/issue-scheduler.test.ts b/packages/software-factory/tests/issue-scheduler.test.ts index 10cea3dafa..2ed9619722 100644 --- a/packages/software-factory/tests/issue-scheduler.test.ts +++ b/packages/software-factory/tests/issue-scheduler.test.ts @@ -26,6 +26,10 @@ class MockIssueStore implements IssueStore { } return { ...issue }; } + + async updateIssue(): Promise { + // no-op for scheduler tests + } } // --------------------------------------------------------------------------- diff --git a/packages/software-factory/tests/test-step.test.ts b/packages/software-factory/tests/test-step.test.ts new file mode 100644 index 0000000000..79d557044c --- /dev/null +++ b/packages/software-factory/tests/test-step.test.ts @@ -0,0 +1,399 @@ +import { module, test } from 'qunit'; + +import type { LooseSingleCardDocument } from '@cardstack/runtime-common'; + +import type { ValidationStepResult } from '../src/factory-agent'; + +import { + TestValidationStep, + type TestValidationStepConfig, + type TestValidationDetails, +} from '../src/validators/test-step'; + +import type { + TestRunHandle, + ExecuteTestRunOptions, +} from '../src/test-run-types'; +import type { RealmFetchOptions } from '../src/realm-operations'; + +// --------------------------------------------------------------------------- +// Mock helpers +// --------------------------------------------------------------------------- + +function makeConfig( + overrides: Partial = {}, +): TestValidationStepConfig { + return { + realmServerUrl: 'https://example.test/', + hostAppUrl: 'https://example.test/', + testResultsModuleUrl: 'https://example.test/test-results', + ...overrides, + }; +} + +function makeFetchFilenames( + filenames: string[], +): ( + realmUrl: string, + options?: RealmFetchOptions, +) => Promise<{ filenames: string[]; error?: string }> { + return async () => ({ filenames }); +} + +function makeFetchFilenamesError( + error: string, +): ( + realmUrl: string, + options?: RealmFetchOptions, +) => Promise<{ filenames: string[]; error?: string }> { + return async () => ({ filenames: [], error }); +} + +function makeExecuteTestRun( + handle: TestRunHandle, +): (options: ExecuteTestRunOptions) => Promise { + return async () => handle; +} + +function makeExecuteTestRunThrows( + errorMessage: string, +): (options: ExecuteTestRunOptions) => Promise { + return async () => { + throw new Error(errorMessage); + }; +} + +function makeTestRunCardDocument( + attrs: Record, +): LooseSingleCardDocument { + return { + data: { + type: 'card', + attributes: attrs, + meta: { + adoptsFrom: { module: 'test-results', name: 'TestRun' }, + }, + }, + } as LooseSingleCardDocument; +} + +function makeReadCard(document: LooseSingleCardDocument): ( + realmUrl: string, + path: string, + options?: RealmFetchOptions, +) => Promise<{ + ok: boolean; + document?: LooseSingleCardDocument; + error?: string; +}> { + return async () => ({ ok: true, document }); +} + +function makeReadCardError(error: string): ( + realmUrl: string, + path: string, + options?: RealmFetchOptions, +) => Promise<{ + ok: boolean; + document?: LooseSingleCardDocument; + error?: string; +}> { + return async () => ({ ok: false, error }); +} + +// --------------------------------------------------------------------------- +// Tests +// --------------------------------------------------------------------------- + +module('TestValidationStep', function () { + test('no .test.gts files returns passed', async function (assert) { + let step = new TestValidationStep( + makeConfig({ + fetchFilenames: makeFetchFilenames([ + 'hello.gts', + 'Cards/my-card.json', + 'index.json', + ]), + }), + ); + + let result = await step.run('https://example.test/realm/'); + + assert.true(result.passed); + assert.strictEqual(result.step, 'test'); + assert.strictEqual(result.errors.length, 0); + assert.deepEqual(result.files, []); + }); + + test('tests exist and pass — returns passed with details', async function (assert) { + let testRunDoc = makeTestRunCardDocument({ + status: 'passed', + passedCount: 3, + failedCount: 0, + durationMs: 1500, + moduleResults: [ + { + moduleRef: { module: 'hello.test.gts', name: 'default' }, + results: [ + { testName: 'renders greeting', status: 'passed', durationMs: 500 }, + { testName: 'shows title', status: 'passed', durationMs: 500 }, + { testName: 'has style', status: 'passed', durationMs: 500 }, + ], + }, + ], + }); + + let step = new TestValidationStep( + makeConfig({ + fetchFilenames: makeFetchFilenames(['hello.gts', 'hello.test.gts']), + executeTestRun: makeExecuteTestRun({ + testRunId: 'Test Runs/validation-1', + status: 'passed', + sequenceNumber: 1, + }), + readCard: makeReadCard(testRunDoc), + }), + ); + + let result = await step.run('https://example.test/realm/'); + + assert.true(result.passed); + assert.strictEqual(result.step, 'test'); + assert.deepEqual(result.files, ['hello.test.gts']); + assert.ok(result.details, 'has details'); + + let details = result.details as unknown as TestValidationDetails; + assert.strictEqual(details.testRunId, 'Test Runs/validation-1'); + assert.strictEqual(details.passedCount, 3); + assert.strictEqual(details.failedCount, 0); + assert.strictEqual(details.failures.length, 0); + }); + + test('tests exist and fail — returns failed with detailed failures', async function (assert) { + let testRunDoc = makeTestRunCardDocument({ + status: 'failed', + passedCount: 1, + failedCount: 1, + durationMs: 2000, + moduleResults: [ + { + moduleRef: { module: 'hello.test.gts', name: 'default' }, + results: [ + { testName: 'renders greeting', status: 'passed', durationMs: 500 }, + { + testName: 'shows author', + status: 'failed', + message: "Expected 'Alice' but got ''", + stackTrace: 'at hello.test.gts:15:5', + durationMs: 500, + }, + ], + }, + ], + }); + + let step = new TestValidationStep( + makeConfig({ + fetchFilenames: makeFetchFilenames(['hello.test.gts']), + executeTestRun: makeExecuteTestRun({ + testRunId: 'Test Runs/validation-1', + status: 'failed', + sequenceNumber: 1, + }), + readCard: makeReadCard(testRunDoc), + }), + ); + + let result = await step.run('https://example.test/realm/'); + + assert.false(result.passed); + assert.strictEqual(result.step, 'test'); + assert.ok(result.errors.length > 0, 'has errors'); + assert.ok(result.details, 'has details'); + + let details = result.details as unknown as TestValidationDetails; + assert.strictEqual(details.passedCount, 1); + assert.strictEqual(details.failedCount, 1); + assert.strictEqual(details.failures.length, 1); + assert.strictEqual(details.failures[0].testName, 'shows author'); + assert.ok(details.failures[0].message.includes("Expected 'Alice'")); + }); + + test('executeTestRun throws — returns failed with error message', async function (assert) { + let step = new TestValidationStep( + makeConfig({ + fetchFilenames: makeFetchFilenames(['hello.test.gts']), + executeTestRun: makeExecuteTestRunThrows('Browser launch failed'), + }), + ); + + let result = await step.run('https://example.test/realm/'); + + assert.false(result.passed); + assert.strictEqual(result.errors.length, 1); + assert.ok(result.errors[0].message.includes('Browser launch failed')); + }); + + test('fetchFilenames fails — returns failed with error', async function (assert) { + let step = new TestValidationStep( + makeConfig({ + fetchFilenames: makeFetchFilenamesError('Network timeout'), + }), + ); + + let result = await step.run('https://example.test/realm/'); + + assert.false(result.passed); + assert.ok(result.errors[0].message.includes('Network timeout')); + }); + + test('sequence number tracked across calls', async function (assert) { + let capturedOptions: ExecuteTestRunOptions[] = []; + + let step = new TestValidationStep( + makeConfig({ + fetchFilenames: makeFetchFilenames(['hello.test.gts']), + executeTestRun: async (options) => { + capturedOptions.push(options); + return { + testRunId: `Test Runs/validation-${capturedOptions.length}`, + status: 'passed' as const, + sequenceNumber: capturedOptions.length, + }; + }, + readCard: makeReadCard( + makeTestRunCardDocument({ + status: 'passed', + passedCount: 1, + failedCount: 0, + moduleResults: [], + }), + ), + }), + ); + + await step.run('https://example.test/realm/'); + await step.run('https://example.test/realm/'); + + assert.strictEqual(capturedOptions[0].lastSequenceNumber, 0); + assert.strictEqual(capturedOptions[1].lastSequenceNumber, 1); + }); + + test('readCard failure falls back to handle-only result', async function (assert) { + let step = new TestValidationStep( + makeConfig({ + fetchFilenames: makeFetchFilenames(['hello.test.gts']), + executeTestRun: makeExecuteTestRun({ + testRunId: 'Test Runs/validation-1', + status: 'failed', + errorMessage: '2 tests failed', + sequenceNumber: 1, + }), + readCard: makeReadCardError('fetch failed'), + }), + ); + + let result = await step.run('https://example.test/realm/'); + + assert.false(result.passed); + assert.ok(result.errors.length > 0); + assert.ok(result.errors[0].message.includes('2 tests failed')); + assert.notOk(result.details, 'no details when card read fails'); + }); + + test('formatForContext with passing result and details', function (assert) { + let step = new TestValidationStep(makeConfig()); + + let result: ValidationStepResult = { + step: 'test', + passed: true, + errors: [], + details: { + testRunId: 'Test Runs/validation-1', + passedCount: 5, + failedCount: 0, + durationMs: 1000, + failures: [], + }, + }; + + let formatted = step.formatForContext(result); + assert.ok(formatted.includes('PASSED')); + assert.ok(formatted.includes('5')); + }); + + test('formatForContext with failing result and detailed failures', function (assert) { + let step = new TestValidationStep(makeConfig()); + + let result: ValidationStepResult = { + step: 'test', + passed: false, + errors: [{ message: 'shows author: Expected Alice but got empty' }], + details: { + testRunId: 'Test Runs/validation-1', + passedCount: 2, + failedCount: 1, + durationMs: 1500, + failures: [ + { + testName: 'shows author', + module: 'hello.test.gts', + message: "Expected 'Alice' but got ''", + }, + ], + }, + }; + + let formatted = step.formatForContext(result); + assert.ok(formatted.includes('FAILED')); + assert.ok(formatted.includes('2 passed')); + assert.ok(formatted.includes('1 failed')); + assert.ok(formatted.includes('shows author')); + assert.ok(formatted.includes("Expected 'Alice'")); + }); + + test('formatForContext without details falls back to errors', function (assert) { + let step = new TestValidationStep(makeConfig()); + + let result: ValidationStepResult = { + step: 'test', + passed: false, + errors: [{ message: 'Browser launch failed' }], + }; + + let formatted = step.formatForContext(result); + assert.ok(formatted.includes('FAILED')); + assert.ok(formatted.includes('Browser launch failed')); + }); + + test('issueId is used for slug derivation', async function (assert) { + let capturedOptions: ExecuteTestRunOptions | undefined; + + let step = new TestValidationStep( + makeConfig({ + issueId: 'Issues/sticky-note-define-core', + fetchFilenames: makeFetchFilenames(['hello.test.gts']), + executeTestRun: async (options) => { + capturedOptions = options; + return { + testRunId: 'Test Runs/sticky-note-define-core-1', + status: 'passed' as const, + sequenceNumber: 1, + }; + }, + readCard: makeReadCard( + makeTestRunCardDocument({ + status: 'passed', + passedCount: 1, + failedCount: 0, + moduleResults: [], + }), + ), + }), + ); + + await step.run('https://example.test/realm/'); + + assert.strictEqual(capturedOptions?.slug, 'sticky-note-define-core'); + }); +}); diff --git a/packages/software-factory/tests/validation-pipeline.test.ts b/packages/software-factory/tests/validation-pipeline.test.ts new file mode 100644 index 0000000000..923bb8540f --- /dev/null +++ b/packages/software-factory/tests/validation-pipeline.test.ts @@ -0,0 +1,293 @@ +import { module, test } from 'qunit'; + +import type { + ValidationStep, + ValidationStepResult, + ValidationResults, +} from '../src/factory-agent'; + +import { + ValidationPipeline, + createDefaultPipeline, + type ValidationStepRunner, +} from '../src/issue-loop'; + +import { NoOpStepRunner } from '../src/validators/noop-step'; + +// --------------------------------------------------------------------------- +// Test helpers +// --------------------------------------------------------------------------- + +class MockStepRunner implements ValidationStepRunner { + readonly step: ValidationStep; + private result: ValidationStepResult; + runCount = 0; + + constructor(step: ValidationStep, result: Partial) { + this.step = step; + this.result = { + step, + passed: true, + errors: [], + ...result, + }; + } + + async run(_targetRealmUrl: string): Promise { + this.runCount++; + return this.result; + } + + formatForContext(result: ValidationStepResult): string { + if (result.passed) { + return ''; + } + let errors = result.errors.map((e) => `- ${e.message}`).join('\n'); + return `## ${result.step}: FAILED\n${errors}`; + } +} + +class ThrowingStepRunner implements ValidationStepRunner { + readonly step: ValidationStep; + private errorMessage: string; + runCount = 0; + + constructor(step: ValidationStep, errorMessage: string) { + this.step = step; + this.errorMessage = errorMessage; + } + + async run(_targetRealmUrl: string): Promise { + this.runCount++; + throw new Error(this.errorMessage); + } + + formatForContext(_result: ValidationStepResult): string { + return ''; + } +} + +// --------------------------------------------------------------------------- +// Tests +// --------------------------------------------------------------------------- + +module('ValidationPipeline', function () { + test('empty pipeline returns passed with no steps', async function (assert) { + let pipeline = new ValidationPipeline([]); + let results = await pipeline.validate('https://example.test/realm/'); + + assert.true(results.passed); + assert.strictEqual(results.steps.length, 0); + }); + + test('all passing steps returns passed', async function (assert) { + let pipeline = new ValidationPipeline([ + new MockStepRunner('parse', { passed: true }), + new MockStepRunner('lint', { passed: true }), + new MockStepRunner('test', { passed: true }), + ]); + + let results = await pipeline.validate('https://example.test/realm/'); + + assert.true(results.passed); + assert.strictEqual(results.steps.length, 3); + assert.true(results.steps.every((s) => s.passed)); + }); + + test('one failing step makes overall result failed', async function (assert) { + let pipeline = new ValidationPipeline([ + new MockStepRunner('parse', { passed: true }), + new MockStepRunner('lint', { + passed: false, + errors: [{ message: 'lint violation' }], + }), + new MockStepRunner('test', { passed: true }), + ]); + + let results = await pipeline.validate('https://example.test/realm/'); + + assert.false(results.passed); + assert.strictEqual(results.steps.length, 3); + + let lintStep = results.steps.find((s) => s.step === 'lint'); + assert.false(lintStep?.passed); + assert.strictEqual(lintStep?.errors.length, 1); + assert.strictEqual(lintStep?.errors[0].message, 'lint violation'); + }); + + test('multiple failing steps all reported', async function (assert) { + let pipeline = new ValidationPipeline([ + new MockStepRunner('parse', { + passed: false, + errors: [{ message: 'syntax error' }], + }), + new MockStepRunner('lint', { + passed: false, + errors: [{ message: 'lint error' }], + }), + new MockStepRunner('test', { + passed: false, + errors: [{ message: 'test failure' }], + }), + ]); + + let results = await pipeline.validate('https://example.test/realm/'); + + assert.false(results.passed); + assert.strictEqual(results.steps.filter((s) => !s.passed).length, 3); + }); + + test('steps run concurrently (all runners invoked)', async function (assert) { + let runners = [ + new MockStepRunner('parse', { passed: true }), + new MockStepRunner('lint', { passed: true }), + new MockStepRunner('evaluate', { passed: true }), + new MockStepRunner('instantiate', { passed: true }), + new MockStepRunner('test', { passed: true }), + ]; + + let pipeline = new ValidationPipeline(runners); + await pipeline.validate('https://example.test/realm/'); + + for (let runner of runners) { + assert.strictEqual(runner.runCount, 1, `${runner.step} should run once`); + } + }); + + test('exception in one step does not prevent others', async function (assert) { + let goodStep = new MockStepRunner('parse', { passed: true }); + let throwingStep = new ThrowingStepRunner('lint', 'kaboom'); + let anotherGoodStep = new MockStepRunner('test', { passed: true }); + + let pipeline = new ValidationPipeline([ + goodStep, + throwingStep, + anotherGoodStep, + ]); + + let results = await pipeline.validate('https://example.test/realm/'); + + assert.false(results.passed); + assert.strictEqual(results.steps.length, 3); + + // Good steps still ran + assert.strictEqual(goodStep.runCount, 1); + assert.strictEqual(anotherGoodStep.runCount, 1); + assert.strictEqual(throwingStep.runCount, 1); + + // Throwing step captured as failed + let lintStep = results.steps.find((s) => s.step === 'lint'); + assert.false(lintStep?.passed); + assert.strictEqual(lintStep?.errors.length, 1); + assert.strictEqual(lintStep?.errors[0].message, 'kaboom'); + + // Good steps passed + assert.true(results.steps.find((s) => s.step === 'parse')?.passed); + assert.true(results.steps.find((s) => s.step === 'test')?.passed); + }); + + test('exception captured as failed step result with error message', async function (assert) { + let pipeline = new ValidationPipeline([ + new ThrowingStepRunner('evaluate', 'module load failed'), + ]); + + let results = await pipeline.validate('https://example.test/realm/'); + + assert.false(results.passed); + assert.strictEqual(results.steps[0].step, 'evaluate'); + assert.false(results.steps[0].passed); + assert.strictEqual( + results.steps[0].errors[0].message, + 'module load failed', + ); + }); + + test('createDefaultPipeline creates 5 steps in correct order', async function (assert) { + let pipeline = createDefaultPipeline({ + realmServerUrl: 'https://example.test/', + hostAppUrl: 'https://example.test/', + testResultsModuleUrl: 'https://example.test/test-results', + // Inject a fetchFilenames that returns no test files so the test + // step returns "nothing to validate" without hitting a real realm + fetchFilenames: async () => ({ filenames: [] }), + }); + + // Verify step count and order by running validate and inspecting results + let results = await pipeline.validate('https://example.test/realm/'); + + assert.strictEqual(results.steps.length, 5, 'has 5 steps'); + assert.strictEqual(results.steps[0].step, 'parse', 'step 1 is parse'); + assert.strictEqual(results.steps[1].step, 'lint', 'step 2 is lint'); + assert.strictEqual(results.steps[2].step, 'evaluate', 'step 3 is evaluate'); + assert.strictEqual( + results.steps[3].step, + 'instantiate', + 'step 4 is instantiate', + ); + assert.strictEqual(results.steps[4].step, 'test', 'step 5 is test'); + assert.true(results.passed, 'all steps pass (NoOp + no test files)'); + }); + + test('formatForContext returns simple message when all pass', function (assert) { + let pipeline = new ValidationPipeline([ + new MockStepRunner('parse', { passed: true }), + ]); + + let results: ValidationResults = { + passed: true, + steps: [{ step: 'parse', passed: true, errors: [] }], + }; + + let formatted = pipeline.formatForContext(results); + assert.strictEqual(formatted, 'All validation steps passed.'); + }); + + test('formatForContext includes failure details from runners', function (assert) { + let runner = new MockStepRunner('lint', { + passed: false, + errors: [{ message: 'unexpected semicolon' }], + }); + + let pipeline = new ValidationPipeline([runner]); + + let results: ValidationResults = { + passed: false, + steps: [ + { + step: 'lint', + passed: false, + errors: [{ message: 'unexpected semicolon' }], + }, + ], + }; + + let formatted = pipeline.formatForContext(results); + assert.ok(formatted.includes('FAILED'), 'includes FAILED'); + assert.ok( + formatted.includes('unexpected semicolon'), + 'includes error message', + ); + }); +}); + +module('NoOpStepRunner', function () { + test('always returns passed with empty errors', async function (assert) { + let runner = new NoOpStepRunner('parse'); + let result = await runner.run('https://example.test/realm/'); + + assert.strictEqual(result.step, 'parse'); + assert.true(result.passed); + assert.strictEqual(result.errors.length, 0); + }); + + test('formatForContext returns empty string', function (assert) { + let runner = new NoOpStepRunner('lint'); + let result: ValidationStepResult = { + step: 'lint', + passed: true, + errors: [], + }; + + assert.strictEqual(runner.formatForContext(result), ''); + }); +});