Skip to content

fix: fall back to Running state for Aspire resources without health checks#5265

Closed
Copilot wants to merge 2 commits intomainfrom
copilot/fix-timeout-resources-not-ready
Closed

fix: fall back to Running state for Aspire resources without health checks#5265
Copilot wants to merge 2 commits intomainfrom
copilot/fix-timeout-resources-not-ready

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 27, 2026

Aspire 13.2.0 changed WaitForResourceHealthyAsync to never complete for resources without HealthCheckAnnotation. The default AllHealthy wait behavior called this for all resources indiscriminately, causing timeouts on resources like executable projects that have no health checks configured.

  • Check each resource's HealthCheckAnnotation via DistributedApplicationModel before deciding the wait strategy
  • Resources with health checks → WaitForResourceHealthyAsync (unchanged)
  • Resources without health checks → WaitForResourceAsync(KnownResourceStates.Running)

The fix is in WaitForResourcesWithFailFastAsync, so it applies to all ResourceWaitBehavior modes that use waitForHealthy: true (AllHealthy, Named).

// Before: blindly waits for healthy on all resources
if (waitForHealthy)
    await notificationService.WaitForResourceHealthyAsync(name, cancellationToken);

// After: only resources with health check annotations wait for healthy
var useHealthy = waitForHealthy && (healthCheckNames?.Contains(name) ?? false);
if (useHealthy)
    await notificationService.WaitForResourceHealthyAsync(name, cancellationToken);
else
    await notificationService.WaitForResourceAsync(name, KnownResourceStates.Running, cancellationToken);

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI linked an issue Mar 27, 2026 that may be closed by this pull request
1 task
…y wait behavior

In WaitForResourcesWithFailFastAsync, when waitForHealthy is true, check each
resource's HealthCheckAnnotation annotations. Resources with health checks wait
for healthy state; resources without health checks fall back to waiting for
Running state. This fixes timeouts caused by Aspire 13.2.0 no longer reporting
non-health-checked resources as healthy.

Agent-Logs-Url: https://github.com/thomhurst/TUnit/sessions/ea97898f-4689-44b8-a544-1cb56c0b5925

Co-authored-by: thomhurst <30480171+thomhurst@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

The fix correctly addresses the Aspire 13.2.0 regression where WaitForResourceHealthyAsync never completes for resources without health check annotations. The approach is sound and the implementation is clean. A few concerns worth considering:

1. Duplicate DistributedApplicationModel resolution

In WaitForResourcesWithFailFastAsync, the fix resolves DistributedApplicationModel from app.Services to build healthCheckNames. However, the AllHealthy and AllRunning cases in WaitForResourcesAsync (lines 171 and 180) already resolve the same model to enumerate resource names before passing them to this method. This means the model is resolved twice for the AllHealthy path.

Consider passing the pre-resolved model (or the healthCheckNames set directly) into WaitForResourcesWithFailFastAsync to avoid the redundant service lookup. Alternatively, the health-check filtering could live in WaitForResourcesAsync itself, before calling the helper, which is where the AllHealthy vs AllRunning distinction already lives conceptually:

case ResourceWaitBehavior.AllHealthy:
{
    var model = app.Services.GetRequiredService<DistributedApplicationModel>();
    var names = model.Resources.Select(r => r.Name).ToList();
    var healthyNames = model.Resources
        .Where(r => r.Annotations.OfType<HealthCheckAnnotation>().Any())
        .Select(r => r.Name)
        .ToHashSet();
    await WaitForResourcesWithFailFastAsync(app, notificationService, names, healthyNames, cancellationToken);
    break;
}

This makes WaitForResourcesWithFailFastAsync a pure "wait for these names, using healthy for this subset" function, keeping the policy decision in WaitForResourcesAsync where all the other policy decisions live.

2. The Named behavior silently gained "as healthy as possible" semantics

The Named case (line 188) still calls WaitForResourcesWithFailFastAsync with waitForHealthy: true, and with this PR it will now silently fall back to Running for named resources that lack health check annotations. This is probably the right behaviour, but it's worth noting in the XML doc for ResourceWaitBehavior.Named (currently just "Wait only for resources returned by ResourcesToWaitFor()") that the same health-check fallback applies. Without documentation, a user who specifies named resources expecting a strict "healthy" wait may be surprised.

3. targetState log message is now slightly misleading

The LogProgress call on line 401 (unchanged) still reads:

Waiting for N resource(s) to become healthy: [...]

...even when some of those resources will actually wait for running. With the per-resource state variable introduced in the per-resource log on completion, the summary line at the start should also reflect the mixed intent — something like "healthy (or running where no health check exists)" — or the log could be deferred until after healthCheckNames is built and list which resources will use which strategy.

4. Null-conditional in the lambda is always false when waitForHealthy is true

var useHealthy = waitForHealthy && (healthCheckNames?.Contains(name) ?? false);

When waitForHealthy is false, healthCheckNames is null and the ?? correctly short-circuits to false. When waitForHealthy is true, healthCheckNames is always non-null (it was just assigned from new HashSet<string>(...)), so the ?. / ?? false guard is never needed. This is a minor readability point, but it could be simplified to:

var useHealthy = waitForHealthy && healthCheckNames!.Contains(name);

...inside the block where waitForHealthy is known to be true, or by restructuring the condition so the null-safety is not misleading about the invariant. The current code is not wrong, just slightly obscures the invariant that healthCheckNames != null whenever waitForHealthy == true.

Summary

The fix is correct and the approach is reasonable. The main actionable concern is the duplicate service resolution (#1) and the undocumented change in Named semantics (#2). The rest are low-severity clarity issues.

Copilot AI changed the title [WIP] Fix timeout issue with integration tests after Aspire.Hosting.AppHost upgrade fix: fall back to Running state for Aspire resources without health checks Mar 27, 2026
Copilot AI requested a review from thomhurst March 27, 2026 13:10
@thomhurst
Copy link
Copy Markdown
Owner

Closing in favor of #5268. This PR's premise is incorrect — \ already handles resources without health checks by falling back to waiting for \ state (docs). The actual issue is that Aspire 13.2.0 introduces internal rebuilder resources () that never reach any state. #5268 filters these out entirely.

@thomhurst thomhurst closed this Mar 27, 2026
@thomhurst
Copy link
Copy Markdown
Owner

Closing in favor of #5268. This PR's premise is incorrect — WaitForResourceHealthyAsync already handles resources without health checks by falling back to waiting for Running state (docs). The actual issue is that Aspire 13.2.0 introduces internal rebuilder resources (IResourceWithoutLifetime) that never reach any state. #5268 filters these out entirely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Timeout — Resources not ready

2 participants