fix: fall back to Running state for Aspire resources without health checks#5265
fix: fall back to Running state for Aspire resources without health checks#5265
Conversation
…y wait behavior In WaitForResourcesWithFailFastAsync, when waitForHealthy is true, check each resource's HealthCheckAnnotation annotations. Resources with health checks wait for healthy state; resources without health checks fall back to waiting for Running state. This fixes timeouts caused by Aspire 13.2.0 no longer reporting non-health-checked resources as healthy. Agent-Logs-Url: https://github.com/thomhurst/TUnit/sessions/ea97898f-4689-44b8-a544-1cb56c0b5925 Co-authored-by: thomhurst <30480171+thomhurst@users.noreply.github.com>
There was a problem hiding this comment.
Review
The fix correctly addresses the Aspire 13.2.0 regression where WaitForResourceHealthyAsync never completes for resources without health check annotations. The approach is sound and the implementation is clean. A few concerns worth considering:
1. Duplicate DistributedApplicationModel resolution
In WaitForResourcesWithFailFastAsync, the fix resolves DistributedApplicationModel from app.Services to build healthCheckNames. However, the AllHealthy and AllRunning cases in WaitForResourcesAsync (lines 171 and 180) already resolve the same model to enumerate resource names before passing them to this method. This means the model is resolved twice for the AllHealthy path.
Consider passing the pre-resolved model (or the healthCheckNames set directly) into WaitForResourcesWithFailFastAsync to avoid the redundant service lookup. Alternatively, the health-check filtering could live in WaitForResourcesAsync itself, before calling the helper, which is where the AllHealthy vs AllRunning distinction already lives conceptually:
case ResourceWaitBehavior.AllHealthy:
{
var model = app.Services.GetRequiredService<DistributedApplicationModel>();
var names = model.Resources.Select(r => r.Name).ToList();
var healthyNames = model.Resources
.Where(r => r.Annotations.OfType<HealthCheckAnnotation>().Any())
.Select(r => r.Name)
.ToHashSet();
await WaitForResourcesWithFailFastAsync(app, notificationService, names, healthyNames, cancellationToken);
break;
}This makes WaitForResourcesWithFailFastAsync a pure "wait for these names, using healthy for this subset" function, keeping the policy decision in WaitForResourcesAsync where all the other policy decisions live.
2. The Named behavior silently gained "as healthy as possible" semantics
The Named case (line 188) still calls WaitForResourcesWithFailFastAsync with waitForHealthy: true, and with this PR it will now silently fall back to Running for named resources that lack health check annotations. This is probably the right behaviour, but it's worth noting in the XML doc for ResourceWaitBehavior.Named (currently just "Wait only for resources returned by ResourcesToWaitFor()") that the same health-check fallback applies. Without documentation, a user who specifies named resources expecting a strict "healthy" wait may be surprised.
3. targetState log message is now slightly misleading
The LogProgress call on line 401 (unchanged) still reads:
Waiting for N resource(s) to become healthy: [...]
...even when some of those resources will actually wait for running. With the per-resource state variable introduced in the per-resource log on completion, the summary line at the start should also reflect the mixed intent — something like "healthy (or running where no health check exists)" — or the log could be deferred until after healthCheckNames is built and list which resources will use which strategy.
4. Null-conditional in the lambda is always false when waitForHealthy is true
var useHealthy = waitForHealthy && (healthCheckNames?.Contains(name) ?? false);When waitForHealthy is false, healthCheckNames is null and the ?? correctly short-circuits to false. When waitForHealthy is true, healthCheckNames is always non-null (it was just assigned from new HashSet<string>(...)), so the ?. / ?? false guard is never needed. This is a minor readability point, but it could be simplified to:
var useHealthy = waitForHealthy && healthCheckNames!.Contains(name);...inside the block where waitForHealthy is known to be true, or by restructuring the condition so the null-safety is not misleading about the invariant. The current code is not wrong, just slightly obscures the invariant that healthCheckNames != null whenever waitForHealthy == true.
Summary
The fix is correct and the approach is reasonable. The main actionable concern is the duplicate service resolution (#1) and the undocumented change in Named semantics (#2). The rest are low-severity clarity issues.
|
Closing in favor of #5268. This PR's premise is incorrect — \ already handles resources without health checks by falling back to waiting for \ state (docs). The actual issue is that Aspire 13.2.0 introduces internal rebuilder resources () that never reach any state. #5268 filters these out entirely. |
|
Closing in favor of #5268. This PR's premise is incorrect — |
Aspire 13.2.0 changed
WaitForResourceHealthyAsyncto never complete for resources withoutHealthCheckAnnotation. The defaultAllHealthywait behavior called this for all resources indiscriminately, causing timeouts on resources like executable projects that have no health checks configured.HealthCheckAnnotationviaDistributedApplicationModelbefore deciding the wait strategyWaitForResourceHealthyAsync(unchanged)WaitForResourceAsync(KnownResourceStates.Running)The fix is in
WaitForResourcesWithFailFastAsync, so it applies to allResourceWaitBehaviormodes that usewaitForHealthy: true(AllHealthy,Named).💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.