You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(logstorage): validate replicas < node count to prevent ILM stall (tigera#4529)
* fix(logstorage): validate replicas < node count to prevent ILM stall
On single-node ES clusters with replicas: 1, replica shards can never
be allocated. This causes the ILM warm phase migrate action to wait
indefinitely for shard copies to become active, blocking progression
to the delete phase and causing indices to accumulate beyond retention.
Add validation in the LogStorage initializer that rejects configurations
where indices.replicas >= nodes.count, with a clear error message
guiding users to set replicas to 0 for single-node deployments.
* fix(logstorage): warn when node count only exceeds replicas by 1
* fmt
* fix(logstorage): return error as last argument per Go convention
Fixes ST1008 staticcheck violation by swapping return order from
(error, string) to (string, error) in validateLogStorage and
validateReplicasForNodeCount.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
return"", fmt.Errorf("LogStorage spec.indices.replicas (%d) must be less than spec.nodes.count (%d); replica shards cannot be allocated when there are not enough nodes. For a single-node Elasticsearch cluster, set spec.indices.replicas to 0", replicas, nodeCount)
191
+
}
192
+
193
+
ifreplicas>0&&nodeCount==replicas+1 {
194
+
returnfmt.Sprintf("LogStorage spec.nodes.count (%d) is only 1 more than spec.indices.replicas (%d); this may prevent voluntary pod evictions (e.g., node repaving) due to PodDisruptionBudget constraints. If this is expected for your environment, no action is needed. Otherwise, consider setting spec.nodes.count to at least %d", nodeCount, replicas, replicas+2), nil
0 commit comments