Skip to content

Distinguish between a failed machine creation and a failed node creation #1064

@Kostov6

Description

@Kostov6

How to categorize this issue?

/area ops-productivity
/kind bug

What happened:

❯ k -n ... get machine <machine> -o yaml | yq .status
currentStatus:
  lastUpdateTime: "2026-01-16T10:04:42Z"
  phase: Pending
  timeoutActive: true
lastOperation:
  description: Creating machine on cloud provider
  lastUpdateTime: "2026-01-16T10:04:42Z"
  state: Processing
  type: Create

❯ aws ec2 describe-instances ... --query 'Reservations[].Instances[].[Tags[?Key==`Name`] | [0].Value, State.Name]'
[
    [
        "<machine>",
        "running"
    ]
]

❯ k get nodes
(empty)

What you expected to happen:

The above example shows a successfully created machine on the cloud provider but a failure in kubelet registering the node. In this case a more appropriate description would be Waiting for kubelet to create node object.

This message will improve debugging failing shoot clusters as it is much more accurate

Environment:

  • Kubernetes version (use kubectl version):
  • Cloud provider or hardware configuration:
  • Others:

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/ops-productivityOperator productivity related (how to improve operations)kind/bugBug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions