Description
The project and serviceaccount management controllers create RBAC resources (Roles, RoleBindings, ClusterRoles, ClusterRoleBindings, ServiceAccounts) using a Create → AlreadyExists → Update fallback pattern. In clusters with many projects, this generates a high volume of unnecessary 409 errors on every reconcile cycle.
Affected files:
pkg/controller/management/serviceaccounts/serviceaccounts.go — ensureControllerPermissions
pkg/controller/management/projects/projects.go — ensureSystemPermissions, ensureControllerPermissions, ensureDefaultUserRoles, ensureExtendedPermissions
Current pattern
if err := r.client.Create(ctx, roleBinding); err != nil {
if !apierrors.IsAlreadyExists(err) {
return err
}
r.client.Update(ctx, roleBinding) // always fires if resource exists
}
The problem
ensureControllerPermissions in the serviceaccount reconciler iterates over all project namespaces and attempts to Create a RoleBinding in each. Since the RoleBindings already exist after first reconcile, every subsequent reconcile generates N × 409s (where N = number of projects).
- The project reconciler does the same across 5+ resource types on every reconciliation, which is triggered frequently by Warehouse/Stage health condition changes.
Observed impact
On a cluster with active Kargo usage, the k8s API server shows ~800 RBAC-related 409s per minute accumulating continuously:
26,589 rolebindings POST → 409
15,951 roles POST → 409
15,951 serviceaccounts POST → 409
5,318 clusterroles POST → 409
5,318 clusterrolebindings POST → 409
Every 409 is a wasted API server round trip (etcd read with no write). The subsequent Update then performs an unconditional etcd write even when the resource hasn't changed.
Suggested fix
Replace the Create/Update pattern with server-side apply:
if err := r.client.Patch(ctx, roleBinding, client.Apply,
client.ForceOwnership,
client.FieldOwner("kargo"),
); err != nil {
return fmt.Errorf("error applying RoleBinding %q in namespace %q: %w",
roleBinding.Name, roleBinding.Namespace, err)
}
Benefits:
- Single API call instead of two
- No 409s — server handles create-or-update atomically
- No etcd write when the object hasn't changed (unlike unconditional
Update)
- Already the idiomatic pattern for Kubernetes controllers
Description
The project and serviceaccount management controllers create RBAC resources (Roles, RoleBindings, ClusterRoles, ClusterRoleBindings, ServiceAccounts) using a
Create→AlreadyExists→Updatefallback pattern. In clusters with many projects, this generates a high volume of unnecessary 409 errors on every reconcile cycle.Affected files:
pkg/controller/management/serviceaccounts/serviceaccounts.go—ensureControllerPermissionspkg/controller/management/projects/projects.go—ensureSystemPermissions,ensureControllerPermissions,ensureDefaultUserRoles,ensureExtendedPermissionsCurrent pattern
The problem
ensureControllerPermissionsin the serviceaccount reconciler iterates over all project namespaces and attempts toCreatea RoleBinding in each. Since the RoleBindings already exist after first reconcile, every subsequent reconcile generates N × 409s (where N = number of projects).Observed impact
On a cluster with active Kargo usage, the k8s API server shows ~800 RBAC-related 409s per minute accumulating continuously:
Every 409 is a wasted API server round trip (etcd read with no write). The subsequent
Updatethen performs an unconditional etcd write even when the resource hasn't changed.Suggested fix
Replace the
Create/Updatepattern with server-side apply:Benefits:
Update)