WARNING: THIS SITE IS A MIRROR OF GITHUB.COM / IT CANNOT LOGIN OR REGISTER ACCOUNTS / THE CONTENTS ARE PROVIDED AS-IS / THIS SITE ASSUMES NO RESPONSIBILITY FOR ANY DISPLAYED CONTENT OR LINKS / IF YOU FOUND SOMETHING MAY NOT GOOD FOR EVERYONE, CONTACT ADMIN AT ilovescratch@foxmail.com
Skip to content

[BUG] OpenKruise webhook controller should NOT overwrite namespaceSelectors added by other actors #2220

@peng2x

Description

@peng2x

What happened:
We are deploying OpenKruise to Azure Kubernetes Service (AKS) clusters with Helm chart (v1.7.2). After the kruise-controller-manager pod starts up, we observed frequent conflict errors in updating the Kruise ValidatingWebhookConfiguration and MutatingWebhookConfiguration. This unexpected behavior leads to large volume of requests from kruise-controller-manager to API Server (even after the built-in client-go throttling).

OpenKruise’s webhook controller stores a copy of the webhook configuration as template in annotations on its first syn. After that, it reconciles the webhook configuration using the cached template.

if templateStr := mutatingConfig.Annotations["template"]; len(templateStr) > 0 {
var mutatingWHs []admissionregistrationv1.MutatingWebhook
if err := json.Unmarshal([]byte(templateStr), &mutatingWHs); err != nil {
return nil, err
}
return mutatingWHs, nil
}

This template (based on webhookconfiguration.yaml) sets empty namespaceSelectors on some webhooks and the minimal namespace selector (exclude kube-system) on others.

    namespaceSelector:
      matchExpressions:
        - key: [kubernetes.io/metadata.name](http://kubernetes.io/metadata.name)
          operator: NotIn
          values:
            - kube-system

AKS has an admissions enforcer, which automatically excludes kube-system and AKS internal namespaces by patching the namespaceSelector of all webhooks to prevent webhooks from running in critical namespaces. Its reconciliation loop patches each webhook's namespaceSelector with two matchExpressions:

        - key: control-plane
          operator: NotIn
          values:
            - "true"
        - key: [kubernetes.azure.com/managedby](http://kubernetes.azure.com/managedby)
          operator: NotIn
          values:
            - aks

What you expected to happen:
OpenKruise webhook controller should NOT overwrite namespaceSelectors added by other actors (such as AKS in this case).

How to reproduce it (as minimally and precisely as possible):

  1. Deploy kruise-manger to an AKS cluster
  2. Run kubectl get mutatingwebhookconfiguration kruise-mutating-webhook-configuration -n kruise-system -o yaml multiple times (5-10) to observer the flipping of namespaceSelector values in the mutating webhook configuration
  3. Run kubectl get validatingwebhookconfiguration kruise-validating-webhook-configuration -n kruise-system -o yaml multiple times (5-10) to observer the flipping of namespaceSelector values in the validating webhook configuration

Anything else we need to know?:

I filed a separate issue in the openkruise/charts repo for the same problem. However, a more durable fix should be changing the behavior of webhook controller to not overwrite nmaespaceSelector changes made by others.

Environment:

  • Kruise version: v1.7.2
  • Kubernetes version (use kubectl version): v1.34.1
  • Install details (e.g. helm install args):
    helm install kruise-manager openkruise/kruise \
     --wait \
     --timeout 5m \
     --set image.repository=mcr.microsoft.com/oss/v2/openkruise/kruise-manager,image.tag=v1.7.2,image.pullPolicy=IfNotPresent,manager.replicas=1
    
  • Others: None

Metadata

Metadata

Assignees

Labels

kind/bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions