-
Notifications
You must be signed in to change notification settings - Fork 224
Description
Describe the bug
ASO version(s):
- Broken: v2.15.0
- Works: v2.14.0
Kubernetes: AKS (private), Kubernetes server v1.33.x
Install method: Helm (controller in azureserviceoperator-system)
Controller SA: azureserviceoperator-default
Namespaces involved: azureserviceoperator-system, tenant-blue
What happened
After upgrading to ASO v2.15.0, reconciles for ARM resources in tenant-blue fail early with:
generic_reconciler.go:206] "Error claiming resource" err="/v1, Kind=Secret is not cached" logger="resources_resourcegroup" name="rg-tenant-blue" namespace="tenant-blue"
Downgrading back to v2.14.0 (same cluster and manifests) makes the error disappear and resources reconcile normally.
Expected behavior
ASO should be able to claim/reconcile resources without failing on “Secret is not cached”. If Secrets are intentionally not cached, reads should use the uncached API reader, or the controller should register the Secrets informer for the watched namespaces.
To Reproduce
Deploy ASO v2.15.0 in azureserviceoperator-system (SA azureserviceoperator-default).
Apply a simple ARM resource (e.g., resources.azure.com/v1api20200601, ResourceGroup) in namespace tenant-blue.
Observe controller logs:
Error claiming resource ... err="/v1, Kind=Secret is not cached"
Additional context
What I’ve already checked
RBAC:
Granted get,list,watch (and also tried with create,update,patch) on core secrets to the controller SA in both namespaces:
- azureserviceoperator-system
- tenant-blue
Verified with:
kubectl auth can-i --as=system:serviceaccount:azureserviceoperator-system:azureserviceoperator-default \
-n azureserviceoperator-system watch secrets # yes
kubectl auth can-i --as=system:serviceaccount:azureserviceoperator-system:azureserviceoperator-default \
-n tenant-blue watch secrets # yes
Watch scope:
Tried default (no scope var set) and also explicitly set:
AZURE_TARGET_NAMESPACES=tenant-blue in aso-controller-settings Secret.
Rolled the deployment after each change.
Controller restart:
kubectl -n azureserviceoperator-system rollout restart deploy/azureserviceoperator-controller-manager
ClusterRole test (to rule out namespace scoping issues):
Temporarily bound a ClusterRole with get,list,watch on Secrets across the cluster → still hit the same error on v2.15.0.
AKS features/policies:
Azure Policy / Defender / Workload Identity do not affect list/watch on K8s Secrets; nothing else in AKS configuration changes the controller-runtime cache behavior.
All of the above did not resolve the issue on v2.15.0. Switching only the ASO version back to v2.14.0 resolved it immediately.
Hypothesis
This looks like a behavior change in v2.15.0 where the controller now fails reads against types without a registered informer (e.g., stricter cache reader behavior such as ReaderFailOnMissingInformer). If any code path in the reconcile/claim phase reads a Secret via the cached client without registering a Secret informer (or without using the API reader), it will surface exactly as ErrResourceNotCached (“Kind=Secret is not cached”).
Ask
- Is this a known regression/change in v2.15.0?
- Should ASO be registering an informer for corev1.Secret, or should the relevant read be using the API reader instead?
- Guidance/workaround for v2.15.0 users? (env flag/setting to restore previous behavior, or patch to register Secret informer)
Happy to provide more logs. Here’s the representative log line again:
generic_reconciler.go:206] "Error claiming resource" err="/v1, Kind=Secret is not cached" logger="resources_resourcegroup" name="rg-tenant-blue" namespace="tenant-blue"
Metadata
Metadata
Assignees
Labels
Type
Projects
Status