-
Notifications
You must be signed in to change notification settings - Fork 3.8k
fix: reconcile workload configurations when a referenced secret/configmap key is updated #7615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: reconcile workload configurations when a referenced secret/configmap key is updated #7615
Conversation
2ccf24f
to
cf0f0ec
Compare
d348016
to
bd62799
Compare
e24be19
to
175f021
Compare
kubeletService: 'kube-system/kubelet', | ||
kubeletEndpointsEnabled: true, | ||
kubeletEndpointSliceEnabled: false, | ||
watchReferencedObjectsInAllNamespaces: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC from description of the field this should be false by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. If enabled by default, it might create excessive load in existing setups where the operator watches for configuration and workload resources in difference namespaces (e.g. OCP). Still I wanted the E2E tests to run with the flag turned on hence setting the flag to true in the jsonnet and by consequence in examples/rbac/prometheus-operator/prometheus-operator-deployment.yaml
. But after more thinking, I might as well tweak the operator's deployment in the e2e framework...
My longer-term plan would be to turn it on by default after a few releases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Significant fix 🎉
I just had a few questions
logger.Info("Operator's configuration", | ||
"watch_referenced_objects_in_all_namespaces", cfg.WatchObjectRefsInAllNamespaces, | ||
"controller_id", cfg.ControllerID, | ||
"enable_config_reloader_probes", cfg.ReloaderConfig.EnableProbes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be good to add a metric to track reconciliation triggered by referenced objects too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean a different metric than prometheus_operator_triggered_total
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't thought much but I was thinking if we could identify a specific secret triggering reconciliation that could help to understand who is culprit incase of excessive API calls? So the metric can have label triggered by
:<secret>
. If I am not wrong currently we have say only triggered by which controller through prometheus_operator_triggered_total
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
having the secret's name & namespace as labels would be too much in terms of cardinality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ya you are right :)
} | ||
|
||
for _, informer := range informerGetter.GetInformers() { | ||
workloads, err := informer.Lister().List(labels.Everything()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need debug log for error here or would it be noisy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fear that it would be noisy indeed.
Just for error would be ok but we'd need to pass a logger.
Signed-off-by: Simon Pasquier <[email protected]>
Signed-off-by: Simon Pasquier <[email protected]>
Signed-off-by: Simon Pasquier <[email protected]>
Signed-off-by: Simon Pasquier <[email protected]>
Signed-off-by: Simon Pasquier <[email protected]>
Signed-off-by: Simon Pasquier <[email protected]>
Signed-off-by: Simon Pasquier <[email protected]>
Signed-off-by: Simon Pasquier <[email protected]>
lgtm |
fix: reconcile workload configurations when a referenced secret/configmap key is updated
Description
Closes #6018
Type of change
What type of changes does your code introduce to the Prometheus operator? Put an
x
in the box that apply.CHANGE
(fix or feature that would cause existing functionality to not work as expected)FEATURE
(non-breaking change which adds functionality)BUGFIX
(non-breaking change which fixes an issue)ENHANCEMENT
(non-breaking change which improves existing functionality)NONE
(if none of the other choices apply. Example, tooling, build system, CI, docs, etc.)Verification
Please check the Prometheus-Operator testing guidelines for recommendations about automated tests.
Changelog entry
Please put a one-line changelog entry below. This will be copied to the changelog file during the release process.