Releases: k0rdent/kof
v1.4.0
❗ Upgrade Instructions ❗
PromxyServerGroup
CRD was moved fromcrds/
totemplates/
directory for auto-upgrade.- Please use
--take-ownership
on upgrade ofkof-mothership
to 1.4.0:helm upgrade --take-ownership \ --reset-values --wait -n kof kof-mothership -f mothership-values.yaml \ oci://ghcr.io/k0rdent/kof/charts/kof-mothership --version 1.4.0
- This will not be required in future upgrades.
🚀 New Features 🚀
- b31f729: feat: add cluster deployment monitoring page to KOF UI (#502) by @AndrejsPon00
- 032fd30: feat: add cluster summaries monitoring page to KOF UI (#505) by @AndrejsPon00
- ffc72f4: feat: Add multi cluster services monitoring page to KOF UI (#508) by @AndrejsPon00
- 48eb9d3: feat: add state management provider monitoring to KOF UI (#509) by @AndrejsPon00
- 4a3c142: feat: add service set monitoring page to KOF UI (#519) by @AndrejsPon00
- faa2c31: feat: migrate to receiver_creator for filelog/containers to support annotation-based discovery (#529) by @gmlexx
- 3803b5a: feat: add sveltos clusters monitoring page to KOF UI (#531) by @AndrejsPon00
- 783fe3a: feat: add k8s audit logs collector config (#539) by @AndrejsPon00
- fbf250b: feat: add parser for key-value logs (#528) by @AndrejsPon00
- 665c3a8: feat: add filestore for filelogreceivers to store offsets (#544) by @gmlexx
- cae1488: feat: add alerts for CAPI Objects states (#526) by @AndrejsPon00
- ad2ff78: feat: add adopted clusters support for Istio (#551) by @gmlexx
🐛 Notable Fixes 🐛
- 82683aa: fix: remove timestamp metrics from kube-state custom resources (#498) by @gmlexx
- fe99e29: fix: Typo
grafana-operator.enables/enabled
, dedup of this subchart, updated descriptions (#506) by @denis-ryzhkov - 2e6c66e: fix: Fix of warnings on helm install/upgrade of kof-collectors (#504) by @denis-ryzhkov
- 6d2e339: fix: flatten event fields for better filtering (#510) by @gmlexx
- cc20148: fix: Auto-upgrade KOF CRD PromxyServerGroup (#546) by @denis-ryzhkov
- 90cd7ab: fix: Security fix of vite (#548) by @denis-ryzhkov
- 9be5d40: fix: show log line field in dashboard (#559) by @gmlexx
- 79b2f80: fix: move collectors service extensions list to upper charts values (#558) by @gmlexx
- 3e7b53a: fix: hardcoded DS UID victoria-logs.yaml (#560) by @aglarendil
- cf2d044: fix: Crash of OTelCol without extensions required for storing KOF data of Management cluster (#561) by @denis-ryzhkov
✨ More Improvements ✨
- a45a203: chore: add prettier and reformat all dashboards yamls (#512) by @gmlexx
- 354171d: chore: add copilot instructions file (#514) by @gmlexx
- 3fe3595: refactor: clean up k8s object monitoring pages logic in KOF UI (#515) by @AndrejsPon00
- fc37df0: refactor: unify backend k8s objects handler for UI (#517) by @AndrejsPon00
- 8de2d64: ci: checkout latest KCM main for upgrade test (#520) by @gmlexx
- 0f3a4e9: docs: fix links to dev.md (#521) by @gmlexx
- 567f3ad: chore: add
ui-tests
job topr_test_helm_chart.yml
to run UI tests (#522) by @AndrejsPon00 - 248fd5a: test: fix Victoria dashboard tests in KOF UI (#523) by @AndrejsPon00
- 07582a6: chore: update victorialogs plugin for adhoc filters fixes (#525) by @gmlexx
- 7d67731: test: add tests for dashboard components in KOF UI (#524) by @AndrejsPon00
- 74850ee: chore: update victoriametrics plugin with latest fixes (#527) by @gmlexx
- 8dd3e30: chore: remove sveltos-dashboard from mothership chart (#532) by @AndrejsPon00
- 66391a3: chore: KOF 1.4.0-rc1 (#552) by @denis-ryzhkov
- 44d1960: ci: split management and adopted clusters testing (#557) by @gmlexx
Full Changelog: v1.3.0...v1.4.0
v1.3.0
❗ Upgrade Instructions ❗
- Please apply the "Reconciling MultiClusterService" workaround
and update VMCluster/VMAlertspec
values
as documented here.
📚 New Docs 📚
- KOF UI docs about misconfiguration detection and VictoriaMetrics/Logs.
🚀 New Features 🚀
- acb9120: feat: add http config for adopted regional cluster by @gmlexx
- 93d1064: feat: add backend for internal observability of VictoriaMetrics/Logs (#463) by @AndrejsPon00
- 6958133: feat: add VictoriaMetrics and VictoriaLogs observability page to KOF UI (#480) by @AndrejsPon00
- 1fa557d: feat: allow full vm custom objects specs definition in values (#478) by @gmlexx
- 6978d1c: feat: add tooltip for metrics description in KOF UI (#483) by @AndrejsPon00
- 370da8f: feat: update helm charts on storage secret change (#484) by @gmlexx
- 838f53a: feat: add raw metrics tab in KOF UI (#487) by @AndrejsPon00
- 1131315: feat: add custom resources to kube-state-metrics (#489) by @gmlexx
- 7832d51: feat: mothership components monitoring (#342) by @aglarendil
- 6df9757: feat: add misconfiguration check for collector scrape in KOF UI (#490) by @AndrejsPon00
- a8c97a6: feat: kube-state-metrics dashboards for k0rdent objects (#497) by @gmlexx
🐛 Notable Fixes 🐛
- cfba650: fix: change opencost prometheus URL to HTTP for local cluster (#451) by @AndrejsPon00
- 7ba9801: fix: correct instrumentation exporter endpoint to resolve trace export error (#452) by @AndrejsPon00
- 1dc8a60: fix: Replacing release notes with auto-generated ones, updated docs/release (#453) by @denis-ryzhkov
- 64b6f5b: fix: slow KOF UI responses due to long proxy timeout (#448) by @AndrejsPon00
- f49b35a: fix: Customized
cert-manager-startupapicheck
image registry (#457) by @denis-ryzhkov - d15e1cb: fix: promxy server group doesn't update after http client config changes (#456) by @AndrejsPon00
- ad3bec4: fix: increase promxy memory requests/limits to prevent OOM (#458) by @AndrejsPon00
- 2c3d50a: fix: move grafana operator to kof-operators helm chart (#461) by @gmlexx
- ada76b5: fix: Jaeger authenticated endpoint of regional cluster became available for other clusters (#462) by @denis-ryzhkov
- 3437957: fix:
istio/gateway
chart repo compatibility with custom registry (#464) by @denis-ryzhkov - 588682e: fix: add promxy suffix to promxy labels by @gmlexx
- f3dbad0: fix: add missing env variable for goreleaser (#466) by @gmlexx
- 6ea8e64: fix: Added
ServiceTemplateChain
cert-manager-v1-16-4-from-1-16-4
required for upgrade to KOF 1.2.0 (#467) by @denis-ryzhkov - 52b9658: fix: override only defined properties with annotation on config update (#468) by @gmlexx
- 4353a1e: fix: Custom
kcm.serviceMonitor.selector
(#472) by @denis-ryzhkov - 2d6104b: fix: "Cluster Deployments Events" dashboard vs "From Management to Regional" case (#469) by @denis-ryzhkov
- 5f6f3dd: fix: Custom
registryCredentialsConfig
inhelmCharts
ofkof-istio
(#473) by @denis-ryzhkov - f9ad1e9: fix: use node name in node exporter dashboards (#470) by @gmlexx
- 75f174b: fix: Two cases of
chartName
forcert-manager
inkof-istio-network
by @denis-ryzhkov - eb8d43f: fix: Moved
kof-operators
to be installed beforekof-storage
inkof-istio-regional
to avoid "CRDs not found" by @denis-ryzhkov - 06a137d: fix: Updated Jaeger secret name after movingit from
kof-storage
tokof-mothership
in #462 to avoidinvalid ownership metadata
by @denis-ryzhkov - 48504e1: fix: ContainerHighMemUsage alert has container label missing (#477) by @aglarendil
- a1ce5b9: fix: Typo in
intervalFactor
lead to 500 in "Istio Service Dashboard" (#479) by @denis-ryzhkov - 3154164: fix: incorrect log level parsing for uppercase codes (#481) by @AndrejsPon00
- c0098ec: fix: correctly parse and render total metric values and labels (not just last label) in kof UI (#486) by @AndrejsPon00
- b63188f: fix: prevent OOM crash in promxy on large queries (#491) by @AndrejsPon00
- 93599c1: fix: correct memory queries in Grafana dashboard panels (#494) by @AndrejsPon00
- 9107ce6: fix: prevent duplicate metric collection (#488) by @AndrejsPon00
✨ More Improvements ✨
- d3cc733: chore: setup go based on go.mod file by @gmlexx
- fbd2d4a: chore: apply coredns patch for mothership and restart once by @gmlexx
- 7867e05: chore: add promxy port-forward target by @gmlexx
- c62848b: test: check promxy metrics by @gmlexx
- 36154db: chore: add charts and docker images build (#465) by @gmlexx
- 29f0a2b: test: wait until vmauth creates ingress in kind-adopted-regional cluster (#471) by @gmlexx
- 1885064: chore: KOF 1.2.1 patch release by @denis-ryzhkov
- c106d22: test: add unit tests for Victoria pages (KOF UI) (#482) by @AndrejsPon00
- 1ac30b6: chore: KOF 1.3.0-rc1 (#496) by @denis-ryzhkov
- 13e3720: chore: KOF 1.3.0 release (#499) by @denis-ryzhkov
Full Changelog: v1.2.0...v1.3.0
v1.3.0-rc2
Changelog
🚀 New Features 🚀
🐛 Notable Fixes 🐛
- 9107ce6: fix: prevent duplicate metric collection (#488) by @AndrejsPon00
✨ More Improvements ✨
- 13e3720: chore: KOF 1.3.0 release (#499) by @denis-ryzhkov
Full Changelog: v1.3.0-rc1...v1.3.0-rc2
v1.3.0-rc1
Changelog
🚀 New Features 🚀
- acb9120: feat: add http config for adopted regional cluster by @gmlexx
- 93d1064: feat: add backend for internal observability of VictoriaMetrics/Logs (#463) by @AndrejsPon00
- 6958133: feat: add VictoriaMetrics and VictoriaLogs observability page to KOF UI (#480) by @AndrejsPon00
- 1fa557d: feat: allow full vm custom objects specs definition in values (#478) by @gmlexx
- 6978d1c: feat: add tooltip for metrics description in KOF UI (#483) by @AndrejsPon00
- 370da8f: feat: update helm charts on storage secret change (#484) by @gmlexx
- 838f53a: feat: add raw metrics tab in KOF UI (#487) by @AndrejsPon00
- 1131315: feat: add custom resources to kube-state-metrics (#489) by @gmlexx
- 7832d51: feat: mothership components monitoring (#342) by @aglarendil
- 6df9757: feat: add misconfiguration check for collector scrape in KOF UI (#490) by @AndrejsPon00
🐛 Notable Fixes 🐛
- cfba650: fix: change opencost prometheus URL to HTTP for local cluster (#451) by @AndrejsPon00
- 7ba9801: fix: correct instrumentation exporter endpoint to resolve trace export error (#452) by @AndrejsPon00
- 1dc8a60: fix: Replacing release notes with auto-generated ones, updated docs/release (#453) by @denis-ryzhkov
- 64b6f5b: fix: slow KOF UI responses due to long proxy timeout (#448) by @AndrejsPon00
- f49b35a: fix: Customized
cert-manager-startupapicheck
image registry (#457) by @denis-ryzhkov - d15e1cb: fix: promxy server group doesn't update after http client config changes (#456) by @AndrejsPon00
- ad3bec4: fix: increase promxy memory requests/limits to prevent OOM (#458) by @AndrejsPon00
- 2c3d50a: fix: move grafana operator to kof-operators helm chart (#461) by @gmlexx
- ada76b5: fix: Jaeger authenticated endpoint of regional cluster became available for other clusters (#462) by @denis-ryzhkov
- 3437957: fix:
istio/gateway
chart repo compatibility with custom registry (#464) by @denis-ryzhkov - 588682e: fix: add promxy suffix to promxy labels by @gmlexx
- f3dbad0: fix: add missing env variable for goreleaser (#466) by @gmlexx
- 6ea8e64: fix: Added
ServiceTemplateChain
cert-manager-v1-16-4-from-1-16-4
required for upgrade to KOF 1.2.0 (#467) by @denis-ryzhkov - 52b9658: fix: override only defined properties with annotation on config update (#468) by @gmlexx
- 4353a1e: fix: Custom
kcm.serviceMonitor.selector
(#472) by @denis-ryzhkov - 2d6104b: fix: "Cluster Deployments Events" dashboard vs "From Management to Regional" case (#469) by @denis-ryzhkov
- 5f6f3dd: fix: Custom
registryCredentialsConfig
inhelmCharts
ofkof-istio
(#473) by @denis-ryzhkov - f9ad1e9: fix: use node name in node exporter dashboards (#470) by @gmlexx
- 75f174b: fix: Two cases of
chartName
forcert-manager
inkof-istio-network
by @denis-ryzhkov - eb8d43f: fix: Moved
kof-operators
to be installed beforekof-storage
inkof-istio-regional
to avoid "CRDs not found" by @denis-ryzhkov - 06a137d: fix: Updated Jaeger secret name after movingit from
kof-storage
tokof-mothership
in #462 to avoidinvalid ownership metadata
by @denis-ryzhkov - 48504e1: fix: ContainerHighMemUsage alert has container label missing (#477) by @aglarendil
- a1ce5b9: fix: Typo in
intervalFactor
lead to 500 in "Istio Service Dashboard" (#479) by @denis-ryzhkov - 3154164: fix: incorrect log level parsing for uppercase codes (#481) by @AndrejsPon00
- c0098ec: fix: correctly parse and render total metric values and labels (not just last label) in kof UI (#486) by @AndrejsPon00
- b63188f: fix: prevent OOM crash in promxy on large queries (#491) by @AndrejsPon00
- 93599c1: fix: correct memory queries in Grafana dashboard panels (#494) by @AndrejsPon00
✨ More Improvements ✨
- d3cc733: chore: setup go based on go.mod file by @gmlexx
- fbd2d4a: chore: apply coredns patch for mothership and restart once by @gmlexx
- 7867e05: chore: add promxy port-forward target by @gmlexx
- c62848b: test: check promxy metrics by @gmlexx
- 36154db: chore: add charts and docker images build (#465) by @gmlexx
- 29f0a2b: test: wait until vmauth creates ingress in kind-adopted-regional cluster (#471) by @gmlexx
- 1885064: chore: KOF 1.2.1 patch release by @denis-ryzhkov
- c106d22: test: add unit tests for Victoria pages (KOF UI) (#482) by @AndrejsPon00
- 1ac30b6: chore: KOF KOF 1.3.0-rc1 (#496) by @denis-ryzhkov
Full Changelog: v1.2.0...v1.3.0-rc1
v1.2.1
This is a patch release with a bunch of small but important fixes which require no special upgrade instructions.
🐛 Fixes 🐛
- 9ddd84a: fix: Added
ServiceTemplateChain
cert-manager-v1-16-4-from-1-16-4
required for upgrade to KOF 1.2.0 (#467) by @denis-ryzhkov - 4cbef06: fix: change opencost prometheus URL to HTTP for local cluster (#451) by @AndrejsPon00
- 3fe1f5a: fix: correct instrumentation exporter endpoint to resolve trace export error (#452) by @AndrejsPon00
- d26f626: fix: Replacing release notes with auto-generated ones, updated docs/release (#453) by @denis-ryzhkov
- 73fc2ab: fix: promxy server group doesn't update after http client config changes (#456) by @AndrejsPon00
- b2e9ea7: fix: Customized
cert-manager-startupapicheck
image registry (#457) by @denis-ryzhkov - e2a59a8: fix: increase promxy memory requests/limits to prevent OOM (#458) by @AndrejsPon00
- f28498a: fix: Jaeger authenticated endpoint of regional cluster became available for other clusters (#462) by @denis-ryzhkov
- e2780eb: fix:
istio/gateway
chart repo compatibility with custom registry (#464) by @denis-ryzhkov - 10e6a60: fix: add missing env variable for goreleaser (#466) by @gmlexx
- 0fe6647: fix: override only defined properties with annotation on config update (#468) by @gmlexx
- a2d23c9: fix: "Cluster Deployments Events" dashboard vs "From Management to Regional" case (#469) by @denis-ryzhkov
- 25138b2: fix: Custom
kcm.serviceMonitor.selector
(#472) by @denis-ryzhkov - daca158: fix: Custom
registryCredentialsConfig
inhelmCharts
ofkof-istio
(#473) by @denis-ryzhkov - 656b2e3: fix: Two cases of
chartName
forcert-manager
inkof-istio-network
by @denis-ryzhkov - 69b1918: fix: Moved
kof-operators
to be installed beforekof-storage
inkof-istio-regional
to avoid "CRDs not found" by @denis-ryzhkov - 9a60cbe: fix: Updated Jaeger secret name after movingit from
kof-storage
tokof-mothership
in #462 to avoidinvalid ownership metadata
by @denis-ryzhkov - 0d87dec: fix: ContainerHighMemUsage alert has container label missing (#477) by @aglarendil
- 20aa648: fix: Typo in
intervalFactor
lead to 500 in "Istio Service Dashboard" (#479) by @denis-ryzhkov
Full Changelog: v1.2.0...v1.2.1
v1.2.0
❗ Upgrade Instructions ❗
- As part of the KOF 1.2.0 overhaul of metrics collection and representation, we switched from the victoria-metrics-k8s-stack metrics and dashboards to opentelemetry-kube-stack metrics and kube-prometheus-stack dashboards.
- Some of the previously collected metrics have slightly different labels.
- If consistency of timeseries labeling is important, users are advised to conduct relabeling of the corresponding timeseries in the metric storage by running a retroactive relabeling procedure of their preference.
- A possible reference solution here would be to use Rules backfilling via vmalert.
- The labels that would require renaming are these:
- Replace
job="integrations/kubernetes/kubelet"
withjob="kubelet", metrics_path="/metrics"
. - Replace
job="integrations/kubernetes/cadvisor"
withjob="kubelet", metrics_path="/metrics/cadvisor"
. - Replace
job="prometheus-node-exporter"
withjob="node-exporter"
.
- Replace
Also:
- To upgrade from
cert-manager-1-16-4
tocert-manager-v1-16-4
please apply this patch to management cluster:kubectl apply -f - <<EOF apiVersion: k0rdent.mirantis.com/v1beta1 kind: ServiceTemplateChain metadata: name: patch-cert-manager-v1-16-4-from-1-16-4 namespace: kcm-system annotations: helm.sh/resource-policy: keep spec: supportedTemplates: - name: cert-manager-v1-16-4 - name: cert-manager-1-16-4 availableUpgrades: - name: cert-manager-v1-16-4 EOF
📚 New Docs 📚
- KOF 1.2.0 docs: https://docs.k0rdent.io/v1.2.0/admin/kof/
- k0rdent/docs#493 updates:
- Switch to
opentelemetry-kube-stack
collectors, metrics, dashboards. - Dedicated "Upgrading KOF" page.
- Option to skip regional
ClusterDeployment
, create regionalConfigMap
instead. - Optional
crossNamespace
discovery of regional cluster. - OpenStack-specific values.
- Istio does not need
REGIONAL_DOMAIN
- never needed, but docs were unclear. - Auto-deletion of
promxyservergroup
andgrafanadatasource
on uninstall. - Deleted the outdated "Low-level" diagram.
- Better verification of certs.
- Switch to
🚀 New Features 🚀
- feat: Add cluster filter to Victoria Logs dashboard by @AndrejsPon00 in #382
- feat: Option to allow regional cluster to be in another namespace than the child cluster by @denis-ryzhkov in #390
- feat: add dashboard to monitor OpenTelemetry Collectors metrics across all clusters by @AndrejsPon00 in #391
- feat: Show trend insights for collectors metrics in KOF dashboard by @AndrejsPon00 in #395
- feat: Add collectors list page to KOF dashboard by @AndrejsPon00 in #398
- feat: Switch Metric Collectors to Opentelemetry-kube-stack by @aglarendil in #273
- feat: optional regional cluster by @gmlexx in #396
- feat: Added the
clusterNamespace
metrics label ascluster
name may be not unique by @denis-ryzhkov in #401 - feat: Speedup kof release workflow by @AndrejsPon00 in #365
- feat: Collect internal metrics from victoria metrics/logs services by @AndrejsPon00 in #403
- feat: add handler to fetch internal metrics from collectors by @AndrejsPon00 in #387
- feat: Add ability to extract metrics port from annotation by @AndrejsPon00 in #423
- feat: Allow exposing kof-operator webui through Ingress by @chramb in #442
🐛 Notable Fixes 🐛
- fix: Do not replace
expr
with int zero0
ifexpr
is not overridden by @denis-ryzhkov in #380 - fix: batch processor order by @gmlexx in #381
- fix: update dev-child-coredns setup to wait ingress ip provisioning by @gmlexx in #386
- fix: Moved
PromxyServerGroup
andGrafanaDatasource
to namespace ofClusterDeployment
by @denis-ryzhkov in #394 - fix: Added letter
v
tocert-manager:v1.16.4
for compatibility with all registries by @denis-ryzhkov in #404 - fix: patch up incorrect alertmanager rules job label variable by @aglarendil in #409
- fix: Add debug info to kof-operator build to prevent auto-instrumentation crash by @AndrejsPon00 in #414
- fix: Enabled "self metrics" of
kube-state-metrics
by @denis-ryzhkov in #415 - fix: change cluster filter label in dashboards by @gmlexx in #417
- fix: Increase collectors memory to prevent OOM by @AndrejsPon00 in #408
- fix: Prevent regional clusters selection from cluster-deployments-events dashboard by @AndrejsPon00 in #418
- fix: Incorrect routing fallback for KOF UI on server side by @AndrejsPon00 in #419
- fix: k8s events processor by @aglarendil in #421
- fix: fix collectors observability a bit by @aglarendil in #424
- fix: Made
opentelemetry-go-instrumentation
path and version compatible with all image registries by @denis-ryzhkov in #426 - fix: Aligned versions of
autoinstrumentation-go
to v0.21.0 in all image registries by @denis-ryzhkov in #428 - fix: add cluster filter for Victoria logs/metrics Grafana dashboards by @AndrejsPon00 in #431
- fix: fix typo on helpers.tpl by @aglarendil in #434
- fix: Increase
maxLabelsPerTimeseries
to 50 to prevent label dropping in vmInsert by @AndrejsPon00 in #433 - fix: Customizable image registry for
vmauth
by @denis-ryzhkov in #436 - fix: Customizable image registries for
kof-istio-network
,cert-manager
,ingress-nginx
by @denis-ryzhkov in #439 - fix: fix collection of metrics from otel collectors by @aglarendil in #425
- fix: make promxy ingress target correct service by @chramb in #441
- fix: Grafana dashboards fixes by @AndrejsPon00 in #440
- fix:
make dev-collectors-deploy
was breaking exporters by @denis-ryzhkov in #444 - fix: read vmauth credentials from secret by @gmlexx in #437
- fix: listen on host ip metrics for daemon collector by @aglarendil in #446
- fix: Correct panel alignment in
Kube Prometheus Stack
dashboards by @AndrejsPon00 in #445 - fix: kgst 1.2.0 to support
certSecretRef
by @denis-ryzhkov in #449
✨ More Improvements ✨
- test: add support bundle to troubleshoot CI issues by @gmlexx in #363
- docs: add data collection recipes by @gmlexx in #379
- docs: add data sending customization recipe by @gmlexx in #383
- chore: Use kgst 1.0.0 by @denis-ryzhkov in #385
- chore: update standalone cluster templates version to match the latest kcm by @gmlexx in #420
- chore: Update dependencies to resolve vulnerabilities in KOF UI by @AndrejsPon00 in #422
- chore: update charts version for release 1.2.0 by @gmlexx in #427
- chore: format changelog with goreleaser by @gmlexx in #430
- refactor: Group grafana dashboards by folders by @AndrejsPon00 in #432
- docs: Deleted the outdated
docs/collect-from-management.md
by @denis-ryzhkov in #447 - chore: KOF 1.2.0 release by @denis-ryzhkov in #450
🧑 New Contributors 🧑
Full Changelog: v1.1.0...v1.2.0
v1.2.0-rc1
What's Changed
Timeseries Labeling Change
-
As part of the KOF 1.2.0 overhaul of metrics collection and representation, we switched from the victoria-metrics-k8s-stack metrics and dashboards to opentelemetry-kube-stack metrics and kube-prometheus-stack dashboards.
-
Some of the previously collected metrics have slightly different labels.
-
If consistency of timeseries labeling is important, users are advised to conduct relabeling of the corresponding timeseries in the metric storage by running a retroactive relabeling procedure of their preference.
-
A possible reference solution here would be to use Rules backfilling via vmalert.
-
The labels that would require renaming are these:
- Replace
job="integrations/kubernetes/kubelet"
withjob="kubelet", metrics_path="/metrics"
. - Replace
job="integrations/kubernetes/cadvisor"
withjob="kubelet", metrics_path="/metrics/cadvisor"
. - Replace
job="prometheus-node-exporter"
withjob="node-exporter"
.
- Replace
-
test: add support bundle to troubleshoot CI issues by @gmlexx in #363
-
feat: Speedup kof release workflow by @AndrejsPon00 in #365
-
fix: Do not replace
expr
with int zero0
ifexpr
is not overridden by @denis-ryzhkov in #380 -
feat: Add cluster filter to Victoria Logs dashboard by @AndrejsPon00 in #382
-
docs: add data sending customization recipe by @gmlexx in #383
-
fix: update dev-child-coredns setup to wait ingress ip provisioning by @gmlexx in #386
-
chore: Use kgst 1.0.0 by @denis-ryzhkov in #385
-
feat: Option to allow regional cluster to be in another namespace than the child cluster by @denis-ryzhkov in #390
-
feat: add dashboard to monitor OpenTelemetry Collectors metrics across all clusters by @AndrejsPon00 in #391
-
fix: Moved
PromxyServerGroup
andGrafanaDatasource
to namespace ofClusterDeployment
by @denis-ryzhkov in #394 -
feat: Show trend insights for collectors metrics in KOF dashboard by @AndrejsPon00 in #395
-
feat: Add collectors list page to KOF dashboard by @AndrejsPon00 in #398
-
feat: Switch Metric Collectors to Opentelemetry-kube-stack by @aglarendil in #273
-
feat: Added the
clusterNamespace
metrics label ascluster
name may be not unique by @denis-ryzhkov in #401 -
fix: Added letter
v
tocert-manager:v1.16.4
for compatibility with all registries by @denis-ryzhkov in #404 -
fix: patch up incorrect alertmanager rules job label variable by @aglarendil in #409
-
fix: Add debug info to kof-operator build to prevent auto-instrumentation crash by @AndrejsPon00 in #414
-
feat: Collect internal metrics from victoria metrics/logs services by @AndrejsPon00 in #403
-
fix: Enabled "self metrics" of
kube-state-metrics
by @denis-ryzhkov in #415 -
fix: change cluster filter label in dashboards by @gmlexx in #417
-
fix: Increase collectors memory to prevent OOM by @AndrejsPon00 in #408
-
fix: Prevent regional clusters selection from cluster-deployments-events dashboard by @AndrejsPon00 in #418
-
feat: add handler to fetch internal metrics from collectors by @AndrejsPon00 in #387
-
fix: Incorrect routing fallback for KOF UI on server side by @AndrejsPon00 in #419
-
chore: update standalone cluster templates version to match the latest kcm by @gmlexx in #420
-
fix: k8s events processor by @aglarendil in #421
-
fix: fix collectors observability a bit by @aglarendil in #424
-
feat: Add ability to extract metrics port from annotation by @AndrejsPon00 in #423
-
fix: Made
opentelemetry-go-instrumentation
path and version compatible with all image registries by @denis-ryzhkov in #426 -
chore: Update dependencies to resolve vulnerabilities in KOF UI by @AndrejsPon00 in #422
-
chore: update charts version for release 1.2.0 by @gmlexx in #427
Full Changelog: v1.1.0...v1.2.0-rc1
v1.1.0
❗ Upgrade Instructions ❗
- Please run after upgrade of KOF:
And run the same for each regional cluster:
kubectl apply --server-side --force-conflicts \ -f https://github.com/grafana/grafana-operator/releases/download/v5.18.0/crds.yaml
This is required by grafana-operator release notes.kubectl get secret -n kcm-system $REGIONAL_CLUSTER_NAME-kubeconfig \ -o=jsonpath={.data.value} | base64 -d > regional-kubeconfig KUBECONFIG=regional-kubeconfig kubectl apply --server-side --force-conflicts \ -f https://github.com/grafana/grafana-operator/releases/download/v5.18.0/crds.yaml
📚 New Docs 📚
- Alerts: https://docs.k0rdent.io/v1.1.1/admin/kof/kof-alerts/
- KOF UI: https://docs.k0rdent.io/v1.1.1/admin/kof/kof-using/#access-to-the-kof-ui
- Grafana SSO: https://docs.k0rdent.io/v1.1.1/admin/kof/kof-using/#single-sign-on
- Full diff: k0rdent/docs#397
🚀 New Features 🚀
- feat: Switching to upstream PrometheusRules at promxy and regional with patches for all/specific clusters by @denis-ryzhkov in #248
- feat: Add server to
kof-operator
for prometheus observability by @AndrejsPon00 in #275 - feat: add configurable UI port setting by @AndrejsPon00 in #314
- feat: ContainerHighMemoryUsage alert for CAPI Operator and others by @denis-ryzhkov in #317
- feat: Configure Grafana SSO using Dex by @AndrejsPon00 in #319
- feat: add autoinstrumentation to kof operator to collect metrics and traces by @AndrejsPon00 in #344
- feat: Custom image registries PRs and resolved conflicts by @denis-ryzhkov in #348
- feat: sync kof operator resources when cluster annotation changes by @AndrejsPon00 in #340
🐛 Notable Fixes 🐛
- fix: addon-controller ServiceMonitor watches wrong namespace by @aglarendil in #290
- fix: filter out projectsveltos_* metrics from kcm cm by @aglarendil in #291
- fix: Errors on upgrade of kof ServiceTemplates by @denis-ryzhkov in #295
- fix: logs sorting order by @aglarendil in #301
- fix: remove shadcn add command causing unwanted file updates by @AndrejsPon00 in #307
- fix: Temporary adaptation of new alerts to current metrics by @denis-ryzhkov in #310
- fix: Workaround for
generatorURL
in alerts and "See source" in Grafana by @denis-ryzhkov in #312 - fix: "too many open files" in
sveltos-dashboard
by @denis-ryzhkov in #320 - fix: use correct namespace name for MCS by @aglarendil in #332
- fix: update grafana to fix CVE-2025-4123 by @gmlexx in #339
- fix: allow to parametrise operators and ingress-nginx values by @aglarendil in #337
- fix: make Sveltos follow cluster updates by @aglarendil in #333
- fix: Pinned versions of Grafana plugins and Promxy as part of aig-gap solution by @denis-ryzhkov in #349
- fix: Duplicate
version
field in Grafana by @denis-ryzhkov in #350 - fix:
pattern dist
not found onkof-operator-release
by @denis-ryzhkov in #353 and #354
✨ More Improvements ✨
- test: add tests for kof operator UI by @AndrejsPon00 in #299
- chore: bump
vite
from 6.2.0 to 6.2.7 by @AndrejsPon00 in #306 - chore: use latest KCM release for CI by @gmlexx in #308
- feat: add adopted regional cluster deployment by @gmlexx in #309
- docs: add kof-ui docs by @gmlexx in #313
- chore: align chart versions; sveltos dashboard bump to 0.54.0 (#316) by @denis-ryzhkov in #318
- test: add adopted child cluster deployment by @gmlexx in #321
- refactor: Replace collector event receiver by @AndrejsPon00 in #327
- chore: update codeowners by @gmlexx in #329
- chore: kof 1.1.0-rc1 by @denis-ryzhkov in #351
- fix: Switching to kgst 0.2.0 to use k0rdent v1beta1 instead of deleted v1alpha1 by @denis-ryzhkov in #355
- docs: Moved few dev/docs to official docs by @denis-ryzhkov in #357
- docs: Improve Dex SSO documentation with a clearer example by @AndrejsPon00 in #356
- chore: kof 1.1.0 release by @denis-ryzhkov in #359
Full Changelog: v1.0.0...v1.1.0
v1.1.0-rc1
❗ Upgrade instructions ❗
- Please run after upgrade of KOF:
And run the same for each regional cluster:
kubectl apply --server-side --force-conflicts \ -f https://github.com/grafana/grafana-operator/releases/download/v5.18.0/crds.yaml
This is required by grafana-operator release notes.kubectl get secret -n kcm-system $REGIONAL_CLUSTER_NAME-kubeconfig \ -o=jsonpath={.data.value} | base64 -d > regional-kubeconfig KUBECONFIG=regional-kubeconfig kubectl apply --server-side --force-conflicts \ -f https://github.com/grafana/grafana-operator/releases/download/v5.18.0/crds.yaml
🚀 New Features 🚀
- feat: Switching to upstream PrometheusRules at promxy and regional with patches for all/specific clusters by @denis-ryzhkov in #248
- feat: Add server to
kof-operator
for prometheus observability by @AndrejsPon00 in #275 - feat: add configurable UI port setting by @AndrejsPon00 in #314
- feat: ContainerHighMemoryUsage alert for CAPI Operator and others by @denis-ryzhkov in #317
- feat: Configure Grafana SSO using Dex by @AndrejsPon00 in #319
- feat: add autoinstrumentation to kof operator to collect metrics and traces by @AndrejsPon00 in #344
- feat: Custom image registries PRs and resolved conflicts by @denis-ryzhkov in #348
- feat: sync kof operator resources when cluster annotation changes by @AndrejsPon00 in #340
🐛 Notable Fixes 🐛
- fix: addon-controller ServiceMonitor watches wrong namespace by @aglarendil in #290
- fix: filter out projectsveltos_* metrics from kcm cm by @aglarendil in #291
- fix: Errors on upgrade of kof ServiceTemplates by @denis-ryzhkov in #295
- fix: logs sorting order by @aglarendil in #301
- fix: remove shadcn add command causing unwanted file updates by @AndrejsPon00 in #307
- fix: Temporary adaptation of new alerts to current metrics by @denis-ryzhkov in #310
- fix: Workaround for
generatorURL
in alerts and "See source" in Grafana by @denis-ryzhkov in #312 - fix: "too many open files" in
sveltos-dashboard
by @denis-ryzhkov in #320 - fix: use correct namespace name for MCS by @aglarendil in #332
- fix: update grafana to fix CVE-2025-4123 by @gmlexx in #339
- fix: allow to parametrise operators and ingress-nginx values by @aglarendil in #337
- fix: make Sveltos follow cluster updates by @aglarendil in #333
- fix: Pinned versions of Grafana plugins and Promxy as part of aig-gap solution by @denis-ryzhkov in #349
- fix: Duplicate
version
field in Grafana by @denis-ryzhkov in #350 - fix:
pattern dist
not found onkof-operator-release
by @denis-ryzhkov in #353 and #354
✨ More Improvements ✨
- test: add tests for kof operator UI by @AndrejsPon00 in #299
- chore: bump
vite
from 6.2.0 to 6.2.7 by @AndrejsPon00 in #306 - chore: use latest KCM release for CI by @gmlexx in #308
- feat: add adopted regional cluster deployment by @gmlexx in #309
- docs: add kof-ui docs by @gmlexx in #313
- chore: align chart versions; sveltos dashboard bump to 0.54.0 (#316) by @denis-ryzhkov in #318
- test: add adopted child cluster deployment by @gmlexx in #321
- refactor: Replace collector event receiver by @AndrejsPon00 in #327
- chore: update codeowners by @gmlexx in #329
- chore: kof 1.1.0-rc1 by @denis-ryzhkov in #351
- fix: Switching to kgst 0.2.0 to use k0rdent v1beta1 instead of deleted v1alpha1 by @denis-ryzhkov in #355
Full Changelog: v1.0.0...v1.1.0-rc1
v1.0.0
🚀 New Features 🚀
- feat: migrate to v1beta1 by @gmlexx in #256
- feat: add nvidia gpu monitoring dashboard to grafana by @ramessesii2 in #257
- feat: adopting kube-api-server service monitor from opentelemetry-kube-stack by @gmlexx in #259
- feat: extend resources customization for grafana and vmcluster by @gmlexx in #263
- feat: add cluster annotation to customize promxy and datasource http config by @gmlexx in #276
- feat: move to victoria-log-cluster by @gmlexx in #274
🐛 Notable Fixes 🐛
- fix: istio remote secret creation by @gmlexx in #270
- fix:
"helm repo add" requires 2 arguments
related to fix in yq v4.45.4 by @denis-ryzhkov in #282 - fix: support modification of resources for all VM services by @aglarendil in #279
✨ More Improvements ✨
- chore: remove KCM upgrade from KOF upgrade test by @gmlexx in #260
- chore: bump go version to upcomming upgrade by @gmlexx in #269
- chore: bump go version for upcoming upgrade by @gmlexx in #271
- chore: bump helm charts versions to v1.0.0 by @gmlexx in #277
- chore: kof 1.0.0-rc2 using kcm 1.0.0-rc1 and kcm api v1beta1 by @denis-ryzhkov in #284
- chore: kof 1.0.0 using kcm 1.0.0 by @denis-ryzhkov in #285
Full Changelog: v0.3.0...v1.0.0