mw-monitoring

Telemetry stack for my personal GKE cluster - Prometheus + Grafana + other bits to get useful data out of this.

I've deliberately used CoreOS' Prometheus Operator as I recognise how useful this is, but stopped short of deploying the full CoreOS kube-prometheus stack as - great concept though it is - I want to learn about how this stuff hangs together.

Prometheus

This uses the CoreOS Prometheus Operator and tweaked (namespace, resources, labels). Had to add --config-reloader-cpu=20m to fit it on my tiny cluster! There is no equivalent for the prometheus-config-reloader in the current Operator sadly, but just setting it for config-reloader was sufficient to get me up and running.

Install

The operator CRDs are installed from a different repo as the ServiceMonitor resource needs to exist for several other pipelines to work successfully.

Prometheus itself is defined in (./prometheus/). I skimped on a dedicated StorageClass to save myself a few quid.

When this is up and running, you should be able to kubectl port-forward svc/prometheus-operated 9090:9090 and then hit http://localhost:9090/ and see one of the Promethei.

Alert Manager

When this is up and running, you should be able to kubectl port-forward svc/alertmanager-operated 9093:9093 and then hit http://localhost:9093/ and see one of the Promethei.

Authentication

Basic Auth has been replaced with Google's Identity Aware Proxy across my shared Gateway. If looking for that config, look at commits before October 2023.

Set up has been done manually for now - may revisit this later. General gist is:

Enable IAP via the Console
find the Backend Service for this workload in the IAP panel and toggle IAP to On

Testing Alerts

The always-firing alert is helpful for testing - taking a copy of it and changing its receiver for example. I've found that editing the AlertManager config to define a new receiver block with a much shorter repeat interval to be the best approach for me:

      - receiver: testing
        group_by: [group]
        repeat_interval: 1m
        group_interval: 1m
        matchers:
        - receiver="testing"

... then configuring an appropriate receiver matching that name to test with.

There is also a ./fire-test-alert.sh script which is occasionally useful - and needs port-forwarding to AlertManager to work.

Grafana

Is installed via an operator - see ./grafana-operator/generate-manifest.sh. We apply the raw manifest generated locally via helm template.

... then CI takes care of the deployment via kustomize. The operator is deployed in namespaced mode, meaning any grafana resources (including dashboards) are only looked for in the grafana namespace.

To Do

Maybe To Do

Grafana plugins experimentation
Stackdriver

Name		Name	Last commit message	Last commit date
Latest commit History 294 Commits
_retired		_retired
alertmanager		alertmanager
alerts		alerts
dashboards		dashboards
docs		docs
grafana-operator		grafana-operator
grafana		grafana
kube-state-metrics		kube-state-metrics
kubelet		kubelet
node-exporter		node-exporter
prometheus		prometheus
service-monitors		service-monitors
test-webhook		test-webhook
utils		utils
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
.gitleaksignore		.gitleaksignore
.kicsignore		.kicsignore
.semgrepignore		.semgrepignore
Learning.md		Learning.md
README.md		README.md
fire-test-alert.sh		fire-test-alert.sh
webhook.Dockerfile		webhook.Dockerfile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

mw-monitoring

Prometheus

Install

Alert Manager

Authentication

Testing Alerts

Grafana

To Do

Maybe To Do

About

Uh oh!

Releases

Packages

Uh oh!

Languages

alexdmoss/mw-monitoring

Folders and files

Latest commit

History

Repository files navigation

mw-monitoring

Prometheus

Install

Alert Manager

Authentication

Testing Alerts

Grafana

To Do

Maybe To Do

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages