OpenTelemetry MVP with Full Observability Stack

General Overview

This project is an MVP (Minimum Viable Product) demonstrating OpenTelemetry integration with a full observability stack. It showcases how to instrument a FastAPI application to generate telemetry data (traces, metrics, and logs) and route it through an OpenTelemetry Collector to various backends for monitoring and visualization.

The stack includes:

Loki for centralized logging
Grafana for dashboards and visualization
Tempo for distributed tracing
Prometheus for metrics collection

The core application is a FastAPI service that generates sample telemetry data, simulating real-world observability scenarios.

High-Level Design Document

Architecture Overview

The system follows a microservices architecture with the following components:

Application (FastAPI): The main service instrumented with OpenTelemetry for automatic generation of traces, metrics, and logs
OpenTelemetry Collector: Receives OTLP (OpenTelemetry Protocol) data from the application and routes it to appropriate backends
Backends:
- Prometheus: Stores and serves metrics data
- Tempo: Stores and queries distributed traces
- Loki: Aggregates and indexes log data
Visualization: Grafana provides unified dashboards for all telemetry data
Load Generator: A separate service that simulates traffic to the FastAPI application

Data Flow

flowchart LR
    A[Application<br/>FastAPI] --> B[OpenTelemetry<br/>Collector]
    B --> C[Prometheus<br/>Metrics]
    B --> D[Tempo<br/>Traces]
    B --> E[Loki<br/>Logs]
    C --> F[Grafana<br/>Visualization]
    D --> F
    E --> F

The FastAPI application generates telemetry data through OpenTelemetry instrumentation
Data is sent via OTLP to the OpenTelemetry Collector
The Collector processes and routes data to respective backends
Grafana queries all backends to create comprehensive dashboards

Steps to Run Application

Docker/Podman Compose

Prerequisites

Docker and Docker Compose (or Podman and Podman Compose)
Git

Steps to Run

This setup works with both Docker Compose and Podman Compose. Use docker-compose or podman-compose commands as appropriate for your environment.

Clone the repository:

git clone <repository-url>
cd otel-lgtm-mvp

Configure environment variables (optional but recommended):

Copy the sample configuration and edit it to suit your environment:
```
cp .env.example .env
```
Update the values in .env before starting the stack:
- GF_SECURITY_ADMIN_USER: Admin username for Grafana. The example file uses admin for quick local testing. In production choose a unique, non-default username to minimize brute-force attacks.
- GF_SECURITY_ADMIN_PASSWORD: Admin password for Grafana. Keep admin only for development demos. Always set a strong, unique password in production.
- GF_USERS_ALLOW_SIGN_UP: Controls whether self-service user sign-up is allowed. Leave set to false for both development and production unless you intentionally want anyone with access to create an account.
- GF_AUTH_ANONYMOUS_ENABLED: Enables anonymous (no login) dashboard access. Keeping it true simplifies demos and local exploration. Switch it to false in production unless the dashboards are meant for public viewing.
- GF_AUTH_ANONYMOUS_ORG_ROLE: Role assigned to anonymous users. Viewer grants read-only access and is appropriate for development. In production you can keep Viewer, lower permissions, or disable anonymous access entirely depending on your security requirements.

Start the services:

docker-compose up -d
# or
podman-compose up -d

Access the services:
- Grafana Dashboard: http://localhost:3000 (default credentials: admin/admin)
- FastAPI Application: http://localhost:8000
- Prometheus: http://localhost:9090
- Tempo: http://localhost:3200
- Loki: http://localhost:3100
View telemetry data:
- Open Grafana and explore the pre-configured dashboard
- Check application logs and metrics in their respective interfaces

Stop the services:

docker-compose down
# or
podman-compose down

Kubernetes via Kustomize

The repository also provides a Kubernetes deployment that mirrors the compose stack. All manifests live under deploy/k8s/ and are structured as a reusable base plus environment-specific overlays, while shared configuration lives in deploy/k8s/base/config/ so both Kubernetes and docker-compose consume the same files.

Directory layout

deploy/k8s/base – Deployments, Services, PersistentVolumeClaims, ConfigMaps, and the grafana-admin Secret that together stand up Grafana, Loki, Tempo, Prometheus, the OpenTelemetry Collector, the FastAPI app, and the load generator.
deploy/k8s/base/config – Canonical configuration (Grafana provisioning, dashboards, Loki/Tempo/Prometheus configs, OTEL collector pipeline) mounted by docker-compose (shared prefix exported as CONFIG_ROOT) and injected into Kubernetes ConfigMaps.
deploy/k8s/overlays/local – Targets local development clusters. It swaps the app/load generator images to the locally built tags and disables image pulls, making it ideal for kind, k3d, or Minikube.
deploy/k8s/overlays/production – Provides templates for cloud clusters. It adds resource requests/limits, sets a sample storage class, promotes Grafana to a LoadBalancer Service, and defines placeholder Ingress objects for TLS termination.
docs/k8s-manifests.md – Deep dive into every manifest with links back to the official Kubernetes documentation for further reading.

Managing Grafana credentials

The base manifest generates a grafana-admin Secret with the same admin/admin defaults as compose. Before deploying to a shared environment, replace it:

kubectl create secret generic grafana-admin \
  --namespace observability \
  --from-literal=GF_SECURITY_ADMIN_USER=your-admin \
  --from-literal=GF_SECURITY_ADMIN_PASSWORD='strong-password' \
  --dry-run=client -o yaml | kubectl apply -f -

You can also use kustomize edit set secret --disable-name-suffix-hash grafana-admin ... inside an overlay if you prefer the Secret to be managed declaratively.

macOS quickstart (kind + Docker Desktop or Podman)

These steps were tested end-to-end on a macOS host using kind v0.26.0 and Podman 5.5.2. Substitute Docker Desktop if that is your preferred container runtime.

Install prerequisites
- kubectl for cluster interaction.
- kind to provision a local Kubernetes cluster in containers.
- Either Docker Desktop or Podman Desktop as the container engine. When using Podman, make sure podman machine is running (podman machine start).

Clone the repository and move into it

git clone https://github.com/hyzhak/otel-lgtm-mvp.git
cd otel-lgtm-mvp

Build the demo images

# Docker Desktop
docker build -t space-app:latest app
docker build -t loadgen:latest loadgen

# Podman (tested)
podman build -t space-app:latest app
podman build -t loadgen:latest loadgen

Create the kind cluster
```
kind create cluster --name otel-lgtm --wait 2m
```
kind automatically detects Docker, Podman, or Nerdctl. If you want to force a specific runtime set KIND_EXPERIMENTAL_PROVIDER=docker|podman|nerdctl before running the command (see the kind quick-start guide).

Load the local images into the cluster

When Docker is the active runtime, kind load docker-image works directly:

kind load docker-image space-app:latest --name otel-lgtm
kind load docker-image loadgen:latest --name otel-lgtm

With Podman rootless, push-style loading is not yet implemented, so tag the images for the Docker registry namespace and import an archive (workaround documented in the kind Podman guide):

podman tag space-app:latest docker.io/library/space-app:latest
podman tag loadgen:latest docker.io/library/loadgen:latest
podman save --format docker-archive -o space-app.tar docker.io/library/space-app:latest
podman save --format docker-archive -o loadgen.tar docker.io/library/loadgen:latest
KIND_EXPERIMENTAL_PROVIDER=podman kind load image-archive space-app.tar --name otel-lgtm
KIND_EXPERIMENTAL_PROVIDER=podman kind load image-archive loadgen.tar --name otel-lgtm

Deploy the stack

make k8s-apply-local
kubectl wait --namespace observability --for=condition=Available deployment --all --timeout=5m

Access the services
- Forward ports from the cluster and open the dashboards locally:
```
kubectl port-forward -n observability svc/grafana 3000:3000
kubectl port-forward -n observability svc/space-app 8000:8000
```
- Visit http://localhost:3000 (Grafana) and http://localhost:8000 (FastAPI). You can also run open http://localhost:3000 on macOS.

Clean up

make k8s-delete-local
# Docker Desktop
kind delete cluster --name otel-lgtm
# Podman provider
KIND_EXPERIMENTAL_PROVIDER=podman kind delete cluster --name otel-lgtm
rm -f space-app.tar loadgen.tar  # remove the temporary archives if you created them

Development convenience script

Spin up a kind-backed cluster, build/load the demo images, and apply the local overlay in one step:

./scripts/start_k8s_dev_stack.sh

Key environment variables:

CLUSTER_NAME (default otel-lgtm-dev) chooses or creates the target kind cluster.
SKIP_BUILD=1 / SKIP_LOAD=1 reuse existing images instead of rebuilding or loading.
RESET_STACK=1 deletes the overlay before reapplying it.
DOCKER_CONFIG_DIR points to an alternative Docker config (useful on macOS to bypass Keychain helpers).
KUBECONFIG_PATH stores the generated kubeconfig somewhere other than ~/.kube/config (handy when write access is restricted).

After the script reports success, follow the printed commands to port-forward Grafana or the application. When finished, remove the stack with kubectl delete -k deploy/k8s/overlays/local and delete the cluster with kind delete cluster --name <name> if desired.

Local clusters (kind, k3d, Minikube)

Install kubectl and a local Kubernetes distribution (kind, k3d, or minikube).

Build the application images and tag them as expected by the overlay:

docker build -t space-app:latest app
docker build -t loadgen:latest loadgen

Load the images into your cluster (examples shown for kind and Minikube):

kind load docker-image space-app:latest
kind load docker-image loadgen:latest
# or for Minikube
minikube image load space-app:latest
minikube image load loadgen:latest

Apply the manifests:

make k8s-apply-local
# equivalent to: kubectl apply -k deploy/k8s/overlays/local

Wait for workloads to become ready:
```
kubectl get pods -n observability
```

Port-forward to reach the services from your workstation:

kubectl port-forward -n observability svc/grafana 3000:3000
kubectl port-forward -n observability svc/space-app 8000:8000
kubectl port-forward -n observability svc/prometheus 9090:9090

Tear the stack down when finished:
```
make k8s-delete-local
```

Production and cloud clusters (GKE, EKS, AKS, bare metal)

Copy deploy/k8s/overlays/production and adjust it to match your infrastructure:
- Update patches/storage-class.yaml with the correct storageClassName for your cluster.
- Swap the annotations in patches/grafana-service.yaml for the load balancer you use (AWS, GCP, MetalLB, etc.).
- Edit ingress.yaml with the hostnames/TLS secrets that your ingress controller expects.
- Override the container images to point at the registry where you publish the FastAPI app and load generator (for example via kustomize edit set image).
Rotate the Grafana admin credentials as shown above or manage them through your preferred secret store.

Deploy with:

make k8s-apply-production
# or: kubectl apply -k deploy/k8s/overlays/production

Integrate the overlay with GitOps or CI pipelines as needed. The manifests are compatible with both kubectl and Argo CD/Flux.

To clean up the production overlay from a cluster, run make k8s-delete-production.

Helpful commands

Preview the rendered manifests before applying:

kubectl kustomize deploy/k8s/overlays/local | less
kubectl kustomize deploy/k8s/overlays/production | less

Check the health of the running stack:

kubectl get pods,svc,pvc -n observability
kubectl logs -n observability deploy/otelcol

If you change any of the configuration files under deploy/k8s/base/config/, both docker-compose and Kubernetes will pick up the updates. Keep the shared directory as the single source of truth for component configuration.

Running Kubernetes integration tests

The docker-compose integration suite can now run inside the cluster to validate the Kubernetes deployment end-to-end.

Build and load the images that the local overlay expects:

docker build -t space-app:latest app
docker build -t loadgen:latest loadgen
docker build -t integration-tests:latest -f tests/integration/Dockerfile .
kind load docker-image space-app:latest --name otel-lgtm
kind load docker-image loadgen:latest --name otel-lgtm
kind load docker-image integration-tests:latest --name otel-lgtm

Apply the local overlay (or ensure it is already running):

make k8s-apply-local
kubectl wait --namespace observability --for=condition=Available deployment --all --timeout=5m

Launch the Job that executes the tests inside the observability namespace:
```
make k8s-integration-test
```
The helper script behind this target streams the Job logs and cleans up the resources automatically. Override WAIT_TIMEOUT, NAMESPACE, or JOB_NAME when calling the script directly (e.g. WAIT_TIMEOUT=15m ./scripts/run_k8s_integration_tests.sh) to customise behaviour on slower clusters.

Tear the stack down when finished:

make k8s-delete-local
kind delete cluster --name otel-lgtm

When the tests fail, the script captures Kubernetes diagnostics (pod status, Job description, and latest pod logs) before exiting. Because the integration assertions reuse the same image and code as the compose workflow, failures indicate either cluster readiness problems or behavioural differences between the deployments.

Fully automated option

./scripts/run_k8s_integration_tests_full.sh

The script builds the demo and test images, spins up a temporary kind cluster (if needed), loads the images, applies the local overlay, runs the Job, and cleans everything up. Useful environment variables:

DOCKER, KIND, KUBECTL: override the binaries that are executed.
DOCKER_CONFIG_DIR: point at a specific Docker config directory (set this on macOS to bypass Keychain helpers).
KEEP_CLUSTER=1 / KEEP_STACK=1: keep the kind cluster and/or deployed resources after the run.
WAIT_DEPLOY_TIMEOUT, WAIT_JOB_TIMEOUT: adjust the deployment and Job wait ceilings (defaults 5m and 15m).

Additional Notes

The load generator service will automatically start generating traffic to the FastAPI application
All services are configured to work together out of the box
Configuration files for each service are located in their respective directories

Integration Tests

These end-to-end tests bring up the full docker-compose stack, exercise the FastAPI demo service, and assert that traces, metrics, and logs flow into Tempo, Prometheus, and Loki respectively.

One-step helper

./scripts/run_compose_integration_tests.sh

The script wraps the compose workflow, builds fresh images, and tears the stack down automatically. Set COMPOSE if you prefer a different binary (for example COMPOSE="docker-compose") and DOCKER_CONFIG_DIR when you need to point at a custom Docker configuration (handy on macOS to bypass credential helpers).

Quick start

docker compose -f docker-compose.yml -f docker-compose.integration.yml up --build --exit-code-from integration-tests integration-tests
docker compose -f docker-compose.yml -f docker-compose.integration.yml down -v

Using make integration-test wraps the same commands (set COMPOSE=podman-compose to run them with Podman Compose instead of Docker Compose).

For Podman Compose you can run the equivalent flow:

podman-compose down -v && podman-compose up -d --build
# give services a brief moment to settle
sleep 4
podman-compose -f docker-compose.yml -f docker-compose.integration.yml run --rm integration-tests

The test suite will:

run inside a disposable Python 3.12 container built from tests/integration/Dockerfile (dependencies baked into the image),
wait for Grafana, Loki, Tempo, Prometheus, and the demo app to report healthy,
generate sample traffic against the FastAPI app, and
query each backend to ensure telemetry is ingested and queryable.

Set STACK_READY_TIMEOUT or OBS_WAIT_TIMEOUT (environment variables passed to the test container) to tweak readiness and observation timeouts when running on slower hardware.

If you need to run the assertions against an already running stack outside of Docker Compose, override the base URLs (e.g. APP_BASE_URL, GRAFANA_HEALTH_URL) when invoking the workflow so the test container targets the correct hosts.

Additional environment variables let you align the assertions with custom deployments (for example, when the service name or Loki label differs from space-app). Override SERVICE_NAME, APP_HOST, PROM_EXPECTED_JOB, or LOKI_SERVICE_LABEL as needed.

When changing tests/requirements-dev.txt, rebuild the integration image with docker compose -f docker-compose.yml -f docker-compose.integration.yml build integration-tests (or the equivalent podman-compose build).

The GitHub Actions workflow .github/workflows/integration-tests.yml runs the same compose stack on every push and pull request.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OpenTelemetry MVP with Full Observability Stack

General Overview

High-Level Design Document

Architecture Overview

Data Flow

Steps to Run Application

Docker/Podman Compose

Prerequisites

Steps to Run

Kubernetes via Kustomize

Directory layout

Managing Grafana credentials

macOS quickstart (kind + Docker Desktop or Podman)

Development convenience script

Local clusters (kind, k3d, Minikube)

Production and cloud clusters (GKE, EKS, AKS, bare metal)

Helpful commands

Running Kubernetes integration tests

Fully automated option

Additional Notes

Integration Tests

One-step helper

Quick start

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
app		app
deploy/k8s		deploy/k8s
docs		docs
loadgen		loadgen
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.integration.yml		docker-compose.integration.yml
docker-compose.yml		docker-compose.yml
pytest.ini		pytest.ini

License

hyzhak/otel-lgtm-mvp

Folders and files

Latest commit

History

Repository files navigation

OpenTelemetry MVP with Full Observability Stack

General Overview

High-Level Design Document

Architecture Overview

Data Flow

Steps to Run Application

Docker/Podman Compose

Prerequisites

Steps to Run

Kubernetes via Kustomize

Directory layout

Managing Grafana credentials

macOS quickstart (kind + Docker Desktop or Podman)

Development convenience script

Local clusters (kind, k3d, Minikube)

Production and cloud clusters (GKE, EKS, AKS, bare metal)

Helpful commands

Running Kubernetes integration tests

Fully automated option

Additional Notes

Integration Tests

One-step helper

Quick start

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages