Project Overview: See PROJECT.md for a high-level overview, achievements, and technical highlights.
Echo AI includes a complete observability stack to monitor transcription services, measure latencies, and understand system performance.
- Prometheus: For metrics collection and storage
- OpenTelemetry Collector: For collecting and processing telemetry data
- Tempo: For distributed tracing
- Grafana: For visualization of metrics and traces
The observability stack is configured in docker-compose.yml. To start the entire stack:
docker-compose up -dThis will start:
- Prometheus on port 9090
- OpenTelemetry Collector on ports:
- 8889: Collector's own metrics
- 9464: Prometheus exporter for OpenTelemetry metrics
- 4317: OTLP gRPC receiver (primary ingestion point)
- 4318: OTLP HTTP receiver (primary ingestion point)
- Tempo on ports:
- 3200: Query API (used by Grafana)
- 14317: OTLP gRPC receiver (alternative direct connection)
- 14318: OTLP HTTP receiver (alternative direct connection)
- Grafana on port 3000
We use different port mappings to avoid conflicts:
- Applications should send telemetry data to the OpenTelemetry Collector (ports 4317/4318)
- The collector processes and forwards this data to both Prometheus and Tempo
- For direct connection to Tempo (bypassing the collector), use ports 14317/14318
The transcription service sends telemetry data when run with the --otlp flag:
cargo run --bin transcription -- --otlp [other options]By default, the service will try to connect to http://localhost:4317. When running with Docker, you'll need to set the proper endpoint:
# When running the service in Docker:
OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317 cargo run --bin transcription -- --otlp
# When running the service on the host machine:
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317 cargo run --bin transcription -- --otlpThis enables:
- Distributed tracing with spans for key operations
- Latency metrics in histogram format
- Service information and logs
- Access Grafana at http://localhost:3000
- The datasources for Prometheus and Tempo are pre-configured
- Import dashboards or create your own to visualize:
- Latency histograms for each transcription type
- Service performance metrics
- Distributed traces for troubleshooting
For latency histograms:
histogram_quantile(0.95, sum by(le, service) (rate(transcription_latency_bucket{latency_type="transcription"}[5m])))
For comparing services:
histogram_quantile(0.50, sum by(le, service) (rate(transcription_latency_bucket{latency_type="transcription"}[5m])))