Thanks to visit codestin.com
Credit goes to github.com

Skip to content
This repository was archived by the owner on Jan 2, 2026. It is now read-only.

feat: Observability instrumentation and distributed tracing#24

Merged
zircote merged 8 commits intomainfrom
issue-10-observability
Dec 26, 2025
Merged

feat: Observability instrumentation and distributed tracing#24
zircote merged 8 commits intomainfrom
issue-10-observability

Conversation

@zircote
Copy link
Owner

@zircote zircote commented Dec 26, 2025

Summary

Implements comprehensive observability instrumentation for git-notes-memory (Issue #10, SPEC-2025-12-25-001).

Phase 1: Core Infrastructure ✅

  • observability/ module with lazy imports for hook performance
  • ObservabilityConfig with environment-based configuration
  • Thread-safe MetricsCollector (counters, histograms, gauges)
  • SpanContext with contextvars-based trace propagation
  • SessionIdentifier with privacy-preserving SHA256 hashes
  • measure_duration decorator (sync/async support)
  • Prometheus text format exporter (no external deps)

Phase 2: Service Instrumentation ✅

  • Instrumented all core services (Capture, Recall, Embedding, Index, GitOps)
  • Added timed_hook_execution context manager for hook handlers
  • Fixed silent failure points with explicit logging and metrics

Phase 3: Structured Logging ✅

  • StructuredLogger with JSON/text formatters
  • Migrated all 5 hook handlers + 7 support modules

Phase 4: CLI & Export ✅

  • /memory:metrics command (--format=text|json|prometheus)
  • /memory:traces command (--operation, --status, --limit)
  • /memory:health --timing for latency percentiles

Phase 5: OpenTelemetry (Skipped)

  • Optional Tier 3 - deferred for future enhancement

Phase 6: Local Stack & Dashboards ✅

  • Docker Compose observability stack:
    • OpenTelemetry Collector (OTLP receivers, processors, exporters)
    • Prometheus (metrics storage, scrape config)
    • Tempo (distributed tracing)
    • Loki (log aggregation)
    • Grafana (visualization, pre-provisioned)
  • Pre-built Grafana dashboards:
    • memory-operations.json - Captures, searches, latency percentiles
    • hook-performance.json - Executions, timeouts, p95 latency by hook
  • Auto-provisioned datasources with trace-log-metric correlation

Code Review Fixes ✅

  • CRIT-001: Added exponential backoff with jitter for lock acquisition
  • HIGH-004: Added composite index on (namespace, timestamp)
  • HIGH-014: Created docs/observability.md documentation

Test Plan

  • All 1949 tests passing
  • 87.70% coverage (above 80% threshold)
  • Quality gates green (format, lint, typecheck, security)
  • Verified Docker Compose stack files are valid YAML/JSON

Files Changed

  • src/git_notes_memory/observability/ - New observability module (8 files)
  • src/git_notes_memory/*.py - Instrumented core services
  • src/git_notes_memory/hooks/*.py - Migrated to structured logging
  • commands/metrics.md, traces.md, health.md - New CLI commands
  • docker/ - Docker Compose observability stack (9 files)
  • docs/observability.md - Observability documentation
  • docs/code-review/ - Code review artifacts

🤖 Generated with Claude Code

zircote and others added 4 commits December 25, 2025 19:34
- Create PROGRESS.md with 29 tasks across 6 phases
- Update README.md with started timestamp
- Spec approved and ready for implementation

Implements: SPEC-2025-12-25-001

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Complete observability foundation with:

Core Modules:
- config.py: ObservabilityConfig with env-based configuration
- metrics.py: Thread-safe MetricsCollector (counters, histograms, gauges)
- tracing.py: Span context with contextvars-based propagation
- session.py: SessionIdentifier with privacy-preserving hashes
- decorators.py: measure_duration decorator (sync/async support)
- logging.py: StructuredLogger with JSON/text formatters

Exporters:
- prometheus.py: Prometheus text format (no external deps)
- json_exporter.py: Full JSON export for metrics and traces

Key Features:
- Lazy imports via __getattr__ for hook performance (<5ms import)
- Thread-safe operations with threading.Lock
- Privacy-preserving SHA256 hashes (8 char truncated)
- Histogram buckets aligned with hook timeouts (2s, 5s, 15s, 30s)
- Rolling window for bounded memory in histograms

Tests:
- 115 new tests across 8 test modules
- Coverage: 87.76% (above 80% threshold)
- All quality gates passing

Closes Phase 1 of SPEC-2025-12-25-001 (Observability Instrumentation)
Ref: #10

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Implement comprehensive observability for git-notes-memory plugin:

Phase 1 - Core Infrastructure:
- MetricsCollector with counters, histograms, gauges (thread-safe)
- SpanContext and trace_operation for distributed tracing
- StructuredLogger with JSON/text formatters
- SessionIdentifier for multi-tenant distinguishability
- measure_duration decorator and timed_context

Phase 2 - Service Instrumentation:
- Instrument CaptureService, RecallService, EmbeddingService
- Instrument IndexService, GitOps, Hook Handlers
- Fix silent failure points with explicit logging and metrics
- Add timed_hook_execution context manager

Phase 3 - Structured Logging:
- Add *args support to StructuredLogger for backwards compatibility
- Migrate all 5 hook handlers to get_logger()
- Migrate 7 hook support modules to structured logging

Phase 4 - CLI & Export:
- /memory:metrics command with text/json/prometheus formats
- /memory:traces command with filtering options
- /memory:health command with --timing for latency percentiles
- PrometheusExporter class for Prometheus text format

Phases 5-6 (OpenTelemetry, Docker stack) skipped as optional Tier 3.

Closes #10

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
MAXALL deep-clean remediation:

- CRIT-001: Add exponential backoff with jitter to lock acquisition
  - Replaces fixed 100ms retry with 50ms→2s exponential backoff
  - Adds 0-10% jitter to prevent thundering herd
  - Reduces resource exhaustion risk under high concurrency

- HIGH-004: Add composite index on (namespace, timestamp DESC)
  - Improves range query performance for get_by_namespace()

- HIGH-014: Create docs/observability.md documentation
  - Metrics collection and Prometheus export
  - Tracing with span hierarchies
  - Structured logging configuration
  - Health checks and performance characteristics

Code review artifacts in docs/code-review/2025/12/25/:
- CODE_REVIEW.md: 90 findings from 9 specialist agents
- REVIEW_SUMMARY.md: Executive summary
- REMEDIATION_TASKS.md: Actionable checklist
- REMEDIATION_REPORT.md: Final status with false positives

All 1949 tests passing, 87.70% coverage.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@zircote zircote marked this pull request as ready for review December 26, 2025 02:40
Copilot AI review requested due to automatic review settings December 26, 2025 02:40
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements comprehensive observability instrumentation for the git-notes-memory plugin, addressing Issue #10. The implementation includes metrics collection, distributed tracing, and structured logging with CLI commands for accessing telemetry data.

Key Changes:

  • New observability/ module with metrics, tracing, logging, and session identification
  • Instrumentation of 5 services (CaptureService, RecallService, EmbeddingService, IndexService, GitOps) and 5 hook handlers
  • Three new CLI commands: /memory:metrics, /memory:traces, /memory:health
  • Silent failure tracking with explicit logging instead of suppressed exceptions
  • Privacy-preserving session identification for multi-tenant distinguishability

Reviewed changes

Copilot reviewed 54 out of 54 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
tests/test_observability/*.py Comprehensive test suite (273-299 lines each) covering metrics, tracing, logging, session, exporters, and decorators
src/git_notes_memory/observability/*.py Core observability modules: config, metrics, tracing, session, logging, decorators, exporters
src/git_notes_memory/capture.py Instrumented with metrics/tracing, exponential backoff with jitter for lock acquisition, silent failure tracking
src/git_notes_memory/recall.py Added measure_duration decorators and trace operations for search paths
src/git_notes_memory/index.py Composite index added (namespace, timestamp), silent failure tracking for index operations
src/git_notes_memory/sync.py Silent failure logging for hash verification
src/git_notes_memory/git_ops.py Git command metrics and tracing, silent failure tracking for fetch operations
src/git_notes_memory/embedding.py Model load timing and embedding generation metrics
src/git_notes_memory/hooks/*.py All hook handlers instrumented with timed_hook_execution, migrated to structured logging
docs/spec/active/2025-12-25-observability-instrumentation/*.md Complete spec documentation: requirements, architecture, implementation plan, decisions, research, progress
docs/observability.md User-facing documentation for metrics, tracing, and logging features

zircote and others added 2 commits December 25, 2025 21:51
Phase 6 implementation (tasks 6.1-6.2):
- docker-compose.yml with OTEL Collector, Prometheus, Tempo, Loki, Grafana
- OpenTelemetry Collector config with OTLP receivers and exporters
- Prometheus scrape configuration for OTEL metrics
- Tempo distributed tracing backend configuration
- Loki log aggregation configuration
- Grafana datasource provisioning (Prometheus, Tempo, Loki)
- Grafana dashboard provisioning with auto-discovery
- memory-operations.json dashboard (captures, searches, latency)
- hook-performance.json dashboard (executions, timeouts, p95 latency)

Exposed ports:
- 3000: Grafana UI
- 9090: Prometheus UI
- 4317/4318: OTEL gRPC/HTTP
- 3100: Loki
- 3200: Tempo

Usage: cd docker && docker-compose up -d

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Addresses PR #24 review comments:
- SECURITY: Moved inline Python code from commands/*.md to scripts/*.py
  to prevent command injection via $ARGUMENTS string interpolation
- scripts/metrics.py: Safe argument parsing via argparse/sys.argv
- scripts/traces.py: Safe argument parsing via argparse/sys.argv
- scripts/health.py: Safe argument parsing via argparse/sys.argv
- Updated commands/metrics.md, traces.md, health.md to call scripts
- Fixed exporters/__init__.py: explicit imports instead of lazy loading
  to resolve Copilot undefined export warnings

The previous implementation embedded $ARGUMENTS directly into Python
string literals, allowing code injection via crafted arguments.
The new scripts pass arguments safely via command line.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Copilot AI review requested due to automatic review settings December 26, 2025 02:59
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 66 out of 66 changed files in this pull request and generated no new comments.

zircote and others added 2 commits December 25, 2025 22:17
Completes Phase 5 of the observability spec by adding OTLP HTTP export
capability so telemetry can be pushed to OpenTelemetry Collector and the
Docker Compose observability stack.

Changes:
- New `exporters/otlp.py` with OTLPExporter class using stdlib only
  - Converts internal Span/metrics to OTLP JSON format
  - Pushes to {endpoint}/v1/traces and {endpoint}/v1/metrics
  - Configurable via MEMORY_PLUGIN_OTLP_ENDPOINT env var
- Updated Stop handler to flush telemetry at session end
- Updated observability.md with OTLP configuration section
- Added comprehensive tests for OTLP exporter
- pyproject.toml: added S310 per-file-ignore for OTLP file

The exporter uses urllib.request (stdlib) to avoid adding dependencies.
When MEMORY_PLUGIN_OTLP_ENDPOINT is set, traces and metrics are
automatically exported when the Claude Code session ends.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Merged origin/main which includes:
- Secrets filtering with automatic detection and redaction
- PII detection and redaction patterns
- Security allowlist management
- Audit logging for security events
- New `/memory:scan-secrets`, `/memory:audit-log` commands

Conflict resolution in capture.py:
- Preserved observability trace_operation context managers
- Integrated filter_warnings from secrets filtering into warnings list
- Combined warnings using "; ".join() for multiple warning sources

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Copilot AI review requested due to automatic review settings December 26, 2025 03:28
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 69 out of 69 changed files in this pull request and generated no new comments.

@zircote zircote merged commit 1d2e988 into main Dec 26, 2025
13 checks passed
zircote added a commit that referenced this pull request Dec 26, 2025
* feat: initialize observability instrumentation implementation

- Create PROGRESS.md with 29 tasks across 6 phases
- Update README.md with started timestamp
- Spec approved and ready for implementation

Implements: SPEC-2025-12-25-001

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* feat(observability): implement Phase 1 core infrastructure

Complete observability foundation with:

Core Modules:
- config.py: ObservabilityConfig with env-based configuration
- metrics.py: Thread-safe MetricsCollector (counters, histograms, gauges)
- tracing.py: Span context with contextvars-based propagation
- session.py: SessionIdentifier with privacy-preserving hashes
- decorators.py: measure_duration decorator (sync/async support)
- logging.py: StructuredLogger with JSON/text formatters

Exporters:
- prometheus.py: Prometheus text format (no external deps)
- json_exporter.py: Full JSON export for metrics and traces

Key Features:
- Lazy imports via __getattr__ for hook performance (<5ms import)
- Thread-safe operations with threading.Lock
- Privacy-preserving SHA256 hashes (8 char truncated)
- Histogram buckets aligned with hook timeouts (2s, 5s, 15s, 30s)
- Rolling window for bounded memory in histograms

Tests:
- 115 new tests across 8 test modules
- Coverage: 87.76% (above 80% threshold)
- All quality gates passing

Closes Phase 1 of SPEC-2025-12-25-001 (Observability Instrumentation)
Ref: #10

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* feat: Complete observability instrumentation (Phases 1-4)

Implement comprehensive observability for git-notes-memory plugin:

Phase 1 - Core Infrastructure:
- MetricsCollector with counters, histograms, gauges (thread-safe)
- SpanContext and trace_operation for distributed tracing
- StructuredLogger with JSON/text formatters
- SessionIdentifier for multi-tenant distinguishability
- measure_duration decorator and timed_context

Phase 2 - Service Instrumentation:
- Instrument CaptureService, RecallService, EmbeddingService
- Instrument IndexService, GitOps, Hook Handlers
- Fix silent failure points with explicit logging and metrics
- Add timed_hook_execution context manager

Phase 3 - Structured Logging:
- Add *args support to StructuredLogger for backwards compatibility
- Migrate all 5 hook handlers to get_logger()
- Migrate 7 hook support modules to structured logging

Phase 4 - CLI & Export:
- /memory:metrics command with text/json/prometheus formats
- /memory:traces command with filtering options
- /memory:health command with --timing for latency percentiles
- PrometheusExporter class for Prometheus text format

Phases 5-6 (OpenTelemetry, Docker stack) skipped as optional Tier 3.

Closes #10

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* fix: add exponential backoff and observability documentation

MAXALL deep-clean remediation:

- CRIT-001: Add exponential backoff with jitter to lock acquisition
  - Replaces fixed 100ms retry with 50ms→2s exponential backoff
  - Adds 0-10% jitter to prevent thundering herd
  - Reduces resource exhaustion risk under high concurrency

- HIGH-004: Add composite index on (namespace, timestamp DESC)
  - Improves range query performance for get_by_namespace()

- HIGH-014: Create docs/observability.md documentation
  - Metrics collection and Prometheus export
  - Tracing with span hierarchies
  - Structured logging configuration
  - Health checks and performance characteristics

Code review artifacts in docs/code-review/2025/12/25/:
- CODE_REVIEW.md: 90 findings from 9 specialist agents
- REVIEW_SUMMARY.md: Executive summary
- REMEDIATION_TASKS.md: Actionable checklist
- REMEDIATION_REPORT.md: Final status with false positives

All 1949 tests passing, 87.70% coverage.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* feat(observability): add Docker Compose local observability stack

Phase 6 implementation (tasks 6.1-6.2):
- docker-compose.yml with OTEL Collector, Prometheus, Tempo, Loki, Grafana
- OpenTelemetry Collector config with OTLP receivers and exporters
- Prometheus scrape configuration for OTEL metrics
- Tempo distributed tracing backend configuration
- Loki log aggregation configuration
- Grafana datasource provisioning (Prometheus, Tempo, Loki)
- Grafana dashboard provisioning with auto-discovery
- memory-operations.json dashboard (captures, searches, latency)
- hook-performance.json dashboard (executions, timeouts, p95 latency)

Exposed ports:
- 3000: Grafana UI
- 9090: Prometheus UI
- 4317/4318: OTEL gRPC/HTTP
- 3100: Loki
- 3200: Tempo

Usage: cd docker && docker-compose up -d

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* fix(security): divest inline Python from command files into scripts

Addresses PR #24 review comments:
- SECURITY: Moved inline Python code from commands/*.md to scripts/*.py
  to prevent command injection via $ARGUMENTS string interpolation
- scripts/metrics.py: Safe argument parsing via argparse/sys.argv
- scripts/traces.py: Safe argument parsing via argparse/sys.argv
- scripts/health.py: Safe argument parsing via argparse/sys.argv
- Updated commands/metrics.md, traces.md, health.md to call scripts
- Fixed exporters/__init__.py: explicit imports instead of lazy loading
  to resolve Copilot undefined export warnings

The previous implementation embedded $ARGUMENTS directly into Python
string literals, allowing code injection via crafted arguments.
The new scripts pass arguments safely via command line.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* feat(observability): implement OTLP HTTP exporter for telemetry push

Completes Phase 5 of the observability spec by adding OTLP HTTP export
capability so telemetry can be pushed to OpenTelemetry Collector and the
Docker Compose observability stack.

Changes:
- New `exporters/otlp.py` with OTLPExporter class using stdlib only
  - Converts internal Span/metrics to OTLP JSON format
  - Pushes to {endpoint}/v1/traces and {endpoint}/v1/metrics
  - Configurable via MEMORY_PLUGIN_OTLP_ENDPOINT env var
- Updated Stop handler to flush telemetry at session end
- Updated observability.md with OTLP configuration section
- Added comprehensive tests for OTLP exporter
- pyproject.toml: added S310 per-file-ignore for OTLP file

The exporter uses urllib.request (stdlib) to avoid adding dependencies.
When MEMORY_PLUGIN_OTLP_ENDPOINT is set, traces and metrics are
automatically exported when the Claude Code session ends.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>

---------

Co-authored-by: Claude Opus 4.5 <[email protected]>
zircote added a commit that referenced this pull request Dec 26, 2025
* feat: initialize observability instrumentation implementation

- Create PROGRESS.md with 29 tasks across 6 phases
- Update README.md with started timestamp
- Spec approved and ready for implementation

Implements: SPEC-2025-12-25-001

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* feat(observability): implement Phase 1 core infrastructure

Complete observability foundation with:

Core Modules:
- config.py: ObservabilityConfig with env-based configuration
- metrics.py: Thread-safe MetricsCollector (counters, histograms, gauges)
- tracing.py: Span context with contextvars-based propagation
- session.py: SessionIdentifier with privacy-preserving hashes
- decorators.py: measure_duration decorator (sync/async support)
- logging.py: StructuredLogger with JSON/text formatters

Exporters:
- prometheus.py: Prometheus text format (no external deps)
- json_exporter.py: Full JSON export for metrics and traces

Key Features:
- Lazy imports via __getattr__ for hook performance (<5ms import)
- Thread-safe operations with threading.Lock
- Privacy-preserving SHA256 hashes (8 char truncated)
- Histogram buckets aligned with hook timeouts (2s, 5s, 15s, 30s)
- Rolling window for bounded memory in histograms

Tests:
- 115 new tests across 8 test modules
- Coverage: 87.76% (above 80% threshold)
- All quality gates passing

Closes Phase 1 of SPEC-2025-12-25-001 (Observability Instrumentation)
Ref: #10

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* feat: Complete observability instrumentation (Phases 1-4)

Implement comprehensive observability for git-notes-memory plugin:

Phase 1 - Core Infrastructure:
- MetricsCollector with counters, histograms, gauges (thread-safe)
- SpanContext and trace_operation for distributed tracing
- StructuredLogger with JSON/text formatters
- SessionIdentifier for multi-tenant distinguishability
- measure_duration decorator and timed_context

Phase 2 - Service Instrumentation:
- Instrument CaptureService, RecallService, EmbeddingService
- Instrument IndexService, GitOps, Hook Handlers
- Fix silent failure points with explicit logging and metrics
- Add timed_hook_execution context manager

Phase 3 - Structured Logging:
- Add *args support to StructuredLogger for backwards compatibility
- Migrate all 5 hook handlers to get_logger()
- Migrate 7 hook support modules to structured logging

Phase 4 - CLI & Export:
- /memory:metrics command with text/json/prometheus formats
- /memory:traces command with filtering options
- /memory:health command with --timing for latency percentiles
- PrometheusExporter class for Prometheus text format

Phases 5-6 (OpenTelemetry, Docker stack) skipped as optional Tier 3.

Closes #10

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* fix: add exponential backoff and observability documentation

MAXALL deep-clean remediation:

- CRIT-001: Add exponential backoff with jitter to lock acquisition
  - Replaces fixed 100ms retry with 50ms→2s exponential backoff
  - Adds 0-10% jitter to prevent thundering herd
  - Reduces resource exhaustion risk under high concurrency

- HIGH-004: Add composite index on (namespace, timestamp DESC)
  - Improves range query performance for get_by_namespace()

- HIGH-014: Create docs/observability.md documentation
  - Metrics collection and Prometheus export
  - Tracing with span hierarchies
  - Structured logging configuration
  - Health checks and performance characteristics

Code review artifacts in docs/code-review/2025/12/25/:
- CODE_REVIEW.md: 90 findings from 9 specialist agents
- REVIEW_SUMMARY.md: Executive summary
- REMEDIATION_TASKS.md: Actionable checklist
- REMEDIATION_REPORT.md: Final status with false positives

All 1949 tests passing, 87.70% coverage.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* feat(observability): add Docker Compose local observability stack

Phase 6 implementation (tasks 6.1-6.2):
- docker-compose.yml with OTEL Collector, Prometheus, Tempo, Loki, Grafana
- OpenTelemetry Collector config with OTLP receivers and exporters
- Prometheus scrape configuration for OTEL metrics
- Tempo distributed tracing backend configuration
- Loki log aggregation configuration
- Grafana datasource provisioning (Prometheus, Tempo, Loki)
- Grafana dashboard provisioning with auto-discovery
- memory-operations.json dashboard (captures, searches, latency)
- hook-performance.json dashboard (executions, timeouts, p95 latency)

Exposed ports:
- 3000: Grafana UI
- 9090: Prometheus UI
- 4317/4318: OTEL gRPC/HTTP
- 3100: Loki
- 3200: Tempo

Usage: cd docker && docker-compose up -d

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* fix(security): divest inline Python from command files into scripts

Addresses PR #24 review comments:
- SECURITY: Moved inline Python code from commands/*.md to scripts/*.py
  to prevent command injection via $ARGUMENTS string interpolation
- scripts/metrics.py: Safe argument parsing via argparse/sys.argv
- scripts/traces.py: Safe argument parsing via argparse/sys.argv
- scripts/health.py: Safe argument parsing via argparse/sys.argv
- Updated commands/metrics.md, traces.md, health.md to call scripts
- Fixed exporters/__init__.py: explicit imports instead of lazy loading
  to resolve Copilot undefined export warnings

The previous implementation embedded $ARGUMENTS directly into Python
string literals, allowing code injection via crafted arguments.
The new scripts pass arguments safely via command line.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* feat(observability): implement OTLP HTTP exporter for telemetry push

Completes Phase 5 of the observability spec by adding OTLP HTTP export
capability so telemetry can be pushed to OpenTelemetry Collector and the
Docker Compose observability stack.

Changes:
- New `exporters/otlp.py` with OTLPExporter class using stdlib only
  - Converts internal Span/metrics to OTLP JSON format
  - Pushes to {endpoint}/v1/traces and {endpoint}/v1/metrics
  - Configurable via MEMORY_PLUGIN_OTLP_ENDPOINT env var
- Updated Stop handler to flush telemetry at session end
- Updated observability.md with OTLP configuration section
- Added comprehensive tests for OTLP exporter
- pyproject.toml: added S310 per-file-ignore for OTLP file

The exporter uses urllib.request (stdlib) to avoid adding dependencies.
When MEMORY_PLUGIN_OTLP_ENDPOINT is set, traces and metrics are
automatically exported when the Claude Code session ends.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>

---------

Co-authored-by: Claude Opus 4.5 <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant