Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Orchestrate mission-cluster to command-cluster data flows with OpenTelemetry-powered visibility, multi-pipeline flexibility, and resilient modes for retries, backfills, and dry runs—ready to drop into your Kubernetes CronJobs.

Notifications You must be signed in to change notification settings

sohooo/sling-sync-wrapper

Repository files navigation

Sling Sync Wrapper (with OpenTelemetry)

Orchestrate mission-cluster to command-cluster data flows with OpenTelemetry-powered visibility, multi-pipeline flexibility, and resilient modes for retries, backfills, and dry runs—ready to drop into your Kubernetes CronJobs.

Overview

This project wraps the Sling CLI to:

  • Run data sync jobs from mission clusters to a central command cluster.
  • Emit OpenTelemetry traces and logs for rich observability.
  • Output structured JSON logs for context-aware debugging.
  • Support multiple pipelines, retry logic, backfill mode, and noop (dry-run) mode.
  • Integrate easily into Kubernetes as a CronJob.

The collected telemetry is stored in GreptimeDB and visualized via Grafana.

Features

OpenTelemetry Tracing & Logging; Each sync run is traced:

- sync_job_id, rows_synced, duration_seconds, and status.
- Logs captured as span events.
- Retry Logic; Retries failed syncs using exponential backoff (SYNC_MAX_RETRIES, SYNC_BACKOFF_BASE). Configurable Sling CLI timeout (SLING_TIMEOUT).

Multi-Pipeline Support

  • Mount multiple pipeline YAML files and run them sequentially.

Modes

  • noop: validate pipelines and environment, but don’t execute sync.
  • backfill: clear state and perform a full historical sync.

Drill-Down Links in Grafana

  • Jump from traces → logs and logs → traces for rapid troubleshooting.

Architecture

┌────────────────────┐        ┌────────────────────────────┐
│   Sling Wrapper    │        │  OpenTelemetry Collector   │
│  (CronJob, Go)     │        │ (Deployment, OTLP->GreptimeDB)│
└─────────┬──────────┘        └─────────────┬──────────────┘
          │ OpenTelemetry (Traces & Logs)   │
          v                                 v
    ┌──────────────┐                 ┌───────────────┐
    │  GreptimeDB  │ <── (Optional) ─│ Sling State DB│
    └──────────────┘                 └───────────────┘
          │
          v
     ┌─────────┐
     │ Grafana │
     └─────────┘

QUICKSTART

The make quickstart target runs the standalone program in cmd/quickstart. It creates two sample SQLite mission databases, a DuckDB command database and performs a sync between them using direct SQL statements. This example is self-contained and does not invoke the Sling CLI or the wrapper binary. It is meant to generate sample data and pipeline files for exploration.

Run everything with one command:

make quickstart

After the command finishes you will find the databases and pipelines under the quickstart/ directory and a command.db file populated with telemetry from both missions. Inspect the YAML files in quickstart/pipelines/ to see how a minimal SLING_CONFIG looks. From here you can adjust the pipelines or point the wrapper at your own databases.

Quickstart (Local)

This flow exercises the wrapper itself. The wrapper invokes the Sling CLI to run pipeline files, so the CLI must be installed locally.

Prerequisites

  • Go 1.22+
  • Sling CLI installed and on your PATH (run make install-sling-cli or download from Sling releases)
  • GreptimeDB and Grafana (optional for observability)

Build

go build -o sling-sync-wrapper ./cmd/wrapper

Run Example

./sling-sync-wrapper run \
  --mission-cluster-id mission-01 \
  --config ./pipeline.yaml \
  --state file://./sling_state.json \
  --sling-timeout 30m \
  --otel-endpoint localhost:4317

You can also rely solely on environment variables, which is helpful in Kubernetes deployments:

MISSION_CLUSTER_ID=mission-01 \
SLING_CONFIG=./pipeline.yaml \
SLING_STATE=file://./sling_state.json \
OTEL_EXPORTER_OTLP_ENDPOINT=localhost:4317 \
./sling-sync-wrapper run

Flags override environment variables, which remain available for compatibility. The wrapper automatically generates a unique SYNC_JOB_ID for each run.

Subcommands

The wrapper exposes the following subcommands:

  • run: execute configured pipelines (default mode)
  • noop: dry-run without invoking Sling
  • backfill: reset state and exit
# noop
./sling-sync-wrapper noop --config ./pipeline.yaml

# backfill
./sling-sync-wrapper backfill --config ./pipeline.yaml

Environment Variables

The wrapper is configured using the following environment variables:

Variable Default Required? Description
MISSION_CLUSTER_ID unknown-cluster Yes Source cluster identifier used in telemetry.
SLING_CONFIG Yes* Path to a single pipeline file. Required if PIPELINE_DIR is not set.
PIPELINE_DIR /etc/sling/pipelines Yes* Directory containing one or more pipeline files. Required if SLING_CONFIG is not set.
SLING_STATE file://./sling_state.json No Path or URL where sync state is stored.
OTEL_EXPORTER_OTLP_ENDPOINT otel-collector:4317 No OpenTelemetry Collector endpoint for traces and logs.
SYNC_MODE normal No Sync mode: normal (incremental), noop, or backfill.
SYNC_MAX_RETRIES 3 No Number of times to retry a failed pipeline run.
SYNC_BACKOFF_BASE 5s No Base duration for exponential backoff between retries.
SLING_BIN sling No Path to the Sling CLI binary.
SLING_TIMEOUT 30m No Maximum duration for a single Sling CLI invocation.

* Either SLING_CONFIG or PIPELINE_DIR must be set.

Kubernetes Deployment

Install with Helm

helm install sling-sync-wrapper ./helm/sling-sync-wrapper

This will deploy:

  • OpenTelemetry Collector (Deployment + Service) and its ConfigMap.
  • Sling Sync Job (CronJob) running every 5 minutes.
  • Pipeline ConfigMap for one or more pipeline YAMLs.

Configure the CronJob using the environment variables described in the Environment Variables section.

Grafana Dashboard

A pre-built Grafana dashboard is provided:

  • Panels:
    • Job counts and status breakdown.
    • Sync duration and row trends.
    • Recent jobs and logs with drill-down links.
  • Links:
    • Trace → Logs.
    • Logs → Trace Explorer.

Import Dashboard

  • Open Grafana → Dashboards → Import.
  • Paste the provided JSON.
  • Set:
    • YOUR_TRACE_DATASOURCE_UID → GreptimeDB traces.
    • YOUR_LOGS_DATASOURCE_UID → GreptimeDB logs.

Developer Quickstart

Run the core checks and build the Docker image with:

make check  # run fmt, vet, tidy, and test
make docker # build the Docker image

Development

Maintenance

Common development tasks are available via the Makefile:

make check # run fmt, vet, tidy, and unit tests
make fmt   # format Go code
make vet   # run go vet
make tidy  # tidy module dependencies
make test  # run unit tests

Run locally with multiple pipelines

PIPELINE_DIR=./pipelines \
MISSION_CLUSTER_ID=local \
./sling-sync-wrapper

Retry Testing

SYNC_MAX_RETRIES=5 SYNC_BACKOFF_BASE=2s ./sling-sync-wrapper

Observability

  • Traces:
    • One trace per sync job.
    • Span attributes include job metadata.
  • Logs:
    • Each Sling log message attached as a span event.
    • Metrics (optional future work):
    • Could export Prometheus metrics for job success/failure counters.

Next Steps / Enhancements

  • Add Prometheus metrics for sync jobs.
  • Add schema/state validation pre-run.
  • Add alerting rules for repeated failures.

About

Orchestrate mission-cluster to command-cluster data flows with OpenTelemetry-powered visibility, multi-pipeline flexibility, and resilient modes for retries, backfills, and dry runs—ready to drop into your Kubernetes CronJobs.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Languages