Thanks to visit codestin.com
Credit goes to github.com

Skip to content

High-performance time-series database for Aerospace, Defense, and Industrial IoT. 18.6M records/sec. Satellite tracking, launch telemetry, ground stations, manufacturing, energy. DuckDB SQL + Parquet + Arrow. AGPL-3.0

License

Notifications You must be signed in to change notification settings

Basekick-Labs/arc

Repository files navigation

Arc

Ingestion Query Go License

Docs Website Discord GitHub

High-performance time-series database for Aerospace, Defense, and Industrial IoT. 18.6M records/sec. Satellite tracking, launch telemetry, ground stations, manufacturing, energy. DuckDB SQL + Parquet + Arrow. AGPL-3.0


The Problem

Aerospace, defense, and industrial systems generate massive telemetry at scale:

  • Aerospace & Defense: Satellite constellations, launch vehicles, ground stations, orbital tracking
  • Space Operations: 14K+ objects in orbit, TLE data, SGP4 propagation, conjunction analysis
  • Industrial IoT: Manufacturing telemetry, mining sensors, equipment monitoring
  • Energy & Utilities: Grid monitoring, smart meters, renewable output, pipeline sensors
  • Transportation: Racing telemetry, fleet tracking, logistics optimization
  • Healthcare: Patient monitoring, medical devices, clinical studies
  • Observability: Metrics, logs, traces from distributed systems

Traditional time-series databases weren't built for aerospace workloads:

  • ITAR compliance requires self-hosted infrastructure
  • Mission-critical systems can't risk vendor lock-in
  • Burst ingestion during satellite passes (10M+ metrics/sec → silence → burst)
  • Multi-decade retention for space missions
  • Sub-second queries for real-time decision making

Arc solves this: 18.6M records/sec ingestion, sub-second queries on billions of rows, portable Parquet files you own, ITAR-ready self-hosted deployment.

-- Track satellite orbital elements over time
SELECT
  satellite_id,
  norad_id,
  epoch,
  inclination,
  eccentricity,
  mean_motion,
  LAG(mean_motion) OVER (PARTITION BY satellite_id ORDER BY epoch) as prev_mean_motion,
  mean_motion - LAG(mean_motion) OVER (PARTITION BY satellite_id ORDER BY epoch) as orbital_decay
FROM tle.satellites
WHERE satellite_id LIKE 'Starlink%'
  AND epoch > NOW() - INTERVAL '30 days'
ORDER BY satellite_id, epoch DESC;

-- Analyze ground station contact windows
SELECT
  ground_station_id,
  satellite_id,
  MAX(signal_strength) as peak_signal,
  AVG(data_rate) as avg_throughput,
  SUM(bytes_received) as total_data
FROM telemetry.contacts
WHERE contact_start > NOW() - INTERVAL '24 hours'
GROUP BY ground_station_id, satellite_id
HAVING AVG(data_rate) > 1000000;  -- 1 Mbps minimum

-- Industrial equipment monitoring
SELECT
  device_id,
  facility_name,
  AVG(temperature) OVER (
    PARTITION BY device_id
    ORDER BY timestamp
    ROWS BETWEEN 10 PRECEDING AND CURRENT ROW
  ) as temp_moving_avg,
  MAX(pressure) as peak_pressure
FROM iot.sensors
WHERE timestamp > NOW() - INTERVAL '24 hours'
  AND facility_id IN ('plant_7', 'mining_site_42')
HAVING MAX(pressure) > 850;

Standard DuckDB SQL. Window functions, CTEs, joins. No proprietary query language.


Live Demo

See Arc tracking 14,273 satellites in real-time: 🛰️ https://basekick.net/demos/satellite-tracking


Performance

Benchmarked on Apple MacBook Pro M3 Max (14 cores, 36GB RAM, 1TB NVMe). Test config: 12 concurrent workers, 1000-record batches, IoT sensor data.

Ingestion

Protocol Throughput p50 Latency p99 Latency
MessagePack Columnar 18.6M rec/s 0.46ms 3.68ms
MessagePack + Zstd 16.8M rec/s 0.55ms 3.23ms
MessagePack + GZIP 15.4M rec/s 0.63ms 3.17ms
Line Protocol 3.7M rec/s 2.63ms 10.63ms

Compaction

Automatic background compaction merges small Parquet files into optimized larger files:

Metric Before After Reduction
Files 43 1 97.7%
Size 372 MB 36 MB 90.4%

Benefits:

  • 10x storage reduction via better compression and encoding
  • Faster queries - scan 1 file vs 43 files
  • Lower cloud costs - less storage, fewer API calls

Query (March 2026)

Arrow IPC format provides up to 3.6x throughput vs JSON for large result sets:

Query Arrow (ms) JSON (ms) Speedup
COUNT(*) - 1.88B rows 1.9 1.8 0.95x
SELECT LIMIT 10K 70 75 1.07x
SELECT LIMIT 100K 88 106 1.20x
SELECT LIMIT 500K 127 253 1.99x
SELECT LIMIT 1M 159 438 2.75x
Time Range (7d) LIMIT 10K 45 51 1.13x
Time Bucket (1h, 7d) 986 1089 1.10x
Date Trunc (day, 30d) 2013 2190 1.09x

Best throughput:

  • Arrow: 6.29M rows/sec (1M row SELECT)
  • JSON: 2.28M rows/sec (1M row SELECT)
  • COUNT(*): ~1.1T rows/sec (1.88B rows, 1.8ms)

Why Go

  • Stable memory: Go's GC returns memory to OS. No leaks.
  • Single binary: Deploy one executable. No dependencies.
  • Native concurrency: Goroutines handle thousands of connections efficiently.
  • Production GC: Sub-millisecond pause times at scale.

Quick Start

# Build
make build

# Run
./arc

# Verify
curl http://localhost:8000/health

Installation

Docker

docker run -d \
  -p 8000:8000 \
  -v arc-data:/app/data \
  ghcr.io/basekick-labs/arc:latest

Debian/Ubuntu

wget https://github.com/basekick-labs/arc/releases/download/v26.03.1/arc_26.03.1_amd64.deb
sudo dpkg -i arc_26.03.1_amd64.deb
sudo systemctl enable arc && sudo systemctl start arc

RHEL/Fedora

wget https://github.com/basekick-labs/arc/releases/download/v26.03.1/arc-26.03.1-1.x86_64.rpm
sudo rpm -i arc-26.03.1-1.x86_64.rpm
sudo systemctl enable arc && sudo systemctl start arc

Kubernetes (Helm)

helm install arc https://github.com/basekick-labs/arc/releases/download/v26.03.1/arc-26.03.1.tgz

Build from Source

# Prerequisites: Go 1.26+

# Clone and build
git clone https://github.com/basekick-labs/arc.git
cd arc
make build

# Or build directly with Go (the duckdb_arrow tag is required)
go build -tags=duckdb_arrow ./cmd/arc

# Run
./arc

Ecosystem & Integrations

Tool Description Link
VS Code Extension Browse databases, run queries, visualize results Marketplace
Grafana Data Source Native Grafana plugin for dashboards and alerting GitHub
Telegraf Output Plugin Ship metrics from 300+ Telegraf inputs directly to Arc Docs
Python SDK Query and ingest from Python applications PyPI
Superset Dialect (JSON) Apache Superset connector using JSON transport GitHub
Superset Dialect (Arrow) Apache Superset connector using Arrow transport GitHub

Features

  • Ingestion: MessagePack columnar (fastest), InfluxDB Line Protocol
  • Query: DuckDB SQL engine, JSON and Apache Arrow IPC responses
  • Storage: Local filesystem, S3, MinIO
  • Auth: Token-based authentication with in-memory caching
  • Durability: Optional write-ahead log (WAL)
  • Compaction: Tiered (hourly/daily) automatic file merging
  • Data Management: Retention policies, continuous queries, GDPR-compliant delete
  • Observability: Prometheus metrics, structured logging, graceful shutdown
  • Reliability: Circuit breakers, retry with exponential backoff

Configuration

Arc uses TOML configuration with environment variable overrides.

[server]
host = "0.0.0.0"
port = 8000

[storage]
backend = "local"        # local, s3, minio
local_path = "./data/arc"

[ingest]
flush_interval = "5s"
max_buffer_size = 50000

[auth]
enabled = true

Environment variables use ARC_ prefix:

export ARC_SERVER_PORT=8000
export ARC_STORAGE_BACKEND=s3
export ARC_AUTH_ENABLED=true

See arc.toml for complete configuration reference.


Project Structure

arc/
├── cmd/arc/              # Application entry point
├── internal/
│   ├── api/              # HTTP handlers (Fiber) — query, write, import, TLE, admin
│   ├── audit/            # Audit logging for API operations
│   ├── auth/             # Token authentication and RBAC
│   ├── backup/           # Backup and restore (data, metadata, config)
│   ├── circuitbreaker/   # Resilience patterns (retry, backoff)
│   ├── cluster/          # Raft consensus, node roles, WAL replication
│   ├── compaction/       # Tiered hourly/daily Parquet file merging
│   ├── config/           # TOML configuration with env var overrides
│   ├── database/         # DuckDB connection pool
│   ├── governance/       # Per-token query quotas and rate limiting
│   ├── ingest/           # MessagePack, Line Protocol, TLE, Arrow writer
│   ├── license/          # License validation and feature gating
│   ├── logger/           # Structured logging (zerolog)
│   ├── metrics/          # Prometheus metrics
│   ├── mqtt/             # MQTT subscriber — topic-to-measurement ingestion
│   ├── pruning/          # Query-time partition pruning
│   ├── query/            # Parallel partition executor
│   ├── queryregistry/    # Active/completed query tracking
│   ├── scheduler/        # Continuous queries and retention policies
│   ├── shutdown/         # Graceful shutdown coordinator
│   ├── sql/              # SQL parsing utilities
│   ├── storage/          # Local, S3, Azure backends
│   ├── telemetry/        # Usage telemetry
│   ├── tiering/          # Hot/cold storage lifecycle management
│   └── wal/              # Write-ahead log
├── pkg/models/           # Shared data structures (Record, ColumnarRecord)
├── benchmarks/           # Performance benchmarking suites
├── deploy/               # Docker Compose and Kubernetes configs
├── helm/                 # Helm charts
├── scripts/              # Utility scripts (analysis, backfill, debugging)
├── arc.toml              # Configuration file
├── Makefile              # Build commands
└── go.mod

Development

make deps           # Install dependencies
make build          # Build binary
make run            # Run without building
make test           # Run tests
make test-coverage  # Run tests with coverage
make bench          # Run benchmarks
make lint           # Run linter
make fmt            # Format code
make clean          # Clean build artifacts

License

Arc is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).

  • Free to use, modify, and distribute
  • If you modify Arc and run it as a service, you must share your changes under AGPL-3.0

For commercial licensing, contact: [email protected]


Contributors

Thanks to everyone who has contributed to Arc:

  • @schotime (Adam Schroder) - Data-time partitioning, compaction API triggers, UTC fixes
  • @khalid244 - S3 partition pruning improvements, multi-line SQL query support

About

High-performance time-series database for Aerospace, Defense, and Industrial IoT. 18.6M records/sec. Satellite tracking, launch telemetry, ground stations, manufacturing, energy. DuckDB SQL + Parquet + Arrow. AGPL-3.0

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors