LogLens is a high-performance, scalable log analytics engine built in Go. It enables real-time ingestion, search, and analysis of structured logs at scale.
With support for high-throughput ingestion, efficient indexing using Bleve, batched writes, write-ahead logging (WAL), compression, and time-based retention, LogLens is designed to be the backbone of modern observability pipelines.
- β High Throughput Ingestion: Handle thousands of logs per second with minimal overhead
- β Real-Time Search: Instant querying using Bleve full-text search index
- β Batching & Compression: Efficient disk writes with Zstandard compression
- β Write-Ahead Logging (WAL): Ensures durability and crash recovery
- β Time-Based Retention: Auto-delete logs beyond configured threshold
- β REST API: Simple HTTP endpoints for ingestion and querying
- β Performance Monitoring: Built-in stats and metrics endpoint
- β Load Testing Ready: Comes with Vegeta scripts for benchmarking
This image was generated by GoTypeGraph
The system processes logs through several stages:
- Ingestion
- Buffering
- Indexing
- Storage
- Search
- Retention
- Accepts JSON-formatted logs via HTTP POST requests.
- Tags logs using custom headers (
KV-environment,KV-level, etc.). - Sends logs asynchronously to an internal channel for processing.
- Stores recent logs in memory for fast access and partial indexing.
- Flushes logs to disk when buffer reaches threshold or day changes.
- Periodically indexes batches in memory for fast queries.
- Ensures durability before logs are flushed to disk.
- Prevents data loss during crashes by replaying WAL on startup.
- Logs are written synchronously with
Sync()(can be optimized).
- Uses Bleve to build a full-text search index.
- Indexes logs in background via a worker pool.
- Handles complex queries with filters, time ranges, and text match.
- Serializes logs into binary format and compresses them using Zstandard.
- Writes compressed logs to disk in files named like
12345656789-987654321.lens.- The numbers represent start and end timestamps (in microseconds since epoch) of the batch's ingestion window.
- Retrieves logs from batches based on position offsets.
- Deletes old batches as part of retention cleanup.
- Manages file I/O operations (read/write/delete).
- Generates file paths based on ingestion date.
- Caches open file handles for faster access.
- Compressed
.lensfiles are stored in a directory structure organized by year/month/day.
- Periodically scans and deletes logs older than a set number of days.
- Queries index to find expired logs.
- Deletes matching batches from disk.
- Reports statistics like freed space and deleted logs.
- Supports hybrid query execution:
- First searches the in-memory buffer.
- Then searches indexed logs from disk.
- Merges results and returns unified output.
- Includes frequency maps, pagination, and retrieval time tracking.
Hereβs how a log flows through the system:
-
Ingest
- HTTP request received β parsed β sent to
entryChan
- HTTP request received β parsed β sent to
-
Buffer
- Log is added to in-memory buffer
- Assigned a position (offset in buffer)
-
WAL
- Log is written to Write-Ahead Log (with sync)
- Ensures crash recovery
-
Indexing
- Logs are periodically indexed in memory
- When buffer reaches threshold, logs are flushed
-
Flush
- Buffered logs are grouped into a batch
- Batch is compressed and written to disk
- Logs are also indexed in Bleve asynchronously
-
Search
- Query hits both memory buffer and indexed logs
- Matching positions are used to retrieve actual logs
- Results are merged and returned
-
Retention
- Periodic scan finds old logs
- Batches containing old logs are deleted
- Index entries for those logs are removed
Each .lens file contains:
[Header: Magic "LENS"] [Version] [Length]
[Repeated: [Log Size] [Serialized Log]]
- Logs are serialized using
gob - Entire batch is compressed using Zstandard
- Files are organized by date directories:
/data/ βββ 2025/ βββ 04/ βββ 05/ βββ 123456789-987654321.lens
Each filename like 123456789-987654321.lens indicates the start and end timestamp (in microseconds since epoch) of the batch's ingestion window.
-
Buffer Flush Worker:
- Listens on
entryChan - Flushes buffer when it reaches threshold or day changes
- Listens on
-
Indexing Worker:
- Listens on
consumeBatchChan - Indexes logs in background
- Listens on
-
Retention Worker:
- Runs daily
- Deletes expired logs based on time threshold
- Batching: Logs are buffered and flushed in batches to reduce disk I/O.
- Compression: Uses Zstandard to reduce storage footprint.
- WAL: Ensures durability even if system crashes before flush.
- Efficient Search: Hybrid search between memory buffer and persisted logs.
- File Format Design: Binary format with header metadata + compressed payload.
Use the provided Vegeta scripts to simulate high-load scenarios:
# Run load test with 3000 RPS for 10 seconds
./vegeta.bash 3000 10sMonitor system behavior using:
go tool pprof http://localhost:6060/debug/pprof/profile?seconds=30| Method | Endpoint | Description |
|---|---|---|
| POST | / |
Ingest a new log |
| GET | / |
Search logs |
| GET | /range-count?start=YYYY-MM-DD&end=YYYY-MM-DD |
Get count over time range |
Headers like KV-environment, KV-level allow tagging logs with metadata.
Each .lens file contains:
[Magic "LENS"] [Version (1 byte)] [Log Count (4 bytes)]
[Log Size (4 bytes)][Log Payload...]
[Log Size (4 bytes)][Log Payload...]
...
All payloads are compressed using Zstandard.
The Write-Ahead Log ensures that logs are not lost during unexpected shutdowns. On startup, the system replays the WAL to rebuild the in-memory buffer.
Old logs can be automatically deleted after a configurable number of days. A cron job scans and removes expired data from both index and disk.
- β UI dashboard for visualization (WIP)
- β Query caching layer (WIP)
- β Support for TLS and authentication
- β Alerting system with custom triggers
Contributions welcome! Whether it's performance improvements, bug fixes, or feature additions β feel free to open an issue or PR.