Thanks to visit codestin.com
Credit goes to github.com

Skip to content

A Lightweight Log Management System.

hasssanezzz/LogLens

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

17 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ” LogLens β€” Real-Time Log Analytics Engine

LogLens is a high-performance, scalable log analytics engine built in Go. It enables real-time ingestion, search, and analysis of structured logs at scale.

With support for high-throughput ingestion, efficient indexing using Bleve, batched writes, write-ahead logging (WAL), compression, and time-based retention, LogLens is designed to be the backbone of modern observability pipelines.


πŸš€ Features

  • βœ… High Throughput Ingestion: Handle thousands of logs per second with minimal overhead
  • βœ… Real-Time Search: Instant querying using Bleve full-text search index
  • βœ… Batching & Compression: Efficient disk writes with Zstandard compression
  • βœ… Write-Ahead Logging (WAL): Ensures durability and crash recovery
  • βœ… Time-Based Retention: Auto-delete logs beyond configured threshold
  • βœ… REST API: Simple HTTP endpoints for ingestion and querying
  • βœ… Performance Monitoring: Built-in stats and metrics endpoint
  • βœ… Load Testing Ready: Comes with Vegeta scripts for benchmarking

🧠 Architecture Overview

Architecture Overview

This image was generated by GoTypeGraph

πŸ” High-Level Workflow

The system processes logs through several stages:

  1. Ingestion
  2. Buffering
  3. Indexing
  4. Storage
  5. Search
  6. Retention

πŸ—οΈ Core Components & Responsibilities

1. Ingestion Layer

  • Accepts JSON-formatted logs via HTTP POST requests.
  • Tags logs using custom headers (KV-environment, KV-level, etc.).
  • Sends logs asynchronously to an internal channel for processing.

2. Memory Buffer

  • Stores recent logs in memory for fast access and partial indexing.
  • Flushes logs to disk when buffer reaches threshold or day changes.
  • Periodically indexes batches in memory for fast queries.

3. Write-Ahead Log (WAL)

  • Ensures durability before logs are flushed to disk.
  • Prevents data loss during crashes by replaying WAL on startup.
  • Logs are written synchronously with Sync() (can be optimized).

4. Index Manager

  • Uses Bleve to build a full-text search index.
  • Indexes logs in background via a worker pool.
  • Handles complex queries with filters, time ranges, and text match.

5. Batch Manager

  • Serializes logs into binary format and compresses them using Zstandard.
  • Writes compressed logs to disk in files named like 12345656789-987654321.lens.
    • The numbers represent start and end timestamps (in microseconds since epoch) of the batch's ingestion window.
  • Retrieves logs from batches based on position offsets.
  • Deletes old batches as part of retention cleanup.

6. Storage Manager

  • Manages file I/O operations (read/write/delete).
  • Generates file paths based on ingestion date.
  • Caches open file handles for faster access.
  • Compressed .lens files are stored in a directory structure organized by year/month/day.

7. Retention Manager

  • Periodically scans and deletes logs older than a set number of days.
  • Queries index to find expired logs.
  • Deletes matching batches from disk.
  • Reports statistics like freed space and deleted logs.

8. Search Engine

  • Supports hybrid query execution:
    • First searches the in-memory buffer.
    • Then searches indexed logs from disk.
  • Merges results and returns unified output.
  • Includes frequency maps, pagination, and retrieval time tracking.

πŸ”„ Data Flow

Here’s how a log flows through the system:

  1. Ingest

    • HTTP request received β†’ parsed β†’ sent to entryChan
  2. Buffer

    • Log is added to in-memory buffer
    • Assigned a position (offset in buffer)
  3. WAL

    • Log is written to Write-Ahead Log (with sync)
    • Ensures crash recovery
  4. Indexing

    • Logs are periodically indexed in memory
    • When buffer reaches threshold, logs are flushed
  5. Flush

    • Buffered logs are grouped into a batch
    • Batch is compressed and written to disk
    • Logs are also indexed in Bleve asynchronously
  6. Search

    • Query hits both memory buffer and indexed logs
    • Matching positions are used to retrieve actual logs
    • Results are merged and returned
  7. Retention

    • Periodic scan finds old logs
    • Batches containing old logs are deleted
    • Index entries for those logs are removed

πŸ“¦ File Format

Each .lens file contains:

[Header: Magic "LENS"] [Version] [Length]
[Repeated: [Log Size] [Serialized Log]]
  • Logs are serialized using gob
  • Entire batch is compressed using Zstandard
  • Files are organized by date directories:
    /data/
      └── 2025/
          └── 04/
              └── 05/
                  └── 123456789-987654321.lens
    

Each filename like 123456789-987654321.lens indicates the start and end timestamp (in microseconds since epoch) of the batch's ingestion window.


βš™οΈ Background Workers

  • Buffer Flush Worker:

    • Listens on entryChan
    • Flushes buffer when it reaches threshold or day changes
  • Indexing Worker:

    • Listens on consumeBatchChan
    • Indexes logs in background
  • Retention Worker:

    • Runs daily
    • Deletes expired logs based on time threshold

πŸ“ˆ Performance Characteristics

βœ… What’s Optimized Already

  • Batching: Logs are buffered and flushed in batches to reduce disk I/O.
  • Compression: Uses Zstandard to reduce storage footprint.
  • WAL: Ensures durability even if system crashes before flush.
  • Efficient Search: Hybrid search between memory buffer and persisted logs.
  • File Format Design: Binary format with header metadata + compressed payload.

πŸ§ͺ Load Testing

Use the provided Vegeta scripts to simulate high-load scenarios:

# Run load test with 3000 RPS for 10 seconds
./vegeta.bash 3000 10s

Monitor system behavior using:

go tool pprof http://localhost:6060/debug/pprof/profile?seconds=30

🧰 REST API Endpoints

Method Endpoint Description
POST / Ingest a new log
GET / Search logs
GET /range-count?start=YYYY-MM-DD&end=YYYY-MM-DD Get count over time range

Headers like KV-environment, KV-level allow tagging logs with metadata.


πŸ“ File Format Specification

Each .lens file contains:

[Magic "LENS"] [Version (1 byte)] [Log Count (4 bytes)]
[Log Size (4 bytes)][Log Payload...]
[Log Size (4 bytes)][Log Payload...]
...

All payloads are compressed using Zstandard.


πŸ›‘οΈ Crash Recovery

The Write-Ahead Log ensures that logs are not lost during unexpected shutdowns. On startup, the system replays the WAL to rebuild the in-memory buffer.


πŸ—‘οΈ Time-Based Retention

Old logs can be automatically deleted after a configurable number of days. A cron job scans and removes expired data from both index and disk.


🧩 Future Enhancements

  • βœ… UI dashboard for visualization (WIP)
  • βœ… Query caching layer (WIP)
  • βœ… Support for TLS and authentication
  • βœ… Alerting system with custom triggers

🀝 Contributing

Contributions welcome! Whether it's performance improvements, bug fixes, or feature additions β€” feel free to open an issue or PR.

About

A Lightweight Log Management System.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published