Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Scintirete 是一款基于 HNSW 算法实现的、嵌入式友好的、面向生产的向量数据库。Scintirete is a lightweight, embedded device friendly, production-ready vector database built on the HNSW algorithm.

License

Notifications You must be signed in to change notification settings

Scintirete/Scintirete

Repository files navigation

Scintirete

Go Release License

中文文档

Scintirete is a lightweight, production-ready vector database built on the HNSW (Hierarchical Navigable Small World) algorithm. The name derives from Latin words "Scintilla" (spark) and "Rete" (network), symbolizing a sparkling network that illuminates the crucial connections within complex data landscapes through deep similarity matching.

Core Philosophy: Discover infinite neighbors, illuminate the data network.

Open Source Ecosystem

Website and Documentation: scintirete.top (Source Code)

Database Management System (DBMS): dms.scintirete.top (Source Code)

Features

  • Lightweight & Simple: Self-contained implementation focused on core vector search functionality with minimal dependencies
  • High Performance: In-memory HNSW graph indexing provides millisecond-level nearest neighbor search
  • Data Safety: Based on flatbuffers, implements a Redis-like AOF + RDB persistence mechanism to ensure data durability
  • Modern APIs: Native support for both gRPC and HTTP/JSON interfaces for seamless integration
  • Production Ready: Structured logging, audit logs, Prometheus metrics, and comprehensive CLI tools designed for production environments
  • Cross-platform: Support Linux, macOS, Windows, arm64, amd64 architectures out of the box, with optimized builds for Raspberry Pi devices
  • Support Text Embedding: Support OpenAI-compatible API integration, support automatic text vectorization

Scintirete targets small to medium-scale projects, edge computing scenarios, and developers who need rapid prototyping with a reliable, high-performance, and maintainable vector search solution.

Roadmap

  • Provide upstream framework integrations, such as langchain, langgraph, etc.
  • Implement some killer features in the webapp for reference experience, such as movie recommendation, face recognition, knowledge base question-answering, etc.
  • Run the full project smoothly on Raspberry Pi
  • Provide multi-language SDK based on protobuf

Quick Start

Prerequisites

  • Go 1.24+ (for building from source)
  • Docker (optional, for containerized deployment)

Installation

Option 1: Download Pre-built Binaries

Download the latest release from the releases page.

Platform Support:

Platform Architecture Binary Package Supported Devices
Linux (x86_64) amd64 scintirete-linux-amd64.tar.gz Standard servers, PCs
Linux (ARM64) arm64 scintirete-linux-arm64-pi45.tar.gz Raspberry Pi 3/4/5, Zero 2W (64-bit OS)
Linux (ARM v7) arm scintirete-linux-arm-pi23.tar.gz Raspberry Pi 2/3/4/5, Zero 2W (32-bit OS)
Linux (ARM v6) arm scintirete-linux-arm-pi1.tar.gz Raspberry Pi 1, Zero, Zero W
Windows amd64/arm64 scintirete-windows-*.zip Windows PCs
macOS amd64/arm64 scintirete-darwin-*.tar.gz Intel Macs, Apple Silicon

Raspberry Pi Quick Reference:

Raspberry Pi Model CPU Architecture Common OS Bit Width Go Build Parameters Docker Architecture
Pi 1, Pi Zero, Pi Zero W ARMv6 (32-bit) 32-bit GOARCH=arm, GOARM=6 linux/arm/v6
Pi 2 (Rev 1.1) ARMv7 (32-bit) 32-bit GOARCH=arm, GOARM=7 linux/arm/v7
Pi 3, Pi 4, Pi 5, Zero 2 W ARMv8 (64-bit capable) 32-bit OS (legacy Raspberry Pi OS) GOARCH=arm, GOARM=7 (compatibility mode) linux/arm/v7
Pi 3, Pi 4, Pi 5, Zero 2 W ARMv8 (AArch64) 64-bit OS (modern Raspberry Pi OS) GOARCH=arm64 linux/arm64

Option 2: Build from Source

git clone https://github.com/scintirete/scintirete.git
cd scintirete
make all

Option 3: Docker

Docker images support multiple architectures and will automatically select the appropriate architecture:

# Pull latest version (auto-selects architecture)
docker pull ghcr.io/scintirete/scintirete:latest

# Explicitly specify architecture (if needed)
docker pull --platform linux/arm64 ghcr.io/scintirete/scintirete:latest    # Pi 3/4/5 (64-bit)
docker pull --platform linux/arm/v7 ghcr.io/scintirete/scintirete:latest   # Pi 2/3/4/5 (32-bit)
docker pull --platform linux/arm/v6 ghcr.io/scintirete/scintirete:latest   # Pi 1/Zero/Zero W

Supported Docker Architectures:

  • linux/amd64 - x86_64 platforms
  • linux/arm64 - ARM64 platforms (Raspberry Pi 3/4/5, Zero 2W with 64-bit OS)
  • linux/arm/v7 - ARM v7 platforms (Raspberry Pi 2/3/4/5, Zero 2W with 32-bit OS)
  • linux/arm/v6 - ARM v6 platforms (Raspberry Pi 1, Zero, Zero W)

Basic Usage

1. Start the Server

# Using binary
./bin/scintirete-server

# Using Docker
docker run -p 8080:8080 -p 9090:9090 ghcr.io/scintirete/scintirete:latest

# Using docker-compose
docker-compose up -d

The server will start with:

  • gRPC API on port 9090
  • HTTP/JSON API on port 8080

2. Environment Setup with Embedding Support

To use the text embedding features, configure your OpenAI-compatible API, the [embedding] section in the configuration file configs/scintirete.toml defines the configuration for interacting with the external text embedding service

First create the configuration file from template, then edit it:

cp configs/scintirete.template.toml configs/scintirete.toml

Edit the configuration file configs/scintirete.toml:

[embedding]
base_url = "https://api.openai.com/v1/embeddings"
api_key = ""
rpm_limit = 3500
tpm_limit = 90000

3. Basic Operations

Using the CLI tool to perform basic vector operations:

# Create a database
./bin/scintirete-cli -p "your-password" db create my_app

# Create a collection for documents
./bin/scintirete-cli -p "your-password" collection create my_app documents --metric Cosine

# Insert text with automatic embedding
./bin/scintirete-cli -p "your-password" text insert my_app documents \
  "doc1" \
  "Scintirete is a lightweight vector database optimized for production use." \
  '{"source":"documentation","type":"intro"}'

# Insert more documents
./bin/scintirete-cli -p "your-password" text insert my_app documents \
  "doc2" \
  "HNSW algorithm provides efficient approximate nearest neighbor search." \
  '{"source":"documentation","type":"technical"}'

# Search for similar content
./bin/scintirete-cli -p "your-password" text search my_app documents \
  "What is Scintirete?" \
  5

# Get collection information
./bin/scintirete-cli -p "your-password" collection info my_app documents

4. Working with Pre-computed Vectors

If you have pre-computed vectors:

# Insert vectors directly
./bin/scintirete-cli -p "your-password" vector insert my_app vectors \
  --id "vec1" \
  --vector '[0.1, 0.2, 0.3, 0.4]' \
  --metadata '{"category":"example"}'

# Search with vector
./bin/scintirete-cli -p "your-password" vector search my_app vectors \
  --vector '[0.15, 0.25, 0.35, 0.45]' \
  --top-k 3

More documentation can be found in the https://scintirete.top/docs directory.

Architecture

Scintirete implements a modern vector database architecture with the following components:

  • Core Engine: In-memory HNSW graph with configurable parameters
  • Persistence Layer: Dual-mode persistence with AOF (real-time) and RDB (snapshot) strategies
  • API Layer: Dual protocol support with gRPC for performance and HTTP/JSON for accessibility
  • Embedding Integration: OpenAI-compatible API integration for automatic text vectorization
  • Observability: Comprehensive logging, audit logs, and metrics

For detailed technical documentation, see the https://scintirete.top/docs directory.

Configuration

Scintirete uses a single TOML configuration file. The default configuration provides sensible defaults for most use cases:

[server]
grpc_host = "127.0.0.1"
grpc_port = 9090
http_host = "127.0.0.1"
http_port = 8080
passwords = ["your-strong-password-here"]

[log]
level = "info"
format = "json"
enable_audit_log = true

[persistence]
data_dir = "./data"
aof_sync_strategy = "everysec"

[embedding]
base_url = "https://api.openai.com/v1/embeddings"
api_key = "your-openai-api-key"
rpm_limit = 3500
tpm_limit = 90000

API Documentation

Scintirete provides both gRPC and HTTP/JSON APIs:

  • gRPC: High-performance interface defined in protobuf
  • HTTP/JSON: RESTful interface accessible at http://localhost:8080/

For comprehensive API documentation and usage examples, refer to the documentation.

Performance Considerations

  • Memory Usage: Vectors are stored in memory for optimal search performance
  • Index Configuration: Tune HNSW parameters (m, ef_construction, ef_search) based on accuracy/speed requirements
  • Persistence: Configure AOF sync strategy based on durability vs. performance needs

Contributing

We welcome contributions to Scintirete! Here's how you can help:

Development Setup

  1. Fork and Clone

    git clone https://github.com/your-username/scintirete.git
    cd scintirete
  2. Install Dependencies and Build

    brew install flatbuffers protobuf
    make all
  3. Run Tests

    make test

Contribution Guidelines

  • Code Quality: Ensure your code passes all tests and follows Go conventions
  • Documentation: Update documentation for any API or configuration changes
  • Testing: Add tests for new features and bug fixes
  • Commit Messages: Use clear, descriptive commit messages
  • Pull Requests: Provide detailed descriptions of changes and their rationale

Areas for Contribution

  • Performance Optimization: HNSW algorithm improvements, memory optimization
  • Features: Metadata filtering, additional distance metrics, clustering algorithms
  • Integrations: Client libraries for different languages, framework integrations
  • Documentation: Tutorials, best practices, deployment guides
  • Testing: Integration tests, benchmarks, stress tests

Code of Conduct

We are committed to providing a welcoming and inclusive environment. Please treat all contributors with respect and professionalism.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support


Scintirete: Illuminate the data network, discover infinite neighbors.

About

Scintirete 是一款基于 HNSW 算法实现的、嵌入式友好的、面向生产的向量数据库。Scintirete is a lightweight, embedded device friendly, production-ready vector database built on the HNSW algorithm.

Topics

Resources

License

Stars

Watchers

Forks

Packages