Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Scintirete 的 python SDK,Scintirete 是一款基于 HNSW 算法实现的、轻量级、面向生产的单机向量数据库。

License

Notifications You must be signed in to change notification settings

Scintirete/scintirete-sdk-python

Repository files navigation

Scintirete Python SDK

English | 中文

Official Python client library for Scintirete Vector Database.

PyPI version Python License

Features

  • 🚀 High Performance: Built on gRPC with connection pooling and compression
  • 🔄 Sync & Async: Support both synchronous and asynchronous operations
  • 🔐 Authentication: Simple password-based authentication
  • 📊 Rich Types: Comprehensive type hints and data models
  • 🧪 Well Tested: Extensive unit tests and integration tests
  • 📖 Documentation: Detailed API documentation and examples

Installation

pip install scintirete-sdk

For async support:

pip install scintirete-sdk[async]

For development:

pip install scintirete-sdk[dev]

Quick Start

Synchronous Client

from scintirete_sdk import ScintireteClient, DistanceMetric, Vector

# Create client
client = ScintireteClient("localhost:50051", password="your_password")

# Create database
client.create_database("my_db")

# Create collection
client.create_collection(
    "my_db", 
    "my_collection", 
    metric_type=DistanceMetric.COSINE
)

# Insert vectors
vectors = [
    Vector(elements=[0.1, 0.2, 0.3], metadata={"label": "sample1"}),
    Vector(elements=[0.4, 0.5, 0.6], metadata={"label": "sample2"}),
]
ids, count = client.insert_vectors("my_db", "my_collection", vectors)

# Search vectors
results = client.search(
    "my_db", 
    "my_collection", 
    query_vector=[0.1, 0.2, 0.3], 
    top_k=5
)

for result in results:
    print(f"ID: {result.id}, Distance: {result.distance}")

# Close connection
client.close()

Asynchronous Client

import asyncio
from scintirete_sdk import ScintireteAsyncClient, DistanceMetric, Vector

async def main():
    # Create async client
    async with ScintireteAsyncClient("localhost:50051") as client:
        # Create database
        await client.create_database("my_db")
        
        # Create collection
        await client.create_collection(
            "my_db", 
            "my_collection", 
            metric_type=DistanceMetric.COSINE
        )
        
        # Insert vectors
        vectors = [
            Vector(elements=[0.1, 0.2, 0.3], metadata={"label": "sample1"}),
            Vector(elements=[0.4, 0.5, 0.6], metadata={"label": "sample2"}),
        ]
        ids, count = await client.insert_vectors("my_db", "my_collection", vectors)
        
        # Search vectors
        results = await client.search(
            "my_db", 
            "my_collection", 
            query_vector=[0.1, 0.2, 0.3], 
            top_k=5
        )
        
        for result in results:
            print(f"ID: {result.id}, Distance: {result.distance}")

# Run async function
asyncio.run(main())

Context Manager

# Synchronous context manager
with ScintireteClient("localhost:50051") as client:
    databases = client.list_databases()
    print(databases)

# Asynchronous context manager
async with ScintireteAsyncClient("localhost:50051") as client:
    databases = await client.list_databases()
    print(databases)

Configuration

Client Options

from scintirete_sdk import ScintireteClient

client = ScintireteClient(
    address="localhost:50051",
    password="your_password",          # Authentication password
    use_tls=False,                     # Enable TLS/SSL
    default_timeout=30.0,              # Default timeout in seconds
    enable_gzip=True,                  # Enable gRPC compression
    options=[                          # Custom gRPC options
        ("grpc.keepalive_time_ms", 30000),
        ("grpc.max_receive_message_length", 64 * 1024 * 1024),
    ]
)

HNSW Configuration

from scintirete_sdk import HnswConfig, DistanceMetric

# Custom HNSW parameters
hnsw_config = HnswConfig(
    m=32,                    # Maximum connections per node
    ef_construction=400      # Search scope during construction
)

# Create collection with custom HNSW config
client.create_collection(
    "my_db",
    "my_collection", 
    metric_type=DistanceMetric.L2,
    hnsw_config=hnsw_config
)

API Reference

Database Operations

# Create database
success = client.create_database("my_database")

# List databases
databases = client.list_databases()

# Drop database
success, dropped_collections = client.drop_database("my_database")

Collection Operations

# Create collection
info = client.create_collection(
    db_name="my_db",
    collection_name="my_collection",
    metric_type=DistanceMetric.COSINE,
    hnsw_config=HnswConfig(m=16, ef_construction=200)
)

# Get collection info
info = client.get_collection_info("my_db", "my_collection")
print(f"Dimension: {info.dimension}, Vectors: {info.vector_count}")

# List collections
collections = client.list_collections("my_db")

# Drop collection
success, dropped_vectors = client.drop_collection("my_db", "my_collection")

Vector Operations

from scintirete_sdk import Vector

# Insert vectors
vectors = [
    Vector(
        elements=[0.1, 0.2, 0.3, 0.4],
        metadata={"source": "document1", "category": "text"}
    ),
    Vector(
        elements=[0.5, 0.6, 0.7, 0.8],
        metadata={"source": "document2", "category": "image"}
    )
]

inserted_ids, count = client.insert_vectors("my_db", "my_collection", vectors)

# Search vectors
results = client.search(
    db_name="my_db",
    collection_name="my_collection",
    query_vector=[0.1, 0.2, 0.3, 0.4],
    top_k=10,
    ef_search=50,          # Override HNSW search parameter
    include_vector=True    # Include vector data in results
)

# Delete vectors
deleted_count = client.delete_vectors("my_db", "my_collection", [1, 2, 3])

Text Embedding Operations

from scintirete_sdk import TextWithMetadata

# List available embedding models
models, default_model = client.list_embedding_models()
print(f"Default model: {default_model}")

# Embed text
texts = ["Hello world", "Python programming"]
results = client.embed_text(texts, embedding_model="text-embedding-ada-002")

for result in results:
    print(f"Text: {result.text}")
    print(f"Embedding: {result.embedding[:5]}...")  # First 5 dimensions

# Embed and insert
texts_with_metadata = [
    TextWithMetadata(
        text="Natural language processing",
        metadata={"topic": "AI", "difficulty": "advanced"}
    ),
    TextWithMetadata(
        text="Machine learning basics",
        metadata={"topic": "AI", "difficulty": "beginner"}
    )
]

ids, count = client.embed_and_insert(
    "my_db", 
    "my_collection", 
    texts_with_metadata,
    embedding_model="text-embedding-ada-002"
)

# Embed and search
results = client.embed_and_search(
    db_name="my_db",
    collection_name="my_collection",
    query_text="What is machine learning?",
    top_k=5,
    embedding_model="text-embedding-ada-002"
)

Persistence Operations

# Synchronous save
success, message, size, duration = client.save()
print(f"Saved {size} bytes in {duration} seconds")

# Background save
success, message, job_id = client.bg_save()
print(f"Background save job: {job_id}")

Distance Metrics

The SDK supports multiple distance metrics:

from scintirete_sdk import DistanceMetric

# Available metrics
DistanceMetric.L2              # Euclidean distance
DistanceMetric.COSINE          # Cosine similarity  
DistanceMetric.INNER_PRODUCT   # Inner product

Error Handling

from scintirete_sdk.exceptions import (
    ScintireteError,
    ConnectionError,
    AuthenticationError,
    DatabaseError,
    VectorError
)

try:
    client.create_database("my_db")
except AuthenticationError as e:
    print(f"Authentication failed: {e}")
except ConnectionError as e:
    print(f"Connection failed: {e}")
except DatabaseError as e:
    print(f"Database error: {e}")
except ScintireteError as e:
    print(f"General error: {e}")

Development

Setup Development Environment

# Clone repository
git clone https://github.com/scintirete/scintirete.git
cd scintirete/sdk/python

# Install development dependencies
make install-dev

# Generate proto files
make gen

Running Tests

# Run unit tests
make test

# Run tests with coverage
make test-cov

# Run integration tests (requires running Scintirete server)
pytest tests/integration/ -m integration

Code Quality

# Format code
make format

# Lint code
make lint

# Type checking
mypy src/

Building and Publishing

# Build package
make build

# Publish to PyPI (requires credentials)
make publish

Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Examples

See the examples directory for more comprehensive usage examples:

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

Changelog

See CHANGELOG.md for a list of changes in each version.

About

Scintirete 的 python SDK,Scintirete 是一款基于 HNSW 算法实现的、轻量级、面向生产的单机向量数据库。

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published