Thanks to visit codestin.com
Credit goes to Github.com

Skip to content

Releases: memvid/memvid

v2.0.157

15 Feb 19:15

Choose a tag to compare

v2.0.157

Release Date: February 15, 2026

Overview

This release adds a structured XLSX extraction pipeline with table detection, OOXML metadata parsing, and semantic chunking. It also removes a vulnerable xlsx (SheetJS) dependency from the Node SDK, fixes the CLI deploy pipeline for proprietary crate handling, and includes clippy/lint fixes and documentation updates.


🚀 New Features

Structured XLSX Extraction Pipeline (memvid-core)

  • New XlsxReader::extract_structured() API for high-accuracy spreadsheet extraction
  • Automatic table boundary and header detection via heuristics and OOXML table definitions
  • Row-aligned semantic chunking that never splits rows across chunk boundaries
  • Formats rows as Header: Value | Header: Value pairs for optimal search accuracy
  • OOXML metadata parsing: number formats (dates, currency, percentages), merged cell regions, named table definitions
  • Column type inference (text, integer, float, date, currency, percentage, boolean)
  • Backward-compatible flat text output alongside structured chunks
  • New modules: xlsx_chunker, xlsx_ooxml, xlsx_table_detect

Remove Vulnerable xlsx Dependency — Issue #198

  • Removed SheetJS [email protected] from @memvid/sdk (CVE-2024-22363, CVE-2023-30533)
  • Production code already used ExcelJS — only example files were updated
  • Downstream users no longer receive Dependabot security alerts from @memvid/sdk

CLI Deploy Fix: Proprietary Crate Handling

  • Made memvid-ghostpack optional in memvid-ask-model and removed from workspace members
  • CI builds no longer fail when proprietary crates are absent (.gitignore'd)
  • Ghost model kind returns a clean error when the runtime is unavailable

🐛 Bug Fixes

  • Fixed clippy pedantic lints (implicit_clone, cast_possible_truncation)
  • Fixed dead_code warning for propagate_merged_cells
  • Resolved VecIndexManifest model field lint
  • xlsx_structured tests now gracefully skip on CI when fixture file is absent

📝 Documentation


📚 Related Issues & PRs


🙏 Contributors

Thank you to all contributors who made this release possible:

v2.0.136

06 Feb 23:47

Choose a tag to compare

Release Date: February 6, 2026

Overview

This release adds frame-level ACL (Access Control Lists), vector index model consistency enforcement, symspell data corruption fixes, and several CI/build improvements. It also includes README documentation updates and ONNX Runtime noise suppression on macOS.


🚀 New Features

Frame-Level ACL Enforcement

  • Added ACL (Access Control List) plumbing across search, ask, and replay paths
  • Per-frame access control enables fine-grained permission enforcement on chunks
  • Robustness fixes for ACL boundary conditions
  • New tests and benchmark/example updates for ACL workflows

Vector Index Model Consistency (PR #188)

  • Enforces strict binding between vector index and embedding model
  • Prevents silent model mismatch corruption when switching embedding providers
  • Ensures vector search results are always consistent with the model used at index time

SymSpell Cleanup Fix & Dictionary Tooling (PR #187)

  • Fixed symspell_cleanup data corruption bug
  • Added dictionary download tooling for easier setup
  • More reliable spell-correction preprocessing for search queries

OpenAI API Embedding Provider (PR #173)

  • Added OpenAI API as an embedding provider option
  • Enables using OpenAI embeddings alongside local ONNX models
  • Flexible embedding backend selection

🐛 Bug Fixes

ONNX Runtime Stderr Suppression (macOS)

  • Suppressed noisy ONNX Runtime warnings on macOS stderr
  • Cleaner console output during normal operation

CI Build Fixes

  • Added missing #[cfg(feature = "lex")] guards for tantivy-dependent code
  • Fixed CI cache key to use Cargo.toml hash instead of missing Cargo.lock
  • Committed Cargo.lock for reproducible CI builds
  • Moved target-specific deps section after main dependencies
  • Ran cargo fmt on clip.rs and text_embed.rs

Lint Fixes

  • Resolved redundant closure lints in tantivy.rs and search/mod.rs
  • General lint formatting cleanup

📝 Documentation


📊 Performance & Reliability

  • ACL enforcement: Zero-overhead when no ACL policy is set
  • Model consistency: Prevents silent search quality degradation from model mismatch
  • SymSpell fix: Eliminates data corruption in spell-correction preprocessing

📚 Related Pull Requests

  • #188 — feat: enforce vector index model consistency (@0x-pankaj)
  • #187 — feat: fix symspell_cleanup data corruption and add dictionary tooling (@0x-pankaj)
  • #173 — feat: add OpenAI API embedding provider (@0x-pankaj)
  • Direct push — Frame-level ACL enforcement across search/ask/replay (@Olow304)

🎯 Migration Notes

For Users

  • No breaking changes — all existing .mv2 files remain compatible
  • ACL is opt-in; existing memories work without any ACL configuration
  • Vector model consistency is enforced automatically on new indexes

For Developers

  • New aclScope field available on API keys (nullable, no migration needed)
  • ACL types available in types/acl.rs
  • Embedding model is now strictly bound to vector index at creation time

🙏 Contributors

Thank you to all contributors who made this release possible:

  • @Olow304 — ACL enforcement, CI fixes, lint cleanup
  • @0x-pankaj — Vector model consistency, symspell fix, OpenAI embeddings
  • @sharafdin — Documentation (deprecation notice)
  • @mo-omar-0197 — README updates

v2.0.135

25 Jan 19:30

Choose a tag to compare

Release Date: January 25, 2026

Overview

This release includes significant performance improvements, bug fixes, and feature enhancements. Highlights include HNSW vector search implementation, SIMD acceleration, Windows test fixes, and improved query precision with implicit AND operators.

🚀 New Features

HNSW Vector Search Implementation (PR #185)

  • Applied HNSW (Hierarchical Navigable Small World) implementation patch
  • Enables fast approximate nearest neighbor search for large vector indexes
  • Significant performance improvement for vector search operations
  • Better scalability for memories with many vector embeddings

SIMD Acceleration (PR #176)

  • Added SIMD acceleration for vector distance calculations
  • Optimized L2 distance computations using SIMD instructions
  • Faster vector similarity searches
  • Improved performance on modern CPUs with SIMD support

Extraction Cache Improvements (PR #175)

  • Added LRU (Least Recently Used) eviction to extraction cache
  • Better memory management for document extraction
  • Prevents cache from growing unbounded
  • Improved performance for repeated document processing

Query Precision Enhancement (PR #178)

  • Changed implicit query operator from OR to AND for precision
  • Multi-word queries now require all terms to match (implicit AND)
  • More precise search results
  • Explicit OR operator still available when needed
  • Better user experience for targeted searches

🐛 Bug Fixes

Windows Test Fixes (PR #186)

  • Fixed Windows test failures by adding delay for Tantivy file handle release
  • Resolved file locking issues on Windows during test cleanup
  • Tests now pass reliably on Windows platforms
  • Improved cross-platform test stability

Clippy Safety Overhaul (PR #180)

  • Comprehensive safety improvements based on Clippy linter recommendations
  • Fixed potential safety issues across the codebase
  • Improved code quality and maintainability
  • Enhanced memory safety guarantees

📝 Documentation

Internationalization

  • Added Bengali (bn) README translation (PR #182)
  • Added Japanese README translation (PR #177)
  • Improved accessibility for non-English speakers
  • Expanded documentation coverage

Documentation Improvements (PR #181)

  • Added HTML markers to all README files to make updates easier
  • Improved documentation maintenance workflow
  • Better structure for automated documentation updates

🔧 Developer Experience

Build & Development Tools (PR #184)

  • Created script to add flags for easier development workflow
  • Streamlined feature flag management
  • Improved developer productivity

📊 Performance Improvements

  • HNSW Implementation: Faster vector search for large indexes
  • SIMD Acceleration: Optimized distance calculations
  • LRU Cache: Better memory utilization
  • Query Precision: More accurate search results

🙏 Contributors

Thank you to all contributors who made this release possible:

📚 Related Pull Requests

  • #186 - fix(tests): add Windows delay for Tantivy file handle release
  • #185 - feat: apply HNSW implementation patch
  • #184 - Created a script to add flags
  • #182 - docs: add Bengali (bn) README translation
  • #181 - Added HTML markers to all README files to make updates easier
  • #180 - Fix/clippy safety overhaul
  • #178 - Fix: Change implicit query operator from OR to AND for precision
  • #177 - docs(i18n): add Japanese README translation
  • #176 - feat: add SIMD acceleration for vector distance calculations
  • #175 - feat(extract): add LRU eviction to extraction cache

🎯 Migration Notes

For Users

  • No breaking changes in this release
  • All existing .mv2 files remain compatible
  • Query behavior change: Multi-word queries now use implicit AND (more precise)
    • Use explicit OR operator if you need the old behavior
    • Example: "machine learning" now requires both words (was: either word)
    • Example: "machine OR learning" still works for either word

For Developers

  • Windows developers: Test stability improved
  • Performance: Vector search is significantly faster with HNSW
  • Memory: Extraction cache now has bounded memory usage

📚 Documentation

🔗 Related

v2.0.134

16 Jan 17:18

Choose a tag to compare

Highlights

Encryption Capsule (.mv2e)

  • Introduced secure file encryption with .mv2e format
  • AES-256-GCM encryption with Argon2id key derivation
  • Lock/unlock files with password protection via lock_file() and unlock_file() APIs
  • Header contains KDF parameters, salt, and nonce for secure decryption

Search Improvements

  • Multi-word queries now default to OR logic for better recall (e.g., "machine learning" finds documents with either term)
  • Fixed parallel segment indexing to properly use search_text field when no_raw=true

SDK & CLI Compatibility

  • Full cross-compatibility between CLI and SDK created .mv2 files
  • Removed extractous from Python SDK default features to avoid native library dependencies

Bug Fixes

  • Fixed lexical search indexing for documents ingested via SDK putMany
  • Resolved Python SDK import error related to missing Tika native library

Contributors

Thank you to our contributors for this release:

v2.0.133

10 Jan 22:13

Choose a tag to compare

v2.0.133

Features

  • Doctor Quiet Mode: Added quiet option to DoctorOptions to suppress debug logs during doctor operations. Useful for SDK integrations where verbose output is unwanted.

Improvements

  • Streaming Encryption Tests: Added comprehensive tests for the streaming encryption feature (PR #117):
  • streaming_encryption_large_file - Tests encrypt/decrypt roundtrip for files >1MB
  • wrong_password_fails_streaming - Verifies password validation in streaming format
  • Confirms reserved[0] == 0x01 marker for streaming format detection

Internal

  • Replaced println! with doctor_log! macro for conditional logging
  • Added thread-local DOCTOR_QUIET flag for clean log suppression
  • Updated all test files to use quiet: true in DoctorOptions

Compatibility

  • Fully backward compatible with existing .mv2 and .mv2e files
  • Streaming encryption (PR #117) auto-detects format via header byte

v2.0.132

09 Jan 21:40

Choose a tag to compare

What's New

Ed25519 Ticket Signature Verification

Added cryptographic signature verification for dashboard-issued capacity tickets using Ed25519.

Changes:

  • New signature.rs module for Ed25519 verification
  • Ticket validation in lifecycle management
  • Dashboard public key verification for capacity tickets
  • New ticket types in types/ticket.rs

v2.0.131

05 Jan 01:50

Choose a tag to compare

🚀 Memvid 2.0 - Complete Rust Rewrite

Give your AI agents memory in one file.

This release marks a complete rewrite of Memvid from Python to Rust, delivering 10-100x performance improvements and a truly portable single-file memory system.

Highlights

  • Single-file architecture - Everything in one .mv2 file, no databases or sidecars
  • Sub-5ms retrieval - Blazing fast local memory access
  • Multi-modal support - Text, PDF, DOCX, images (CLIP), and audio (Whisper)
  • Hybrid search - BM25 full-text + HNSW vector similarity
  • Time-travel - Query any point in memory history
  • Encryption - Optional password-protected capsules (.mv2e)

Installation

Rust:
[dependencies]
memvid-core = "2.0"

CLI:
npm install -g memvid-cli

SDKs:

  • Node.js: npm install @memvid/sdk
  • Python: pip install memvid-sdk

Links

v0.1.3 - Memvid

05 Jun 15:56

Choose a tag to compare

🎉 v0.1.3 Release

🐳 Docker Support for Advanced Codecs

  • Cross-platform H.265/HEVC encoding - No more codec dependency nightmares!
  • Automated Docker container management for non-MP4 codecs
  • Works seamlessly on Windows (WSL), macOS, and Linux
  • Handles all FFmpeg operations in isolated environment

🤖 Multi-LLM Provider Support

  • Added Google Gemini support - Use provider='google' in MemvidChat
  • Added Anthropic Claude support - Use provider='anthropic' in MemvidChat
  • New modular LLMClient class for easy provider management
  • Consistent interface across all LLM providers

⚙️ Enhanced Configuration System

  • Centralized configuration management via config.py
  • Per-codec configuration profiles for optimal compression
  • Flexible FFmpeg parameter customization
  • Support for different video container formats (MP4, MKV, AVI)

✨ New Examples

codec_comparison.py

Compare different video codecs side-by-side:

  • Test H.264, H.265, and MP4V compression ratios
  • Benchmark encoding/decoding performance
  • Find the optimal codec for your use case

file_chat.py

Enhanced document processing and chat:

  • Process entire directories or specific files
  • Configurable chunking parameters
  • Support for PDF, EPUB, HTML, and text files
  • Load and chat with existing memories
  • Graceful FAISS index fallback for small datasets

🔧 Improvements

Better Error Handling

]- Improved error messages for missing dependencies

  • Better handling of codec-specific issues

Configuration Flexibility

  • Customizable chunk sizes and overlap
  • Per-codec video parameters (CRF, preset, profile)
  • Configurable frame rates and sizes

Package Structure

  • Moved LLM providers to optional dependencies: pip install memvid[llm]
  • Added EPUB support as optional: pip install memvid[epub]
  • Core dependencies remain minimal

📦 Installation

# Basic installation
pip install memvid==0.1.3

API Keys

Set your API keys as environment variables:

export OPENAI_API_KEY="sk-..."
export GOOGLE_API_KEY="AIza..."
export ANTHROPIC_API_KEY="sk-ant-..."

🙏 Acknowledgments

Special thanks to our contributors who made this release possible with Docker support, codec testing, and multi-LLM integration!

@TyJK


v0.1.2 - Memvid

28 May 16:48

Choose a tag to compare

🎉 Memvid v0.1.2 - Cross-Platform Compatibility

🚀 What's New

🔧 Major Improvements

  • 🌐 Universal Installation: Memvid now uses OpenCV's built-in QR decoder, eliminating all platform-specific installation issues. No more libzbar not found errors on Windows, macOS, or Linux!
  • 📚 Native PDF Support: Added add_pdf() method to MemvidEncoder for direct PDF processing
    encoder = MemvidEncoder()
    encoder.add_pdf("book.pdf") # That's it!
  • 🔌 Flexible Dependencies: Switched from pinned to flexible version requirements, resolving numpy compatibility
    issues across different Python environments

✨ New Features

  • Added comprehensive PDF book chat example
  • Improved error messages for missing optional dependencies
  • Enhanced documentation with troubleshooting guides

🐛 Bug Fixes

  • Fixed numpy dtype compatibility errors
  • Resolved architecture mismatch issues on Apple Silicon
  • Fixed import errors in mixed Python environments

📖 Documentation

  • Added detailed installation instructions with virtual environment setup
  • Created CONTRIBUTING.md with development guidelines
  • Updated README with badges, use cases, and comparison table
  • Added complete working examples

💡 Upgrading

pip install --upgrade memvid==0.1.2

For PDF support:
pip install PyPDF2

🙏 Thanks

Special thanks to early adopters who reported installation issues. Memvid is now truly plug-and-play across all
platforms!