Thanks to visit codestin.com
Credit goes to Github.com

Skip to content

shinecrazy/marisa-ffi

Repository files navigation

Marisa FFI Rust Bindings

Build Status Crates.io Documentation License

This project provides Rust FFI bindings for the libmarisa library, a space-efficient trie data structure implementation.

Features

  • Lookup: Check whether a string exists in the dictionary
  • Reverse lookup: Restore a key from its ID
  • Common prefix search: Find keys from prefixes of a given string
  • Predictive search: Find keys starting with a given string
  • Memory efficient: Significantly more compact than other trie implementations
  • Cross-platform: Works on Linux, macOS, and Windows

Installation

Prerequisites

  • Rust 1.70+
  • CMake 3.22+
  • C++ compiler with C++20 support

From Crates.io

Add this to your Cargo.toml:

[dependencies]
marisa-ffi = "0.3.1"

From Source

  1. Clone the repository:
git clone https://github.com/shinecrazy/marisa-ffi.git
cd marisa-ffi
  1. Build the FFI library:
cargo build --release

Using as a Dependency

Add this to your Cargo.toml:

[dependencies]
marisa-ffi = "0.3.1"

Quick Start

use marisa_ffi::{Trie, Keyset, Agent};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create a keyset and add some keys
    let mut keyset = Keyset::new()?;
    keyset.push("hello")?;
    keyset.push("world")?;
    keyset.push("rust")?;

    // Build the trie
    let trie = Trie::build(&keyset)?;

    // Lookup a key
    if let Some(id) = trie.lookup("hello") {
        println!("Found 'hello' with ID: {}", id);
    }

    // Predictive search
    let mut agent = Agent::new()?;
    if trie.predictive_search("h", &mut agent)? {
        println!("Found key: {}", agent.key()?);
    }

    Ok(())
}

API Reference

Trie

The main trie structure for efficient string storage and lookup.

impl Trie {
    /// Create a new empty trie
    pub fn new() -> Result<Self, MarisaError>;
    
    /// Build a trie from a keyset
    pub fn build(keyset: &Keyset) -> Result<Self, MarisaError>;
    
    /// Load a trie from a file
    pub fn load(&mut self, filename: &str) -> Result<(), MarisaError>;
    
    /// Save the trie to a file
    pub fn save(&self, filename: &str) -> Result<(), MarisaError>;
    
    /// Lookup a key and return its ID if found
    pub fn lookup(&self, key: &str) -> Option<u32>;
    
    /// Perform predictive search starting with the given prefix
    pub fn predictive_search(&self, prefix: &str, agent: &mut Agent) -> Result<bool, MarisaError>;
    
    /// Perform reverse lookup to get a key by its ID
    pub fn reverse_lookup(&self, id: u32, agent: &mut Agent) -> Result<(), MarisaError>;
    
    /// Perform common prefix search
    pub fn common_prefix_search(&self, prefix: &str, agent: &mut Agent) -> Result<bool, MarisaError>;
}

Keyset

A collection of keys used to build tries.

impl Keyset {
    /// Create a new empty keyset
    pub fn new() -> Result<Self, MarisaError>;
    
    /// Add a key to the keyset
    pub fn push(&mut self, key: &str) -> Result<(), MarisaError>;
    
    /// Add a key with a specific ID to the keyset
    pub fn push_with_id(&mut self, key: &str, id: u32) -> Result<(), MarisaError>;
    
    /// Reset the keyset
    pub fn reset(&mut self);
}

Agent

An agent for performing search operations and retrieving results.

impl Agent {
    /// Create a new agent
    pub fn new() -> Result<Self, MarisaError>;
    
    /// Get the current key as a string
    pub fn key(&self) -> Result<String, MarisaError>;
    
    /// Get the current key ID
    pub fn id(&self) -> u32;
    
    /// Move to the next result
    pub fn next(&mut self) -> Result<bool, MarisaError>;
}

Examples

Basic Usage

See examples/basic_usage.rs for a comprehensive example demonstrating all features.

Benchmark

See examples/benchmark.rs for performance testing and comparison.

Performance

Marisa trie is significantly more space-efficient than other implementations:

Implementation Size (bytes) Remarks
darts-clone 376,613,888 Compacted double-array trie
tx-trie 127,727,058 LOUDS-based trie
marisa-trie 50,753,560 MARISA trie

Source: English Wikipedia page titles (9,805,576 keys)

Error Handling

All operations return Result types with detailed error information:

#[derive(Debug, Clone, PartialEq, Eq)]
pub enum MarisaError {
    StateError,
    NullError,
    BoundError,
    RangeError,
    CodeError,
    ResetError,
    SizeError,
    MemoryError,
    IoError,
    FormatError,
    KeyNotFound,
    NoResults,
    Utf8Error(std::str::Utf8Error),
}

License

This project is licensed under the same terms as libmarisa:

  • BSD-2-Clause OR LGPL-2.1-or-later

Contributing

Contributions are welcome! Please feel free to submit issues and pull requests.

Acknowledgments

  • s-yata for the original libmarisa implementation
  • The Rust community for excellent FFI tooling and documentation

About

libmarisa rust binding

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published