Vidyut aims to provide performant and high-quality solutions for the common problems that Sanskrit programmers face. Some of these problems include:
-
Word generation, or converting bases and suffixes into complete words. (भू → भवति)
-
Word lookup, or mapping a complete word back to its bases and suffixes. (भवति → भू)
-
Transliteration, or conversion of Sanskrit text from one script to another. (भू → bhū)
-
Metrical analysis, or understanding the meter used by a piece of Sanskrit text.
-
Sandhi changes, or applying and undoing the sound changes that occur between pieces of Sanskrit text. (चैव → च एव)
-
Segmentation, or splitting a piece of Sanskrit text into distinct words. (भवत्येव → भवति एव)
Vidyut compiles to fast and efficient native code, and it can be bound to other programming languages with minimal work. We provide first-class support for Python and are eager to support other bindings as well.
Vidyut is under active development as part of the Ambuda project.
License: MIT
Vidyut is implemented in Rust, which provides low-level control with high-level ergonomics. For your convenience, we also provide first-class support for Python bindings through the vidyut Python package. This section describes how to use Vidyut either through Rust or through Python.
First, install Rust on your computer by following the instructions here.
Once you've done so, create a new project with cargo new and install Vidyut's packages:
cargo add vidyut-prakriya
cargo add vidyut-kosha
cargo add vidyut-lipi
# ... and so onYou can also install directly from this repository:
cargo add vidyut-prakriya --git https://github.com/ambuda-org/vidyut.git
cargo add vidyut-kosha --git https://github.com/ambuda-org/vidyut.git
cargo add vidyut-lipi --git https://github.com/ambuda-org/vidyut.git
# ... and so onWe recommend using our pre-built linguistic data, which is available as a ZIP file here.
For more information, see our Rust documentation.
First, install Python on your computer. There are many ways to do so, but we
recommend installing uv then running uv init my-project to create a
Python project.
Once your setup is ready, you can install the vidyut package:
# With uv
$ uv add vidyut
# With pip
$ pip install vidyutYou can also install directly from this repository. Doing so compiles the repository from scratch and might take several minutes, so we strongly suggest using our latest PyPI release instead.
# Building from scratch is slow, so we pass `--verbose` to monitor its status.
# With uv
$ uv add "git+https://github.com/ambuda-org/vidyut.git#subdirectory=bindings-python" --verbose
# With pip
$ pip install -e "git+https://github.com/ambuda-org/vidyut.git#egg=vidyut&subdirectory=bindings-python" --verboseWe recommend using our pre-built linguistic data, which is available as a ZIP file here.
For more information, see our Python documentation.
Building from source lets you work with Vidyut as a developer and contributor.
(This setup requires cargo. Confirm that you have cargo installed by running
cargo --version.)
Once you download the repo, you can run cargo test --all to run unit tests.
$ git clone https://github.com/ambuda-org/vidyut.git
$ cd vidyut
$ cargo test --all(If you install cargo-nextest, you can also run make test for a
nicer testing experience.)
Your first build will likely take a few minutes, but future builds will be much faster.
We recommend using our pre-built linguistic data, which is available as a ZIP file here. Or if you prefer, you can build this data for yourself:
$ cd vidyut-data
$ make create_all_dataOutput will be written to data/build/vidyut-latest.
NOTE: this command is resource-intensive and might stall on slower machines.
(This setup requires uv. Confirm that you have uv installed by running
uv --version.)
Once you download the repo, you can run make test in the bindings-python
directory to run Python-specific unit tests:
$ git clone https://github.com/ambuda-org/vidyut.git
$ cd vidyut/bindings-python
$ make testmake test uses a development build, which compiles more quickly but has worse
runtime performance. To create a release build instead, run make release.
Vidyut contains several standard components for common Sanskrit processing tasks. These components work together well, but you can also use them independently depending on your use case.
In Rust, components of this kind are called crates.
vidyut-chandas identifies the meter in some piece of Sanskrit text. This
crate is experimental, and while it is useful for common and basic use cases,
it is not a state-of-the-art solution.
For details, see the vidyut-chandas README.
vidyut-cheda segments Sanskrit expressions into words then annotates those
words with their morphological data. Our segmenter is optimized for real-time
and interactive usage: it is fast, low-memory, and capably handles pathological
input.
For details, see the vidyut-cheda README.
vidyut-kosha defines a key-value store that can compactly map tens of
millions of Sanskrit words to their inflectional data. Depending on the
application, storage costs can be as low as 1 byte per word. This storage
efficiency comes at the cost of increased lookup time, but in practice, we have
found that this increase is negligible and well worth the efficiency gains
elsewhere.
For details, see the vidyut-kosha README.
vidyut-lipi is a transliteration library for Sanskrit and Pali that also
supports many of the scripts used within the Indosphere. Our goal is to provide
a standard transliterator that is easy to bind to other programming languages.
For details, see the vidyut-lipi README.
vidyut-prakriya generates Sanskrit words with their prakriyās (derivations)
according to the rules of Paninian grammar. Our long-term goal is to provide a
complete implementation of the Ashtadhyayi.
For details, see the vidyut-prakriya README.
vidyut-sandhi contains various utilities for working with sandhi changes
between words. It is fast, simple, and appropriate for most use cases.
For details, see the vidyut-sandhi README.
Our Rust documentation is available on docs.rs, and our Python documentation is available on readthedocs.org. You can also build our documentation from scratch:
-
(Rust) To view documentation for all crates (including private modules and structs), run
make docsfrom the repository root. This command will generate Rust's standard documentation and open it in your default web browser. -
(Python) To view the latest build of our Python documentation, run
make docsfrom thebindings-pythondirectory. This command will write our Python docs to local HTML files, which you should then open manually.
Thank you for considering a contribution to Vidyut! Vidyut is an ambitious and transformative project, and it can grow only with your help.
For all of the details, see our CONTRIBUTING.md file.
If you're excited about our work on Vidyut, we would love to have you join our community.
-
Most of our conversation occurs on Ambuda's Discord server on the
#vidyutchannel, where you can chat directly with our team and get fast answers to your questions. We also schedule time to spend together virtually, usually on a weekly frequency. -
Occasional discussion related to Vidyut might also appear on ambuda-discuss or on standard mailing lists like sanskrit-programmers.
-
You can also follow along with project announcements on ambuda-announce.
-
More technical discussions will appear on our issues page.