Stars
a build server protocol implementation for integrate xcode with sourcekit-lsp
Develop Swift/iOS projects using VSCode
Swift port of Misaki G2P (grapheme-to-phoneme) library that can be used e.g. to generate phonemization for Kokoro text-to-speech engine
LibTMCG is a free C++ library for creating secure and verifiable online card games
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.
Fully open reproduction of DeepSeek-R1
Python version of the Playwright testing and automation library.
The smallest fully-tested TDD-designed all-essentials-included non-magic zero-dependency minimalist Java web application framework
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
A neural parsing pipeline for segmentation, morphological tagging, dependency parsing and lemmatization with pre-trained models for more than 50 languages. Top ranker in the CoNLL-18 Shared Task.
Python implementation of pricing analytics and Monte Carlo simulations for stochastic volatility models including log-normal SV model, Heston
Open source software that helps you create and deploy high-frequency crypto trading bots
AutoPhrase: Automated Phrase Mining from Massive Text Corpora
A peer-to-peer blockchain ledger, built with Swift, using Vapor
Resources for learning about Text Mining and Natural Language Processing
FastXML / PFastXML / PFastreXML - Implementation of Extreme Multi-label Classification
The Berkeley Entity Resolution System jointly solves the problems of named entity recognition, coreference resolution, and entity linking with a feature-rich discriminative model.
newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
The Finnish dependency parsing pipeline being developed by the Turku NLP group. Documentation:
CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing, coreference, sentiment analysis, etc.
TextTeaser is an automatic summarization algorithm.