Starred repositories
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
scikit-learn: machine learning in Python
You like pytorch? You like micrograd? You love tinygrad! ❤️
OCR, layout analysis, reading order, table recognition in 90+ languages
Perform data science on data that remains in someone else's server
code for Data Science From Scratch book
Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.
A Keras implementation of YOLOv3 (Tensorflow backend)
One webpage for every book ever published!
🆔 A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.
Rapid fuzzy string matching in Python using various string metrics
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
📐 Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.
Access a database of word frequencies, in various natural languages.
A packaged and flexible version of the CRAFT text detector and Keras CRNN recognition model.
Spearmint is a package to perform Bayesian optimization according to the algorithms outlined in the paper: Practical Bayesian Optimization of Machine Learning Algorithms. Jasper Snoek, Hugo Laroche…
Train to 94% on CIFAR-10 in <6.3 seconds on a single A100. Or ~95.79% in ~110 seconds (or less!)
DeText: A Deep Neural Text Understanding Framework for Ranking and Classification Tasks
🧬 gget enables efficient querying of genomic reference databases
BookNLP, a natural language processing pipeline for books
python module to plot beautiful and highly customizable genome browser tracks
Extract structured data from ingredient phrases using conditional random fields
brozzler - distributed browser-based web crawler