Stars
[WIP] Resources for AI engineers. Also contains supporting materials for the book AI Engineering (Chip Huyen, 2025)
A fast, effective data attribution method for neural networks in PyTorch
Practical Full-Stack Machine Learning
Python Fire is a library for automatically generating command line interfaces (CLIs) from absolutely any Python object.
Cleanlab's open-source library is the standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
Github mirror of M. Zinkevich's "Rules of Machine Learning" style guide, with extra goodness.
The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"
KeywordScape - Visual Document Exploration using Contextualized Keyword Embeddings
A library for preparing data for machine translation research (monolingual preprocessing, bitext mining, etc.) built by the FAIR NLLB team.
Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.
A library to generate LaTeX expression from Python code.
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
Practice C++ by solving well-prepared exercises on different topics
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
1st place solution for RuSimpleSentEval
Finetuning ByT5 para análise de sentimentos em recomendações de produtos.
Multilingual word vectors in 78 languages
Transformers for Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI
Transformers for Information Retrieval, Text Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI
Google Research
Port of Google's language-detection library to Python.