-
language modelling specialisation
- https://tokenbender.com/
- @tokenbender
Starred repositories
A Collection of Pydantic Models to Abstract IRL
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
lightning implementation of avatarl
NVIDIA curated collection of educational resources related to general purpose GPU programming.
The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >74% on SWE-bench verified!
Unsupervised text tokenizer focused on computational efficiency
A Tree Search Library with Flexible API for LLM Inference-Time Scaling
🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.
A collection of example AI programs built using DSPy and maitained by the Langtrace AI team.
🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!
🤖 MLE-Agent: Your intelligent companion for seamless AI engineering and research. 🔍 Integrate with arxiv and paper with code to provide better code/research plans 🧰 OpenAI, Anthropic, Gemini, Ollam…
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
Python tool for converting files and office documents to Markdown.
awesome synthetic (text) datasets
A reading list on LLM based Synthetic Data Generation 🔥
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤
Let your Claude able to think
PyTorch Implementation of Learning to Prompt (L2P) for Continual Learning @ CVPR22
[WACV 2025] Official implementation of "Online-LoRA: Task-free Online Continual Learning via Low Rank Adaptation" by Xiwen Wei, Guihong Li and Radu Marculescu
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
Small, simple agent task environments for training and evaluation
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA