Highlights
- Pro
Stars
12 Lessons to Get Started Building AI Agents
Python tool for converting files and office documents to Markdown.
[Survey] A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Experimental tool for creating "recipes" to drive automations
Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.
A framework for building, orchestrating and deploying AI agents and multi-agent workflows with support for Python and .NET.
Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!
The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
A project page template for academic papers. Demo at https://eliahuhorwitz.github.io/Academic-project-page-template/
An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.
Tutorials with in-depth explanations on how to finetune small language models
My Arxiv Sanity, a tool that recommends papers to read by crawling arxiv. How does it decide on the most relevant papers? It is powered by GPT...
Pioneering Automated GUI Interaction with Native Agents
Code for paper "Attributing Culture-Conditioned Generations to Pretraining Corpora"
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
https://huyenchip.com/ml-interviews-book/
🤗 smolagents: a barebones library for agents that think in code.
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Audio generation using diffusion models, in PyTorch.
A library for advanced large language model reasoning
WorldGPT: Empowering LLM as Multimodal World Model
M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection