Lists (1)
Sort Name ascending (A-Z)
Stars
Production-ready platform for agentic workflow development.
A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.
[ACL 2024 Findings] Deep Exploration of Cross-Lingual Zero-Shot Generalization in Instruction Tuning
The official implementation of Self-Play Preference Optimization (SPPO)
🧭 COMPASS: Combinatorial Optimization with Policy Adaptation using Latent Space Search
aider is AI pair programming in your terminal
Schedule-Free Optimization in PyTorch
This is the official code for the published paper 'Solve routing problems with a residual edge-graph attention neural network'
A simple and well styled PPO implementation. Based on my Medium series: https://medium.com/@eyyu/coding-ppo-from-scratch-with-pytorch-part-1-4-613dfc1b14c8.
Awesome machine learning for combinatorial optimization papers.
OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.
This repository contains the implementation of paper Online 3D Bin Packing with Constrained Deep Reinforcement Learning.
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…
Accessible large language models via k-bit quantization for PyTorch.
QLoRA: Efficient Finetuning of Quantized LLMs
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Making large AI models cheaper, faster and more accessible
🦁 Lion, new optimizer discovered by Google Brain using genetic algorithms that is purportedly better than Adam(w), in Pytorch
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
glistering96 / llm-course
Forked from mlabonne/llm-courseCourse to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama mode…
Go engine with no human-provided knowledge, modeled after the AlphaGo Zero paper.
Related papers for reinforcement learning, including classic papers and latest papers in top conferences
Codebase for SEFS: Self-Supervision Enhanced Feature Selection with Correlated Gates
ResiDual: Transformer with Dual Residual Connections, https://arxiv.org/abs/2304.14802
Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways