A simple and well styled PPO implementation. Based on my Medium series: https://medium.com/@eyyu/coding-ppo-from-scratch-with-pytorch-part-1-4-613dfc1b14c8.

Python 1,176 157 Updated Oct 1, 2024

Thinklab-SJTU / awesome-ml4co

Awesome machine learning for combinatorial optimization papers.

Python 2,045 230 Updated Nov 7, 2025

opendilab / DI-engine

OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.

Python 3,561 421 Updated Dec 7, 2025

alexfrom0815 / Online-3D-BPP-DRL

This repository contains the implementation of paper Online 3D Bin Packing with Constrained Deep Reinforcement Learning.

Python 627 90 Updated Nov 17, 2023

NVIDIA / TensorRT-LLM

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

Python 12,448 1,971 Updated Dec 23, 2025

bitsandbytes-foundation / bitsandbytes

Accessible large language models via k-bit quantization for PyTorch.

Python 7,845 804 Updated Dec 12, 2025

artidoro / qlora

QLoRA: Efficient Finetuning of Quantized LLMs

Jupyter Notebook 10,791 869 Updated Jun 10, 2024

huggingface / peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python 20,320 2,131 Updated Dec 18, 2025

hpcaitech / ColossalAI

Making large AI models cheaper, faster and more accessible

Python 41,299 4,546 Updated Dec 22, 2025

OpenLMLab / LOMO

LOMO: LOw-Memory Optimization

Python 991 68 Updated Jul 2, 2024

lucidrains / lion-pytorch

🦁 Lion, new optimizer discovered by Google Brain using genetic algorithms that is purportedly better than Adam(w), in Pytorch

Python 2,178 56 Updated Nov 27, 2024

mlabonne / llm-course

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

71,008 8,124 Updated Dec 22, 2025

glistering96 / llm-course

Forked from mlabonne/llm-course

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Jupyter Notebook 1 Updated Jan 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

glistering96

Achievements

Achievements

Block or report glistering96

Lists (1)

Reinforcemnent learnung

Stars

dongjinkun / COPZoo

langgenius / dify

pytorch / rl

ggml-org / llama.cpp

CHLee0801 / KORANI-Instruction-Tuning

uclaml / SPPO

instadeepai / compass

Aider-AI / aider

DopeorNope-Lee / Ko-Fine-tuning_DataGen

facebookresearch / schedule_free

Lei-Kun / DRL-and-graph-neural-network-for-routing-problems

ericyangyu / PPO-for-Beginners

Thinklab-SJTU / awesome-ml4co

opendilab / DI-engine

alexfrom0815 / Online-3D-BPP-DRL

NVIDIA / TensorRT-LLM

bitsandbytes-foundation / bitsandbytes

artidoro / qlora

huggingface / peft

hpcaitech / ColossalAI

OpenLMLab / LOMO

lucidrains / lion-pytorch

mlabonne / llm-course

glistering96 / llm-course

meta-llama / llama-cookbook

leela-zero / leela-zero

yingchengyang / Reinforcement-Learning-Papers

chl8856 / SEFS

microsoft / ResiDual

lucidrains / PaLM-pytorch