sanowl

👽

San sanowl

👽

working on rl

84 followers · 100 following

Cyrion Labs
AUIS
https://sanowl.github.io/

Achievements

Lists (3)

Sort

Starred repositories

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 14,673 2,339 Updated Oct 24, 2025

thesis09 / Cartpole-

Python 1 Updated Oct 23, 2025

NJUNLP / AdaR

Python 12 Updated Oct 15, 2025

liquidmetal-dev / flintlock

Lock, Stock, and Two Smoking MicroVMs. Create and manage the lifecycle of MicroVMs backed by containerd.

Go 1,151 52 Updated Sep 22, 2025

ML-GSAI / SMDM

Official PyTorch implementation for ICLR2025 paper "Scaling up Masked Diffusion Models on Text"

Python 326 23 Updated Dec 22, 2024

NVlabs / Fast-dLLM

Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"

Python 590 51 Updated Oct 23, 2025

ElliottYan / LUFFY

Official Repository of "Learning to Reason under Off-Policy Guidance"

Python 351 40 Updated Oct 4, 2025

NVlabs / RLP

RLP: Reinforcement as a Pretraining Objective

192 13 Updated Oct 5, 2025

agentic-commerce-protocol / agentic-commerce-protocol

The Agentic Commerce Protocol (ACP) is an interaction model and open standard for connecting buyers, their AI agents, and businesses to complete purchases seamlessly. The specification is currently…

798 90 Updated Oct 3, 2025

Kwai-Klear / CE-GPPO

Forked from Kwai-Klear/KlearReasoner

CE-GPPO: Controlling Entropy via Gradient-Preserving Clipping Policy Optimization in Reinforcement Learning

Python 10 Updated Oct 10, 2025

open-sciencelab / SciReason

Python 53 4 Updated Oct 9, 2025

wzpscott / hybrid-radiance-fields

[NeurIPS'25] HyRF: Hybrid Radiance Fields for Efficient and High-quality Novel View Synthesis

58 3 Updated Sep 24, 2025

WenkeHuang / MAPO

MAPO: MIXED ADVANTAGE POLICY OPTIMIZATION

Python 38 Updated Sep 24, 2025

YujunZhou / EVOL-RL

Code for Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation (EVOL-RL).

Python 39 4 Updated Oct 16, 2025

apple / ml-fastvlm

This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025

Python 6,800 475 Updated May 5, 2025

python-trio / trio

Trio – a friendly Python library for async concurrency and I/O

Python 6,912 371 Updated Oct 20, 2025

browser-use / browser-use

🌐 Make websites accessible for AI agents. Automate tasks online with ease.

Python 71,688 8,491 Updated Oct 24, 2025

weizhepei / WebAgent-R1

[EMNLP 2025] WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning

Python 48 2 Updated Oct 23, 2025

pewdiepie-archdaemon / dionysus

laptop

Shell 2,550 83 Updated Sep 1, 2025

typename-yyf / Metis-quantization

Python 11 1 Updated Sep 24, 2025

NVIDIA / Isaac-GR00T

NVIDIA Isaac GR00T N1.5 - A Foundation Model for Generalist Robots.

Jupyter Notebook 5,100 792 Updated Oct 13, 2025

fengvyi / LeetCode

My C++ solutions for LeetCode questions.

C++ 148 41 Updated Jul 23, 2023

MasterVito / SvS

Official Repo for SvS: A Self-play with Variational Problem Synthesis strategy for RLVR training

Python 39 3 Updated Aug 25, 2025

cyrionlabs / NaviCore

Forked from browser-use/browser-use

The core CUA for Project Navi

Python 2 1 Updated Jun 10, 2025

San sanowl

Lists (3)

🔮 Future ideas

✨ Inspiration

🚀 My stack

Starred repositories

Data structures

Amazon Web Services

ASP.NET

Atom