Thanks to visit codestin.com
Credit goes to Github.com

Skip to content
View rawsh's full-sized avatar

Organizations

@devpytech @startclean @studymath

Block or report rawsh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

OpenTinker is an RL-as-a-Service infrastructure for foundation models

Python 397 27 Updated Dec 27, 2025

Streamline on-policy/off-policy distillation workflows in a few lines of code

Python 84 4 Updated Dec 27, 2025

My learning notes for ML SYS.

Python 4,825 310 Updated Dec 24, 2025

Official repository for the paper "LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code"

Python 749 155 Updated Jul 16, 2025

csshX like ssh tool for iTerm2

Python 564 68 Updated Nov 4, 2025

A simple plug-in framework that corrects bias and computes confidence intervals in reporting LLM-as-a-judge evaluation, and an adaptive algorithm that efficiently allocates calibration samples to r…

Jupyter Notebook 60 3 Updated Nov 27, 2025

A calm, CLI-native way to semantically grep everything, like code, images, pdfs and more.

TypeScript 2,376 103 Updated Dec 26, 2025

A non-saturating, open-ended environment for evaluating LLMs in Factorio

Python 871 59 Updated Dec 24, 2025
Python 18 1 Updated Dec 25, 2025

[Preprint] RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments

Python 162 16 Updated Nov 14, 2025
Python 4,248 461 Updated Jul 31, 2025

A challenging aggregation benchmark for long-context models

Python 13 2 Updated Nov 10, 2025

Archer2.0 evolves from its predecessor by introducing ASPO, which overcomes fundamental PPO-Clip limitations to prevent premature convergence and unlock greater RL potential.

Python 26 2 Updated Oct 10, 2025

Harbor is a framework for running agent evaluations and creating and using RL environments.

Python 251 173 Updated Dec 27, 2025

Easy, safe evaluation of arbitrary Python code

Python 269 47 Updated Dec 15, 2025

A lightweight ai sandbox environment

Go 32 1 Updated Dec 14, 2025

Daytona is a Secure and Elastic Infrastructure for Running AI-Generated Code

TypeScript 40,588 3,296 Updated Dec 25, 2025

Ultrafast serverless GPU inference, sandboxes, and background jobs

Go 1,518 133 Updated Nov 26, 2025

A fully customizable and self-hosted sandboxing solution for AI agent code execution and computer use. It features out-of-the-box support for backtracking, a simple REST API and Python SDK, automat…

Go 729 70 Updated Jun 2, 2025

Python & JS/TS SDK for running AI-generated code/code interpreting in your AI app

MDX 2,147 196 Updated Dec 17, 2025

This repo contains the source code for the paper "Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning"

Python 276 27 Updated Nov 24, 2025

Content of Online Encyclopedia of Integer Sequences (OEIS)

113 23 Updated Dec 27, 2025

Why Low-Precision Transformer Training Fails: An Analysis on Flash Attention

Python 33 3 Updated Oct 16, 2025

A framework for the evaluation of autoregressive code generation language models.

Python 1,010 253 Updated Jul 22, 2025

The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >74% on SWE-bench verified!

Python 2,386 306 Updated Dec 23, 2025

PyTorch-native post-training at scale

Python 577 72 Updated Dec 27, 2025

Official code of "StreamBP: Memory-Efficient Exact Backpropagation for Long Sequence Training of LLMs".

Python 74 5 Updated Jun 23, 2025

Codes for the paper "BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping" by Zhiheng Xi et al.

Python 89 5 Updated Oct 25, 2025
Next