A Comprehensive Guide to Running a Gensyn RL-Swarm Training
Continuous Training, an Experimental (advanced) mode, GGUF LLM Compatibility on Huggingface
Our pick an experimental (advanced) mode at this model a continuously trained Qwen2.5-Coder-0.5B-Instruct fine-tuned using Gensyn RL-Swarm framework with GRPO (Group Relative Policy Optimization) and supported format GGUF (llama.cpp) for enhanced code generation capabilities. Note: Current training focuses on programming challenges with adaptive weighted sampling.
- Agent ID: Huggingface LLModels
- Training Status: π’ LIVE - Model updates automatically every 5-10 minutes
- Auto-Sync GGUF Pipeline Status: π’ LIVE - Commits update automatically every hourly
- Current Progress: Round 14,854+ / 100,000 (14.85%)
- Framework Version: Gensyn RL-Swarm (CodeZero)
v0.7.0 - Contract Judge: SwarmCoordinator
v0.4.2 - GGUF Quantization: Multiple quantized available
(F16, Q3_K_M, Q4_K_M, Q5_K_M, Q6_K)
Note
LATEST PULL: Currently in recent update Gensyn RL-Swarm is CodeZero (Solvers, Proposers & Evaluators), This was modify configure experimental (advanced) mode by for Gensyn training model thats support for any min-low vRAM GPU resources like RTX Series 20xx, 30xx, 40xx, A1xx, A2xx, A4xx is same optimize into swarm training (originally).
- Proposers: Generate coding problems and unit tests, adjusting difficulty dynamically based on solver performance. Proposers create challenges that adapt to the swarm's current capabilities, ensuring continuous learning opportunities.
- Solvers: Attempt coding challenges, learn locally through RL, and share rollouts with peers. Solvers exchange solutions to promote diversity and accelerate collective learning across the network.
- Evaluators: Frozen models that assess correctness and assign rewards. Evaluators use rule-based assessment to score submissions without executing code, ensuring safety and scalability.
| Requirement | Detail |
|---|---|
| Linux | Ubuntu 20 - 22 - 24 LTS |
| Windows | WSL - Ubuntu |
| CPU | vCores 10 with 12GB RAM - more |
| VRAM GPU | Min 6GB - more VRAM |
| GPU-Series | GTX 1080 with CUDA 12.4-12.8 - RTX series |
| STORAGE | Pass-lock 98GB - 99GB |
Note: Its just a imagine, you can choose anything your take. But i can sharing any tips if your rent cloud GPU to https://octa.space I'm not promoter choose this my principals rent any low-cost, its very cheaper than rental GPU competitors. For tips & trick read... https://github.com/arcxteam/octa-rental-gpu
Basically, if you rent a GPU with a minimum of 6GB or even 8-12-16-24GB of VRAM, you can run other nodes because only 4-5GB of VRAM will be used for this GENSYN with the configure modified.
- Adaptive Sampling Strategy
- Adaptive Threshold System
- Data Quality Enhanced Implementation
- Adaptive Reward System: Dynamic quality enhanced and dataset weighting for optimal learning
- Multi-domain Coding: Trained on MBPP and CodeContests datasets with adaptive sampling
The model is trained on a composite dataset with adaptive weighted sampling strategy
| Dataset | Initial Weight | Adaptive Range | Focus Area | Stream Synthesis |
|---|---|---|---|---|
| MBPP | 5 | 5-6 | Basic Python programming problems with test cases | google-research-datasets/mbpp |
| CodeContests | 5 | 4-5 | Competitive programming challenges | deepmind/code_contests |
For detail on Huggingface β Qwen2.5-Coder-0.5B-Instruct-Gensyn-Swarm-My-Agent-ID
Qwen/Qwen2.5-Coder-0.5B-Instructβ Recommend Solverdeepseek-ai/deepseek-coder-1.3b-instructβ Proposers, EvaluatorsQwen/Qwen2.5-Coder-1.5B-Instructβ Proposers, Evaluators
I. Update System Packages
apt update && apt upgrade -y && \
apt install screen curl ufw nload tree iptables git wget lz4 jq make gcc nano automake autoconf \
htop tmux libgbm1 protobuf-compiler python3 python3-pip python3-venv python3-dev python3-setuptools \
tar clang nethogs ncdu unzip build-essential pkg-config libssl-dev libleveldb-dev \
speedtest-cli ca-certificates libffi-dev libsqlite3-dev -yII. Install v22 Node.js - Npm - Yarn - Pm2
source <(wget -qO- https://raw.githubusercontent.com/arcxteam/w-ai-wombo/main/nodejs.sh)III. Clone Repository
git clone https://github.com/arcxteam/rl-swarm.git && cd rl-swarmI. If run with OctaSpace GPU default Tmux/Termux sessions
- Full comprehensif guide tips & trick read here Octaspace
- Quick-1: Always rent with this template UBUNTU 22.04/24.04 LTS dont use another template
- Quick-2: Always deploy in coloum (Expose HTTP Ports) input https port as 3000 will auto create localhost to access gensyn login
- Create new sessions
CTRL+B+C - Go to sessions
tmux attach - List sessions
CTRL+B+W - Navigate sessions
scroll up β down β or click any sessions & then enter - Close with run background
CTRL+B+D
II. If run with other cloud GPU use Screen sessions
- Create sessions
screen -S gensyn - Close sessions
CTRL+A+D - Go to sessions
screen -r gensyn
III. Virtual Python & Running
python3 -m venv .venv
source .venv/bin/activate
# if not worked, then:
. .venv/bin/activate
./run_rl_swarm.shI. Install Docker & Compose β if the cloud GPU was support & use Ubuntu VM
curl -sSL https://raw.githubusercontent.com/arcxteam/succinct-prover/refs/heads/main/docker.sh | bashII. Setup build on docker
- CPU
docker compose run --rm --build -Pit swarm-cpu- GPU
docker compose run --rm --build -Pit swarm-gpuI. Get API key: https://wandb.ai/authorize II. Create new Project and edit this or nano .env
# Create .env file
cat > .env << 'EOF'
WANDB_MODE=online
WANDB_API_KEY=YOUR_API_KEY
WANDB_ENTITY=YOUR_TEAM_OR_USERNAME
WANDB_PROJECT=YOUR_PROJECT_NAME
EOFIII. Execute script
chmod +x wandb_run_rl_swarm.shIV. Running*
wandb online
./wandb_run_rl_swarm.sh- Open a new terminal
- Install localtunnel
npm install -g localtunnel
- The password is actually your VPS-IP
curl ifconfig.me && echo
- Get URL access
lt --port 3000
Visit the prompted url, and enter your password to access Gensyn login page
| Format | Size | Precision | Use Case | Download |
|---|---|---|---|---|
| Safetensors (BF16) | 988 MB | BF16 | Full precision training/fine-tuning | model.safetensors |
| GGUF F16 | 994 MB | FP16 | High quality inference | Qwen2.5-Coder-0.5B-F16.gguf |
| GGUF Q6_K | 506 MB | 6-bit | High quality compression | Qwen2.5-Coder-0.5B-Q6_K.gguf |
| GGUF Q5_K_M | 420 MB | 5-bit | Balanced quality/size | Qwen2.5-Coder-0.5B-Q5_K_M.gguf |
| GGUF Q4_K_M | 398 MB | 4-bit | Recommended for production | Qwen2.5-Coder-0.5B-Q4_K_M.gguf |
| GGUF Q3_K_M | 355 MB | 3-bit | Smallest, fastest | Qwen2.5-Coder-0.5B-Q3_K_M.gguf |
All GGUF formats are llama.cpp is compatible ready to use Inferences chat and auto-update be hourly.
- Gensyn Documentation: https://docs.gensyn.ai/
- Gensyn GitHub: https://github.com/gensyn-ai
- RL-Swarm Contracts: https://github.com/gensyn-ai/rl-swarm-contracts
- Qwen2.5-Coder Model Card: https://huggingface.co/Qwen/Qwen2.5-Coder-0.5B-Instruct
- MBPP Dataset: https://huggingface.co/datasets/google-research-datasets/mbpp
- CodeContests Dataset: https://huggingface.co/datasets/deepmind/code_contests
- arXiv:1910.09700: ML Carbon Emissions methodology
- Community: Gensyn Discord