solrex

Wenbo Yang solrex

Head of Search Infra Department @baidu, Distinguished Architect

189 followers · 44 following

@baidu
Beijing, China
https://yangwenbo.com

Achievements

Organizations

Starred repositories

cuga-project / cuga-agent

CUGA is an open-source generalist agent for the enterprise, supporting complex task execution on web and APIs, OpenAPI/MCP integrations, composable architecture, reasoning modes, and policy-aware f…

Python 531 73 Updated Dec 18, 2025

NVIDIA / ib-traffic-monitor

A TUI-based utility for real-time monitoring of InfiniBand traffic and performance metrics on the local node

C 60 5 Updated Dec 19, 2025

wondertrader / wondertrader

WonderTrader——量化研发交易一站式框架

C++ 5,718 1,091 Updated Sep 30, 2025

mirage-project / mirage

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

C++ 2,003 161 Updated Dec 20, 2025

dragonflydb / dragonfly

A modern replacement for Redis and Memcached

C++ 29,557 1,123 Updated Dec 22, 2025

flash-algo / flash-sparse-attention

Trainable fast and memory-efficient sparse attention

Python 490 46 Updated Dec 23, 2025

ai-dynamo / dynamo

A Datacenter Scale Distributed Inference Serving Framework

Rust 5,672 751 Updated Dec 23, 2025

NVIDIA-NeMo / Run

A tool to configure, launch and manage your machine learning experiments.

Python 211 87 Updated Dec 23, 2025

NVIDIA-NeMo / RL

Scalable toolkit for efficient model reinforcement

Python 1,163 201 Updated Dec 23, 2025

NVIDIA-NeMo / Automodel

Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Face support

Python 213 36 Updated Dec 20, 2025

NVIDIA-NeMo / Megatron-Bridge

HuggingFace conversion and training library for Megatron-based models

Python 301 109 Updated Dec 22, 2025

NVIDIA-NeMo / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 16,345 3,243 Updated Dec 22, 2025

BIGBALLON / UME-Search

Toward Universal Multimodal Embedding

Python 72 4 Updated Aug 1, 2025

NVIDIA / gds-nvidia-fs

NVIDIA GPUDirect Storage Driver

C 311 52 Updated Dec 18, 2025

dragonflyoss / dragonfly

Delivers efficient, stable, and secure data distribution and acceleration powered by P2P technology, with an optional content‑addressable filesystem that accelerates OCI container launch.

Go 2,942 359 Updated Dec 22, 2025

ModelTC / Qwen-Image-Lightning

Qwen-Image-Lightning: Speed up Qwen-Image model with distillation

Python 1,058 41 Updated Dec 22, 2025

alibaba / Pai-Megatron-Patch

The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.

Python 1,485 216 Updated Dec 15, 2025

Tencent / WeKnora

LLM-powered framework for deep document understanding, semantic retrieval, and context-aware answers using RAG paradigm.

Go 9,947 1,070 Updated Dec 22, 2025

PeterGriffinJin / Search-R1

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 3,690 309 Updated Nov 13, 2025

ruikangliu / FlatQuant

[ICML 2025] Official PyTorch implementation of "FlatQuant: Flatness Matters for LLM Quantization"

Python 202 22 Updated Nov 25, 2025

thu-ml / SageAttention

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda 2,891 291 Updated Dec 22, 2025