-
DaoCloud
- Shanghai
-
14:03
(UTC +08:00)
Starred repositories
Offline optimization of your disaggregated Dynamo graph
本人自学计算机基础课程记录,主要为基础四大件,即大家常说的“408”,包含数据结构和算法 、计算机操作系统 、计算机网络 、计算机组成原理。学习资料来源王道课程,笔记插图来源于个人整理。
Glances an Eye on your system. A top/htop alternative for GNU/Linux, BSD, Mac OS and Windows operating systems.
System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge
A collection of awesome readme templates to display on your profile
agent-sandbox enables easy management of isolated, stateful, singleton workloads, ideal for use cases like AI agent runtimes.
AIInfra(AI 基础设施)指AI系统从底层芯片等硬件,到上层软件栈支持AI大模型训练和推理。
A command-line interface tool for serving LLM using vLLM.
Open Model Engine (OME) — Kubernetes operator for LLM serving, GPU scheduling, and model lifecycle management. Works with SGLang, vLLM, TensorRT-LLM, and Triton
Next Generation Agentic Proxy for AI Agents and MCP servers
llm-d helm charts and deployment examples
A GitHub Action to lint and test Helm charts
Standardized Distributed Generative and Predictive AI Inference Platform for Scalable, Multi-Framework Deployment on Kubernetes
Cost-efficient and pluggable Infrastructure components for GenAI inference
GenAI inference performance benchmarking tool
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
The Cloud-Native API Gateway and AI Gateway
Gateway API Inference Extension
Repository for the next iteration of composite service (e.g. Ingress) and load balancing APIs.
Achieve state of the art inference performance with modern accelerators on Kubernetes
Generative AI Examples is a collection of GenAI examples such as ChatQnA, Copilot, which illustrate the pipeline capabilities of the Open Platform for Enterprise AI (OPEA) project.
Simplified Data Management and Sharing for Kubernetes