Thanks to visit codestin.com
Credit goes to Github.com

Skip to content
View nicole-lihui's full-sized avatar
:octocat:
Focusing
:octocat:
Focusing
  • DaoCloud
  • Shanghai
  • 14:03 (UTC +08:00)

Organizations

@istio @merbridge @pluma-tools @knoway-dev

Block or report nicole-lihui

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Offline optimization of your disaggregated Dynamo graph

Python 150 50 Updated Jan 17, 2026

计算机自学指南

HTML 70,744 7,804 Updated Jan 8, 2026

本人自学计算机基础课程记录,主要为基础四大件,即大家常说的“408”,包含数据结构和算法 、计算机操作系统 、计算机网络 、计算机组成原理。学习资料来源王道课程,笔记插图来源于个人整理。

C 432 38 Updated Jul 13, 2025

Glances an Eye on your system. A top/htop alternative for GNU/Linux, BSD, Mac OS and Windows operating systems.

Python 31,338 1,667 Updated Jan 17, 2026

System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge

Go 2,862 453 Updated Jan 18, 2026

A collection of awesome readme templates to display on your profile

JavaScript 11,134 7,340 Updated Aug 8, 2024

agent-sandbox enables easy management of isolated, stateful, singleton workloads, ideal for use cases like AI agent runtimes.

Go 708 84 Updated Jan 17, 2026

AIInfra(AI 基础设施)指AI系统从底层芯片等硬件,到上层软件栈支持AI大模型训练和推理。

Jupyter Notebook 5,762 796 Updated Dec 22, 2025

A command-line interface tool for serving LLM using vLLM.

Python 462 25 Updated Dec 3, 2025

Unified Collective Communication Library

C 286 127 Updated Jan 16, 2026

Open Model Engine (OME) — Kubernetes operator for LLM serving, GPU scheduling, and model lifecycle management. Works with SGLang, vLLM, TensorRT-LLM, and Triton

Go 356 54 Updated Jan 17, 2026

Next Generation Agentic Proxy for AI Agents and MCP servers

Rust 1,584 254 Updated Jan 17, 2026
Python 3 1 Updated Dec 16, 2025

llm-d helm charts and deployment examples

Shell 48 53 Updated Dec 13, 2025

A GitHub Action to lint and test Helm charts

Shell 286 81 Updated Nov 26, 2025

Standardized Distributed Generative and Predictive AI Inference Platform for Scalable, Multi-Framework Deployment on Kubernetes

Go 5,012 1,343 Updated Jan 17, 2026

📚 从零开始的大语言模型原理与实践教程

Jupyter Notebook 24,438 2,241 Updated Jan 3, 2026

Cost-efficient and pluggable Infrastructure components for GenAI inference

Go 4,532 516 Updated Jan 18, 2026

GenAI inference performance benchmarking tool

Python 141 61 Updated Jan 15, 2026

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Python 13,169 881 Updated Dec 17, 2024

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Python 1,893 118 Updated Jan 21, 2024

batched loras

Python 348 16 Updated Sep 6, 2023

The Cloud-Native API Gateway and AI Gateway

Go 5,244 640 Updated Jan 16, 2026

Gateway API Inference Extension

Go 566 221 Updated Jan 16, 2026

Repository for the next iteration of composite service (e.g. Ingress) and load balancing APIs.

Go 2,575 654 Updated Jan 16, 2026

Achieve state of the art inference performance with modern accelerators on Kubernetes

Shell 2,364 295 Updated Jan 17, 2026

Generative AI Examples is a collection of GenAI examples such as ChatQnA, Copilot, which illustrate the pipeline capabilities of the Open Platform for Enterprise AI (OPEA) project.

Shell 714 332 Updated Jan 15, 2026

Simplified Data Management and Sharing for Kubernetes

Go 17 4 Updated Jan 15, 2026
Next