Popular repositories Loading
-
cline-chinese
cline-chinese PublicForked from HybridTalentComputing/cline-chinese
Cline中文汉化版,Cline是一款在您的 IDE 中运行的自主编程助手,经您许可后可以创建/编辑文件、运行命令、使用浏览器等功能。
TypeScript
-
lktransformers
lktransformers PublicForked from guqiong96/lktransformers
The complete NUMA-optimized branch of the ktransformers project
Python
-
Lvllm
Lvllm PublicForked from guqiong96/Lvllm
LvLLM is a special NUMA extension of vllm that makes full use of CPU and memory resources, reduces GPU memory requirements, and features an efficient GPU parallel and NUMA parallel architecture, su…
Python
-
fastllm
fastllm PublicForked from ztxz16/fastllm
fastllm是后端无依赖的高性能大模型推理库。同时支持张量并行推理稠密模型和混合模式推理MOE模型,任意10G以上显卡即可推理满血DeepSeek。双路9004/9005服务器+单显卡部署DeepSeek满血满精度原版模型,单并发20tps;INT4量化模型单并发30tps,多并发可达60+。
C++
-
ktransformers
ktransformers PublicForked from kvcache-ai/ktransformers
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
Python
-
If the problem persists, check the GitHub status page or contact support.