Stars
FlashInfer: Kernel Library for LLM Serving
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
Intelligent automation and multi-agent orchestration for Claude Code
Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.
A high-performance, extensible Python AOT compiler.
FlashMLA: Efficient Multi-head Latent Attention Kernels
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
PyTorch Tutorial for Deep Learning Researchers
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Havenask is a large-scale distributed information search system widely used within Alibaba Group
Dapr is a portable runtime for building distributed applications across cloud and edge, combining event-driven architecture with workflow orchestration.
精选机器学习,NLP,图像识别, 深度学习等人工智能领域学习资料,搜索,推荐,广告系统架构及算法技术资料整理。算法大牛笔记汇总
12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all
Extremely fast Query Engine for DataFrames, written in Rust
A curated list of awesome parallel computing resources
C++ multi-dimensional labeled arrays and dataframe based on xtensor
C++ DataFrame for statistical, financial, and ML analysis in modern C++
oneAPI Threading Building Blocks (oneTBB)