Stars
PyTorch native quantization and sparsity for training and inference
MooreThreads / tilelang_musa
Forked from tile-ai/tilelangDomain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
torch_musa is an open source repository based on PyTorch, which can make full use of the super computing power of MooreThreads graphics cards.
Hands-on Notes Using OpenClaw on the MT AIBOOK
《Pytorch实用教程》(第二版)无论是零基础入门,还是CV、NLP、LLM项目应用,或是进阶工程化部署落地,在这里都有。相信在本书的帮助下,读者将能够轻松掌握 PyTorch 的使用,成为一名优秀的深度学习工程师。
a static analytical model for LLM distributed training
Flash Attention in ~100 lines of CUDA (forward pass only)
PaddlePaddle High Performance Deep Learning Inference Engine for Mobile and Edge (飞桨高性能深度学习端侧推理引擎)
library to read/write .npy and .npz files in C/C++
how to optimize some algorithm in cuda.
On-device AI across mobile, embedded and edge for PyTorch
Python package built to ease deep learning on graph, on top of existing DL frameworks.
Universal LLM Deployment Engine with ML Compilation
Kaldi-compatible online fbank extractor without external dependencies
Open source AUTOSAR classic platform forked from the Arctic Core
High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.
这是一个faster-rcnn的pytorch实现的库,可以利用voc数据集格式的数据进行训练。
Regrouping all neural networks for Kalray Neural Networks applications
A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks …
ppocrv5(det, cls, rec) onnx/axmodel inference pipeline
A tool to modify ONNX models in a visualization fashion, based on Netron and Flask.