Thanks to visit codestin.com
Credit goes to Github.com

Skip to content
View yujiongzhang's full-sized avatar

Block or report yujiongzhang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

PyTorch native quantization and sparsity for training and inference

Python 2,685 429 Updated Feb 14, 2026

VeriSilicon Tensor Interface Module

C 247 87 Updated Jan 22, 2026

先进编译实验室的个人主页

C++ 198 21 Updated Oct 15, 2025

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 43 3 Updated Jan 30, 2026

torch_musa is an open source repository based on PyTorch, which can make full use of the super computing power of MooreThreads graphics cards.

Python 478 34 Updated Feb 6, 2026

Hands-on Notes Using OpenClaw on the MT AIBOOK

1 1 Updated Feb 11, 2026

《Pytorch实用教程》(第二版)无论是零基础入门,还是CV、NLP、LLM项目应用,或是进阶工程化部署落地,在这里都有。相信在本书的帮助下,读者将能够轻松掌握 PyTorch 的使用,成为一名优秀的深度学习工程师。

Jupyter Notebook 4,409 477 Updated Jan 27, 2025

a static analytical model for LLM distributed training

Python 116 15 Updated Jan 8, 2026

Flash Attention in ~100 lines of CUDA (forward pass only)

Cuda 1,070 109 Updated Dec 30, 2024

PaddlePaddle High Performance Deep Learning Inference Engine for Mobile and Edge (飞桨高性能深度学习端侧推理引擎)

C++ 7,225 1,628 Updated May 22, 2025

library to read/write .npy and .npz files in C/C++

C++ 1,462 329 Updated Jan 18, 2023

how to optimize some algorithm in cuda.

Cuda 2,819 255 Updated Feb 12, 2026

On-device AI across mobile, embedded and edge for PyTorch

Python 4,265 838 Updated Feb 14, 2026

Python package built to ease deep learning on graph, on top of existing DL frameworks.

Python 14,229 3,060 Updated Jul 31, 2025

common in-memory tensor structure

C++ 1,168 158 Updated Jan 26, 2026

Universal LLM Deployment Engine with ML Compilation

Python 22,037 1,934 Updated Feb 13, 2026

Kaldi-compatible online fbank extractor without external dependencies

C++ 141 35 Updated Oct 9, 2025

compiler learning resources collect.

Python 2,678 364 Updated Mar 19, 2025

Open source AUTOSAR classic platform forked from the Arctic Core

C 623 376 Updated Aug 6, 2024

My sample code for linux

C 122 90 Updated Oct 10, 2020

High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.

Python 2,896 314 Updated Feb 14, 2026

Open Machine Learning Compiler Framework

Python 13,117 3,787 Updated Feb 14, 2026

这是一个faster-rcnn的pytorch实现的库,可以利用voc数据集格式的数据进行训练。

Python 1,819 365 Updated Oct 3, 2023

Regrouping all neural networks for Kalray Neural Networks applications

Jupyter Notebook 12 2 Updated Aug 1, 2025

A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks …

Python 1,981 274 Updated Feb 14, 2026

OpenVX sample implementation

C 148 51 Updated Feb 17, 2024

Khronos OpenVX Tutorial Material

C++ 247 84 Updated Aug 6, 2021

ppocrv5(det, cls, rec) onnx/axmodel inference pipeline

Python 8 2 Updated Jul 10, 2025

A tool to modify ONNX models in a visualization fashion, based on Netron and Flask.

JavaScript 1,612 198 Updated Nov 19, 2025

A UDP/TCP Assistant. 网络调试助手

C++ 292 120 Updated Jan 8, 2021
Next