Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View mengniwang95's full-sized avatar

Block or report mengniwang95

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Advanced Quantization Algorithm for LLMs and VLMs, with support for CPU, Intel GPU, CUDA and HPU.

Python 677 57 Updated Oct 24, 2025

Model compression for ONNX

Python 97 9 Updated Nov 18, 2024

Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)

Python 199 265 Updated Oct 24, 2025

Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.

LLVM 1,384 791 Updated Oct 25, 2025

Accessible large language models via k-bit quantization for PyTorch.

Python 7,684 791 Updated Oct 22, 2025

🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools

Python 3,125 598 Updated Oct 9, 2025

Examples for using ONNX Runtime for machine learning inferencing.

C++ 1,516 394 Updated Oct 17, 2025

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Python 2,164 215 Updated Oct 8, 2024

Common utilities for ONNX converters

Python 282 68 Updated Sep 4, 2025

ONNXMLTools enables conversion of models to ONNX

Python 1,118 205 Updated Jun 10, 2025

Sandbox for training deep learning networks

Python 3,017 558 Updated Sep 6, 2024

Olive: Simplify ML Model Finetuning, Conversion, Quantization, and Optimization for CPUs, GPUs and NPUs.

Python 2,161 253 Updated Oct 24, 2025

Intel® Performance Counter Monitor (Intel® PCM)

C++ 3,124 505 Updated Oct 3, 2025

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 94,240 25,667 Updated Oct 25, 2025

oneAPI Deep Neural Network Library (oneDNN)

C++ 3,901 1,078 Updated Oct 25, 2025

A JIT assembler for x86/x64 architectures supporting FPU, MMX, SSE (1-4), AVX (1-2, 512), APX, and AVX10.2

C++ 2,194 293 Updated Sep 2, 2025

Open standard for machine learning interoperability

Python 19,770 3,815 Updated Oct 25, 2025

Intel® AI Reference Models: contains Intel optimizations for running deep learning workloads on Intel® Xeon® Scalable processors and Intel® Data Center GPUs

Python 719 225 Updated Sep 30, 2025

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

C++ 18,188 3,513 Updated Oct 25, 2025

Inference of quantization aware trained networks using TensorRT

Python 83 18 Updated Jan 27, 2023

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

C++ 5,535 648 Updated Oct 24, 2025

Reference implementations of MLPerf® inference benchmarks

Python 1,479 581 Updated Oct 25, 2025

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Python 2,514 282 Updated Oct 24, 2025

Pre-trained Deep Learning models and demos (high quality and extremely fast)

Python 4,308 1,397 Updated Oct 16, 2025