Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View johnrachwan123's full-sized avatar

Organizations

@PrunaAI

Block or report johnrachwan123

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Courses on building, compressing, evaluating, and deploying efficient AI models.

Jupyter Notebook 65 5 Updated Nov 10, 2025
Python 52 3 Updated Nov 6, 2025

Questions and answers to the Germany Citizenship Test (Einbürgerungstest) in Anki format.

22 3 Updated Jul 20, 2025

Implementation of our unlearning method "Partial Model Collapse" introduced in the paper: "Model Collapse Is Not a Bug but a Feature in Machine Unlearning for LLMs" (Preprint).

Python 27 Updated Jan 4, 2026

Download and Compile Any Diffusion Models in your Endpoint

Python 7 Updated Aug 21, 2025

A curated list of the best software pricing pages and useful resources for pricing research

15 2 Updated Nov 28, 2025

TabBench is a benchmark built to evaluate machine learning models on tabular data, focusing on real-world industry use cases.

Jupyter Notebook 108 1 Updated Sep 29, 2025

📚 Collection of token-level model compression resources.

189 8 Updated Sep 3, 2025

Pruna is a model optimization framework built for developers, enabling you to deliver faster, more efficient models with minimal overhead.

Python 1,075 77 Updated Jan 22, 2026
Python 6 Updated Apr 7, 2025

A curated list of materials on AI efficiency

203 19 Updated Dec 14, 2025

This is a ComfyUI node that integrates pruna

Python 65 3 Updated Sep 8, 2025

This repository describes how to use pruna with tritonserver

Python 7 Updated May 28, 2025

collection of diffusion model papers categorized by their subareas

2,124 98 Updated Jan 23, 2026

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

Cuda 9,461 931 Updated Jan 18, 2026

📚A curated list of Awesome Diffusion Inference Papers with Codes: Sampling, Cache, Quantization, Parallelism, etc.🎉

Python 504 25 Updated Jan 18, 2026

A list of awesome compiler projects and papers for tensor computation and deep learning.

2,719 324 Updated Oct 19, 2024

Official Repository for the ICLR 2022 paper "Generalization of Neural Combinatorial Solvers through the Lens of Adversarial Robustness"

Jupyter Notebook 14 1 Updated Nov 20, 2022

Official Implementation of the Paper "MAGNet: Motif-Agnostic Generation of Molecules from Shapes"

Python 15 Updated Nov 25, 2023

Code for Winning the Lottery Ahead of Time: Efficient Early Network Pruning (ICML 2022)

Python 30 3 Updated Nov 15, 2023

A massively parallel, high-level programming language

Rust 19,144 469 Updated Jun 3, 2025

Examples and guides for using the OpenAI API

Jupyter Notebook 71,101 11,895 Updated Jan 22, 2026

Open source implementation and models of One-step Diffusion with Distribution Matching Distillation

Python 180 14 Updated May 26, 2024

Awesome LLM compression research papers and tools.

1,761 117 Updated Nov 10, 2025

High-speed Large Language Model Serving for Local Deployment

C++ 8,591 477 Updated Aug 2, 2025

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 4,935 337 Updated Jan 18, 2026

Mac app for crushing tech interviews with AI

Swift 4,265 304 Updated Jan 14, 2025

A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including language and vision, we are continuously improving the project. Wel…

204 10 Updated Feb 10, 2025

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

Python 12,718 2,037 Updated Jan 24, 2026
Next