Top 23 Python GPU Projects

Pytorch

1 405 94,956 10.0 Python

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Project mention: The bug that taught me more about PyTorch than years of using it | news.ycombinator.com | 2025-10-26

He's not a core maintainer and hasn't been for years - pytorch's contributors are completely public
https://github.com/pytorch/pytorch/graphs/contributors
Stream

getstream.io featured

Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
DeepSpeed

2 53 40,641 9.6 Python

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Project mention: All Data and AI Weekly #193 - June 9, 2025 | dev.to | 2025-06-09
scalene

3 37 13,086 9.1 Python

Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals

Project mention: ATC/OSDI '25 Joint Keynote: Accelerating Software Dev: The LLM (R)Evolution [video] | news.ycombinator.com | 2025-09-08

- https://github.com/plasma-umass/scalene
Coz: A causal profiler that tells you where to optimize your code (C/C++/Rust/Swift/Java)
tvm

4 17 12,809 9.8 Python

Open deep learning compiler stack for cpu, gpu and specialized accelerators
cupy

5 25 10,608 9.9 Python

NumPy & SciPy for GPU

Project mention: Nvidia adds native Python support to CUDA | news.ycombinator.com | 2025-04-04

The plethora of packages, including DSLs for compute and MLIR.
https://developer.nvidia.com/how-to-cuda-python
https://cupy.dev/
server

6 30 10,005 9.1 Python

The Triton Inference Server provides an optimized cloud and edge inferencing solution. (by triton-inference-server)

Project mention: Gluon: a GPU programming language based on the same compiler stack as Triton | news.ycombinator.com | 2025-09-17

Also it REALLY jams me up that this is a thing, complicating discussions: https://github.com/triton-inference-server/server
skypilot

7 41 8,955 10.0 Python

Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, 20+ clouds, or on-prem).

Project mention: Cloud Run GPUs, now GA, makes running AI workloads easier for everyone | news.ycombinator.com | 2025-06-04

To massively increase the reliability to get GPUs, you can use something like SkyPilot (https://github.com/skypilot-org/skypilot) to fall back across regions, clouds, or GPU choices. E.g.,
$ sky launch --gpus H100
will fall back across GCP regions, AWS, your clusters, etc. There are options to say try either H100 or H200 or A100 or .
Essentially the way you deal with it is to increase the infra search space.
InfluxDB

www.influxdata.com featured

InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
ImageAI

8 12 8,841 3.1 Python

A python library built to empower developers to build applications and systems with self-contained Computer Vision capabilities
BigDL

9 14 8,445 9.4 Python

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, DeepSpeed, Axolotl, etc.

Project mention: FlashMoE: DeepSeek-R1 671B and Qwen3MoE 235B with 1~2 Intel B580 GPU in IPEX-LLM | news.ycombinator.com | 2025-05-12
AlphaPose

10 4 8,444 0.0 Python

Real-Time and Accurate Full-Body Multi-Person Pose Estimation&Tracking System
nvitop

11 7 6,271 8.3 Python

An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.

Project mention: Show HN: Sping – A HTTP/TCP Latency Tool That's Easy on the Eye | news.ycombinator.com | 2025-08-24

I've frequently found myself using [nvitop](https://github.com/XuehaiPan/nvitop) to diagnose GPU/CPU contention issues.
The two best things about it are:
chainer

12 2 5,908 0.0 Python

A flexible framework of neural networks for deep learning
tf-quant-finance

13 133 5,048 4.3 Python

High-performance TensorFlow library for quantitative finance.
pytorch-forecasting

14 9 4,653 8.9 Python

Time series forecasting with PyTorch
gpustat

15 8 4,286 2.4 Python

📊 A simple command-line utility for querying and monitoring GPU status
asitop

16 18 4,245 0.0 Python

Perf monitoring CLI tool for Apple Silicon
executorch

17 5 3,490 10.0 Python

On-device AI across mobile, embedded and edge for PyTorch

Project mention: Google AI Edge – on-device cross-platform AI deployment | news.ycombinator.com | 2025-06-01

Genuine question, why should I use this to deploy models on the edge instead of executorch? https://github.com/pytorch/executorch
For context, I get to choose the tech stack for a greenfield project. I think that executor h, which belongs to the pytorch ecosystem, will have a way more predictable future than anything Google does, so I currently consider executorch more.
jittor

18 4 3,212 8.7 Python

Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.
TransformerEngine

19 3 2,912 9.8 Python

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory utilization in both training and inference.
leptonai

20 2 2,797 9.4 Python

A Pythonic framework to simplify AI service building
jetson_stats

21 2 2,416 9.1 Python

📊 Simple package for monitoring and control your NVIDIA Jetson [Orin, Xavier, Nano, TX] series
torchrec

22 2 2,390 9.8 Python

Pytorch domain library for recommendation systems

Project mention: Advancements in Embedding-Based Retrieval at Pinterest Homefeed | news.ycombinator.com | 2025-02-14

Nice, there are a ton of threads here to check out. For example I had not heard of
https://pytorch.org/torchrec/
Which seems to nicely package a lot of primitives I have worked with previously.
pygraphistry

23 10 2,361 9.5 Python

PyGraphistry is a Python library to quickly load, shape, embed, and explore big graphs with the GPU-accelerated Graphistry visual graph analyzer
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python GPU discussion

Python GPU related posts

The bug that taught me more about PyTorch than years of using it

3 projects | news.ycombinator.com | 26 Oct 2025
PyTorch 2.9 released with C ABI and better multi-GPU support

1 project | news.ycombinator.com | 15 Oct 2025
The 64 KB Challenge: Teaching a Tiny Net to Play Pong

2 projects | dev.to | 12 Oct 2025
Show HN: I Built Claude Code for CUDA in 18 Hours (Open Source)

1 project | news.ycombinator.com | 9 Oct 2025
Docker Was Too Slow, So We Replaced It: Nix in Production [video]

2 projects | news.ycombinator.com | 28 Sep 2025
Wasted Open Source efforts 😮

2 projects | dev.to | 10 Sep 2025
Speeding up PyTorch inference by 87% on Apple with AI-generated Metal kernels

5 projects | news.ycombinator.com | 3 Sep 2025
A note from our sponsor - Stream
getstream.io | 16 Nov 2025

Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure. Learn more →

Index

What are some of the best open-source GPU projects in Python? This list will help you:

#	Project	Stars
1	Pytorch	94,956
2	DeepSpeed	40,641
3	scalene	13,086
4	tvm	12,809
5	cupy	10,608
6	server	10,005
7	skypilot	8,955
8	ImageAI	8,841
9	BigDL	8,445
10	AlphaPose	8,444
11	nvitop	6,271
12	chainer	5,908
13	tf-quant-finance	5,048
14	pytorch-forecasting	4,653
15	gpustat	4,286
16	asitop	4,245
17	executorch	3,490
18	jittor	3,212
19	TransformerEngine	2,912
20	leptonai	2,797
21	jetson_stats	2,416
22	torchrec	2,390
23	pygraphistry	2,361

Python GPU

Top 23 Python GPU Projects

Python GPU discussion

Python GPU related posts

The bug that taught me more about PyTorch than years of using it

PyTorch 2.9 released with C ABI and better multi-GPU support

The 64 KB Challenge: Teaching a Tiny Net to Play Pong

Show HN: I Built Claude Code for CUDA in 18 Hours (Open Source)

Docker Was Too Slow, So We Replaced It: Nix in Production [video]

Wasted Open Source efforts 😮

Speeding up PyTorch inference by 87% on Apple with AI-generated Metal kernels

Index

Did you know that Python is the 2nd most popular programming language based on number of references?

Did you know that Python is
the 2nd most popular programming language
based on number of references?