Thanks to visit codestin.com
Credit goes to www.libhunt.com

Python Machine Learning

Open-source Python projects categorized as Machine Learning

Top 23 Python Machine Learning Projects

Machine Learning
  1. transformers

    🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

    Project mention: Using “ibm-granite/granite-speech-3.3–8b” 🪨 for ASR | dev.to | 2025-11-02

    python3.12 -m venv new_venv_312 source new_venv_312/bin/activate pip install --upgrade pip pip install https://github.com/huggingface/transformers/archive/main.zip torchaudio peft soundfile torchcodec ### and also pip install librosa

  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. Pytorch

    Tensors and Dynamic neural networks in Python with strong GPU acceleration

    Project mention: The bug that taught me more about PyTorch than years of using it | news.ycombinator.com | 2025-10-26

    He's not a core maintainer and hasn't been for years - pytorch's contributors are completely public

    https://github.com/pytorch/pytorch/graphs/contributors

  4. nn

    🧑‍🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠

  5. scikit-learn

    scikit-learn: machine learning in Python

    Project mention: Open Source Journey | dev.to | 2025-11-01

    Start Simple, Build Confidence Project: Scikit-learn After the intense first experience with BEHAVIOR-1K, I needed something more approachable. I went straight to Scikit-learn's good first issue label and found a task that seemed manageable: changing relative imports to absolute imports in Cython files. From this

  6. Keras

    Deep Learning for humans

    Project mention: PyTorch vs TensorFlow 2025: Which one wins after 72 hours? | dev.to | 2025-08-29

    Keras 3 multi-backend

  7. yolov5

    YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

    Project mention: Labellerr YOLOv8: Cars and Number Plate Detection — Practical, Step-by-Step | dev.to | 2025-11-05

    YOLOv8(by Ultralytics) is one of the most widely used state-of-the-art object detection models. It is known for delivering high accuracy, while still being fast enough for real-time detection.

  8. Face Recognition

    The world's simplest facial recognition api for Python and the command line

    Project mention: Show HN: Real-time privacy protection for smart glasses | news.ycombinator.com | 2025-08-11

    Did you look at egoblur? its a lot more effective at face detection than https://github.com/ageitgey/face_recognition granted, you'd have to do your own face matching to do exception.

  9. Stream

    Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.

    Stream logo
  10. faceswap

    Deepfakes Software For All

  11. OpenBB

    Financial data platform for analysts, quants and AI agents.

    Project mention: OpenBB – Investment Research for Everyone, Everywhere | news.ycombinator.com | 2025-03-22
  12. ultralytics

    Ultralytics YOLO 🚀

    Project mention: Show HN: Using YOLO to Detect Office Chairs in 40M Hotel Photos | news.ycombinator.com | 2025-01-25

    They did it on their own computer. https://github.com/ultralytics/ultralytics

  13. Airflow

    Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

    Project mention: What is Argo Workflows? | dev.to | 2025-11-10

    Apache Airflow - Apache's Airflow project is a popular workflow system that supports DAG-based tasks and precise scheduling. It's an extensible Python project that supports several different providers and job executors, including Kubernetes.

  14. streamlit

    Streamlit — A faster way to build and share data apps.

    Project mention: How to Build a RAG Solution with Llama Index, ChromaDB, and Ollama | dev.to | 2025-11-04

    With a few lines of Python, you can build a basic retrieval-augmented generation (RAG) solution, but it doesn’t stop here. You can extend this project to search for multiple web pages, load large documents, add a simple web UI using either Streamlit or Anvil, or even experiment with different models in Ollama.

  15. DeepSpeed

    DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

    Project mention: All Data and AI Weekly #193 - June 9, 2025 | dev.to | 2025-06-09
  16. gradio

    Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!

    Project mention: The Ultimate Guide to Building Stunning AI Apps For Beginners - Gradio | dev.to | 2025-11-14

    Why Gradio is the New Superpower for Every AI Learner in 2025

  17. Ray

    Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

    Project mention: PyTorch Monarch | news.ycombinator.com | 2025-10-23

    Not currently, but it is being worked on https://github.com/ray-project/ray/issues/53976.

  18. Open-Assistant

    OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

  19. MindsDB

    Federated query engine for AI - The only MCP Server you'll ever need

    Project mention: “One Journey Ends, Another Begins — My Hacktoberfest 2025 Story” | dev.to | 2025-10-31

    Just wrapped up my Hacktoberfest project using MindsDB and Streamlit — built a CRM Semantic Search AI app! 😄 If anyone’s into open source + AI, would love feedback on my PR: Hacktoberfest 2025 PR – Add CRM Semantic Search use case (MindsDB)

  20. gym

    A toolkit for developing and comparing reinforcement learning algorithms.

  21. supervision

    We write your reusable computer vision tools. 💜

    Project mention: Show HN: Plug-and-play Python utils for any computer-vision pipeline | news.ycombinator.com | 2025-07-21
  22. paperless-ngx

    A community-supported supercharged document management system: scan, index and archive all your documents

    Project mention: Review for Synology DiskStation DS925+: A feature-packed NAS | dev.to | 2025-10-30

    Borg Backup - I use it to automatically back up my main hosted Docker services. I have publicly hosted instances of Immich, and Paperless-NGX using Docker containers. I periodically make a backup of their data folder using Borg and store it in a Borg repo. The advantage of storing the backups in a Borg repo is that it is a deduplicating archival program. So no matter how many backups you make, it will not take any extra space than the first backup, provided nothing has changed. If there is a change, only that changed chunk is backed up, just like git. Also, you can easily encrypt and/or compress while backing up. Restoring a backup is also as easy as running a single Borg command.

  23. qlib

    Qlib is an AI-oriented Quant investment platform that aims to use AI tech to empower Quant Research, from exploring ideas to implementing productions. Qlib supports diverse ML modeling paradigms, including supervised learning, market dynamics modeling, and RL, and is now equipped with https://github.com/microsoft/RD-Agent to automate R&D process.

    Project mention: Choosing the Right AI Model for Stock Prediction | dev.to | 2025-10-04

    After researching different AI models in Qlib (a quantitative finance platform), here's what I learned:

  24. spaCy

    💫 Industrial-strength Natural Language Processing (NLP) in Python

    Project mention: Strengthening Open-Source Integrity: My First Contribution to spaCy | dev.to | 2025-10-28

    🔗 Pull Request: #13877 — Remove spaCy Quickstart from Universe/Courses due to spam redirect

  25. pytorch-lightning

    Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.

  26. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Machine Learning discussion

Log in or Post with

Python Machine Learning related posts

  • The Ultimate Guide to Building Stunning AI Apps For Beginners - Gradio

    2 projects | dev.to | 14 Nov 2025
  • Deep universal probabilistic programming with Python and PyTorch

    1 project | news.ycombinator.com | 14 Nov 2025
  • What is Argo Workflows?

    3 projects | dev.to | 10 Nov 2025
  • TabPFN-2.5 – SOTA foundation model for tabular data

    2 projects | news.ycombinator.com | 6 Nov 2025
  • Python library for quantum computing, quantum ML, and quantum chemistry

    1 project | news.ycombinator.com | 5 Nov 2025
  • Why stop at 1M tokens when you can have 10M?

    2 projects | news.ycombinator.com | 4 Nov 2025
  • We're open-sourcing the successor of Jupyter notebook

    4 projects | news.ycombinator.com | 4 Nov 2025
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 16 Nov 2025
    InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →

Index

What are some of the best open-source Machine Learning projects in Python? This list will help you:

# Project Stars
1 transformers 152,508
2 Pytorch 94,956
3 nn 64,273
4 scikit-learn 64,038
5 Keras 63,551
6 yolov5 56,018
7 Face Recognition 55,756
8 faceswap 54,691
9 OpenBB 54,534
10 ultralytics 48,563
11 Airflow 43,200
12 streamlit 42,140
13 DeepSpeed 40,641
14 gradio 40,497
15 Ray 39,825
16 Open-Assistant 37,492
17 MindsDB 37,211
18 gym 36,649
19 supervision 35,881
20 paperless-ngx 34,208
21 qlib 33,724
22 spaCy 32,785
23 pytorch-lightning 30,432

Sponsored
InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com

Did you know that Python is
the 2nd most popular programming language
based on number of references?