Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure. Learn more →
Top 23 Python SQL Projects
-
devops-exercises
Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, NoSQL, Azure, GCP, DNS, Elastic, Network, Virtualization. DevOps Interview Questions
Project mention: A collection of exercises and examples for learning DevOps concepts | news.ycombinator.com | 2025-06-29 -
Stream
Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
-
pandas-ai
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
-
vanna
🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using Agentic Retrieval 🔄.
Project mention: Beyond the Diff: How Deep Context Analysis Caught a Critical Bug in a 20K-Star Open Source Project | dev.to | 2025-10-20A developer submitted PR #951 to Vanna.ai, a popular open-source text-to-SQL tool with 20,000+ stars. The change added Databricks integration—156 lines of well-documented code supporting two connection engines (SQL warehouse and ODBC).
-
an SQLModel entity backed by a database table doesn't validate its fields on creation, which is the point of Pydantic.
https://github.com/fastapi/sqlmodel/issues/52#issuecomment-1...
-
Project mention: How to Make Websites That Will Require Lots of Your Time and Energy | news.ycombinator.com | 2025-07-28
at the very least, if you are really writing lots of INSERTs by hand I bet you are either not quoting properly or you are writing queries with 15 placeholders and someday you'll put one in the wrong place.
ORMs and related toolkits have come a long way since they were called the "Vietnam of Computer Science". I am a big fan of JooQ in Java
https://www.jooq.org/
and SQLAlchemy in Python
https://www.sqlalchemy.org/
Note both of these support both an object <-> SQL mapper (usually with generated objects) that covers the case of my code sample above, and a DSL for SQL inside the host language which is delightful if you want to do code generation to make query builders and stuff like that. I work on a very complex search interface which builds out joins, subqueries, recursive CTEs, you name it, and the code is pretty easy to maintain.
-
I've been using LLM-assistance for my larger open source projects - https://github.com/simonw/datasette https://github.com/simonw/llm and https://github.com/simonw/sqlite-utils - for a couple of years now.
Also literally hundreds of smaller plugins and libraries and CLI tools, see https://github.com/simonw?tab=repositories (now at 880 repos) and https://pypi.org/user/simonw/ (340 published packages).
Unlike my tools.simonwillison.net stuff the vast majority of those products are covered by automated tests and usually have comprehensive documentation too.
-
Project mention: XAN: A Modern CSV-Centric Data Manipulation Toolkit for the Terminal | news.ycombinator.com | 2025-03-27
I used to use q for this sort of thing. Not sure if there are better choices now as it have been a few years.
https://harelba.github.io/q/
-
InfluxDB
InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
-
-
sqlfluff
A modular SQL linter and auto-formatter with support for multiple dialects and templated code.
-
countries-states-cities-database
🌍 Discover our global repository of countries, states, and cities! 🏙️ Get comprehensive data in JSON, SQL, PSQL, SQLSERVER, MONGODB, SQLITE, XML, YAML, and CSV formats. Access ISO2, ISO3 codes, country code, capital, native language, timezones (for countries), and more. #countries #states #cities
-
Agreed, and it's an amazingly well-maintained GitHub repo: https://github.com/tobymao/sqlglot
Big kudos to Toby and the team.
-
Mage
🧙 The modern replacement for Airflow. Mage is an open-source data pipeline tool for transforming and integrating data. https://github.com/mage-ai/mage-ai
That’s where Mage AI stood out. From the very first try to run it , it feels really easy and straight forward .
-
https://github.com/ibis-project/ibis and
-
Flask-AppBuilder
Simple and rapid application development framework, built on top of Flask. includes detailed security, auto CRUD generation for your models, google charts and much more. Demo (login with guest/welcome) - http://flaskappbuilder.pythonanywhere.com/
-
dataset
Easy-to-use data handling for SQL data stores with support for implicit table creation, bulk loading, and transactions.
-
-
-
ethereum-etl
Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Google BigQuery https://goo.gl/oY5BCQ
-
-
django-sql-explorer
SQL reporting that Just Works. Fast, simple, and confusion-free. Write and share queries in a delightful SQL editor, with AI assistance.
-
PyPika
PyPika is a python SQL query builder that exposes the full richness of the SQL language using a syntax that reflects the resulting query. PyPika excels at all sorts of SQL queries but is especially useful for data analysis.
-
-
fugue
A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Python SQL discussion
Python SQL related posts
-
CLI to manage your SQL database schemas and migrations
-
OLAP Workload Testing and Benchmarking Suite
-
Text2SQL is dead – long live text2SQL
-
Django: One ORM to rule all databases
-
Shillelagh: Query APIs Using SQL
-
Multi-model RAG with LangChain
-
Show HN: Xorq – open compute catalog for AI
-
A note from our sponsor - Stream
getstream.io | 16 Nov 2025
Index
What are some of the best open-source SQL projects in Python? This list will help you:
| # | Project | Stars |
|---|---|---|
| 1 | devops-exercises | 79,765 |
| 2 | pandas-ai | 22,534 |
| 3 | vanna | 21,588 |
| 4 | sqlmodel | 17,122 |
| 5 | SQLAlchemy | 11,108 |
| 6 | datasette | 10,519 |
| 7 | q | 10,331 |
| 8 | modin | 10,325 |
| 9 | sqlfluff | 9,299 |
| 10 | countries-states-cities-database | 8,938 |
| 11 | sqlglot | 8,568 |
| 12 | Mage | 8,517 |
| 13 | ibis | 6,211 |
| 14 | Flask-AppBuilder | 4,918 |
| 15 | dataset | 4,827 |
| 16 | alembic | 3,747 |
| 17 | dataherald | 3,574 |
| 18 | ethereum-etl | 3,101 |
| 19 | pg_activity | 2,929 |
| 20 | django-sql-explorer | 2,853 |
| 21 | PyPika | 2,750 |
| 22 | sqlmesh | 2,708 |
| 23 | fugue | 2,122 |