InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →
Top 23 Python SQL Projects
-
devops-exercises
Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, NoSQL, Azure, GCP, DNS, Elastic, Network, Virtualization. DevOps Interview Questions
Project mention: A collection of exercises and examples for learning DevOps concepts | news.ycombinator.com | 2025-06-29 -
Stream
Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
-
pandas-ai
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
-
vanna
🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using Agentic Retrieval 🔄.
Project mention: Beyond the Diff: How Deep Context Analysis Caught a Critical Bug in a 20K-Star Open Source Project | dev.to | 2025-10-20A developer submitted PR #951 to Vanna.ai, a popular open-source text-to-SQL tool with 20,000+ stars. The change added Databricks integration—156 lines of well-documented code supporting two connection engines (SQL warehouse and ODBC).
-
an SQLModel entity backed by a database table doesn't validate its fields on creation, which is the point of Pydantic.
https://github.com/fastapi/sqlmodel/issues/52#issuecomment-1...
-
Project mention: How to Make Websites That Will Require Lots of Your Time and Energy | news.ycombinator.com | 2025-07-28
at the very least, if you are really writing lots of INSERTs by hand I bet you are either not quoting properly or you are writing queries with 15 placeholders and someday you'll put one in the wrong place.
ORMs and related toolkits have come a long way since they were called the "Vietnam of Computer Science". I am a big fan of JooQ in Java
https://www.jooq.org/
and SQLAlchemy in Python
https://www.sqlalchemy.org/
Note both of these support both an object <-> SQL mapper (usually with generated objects) that covers the case of my code sample above, and a DSL for SQL inside the host language which is delightful if you want to do code generation to make query builders and stuff like that. I work on a very complex search interface which builds out joins, subqueries, recursive CTEs, you name it, and the code is pretty easy to maintain.
-
I've been using LLM-assistance for my larger open source projects - https://github.com/simonw/datasette https://github.com/simonw/llm and https://github.com/simonw/sqlite-utils - for a couple of years now.
Also literally hundreds of smaller plugins and libraries and CLI tools, see https://github.com/simonw?tab=repositories (now at 880 repos) and https://pypi.org/user/simonw/ (340 published packages).
Unlike my tools.simonwillison.net stuff the vast majority of those products are covered by automated tests and usually have comprehensive documentation too.
-
Project mention: XAN: A Modern CSV-Centric Data Manipulation Toolkit for the Terminal | news.ycombinator.com | 2025-03-27
I used to use q for this sort of thing. Not sure if there are better choices now as it have been a few years.
https://harelba.github.io/q/
-
InfluxDB
InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
-
-
sqlfluff
A modular SQL linter and auto-formatter with support for multiple dialects and templated code.
-
countries-states-cities-database
🌍 Discover our global repository of countries, states, and cities! 🏙️ Get comprehensive data in JSON, SQL, PSQL, SQLSERVER, MONGODB, SQLITE, XML, YAML, and CSV formats. Access ISO2, ISO3 codes, country code, capital, native language, timezones (for countries), and more. #countries #states #cities
-
Agreed, and it's an amazingly well-maintained GitHub repo: https://github.com/tobymao/sqlglot
Big kudos to Toby and the team.
-
Mage
🧙 The modern replacement for Airflow. Mage is an open-source data pipeline tool for transforming and integrating data. https://github.com/mage-ai/mage-ai
That’s where Mage AI stood out. From the very first try to run it , it feels really easy and straight forward .
-
https://github.com/ibis-project/ibis and
-
Flask-AppBuilder
Simple and rapid application development framework, built on top of Flask. includes detailed security, auto CRUD generation for your models, google charts and much more. Demo (login with guest/welcome) - http://flaskappbuilder.pythonanywhere.com/
-
dataset
Easy-to-use data handling for SQL data stores with support for implicit table creation, bulk loading, and transactions.
-
-
-
ethereum-etl
Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Google BigQuery https://goo.gl/oY5BCQ
-
-
django-sql-explorer
SQL reporting that Just Works. Fast, simple, and confusion-free. Write and share queries in a delightful SQL editor, with AI assistance.
-
PyPika
PyPika is a python SQL query builder that exposes the full richness of the SQL language using a syntax that reflects the resulting query. PyPika excels at all sorts of SQL queries but is especially useful for data analysis.
-
-
fugue
A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Python SQL discussion
Python SQL related posts
-
CLI to manage your SQL database schemas and migrations
-
OLAP Workload Testing and Benchmarking Suite
-
Text2SQL is dead – long live text2SQL
-
Django: One ORM to rule all databases
-
Shillelagh: Query APIs Using SQL
-
Multi-model RAG with LangChain
-
Show HN: Xorq – open compute catalog for AI
-
A note from our sponsor - InfluxDB
www.influxdata.com | 16 Nov 2025
Index
What are some of the best open-source SQL projects in Python? This list will help you:
| # | Project | Stars |
|---|---|---|
| 1 | devops-exercises | 79,855 |
| 2 | pandas-ai | 22,534 |
| 3 | vanna | 21,588 |
| 4 | sqlmodel | 17,122 |
| 5 | SQLAlchemy | 11,108 |
| 6 | datasette | 10,519 |
| 7 | q | 10,331 |
| 8 | modin | 10,325 |
| 9 | sqlfluff | 9,299 |
| 10 | countries-states-cities-database | 8,938 |
| 11 | sqlglot | 8,568 |
| 12 | Mage | 8,517 |
| 13 | ibis | 6,211 |
| 14 | Flask-AppBuilder | 4,921 |
| 15 | dataset | 4,827 |
| 16 | alembic | 3,747 |
| 17 | dataherald | 3,574 |
| 18 | ethereum-etl | 3,103 |
| 19 | pg_activity | 2,929 |
| 20 | django-sql-explorer | 2,853 |
| 21 | PyPika | 2,750 |
| 22 | sqlmesh | 2,708 |
| 23 | fugue | 2,122 |