A query executes in just 2 milliseconds, yet its planning phase takes 500 ms. The database is reasonably sized, the query involves 9 tables, and the default_statistics_target is set to only 500. Where does this discrepancy come from?
This question was recently raised on the pgsql-performance mailing list, and the investigation revealed a somewhat surprising culprit: the column statistics stored in PostgreSQL's pg_statistic table.
The Context
In PostgreSQL, query optimisation relies on various statistical measures, such as MCV, histograms, distinct values, and others - all stored in the pg_statistic table. By default, these statistics are based on samples of up to 100 elements. For larger tables, however, we typically need significantly more samples to ensure reliable estimates. A thousand to 5000 elements might not seem like much when representing billions of rows, but this raises an important question: could large statistical arrays, particularly MCVs on variable-sized columns, seriously impact query planning performance, even if query execution itself is nearly instantaneous?
Investigating the Problem
We're examining a typical auto-generated 1C system query. '1C' is a typical object-relational mapping framework for accounting applications. PostgreSQL version is 17.5. Notably, the default_statistics_target value is set to only 500 elements, even below the recommended value for 1C systems (2500). The query contains 12 joins, but 9 are spread across subplans, and the join search space is limited by three JOINs, which is quite manageable. Looking at the EXPLAIN output, the planner touches only 5 buffer pages during planning - not much.
Interestingly, the alternative PostgreSQL fork (such forks have become increasingly popular these days) executed this query with nearly identical execution plans, and the planning time is considerably shorter - around 80 milliseconds. Let's use this as our control sample.
The Hunt for Root Cause
The first suspicion was obvious: perhaps the developer
[...]