Codestin Search App

Unlocking High-Performance PostgreSQL: Key Memory Optimizations

Posted by warda bibi in Stormatics on 2026-01-29 at 07:30

PostgreSQL can scale extremely well in production, but many deployments run on conservative defaults that are safe yet far from optimal. The crux of performance optimization is to understand what each setting really controls, how settings interact under concurrency, and how to verify impact with real metrics.

This guide walks through the two most important memory parameters:

shared_buffers
work_mem

shared_buffers

Let’s start with shared_buffers, because this is one of the most important concepts in PostgreSQL. When a client connects to PostgreSQL and asks for data, PostgreSQL does not read directly from disk and stream it back to the client. Instead, PostgreSQL does something that pulls the required data page into shared memory first and then serves it from there. The same design applies to writes. When the client updates a row, PostgreSQL does not immediately write that change to disk. It loads the page into memory, updates it in RAM, and marks that page as dirty. Disk writes come later.

And this design is intentional because reading and writing in memory are orders of magnitude faster than reading from or writing to disk, and it dramatically reduces random I/O overhead.

So what exactly is shared_buffers?

shared_buffers defines the size of the shared memory region that PostgreSQL uses as its internal buffer cache. And all the reads and writes go through shared_buffers. Disk interaction happens later asynchronously through background writing and checkpoints. So shared_buffers is the layer between the database processes and the disk.

By default, PostgreSQL sets shared_buffers to 128MB. That might be fine for local environments; however, it is not enough cache for real working sets, which means more disk reads, more I/O pressure, and less s

[...]

500 Milliseconds on Planning: How PostgreSQL Statistics Slowed Down a Query 20 Times Over

Posted by Andrei Lepikhov in pgEdge on 2026-01-28 at 15:25

A query executes in just 2 milliseconds, yet its planning phase takes 500 ms. The database is reasonably sized, the query involves 9 tables, and the default_statistics_target is set to only 500. Where does this discrepancy come from?

This question was recently raised on the pgsql-performance mailing list, and the investigation revealed a somewhat surprising culprit: the column statistics stored in PostgreSQL's pg_statistic table.

The Context

In PostgreSQL, query optimisation relies on various statistical measures, such as MCV, histograms, distinct values, and others - all stored in the pg_statistic table. By default, these statistics are based on samples of up to 100 elements. For larger tables, however, we typically need significantly more samples to ensure reliable estimates. A thousand to 5000 elements might not seem like much when representing billions of rows, but this raises an important question: could large statistical arrays, particularly MCVs on variable-sized columns, seriously impact query planning performance, even if query execution itself is nearly instantaneous?

Investigating the Problem

We're examining a typical auto-generated 1C system query. '1C' is a typical object-relational mapping framework for accounting applications. PostgreSQL version is 17.5. Notably, the default_statistics_target value is set to only 500 elements, even below the recommended value for 1C systems (2500). The query contains 12 joins, but 9 are spread across subplans, and the join search space is limited by three JOINs, which is quite manageable. Looking at the EXPLAIN output, the planner touches only 5 buffer pages during planning - not much.

Interestingly, the alternative PostgreSQL fork (such forks have become increasingly popular these days) executed this query with nearly identical execution plans, and the planning time is considerably shorter - around 80 milliseconds. Let's use this as our control sample.

The Hunt for Root Cause

The first suspicion was obvious: perhaps the developer

[...]

New Presentation

Posted by Bruce Momjian in EDB on 2026-01-28 at 14:00

I just gave a new presentation at Prague PostgreSQL Developer Day titled What's Missing in Postgres? It's an unusual talk because it explains the missing features of Postgres, and why. One thing I learned in writing the talk is that the majority of our missing features are performance-related, rather than functionality-related. I took many questions:

some pointed out that extensions supply much of this missing functionality
some supported the lack of features because the features are either unnecessary or harmful
some features are in-progress

Thanks to Melanie Plageman for the idea of this talk.

Migrating Sybase ASE aka SAP ASE to PostgreSQL

Posted by Avi Vallarapu in HexaCluster on 2026-01-28 at 11:04

Legacy Sybase ASE/SAP ASE databases are still powering mission-critical OLTP workloads, but modernization pressure keeps rising. Witness the differences between SAP ASE and PostgreSQL, and the migration path to PostgreSQL.

Why Your HA Architecture is a Lie (And That's Okay)

Posted by Lætitia AVROT on 2026-01-28 at 00:00

If Darth Vader existed and decided to do to Earth what he did to Alderaan, everyone would lose data. I love this quote from Robert Haas because it’s a reality check we all need. In the database world, we’re constantly sold the dream of “Five Nines” (99.999% uptime) and “Zero Data Loss” (RPO1 0). We spend months building complex clusters to achieve it. Let’s be honest: these are fairy tales. Beautiful to imagine, but they don’t exist in production.

Unused Indexes In PostgreSQL: Risks, Detection, And Safe Removal

Posted by semab tariq in Stormatics on 2026-01-27 at 09:57

Indexes exist to speed up data access. They allow PostgreSQL to avoid full table scans, significantly reducing query execution time for read-heavy workloads.

From real production experience, we have observed that well-designed, targeted indexes can improve query performance by 5× or more, especially on large transactional tables.

However, indexes are not free.

And in this blog, we are going to discuss what issues unused indexes can cause and how to remove them from production systems with a rollback plan, safely

1. Why Unused Large Indexes Become a Long-Term Problem

Over time, unused indexes can silently degrade database performance. Below are some of the most common issues they cause in production systems.

1.1. Slower INSERT, UPDATE, And DELETE Operations

Every write operation must update all indexes on a table, including those that are never used by queries.

1.2. Increased Vacuum And Autovacuum Overhead

Indexes accumulate dead tuples just like tables. These must be vacuumed, increasing I/O usage and extending vacuum runtimes.

1.3. Longer Maintenance Windows

Operations such as VACUUM and REINDEX take longer as the number and size of indexes grow.

1.4. Disk Space Waste And Cache Pollution

Large unused indexes consume disk space and can evict useful data from shared buffers, reducing cache efficiency.

Because of these reasons, it is always recommended to periodically identify and safely remove unused indexes from production systems, but only through a controlled and well-validated process.

2. How To Safely Drop Unused Indexes In PostgreSQL

Below is a step-by-step, production-safe checklist that should be followed before dropping any index.

2.1. Check When System Statistics Were Last Reset

If statistics were reset recently, an index may appear unused even though it is actively required by workloads.

SELECT
  datname,
  stats_re

[...]

How to render timestamp with a timezone that is different from current?

Posted by Hubert 'depesz' Lubaczewski on 2026-01-27 at 09:25

This question appeared on IRC, and while I wasn't there while it happened, it caught my eye: » Can I not render this with timezone offset: select ‘2026-01-09 04:35:46.9824-08'::timestamp with time zone at time zone ‘UTC'; » Returns ‘2026-01-09 12:35:46.9824' which is without the offset. Let's see what can be done about it. First, let's … Continue reading

Is the future of MySQL PostgreSQL (Or MariaDB, or TiDB, or ...)?

Posted by Dave Stokes on 2026-01-25 at 16:45

I am not intentionally trying to upset anyone with this blog post or minimize the efforts of many brilliant people whom I admire. However, I connected with several people over the 2025 holidays who all had the same question: What is the future of MySQL? At the upcoming FOSDEM conference, several events will discuss this subject and push a particular solution. And in several ways, they are all wrong.

Oracle has not been improving the community edition for a long time now. They have laid off many of their top performers in the MySQL group. We got almost a good decade and a half out of Oracle's stewardship of the "world's most popular database", and we should be thankful for that. However, now that time is over, it is time to consider future options that will involve no updates, CVEs, or innovation for what is the MySQL Community Edition.

There are several choices available.

Nothing!

The first choice is nothing. Many folks run old, end-of-life versions of MySQL. There are many instances of MySQL 5.7. There are some fantastic features in later versions of the software. But if those features are not needed or desired, then why upgrade? MySQL has always had a minimalist appeal for those who have little need for features like JSON, material views, and the like. This vanilla approach will be the default for many who do not change it if it is still working in the school of software management.

Pros: You do not have to make any changes.
Cons: You are taking on technical debt like the Titanic took on water. You may get a few years out of this, but this path is fraught with hungry dragons.

The Elephant

PostgreSQL? This is a great database, offering numerous valuable features and making it a solid choice. It is reasonably easy to port schemas and data from MySQL to PostgreSQL. You will want to run a connection pooler. You will need to keep an eye on the vacuum status. You will need to learn to pick from an embarrassing number of indexing options, with B+ tree probably still being your prima

[...]

PostgreSQL Anonymizer, available in all good shops

Posted by damien clochard in Dalibo on 2026-01-25 at 12:30

As we prepare for the upcoming release of PostgreSQL Anonymizer 3.0, I took some time to check which platforms now support the extension. What I discovered brought me a sense of achievement that I wanted to share with the community.

More and More Platforms Are Embracing Data Anonymization

Over the past months, several major Cloud Service Providers have adopted the PostgreSQL Anonymizer extension, making it easier than ever for organizations to protect sensitive data.

The new adopters include:

They add to the current list composed of Alibaba Cloud, Crunchy Bridge, Google Cloud SQL, Microsoft Azure Database, Neon and others

Growing Support Across PostgreSQL Forks

Perhaps even more remarkable is the adoption by major PostgreSQL forks and enterprise distributions. Each of these platforms has its own specific requirements and user base, and seeing PostgreSQL Anonymizer integrated across this ecosystem is truly humbling:

Please refer to their own documentation on how to activate the extension as they might have a platform-specific install procedure.

Beyond PostgreSQL: The Django Integration

I’ve also noticed a Django plugin for PostgreSQL Anonymizer, making it easier for Python developers to integrate data anonymization into their applications.

Reflecting on our journey

When we started working on PostgreSQL Anonymizer in 2018, the goal was simple: provide a straightforward way to mask personal information directly within PostgreSQL. We wanted to make privacy-preserving techniques accessible to anyone using PostgreSQL, without requiring complex and expansive external tools or ETLs.

Seeing this level of adoption across cloud providers, enterprise distributions, and even extending into application frameworks is incredibly rewarding. But it’s important to remember that this success belongs to everyone who contr

[...]

🛠️ PGXN Tools v1.7

Posted by David Wheeler on 2026-01-24 at 22:53

Today I released v1.7.0 of the pgxn-tools OCI image, which simplifies Postgres extension testing and PGXN distribution. The new version includes just a few updates and improvements:

Upgraded the Debian base image from Bookworm to Trixie
Set the PGUSER environment variable to postgres in the Dockerfile, removing the need for users to remember to do it.
Updated pg-build-test to set MAKEFLAGS="-j $(nprocs)" to shorten build runtimes.
Also updated pgrx-build-test to pass -j $(nprocs), for the same reason.
Upgraded the pgrx test extension to v0.16.1 and test it on Postgres versions 13-16.

Just a security and quality of coding life release. Ideally existing workflows will continue to work as they always have.

Introduction to Buffers in PostgreSQL

Posted by Radim Marek on 2026-01-24 at 16:15

The work around RegreSQL led me to focus a lot on buffers. If you are a casual PostgreSQL user, you have probably heard about adjusting shared_buffers and followed the good old advice to set it to 1/4 of available RAM. But after we went a little bit too enthusiastic about them on a recent Postgres FM episode I've been asked what that's all about.

Buffers are one of those topics that easily gets forgotten. And while they are a foundation block of PostgreSQL's performance architecture, most of us treat them as a black box. This article is going to attempt to change that.

The 8KB page

There's one concept we need to cover before diving into the buffers. And that's the concept of the 8KB page. Everything in PostgreSQL is stored in blocks that are 8KB wide.

When PostgreSQL reads the data, it does not read individual rows. It reads the entire page. When it writes something, same thing - same page. You want to retrieve one small row, you will always retrieve much more data to go along with it. And if you followed carefully, same applies to writes.

-- you can check the block size (which should be almost always 8192 bytes)
show block_size;

 block_size
------------
 8192
(1 row)

Every table and index is a collection of these pages. A row might span multiple pages if it's large enough, but the page remains the atomic unit of I/O.

PostgreSQL vs OS

The interesting part is understanding why PostgreSQL needs to maintain its own infrastructure for its own buffer cache, when the operating system can already cache disk pages.

The answer is quite simple. PostgreSQL understands the data it reads. Whilst the operating system only sees files and bytes. PostgreSQL sees tables, indexes, query plans and has semantic knowledge to cache things faster.

Consider this example: a query needs to perform a sequential scan of a large table. The OS might happily cache all those pages, but PostgreSQL knows this is a one-time operation and uses a special strategy (ring buffers) to avoid eviction of the main cac

[...]

CSI: Postgres — Did someone change my table??

Posted by Kaarel Moppel on 2026-01-23 at 22:00

PostgreSQL has many small “hidden gem” features included (not to mention ~1K extensions adding a bunch more) waiting for someone to notice them. Some are useful every day, some are niche, and some (e.g. debug_* settings) exist only so that core developers could troubleshoot things without losing too much hair....

What's New in the pgEdge Postgres MCP Server: Beta 2 and Beta 3

Posted by Dave Page in pgEdge on 2026-01-23 at 05:34

When we released the first beta of the pgEdge Postgres MCP Server back in December, we were excited to see the community's response to what we'd built. Since then, the team has been hard at work adding new capabilities, refining the user experience, and addressing the feedback we've received. I'm pleased to share what's landed in Beta 2 (now available) and what's coming in Beta 3 (currently in QA).

Beta 2: Write Access, Token Efficiency, and a Better CLI

Beta 2 represents a significant step forward in making the pgEdge Postgres MCP Server more capable and more efficient.

Write Access Mode

Perhaps the most requested feature since we launched has been the ability to do more than just query data. In Beta 2, we've introduced an optional write access mode that allows the LLM to execute DDL and DML statements when enabled.This feature is disabled by default - safety first - but when you do enable it via the configuration option, the server will permit CREATE, DROP, ALTER, INSERT, UPDATE, and DELETE operations. We've also added automatic schema metadata refresh after DDL operations, so always returns current information.To ensure users are always aware when they're connected to a write-enabled database, we've added visual warnings throughout the interfaces. The web client displays a prominent amber warning banner, whilst the CLI shows a [] indicator in the database listing and warns you when switching to such a database. We want there to be no ambiguity about what the LLM can and cannot do.

Token Management Improvements

Anyone who's worked with LLMs knows that token usage matters - both for cost and for context window management. Beta 2 introduces several features designed to reduce token consumption.The new tool provides a lightweight way to check the size of a table before querying it. Rather than fetching data only to discover you've got millions of rows, you can now get a count first and plan your query accordingly.We've also added pagination support to with an parameter, allowing you to page throu[...]

PostgreSQL Contributor Story: Florin Irion

Posted by Floor Drees in EDB on 2026-01-22 at 12:35

In 2025 we started a program to help colleagues who show promise for PostgreSQL Development to become contributors. In this post we highlight Florin's journey, a Staff SDE at EDB based in Italy.

CERN PGDay: an annual PostgreSQL event in Geneva, Switzerland

Posted by Sarah Conway in Data Bene on 2026-01-22 at 00:00

If you’re located near Western Switzerland and the Geneva region (or you just want to visit!), you might find it well worth your time to attend CERN PGDay 2026. It’s an annual gathering for anyone interested in learning more about PostgreSQL that takes place at CERN, the world’s largest particle physics laboratory.

If you find the subject of particle physics interesting, you may want to visit anyways! They offer free access to many activities that run from Tuesday to Sunday; you can view the full programme here.

Here, you’ll be able to attend a single track of seven English-language sessions, with a social gathering afterwards to enjoy CERN while continuing to connect with the rest of the attendees.

This year, there’ll be:

A new PostgreSQL backend for CERN Tape Archive scheduling for LHC Run 4 - Konstantina Skovola, CERN
DCS Data Tools - PostgreSQL/TimescaleDB Implementation for ATLAS DCS Time-Series Data - Dimitrios Matakias, Paris Moschovakos, CERN
Operational hazards of managing PostgreSQL DBs over 100TB - Teresa Lopes, Adyen
Vacuuming Large Tables: How Recent Postgres Changes Further Enable Mission Critical Workloads - Robert Treat, AWS
The (very practical) Postgres Sharding Landscape - Álvaro Hernández, OnGres
The Alchemy of Shared Buffers: Balancing Concurrency and Performance - Josef Machytka, credativ
When Kafka Met Elephant: A Love Story about Fast Ingestion - Barbora Linhartova, Jan Suchanek, Baremon

The first talk of the day is of particular note…

The CERN Tape Archive (CTA) stores over one exabyte of scientific data. To orchestrate storage operations (archival) and access operations (retrieval), the CTA Scheduler coordinates concurrent data movements across hundreds of tape servers, relying on a Scheduler Database (Scheduler DB) to manage the metadata of the in-flight requests. The existing objectstore-based design of the CTA Scheduler DB is a complex transactional management system. This talk presents

[...]

Understanding ALTER TABLE Behavior on Partitioned Tables in PostgreSQL

Posted by Chao Li in Highgo Software on 2026-01-21 at 08:53

Partitioned tables are a core PostgreSQL feature, but one area still causes regular confusion—even for experienced users:

How exactly does ALTER TABLE behave when partitions are involved?

Does an operation propagate to partitions? Does it affect future partitions? Does ONLY do what it claims? Why do some commands work on parents but not on partitions—or vice versa?

Today, PostgreSQL documentation describes individual ALTER TABLE sub-commands well, but it rarely explains their interaction with partitioned tables as a whole. As a result, users often discover the real behavior only through trial and error.

This post summarizes a systematic investigation of ALTER TABLE behavior on partitioned tables, turning scattered rules into a consistent classification model.

The Problem: “Inconsistent” Is Not the Same as “Undocumented”

The PostgreSQL community often describes ALTER TABLE behavior on partitioned tables as inconsistent. In practice, the deeper problem is that:

The rules do exist, but
They are spread across code paths, error messages, and historical decisions, and
They are not documented in a way that lets users predict outcomes.

Without a mental model, even simple questions become hard to answer:

If I run this on the parent, what happens to existing partitions?
What about partitions created later?
Does ONLY prevent propagation—or is it ignored?
Can I override settings per partition?

How I Evaluated Each `ALTER TABLE` Sub-command

To make sense of this, I tested ALTER TABLE sub-commands against partitioned tables using the same set of questions each time.

Four evaluation criteria

For every sub-command, I asked:

Propagation
Does the action on a parent partitioned table propagate to existing partitions?
Inheritance for new partitions
If I create a new partition later, does it inherit

[...]

PDXPUG February 19th, 2026: What’s New in PostgreSQL 18

Posted by Mark Wong on 2026-01-20 at 21:58

2026 Thursday February 19th Meeting 6:30pm:8:30pm

Please note the new meeting location. And please RSVP on MeetUp as space is limited.

Location: Multnomah Arts Center – The front desk can guide you to the meeting room.

7688 SW CAPITOL HWY • PORTLAND, OR 97219

Speaker: Mark Wong

PostgreSQL 18 was released September 25, 2025.

We will review freely available presentations available on the internet.

Come learn what’s new, share experiences, or just meet with local peers! Casual, informal.

https://www.meetup.com/pdxpug/events/312977438/

Postgres Serials Should be BIGINT (and How to Migrate)

Posted by Elizabeth Garrett Christensen in Crunchy Data on 2026-01-20 at 13:00

Lots of us started with a Postgres database that incremented with an id SERIAL PRIMARY KEY. This was the Postgres standard for many years for data columns that auto incremented. The SERIAL is a shorthand for an integer data type that is automatically incremented. However as your data grows in size, SERIALs and INTs can run the risk of an integer overflow as they get closer to 2 Billion uses.

We covered a lot of this in a blog post The Integer at the End of the Universe: Integer Overflow in Postgres a few years ago. Since that was published we’ve helped a number of customers with this problem and I wanted to refresh the ideas and include some troubleshooting steps that can be helpful. I also think that BIGINT is more cost effective than folks realize.

SERIAL and BIGSERIAL are just shorthands and map directly to the INT and BIGINT data types. While something like CREATE TABLE user_events (id SERIAL PRIMARY KEY) would have been common in the past, the best practice now is BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY is recommended. SERIAL/ BIGSERIAL are not SQL standard and the GENERATED ALWAYS keyword prevents accidental inserts, guaranteeing the database manages the sequence instead of a manual or application based addition.

INT - goes up to 2.1 Billion (2,147,483,647) and more if you do negative numbers. INT takes up 4 bytes per row column.
BIGINT- goes up 9.22 quintillion (9,223,372,036,854,775,807) and needs a 8-bytes for storage.

Serials vs UUID

Before I continue talking about serials in Postgres, it is worth noting that Postgres also has robust UUID support, including v7 which was just released. If you decide to go with UUID, great. This makes a ton of sense for things that can be URLs or are across systems. However not all ids need to be UUIDs, so lots of folks still continue with a serialized / incremented integers.

Cost difference between INT and BIGINT

Postgres does not pack data tightly like a text file. It writes data in aligned tuples / rows, and sta

[...]

PostgreSQL on Kubernetes vs VMs: A Technical Decision Guide

Posted by Umair Shahid in Stormatics on 2026-01-20 at 11:01

If your organization is standardizing on Kubernetes, this question shows up fast:

“Should PostgreSQL run on Kubernetes too?”

The worst answers are the confident ones:

“Yes, because everything else is on Kubernetes.”
“No, because databases are special.”

Both are lazy. The right answer depends on what you’re optimizing for: delivery velocity, platform consistency, latency predictability, operational risk, compliance constraints, and, most importantly, who is on-call when things go sideways.

I have seen PostgreSQL run very well on Kubernetes. I’ve also seen teams pay a high “complexity tax” for benefits they never actually used. This post is an attempt to give you a technical evaluation you can use to make a decision that fits your environment.

Start with the real question: are you running a database, or building a database platform?

This is the cleanest framing I have found:

Running a database: You have a small number of production clusters that are business-critical. You want predictable performance, understandable failure modes, straightforward upgrades, and clean runbooks.
Building a database platform: You want self-service provisioning, standardized guardrails, GitOps workflows, multi-tenancy controls, and a repeatable API so teams can spin up PostgreSQL clusters without opening tickets.

Kubernetes shines in the second world. VMs shine in the first.

Yes, you can do either on either platform. But the default fit differs.

A neutral comparison model: 6 dimensions that actually matter

Here is a practical rubric you can use in architecture reviews.

If you want a quick decision shortcut:

If your main goal is self-service and standardization, Kubernetes is compelling. If your main goal is pre

[...]

4 causes of table bloat in PostgreSQL and how to address them

Posted by Shinya Kato on 2026-01-20 at 08:55

What Is Table Bloat?

Table bloat in PostgreSQL refers to the phenomenon where "dead tuples" generated by UPDATE or DELETE operations remain uncollected by VACUUM, causing data files to grow unnecessarily large.

For VACUUM to reclaim dead tuples, it must be guaranteed that those tuples "cannot possibly be referenced by any currently running transaction." If old transactions persist for any reason, VACUUM's garbage collection stops at that point.

This article explains the following four causes of table bloat: how each manifests, how to identify the root cause, and how to resolve it.

Long-running transactions
Uncommitted prepared transactions
Queries on standby servers with hot_standby_feedback enabled
Logical replication lag

Test Environment

PostgreSQL 19dev (34740b90bc123d645a3a71231b765b778bdcf049)

Long-Running Transactions

This is probably the most familiar cause. Whether active or idle, a long-running transaction prevents VACUUM from reclaiming dead tuples generated by UPDATE or DELETE operations that occurred after the transaction started. This is because the long-running transaction might need to read the pre-update versions of those tuples.

Setup

In Terminal 1, start a transaction and obtain a transaction ID.

Terminal 1:
=# BEGIN;
BEGIN
=*# SELECT txid_current();
 txid_current
--------------
          782
(1 row)

In a separate Terminal 2, delete data and run VACUUM.

Terminal 2:
=# DELETE FROM t;
DELETE 100
=# VACUUM (VERBOSE) t;
INFO:  00000: vacuuming "postgres.public.t"
LOCATION:  heap_vacuum_rel, vacuumlazy.c:848
INFO:  00000: finished vacuuming "postgres.public.t": index scans: 0
pages: 0 removed, 1 remain, 1 scanned (100.00% of total), 0 eagerly scanned
tuples: 0 removed, 100 remain, 100 are dead but not yet removable
removable cutoff: 782, which was 2 XIDs old when operation ended
frozen: 0 pages from table (0.00% of total) had 0 tuples frozen
visibility map: 0 pages set all-visible, 0 pages set all-frozen (0 were all-visible)

[...]

Exploration: CNPG Extensions(ImageVolume)

Posted by Umut TEKIN in Cybertec on 2026-01-20 at 05:49

Introduction

PostgreSQL is the most advanced open source database system and it is widely used across many industries. Among its many strengths, extensibility places PostgreSQL in a unique spot. CNPG has been supporting extensions; however, this traditionally required building custom container images to include the necessary extensions.

This has changed with the introduction of PostgreSQL 18 and Kubernetes 1.33. PostgreSQL 18 has introduced an extension_control_path parameter, while Kubernetes 1.33 adds the ImageVolume feature. Together, these features enable CNPG to dynamically load extensions into a cluster at pod startup.

extension_control_path is a search path for extensions. ImageVolume allows to mount an OCI - compliant container image as an immutable and read - only volume to a pod at a specified filesystem path. Previously, PostgreSQL extensions were tightly coupled with the CNPG container images, but this is no longer the case. By leveraging the ImageVolume feature, extensions no longer need to be embedded at image build time. As a result, we can;

Stick with official images
Dynamically add extensions to a cluster without rebuilding container images
Have more flexibility due to decoupling container images from the distribution of the extensions

Requirements

In order to use this feature, we need;

PostgreSQL 18+
Kubernetes 1.33+
Container runtime with ImageVolume support:
- containerd v2.1.0+
- CRI-O v1.31+
CloudNativePG-compatible extension container images

Bootstrapping A Cluster with the pgvector Extension

Prepare a manifest file like the following:

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: physics
spec:
  imageName: ghcr.io/cloudnative-pg/postgresql:18-minimal-trixie
  instances: 1
  storage:
    size: 1Gi

  postgresql:
    extensions:
    - name: pgvector
      image:
        reference: ghcr.io/cloudnative-pg/pgvector:0.8.1-18-trixie
---
apiVersion: postgresql.cnpg.i

[...]

How Blocking-Lock Brownouts Can Escalate from Row-Level to Complete System Outages

Posted by Jeremy Schneider on 2026-01-20 at 04:23

This article is a shortened version. For the full writeup, go to https://github.com/ardentperf/pg-idle-test/tree/main/conn_exhaustion

This test suite demonstrates a failure mode when application bugs which poison connection pools collide with PgBouncers that are missing peer config and positioned behind a load balancer. PgBouncer’s peering feature (added with v1.19 in 2023) should be configured if multiple PgBouncers are being used with a load balancer – this feature prevents the escalation demonstrated here.

The failures described here are based on real-world experiences. While uncommon, this failure mode has been seen multiple times in the field.

Along the way, we discover unexpected behaviors (bugs?) in Go’s database/sql (or sqlx) connection pooler with the pgx client and in Postgres itself.

Sample output: https://github.com/ardentperf/pg-idle-test/actions/workflows/test.yml

The Problem in Brief

Go’s database/sql allows connection pools to become poisoned by returning connections with open transactions for re-use. Transactions opened with db.BeginTx() will be cleaned up, but – for example – conn.ExecContext(..., "BEGIN") will not be cleaned up. PR #2481 proposes some cleanup logic in pgx for database/sql connection pools (not yet merged); I tested the PR with this test suite. The PR relies on the TxStatus indicator in the ReadyForStatus message which Postgres sends back to the client as part of its network protocol.

A poisoned connection pool can cause an application brownout since other sessions updating the same row wait indefinitely for the blocking transaction to commit or rollback its own update. On a high-activity or critical table, this can quickly lead to significant pile-ups of connections waiting to update the same locked row. With Go this means context deadline timeouts and retries and connection thrashing by all of the threads and processes that are trying to update the row. Backoff logic is often lacking in these code paths. When there is a currently running SQL (hung –

[...]

MERISE: The French Database Modeling Superpower That Could Save Your Data Model

Posted by Lætitia AVROT on 2026-01-20 at 00:00

You saved my life. There’s not one day when I don’t use MERISE. An experienced developer told me this six months after attending my masterclass. When a senior pro says this with half a year of hindsight, it’s not just politeness. It’s because they’ve seen how much damage a lack of methodology can do in the long run. The Death of Data Modeling? 🔗Lately, I’ve noticed a frustrating trend. Many schools are skipping data modeling entirely.

Unconventional PostgreSQL Optimizations

Posted by Haki Benita on 2026-01-19 at 22:00

When it comes to database optimization, developers often reach for the same old tools: rewrite the query slightly differently, slap an index on a column, denormalize, analyze, vacuum, cluster, repeat. Conventional techniques are effective, but sometimes being creative can really pay off!

In this article, I present unconventional optimization techniques in PostgreSQL.

<small>image by <a href="https://www.abstrakt.design">abstrakt design</a></small> — image by abstrakt design

Table of Contents

Eliminate Full Table Scans Based on Check Constraints

Imagine you have this table of users:

db=# CREATE TABLE users (
    id INT PRIMARY KEY,
    username TEXT NOT NULL,
    plan TEXT NOT NULL,
    CONSTRAINT plan_check CHECK (plan IN ('free', 'pro'))
);
CREATE TABLE

For each user you keep their name and which payment plan they're on. There are only two plans, "free" and "pro", so you add a check constraint.

Generate some data and analyze the table:

db=# INSERT INTO users
SELECT n, uuidv4(), (ARRAY['free', 'pro'])[ceil(rando

[...]

The DATE Data Type in Oracle vs. PostgreSQL

Posted by Akhil Reddy Banappagari in HexaCluster on 2026-01-19 at 15:47

Choosing a correct datatype mapping while migrating from Oracle to PostgreSQL is very important to avoid migration failures. Especially when we have date and time involved, it is very important to understand the behavior in both Oracle and PostgreSQL.

Who Contributed to PostgreSQL Development in 2025?

Posted by Robert Haas in EDB on 2026-01-19 at 15:29

Here is another annual blog post breaking down code contributions to PostgreSQL itself (not ecosystem projects) by principal author. I have mentioned every year that this methodology has many limitations and fails to capture a lot of important work, and I reiterate that this year as usual. Nonetheless, many people seem to find these statistics helpful, so here they are.

The strange case of the underestimated Merge Join node

Posted by Frédéric Yhuel in Dalibo on 2026-01-19 at 07:15

This post appeared first on the Dalibo blog.

Brest, France, 19 January 2026

We recently encountered a strange optimizer behaviour, reported by one of our customers:

Customer: “Hi Dalibo, we have a query that is very slow on the first execution after a batch process, and then very fast. We initially suspected a caching effect, but then we noticed that the execution plan was different.”

Dalibo: “That’s a common issue. Autoanalyze didn’t have the opportunity to process the table after the batch job had finished, and before the first execution of the query. You should run the VACUUM ANALYZE command (or at least ANALYZE) immediately after your batch job.”

Customer: “Yes, it actually solves the problem, but… your hypothesis is wrong. We looked at pg_stat_user_tables, and are certain that the tables were not vacuumed or analyzed between the slow and fast executions. We don’t have a production problem, but we would like to understand.”

Dalibo: “That’s very surprising! we would also like to understand…”

So let’s dive in!

Execution plans

The query is quite basic (table and column names have been anonymized):

SELECT    *
FROM      bar
LEFT JOIN foo ON (bar.a = foo.a)
WHERE     id = 10744501
ORDER BY  bar.x DESC, foo.x DESC;

Here’s the plan of the first execution of the query after the batch job:

                                                                 QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
 Sort (cost=17039.22..17042.11 rows=1156 width=786) (actual time=89056.476..89056.480 rows=6 loops=1)
   Sort Key: bar.x DESC, foo.x DESC
   Sort Method: quicksort Memory: 25kB
   Buffers: shared hit=2255368 read=717581 dirtied=71206 written=11997
   -> Merge Right Join (cost=2385.37..16980.41 rows=1156 width=786) (actual time=89056.428..89056.432 rows=6 loops=1)
         Inner Unique: true
         Merge Cond: (foo.a = bar.a)
         Buffe

[...]

Turbocharging LISTEN/NOTIFY with 40x Boost

Posted by Robins Tharakan on 2026-01-18 at 11:40

Unless you've built a massive real-time notification system with thousands of distinct channels, it is easy to miss the quadratic performance bottleneck that Postgres used to have in its notification queue. A recent commit fixes that with a spectacular throughput improvement.

The commit in question is 282b1cde9d, which landed recently and targets a future release (likely Postgres 19, though as with all master branch commits, there's a standard caveat that it could be rolled back before release).

The Problem

Previously, when a backend sent a notification (via NOTIFY), Postgres had to determine which other backends were listening to that channel. The implementation involved essentially a linear search or walk through a list for each notification, which became costly when the number of unique channels grew large.

If you had a scenario with thousands of unique channels (e.g., a "table change" notification system where each entity has its own channel), the cost of dispatching notifications would scale quadratically. For a transaction sending N notifications to N different channels, the effort was roughly O(N^2).

The Fix

The optimization introduces a shared hash table to track listeners. Instead of iterating through a list to find interested backends, the notifying process can now look up the channel in the hash table to instantly find who (if anyone) is listening. This implementation uses Partitioned Hash Locking, allowing concurrent LISTEN/UNLISTEN commands without global lock contention.

Additionally, the patch includes an optimization to "direct advance" the queue pointer for listeners that are not interested in the current batch of messages. This is coupled with a Wake-Only-Tail strategy that signals only the backend at the tail of the queue, avoiding "thundering herd" wake-ups and significantly reducing CPU context switches.

Finally, the patch helps observability by adding specific Wait Events, making it easier to spot contention in pg_stat_activity.

Benchmarking Methodology

[...]

Illinois Prairie PUG January Edition

Posted by Henrietta Dombrovskaya on 2026-01-17 at 15:14

We just had the first meetup of 2026, and all I can say is a huge thank you to Ryan Booz and all attendees, both in person and virtual!

I was so happy to see many familiar faces, as well as first-timers. We had great attendance (one of those rare situations when I didn’t order enough pizza :)). Ryan Booz, who, as I previously mentioned, is one of the few out-of-towners who dare to face Chicago winter weather, presented a great talk on configuring Postgres for effective logging and query-optimization analysis.

I liked the fact that we had 30 participants that early in the year, when people are just starting to get back to their regular activity level. More importantly, we now have a group of active members who not only keep coming to the meetups but also actively listen, participate in discussions, and stay long after the presentation ends, discussing what they just heard, sharing experiences, suggesting future topics, and talking about how we can make Postgres more appealing to application developers! I always have to remind the last group of people staying late that, as much as I love them all, I need to close the house, but those are my happiest moments!

On days like that, I have this strong feeling of community building happening right here, and that’s the most rewarding thing I could wish for.

Presentation slides and sql examples are available here, and below is the meetup recording. Please not the quick turnaround of the video! We tried!

Introducing pgEdge Load Generator: Realistic PostgreSQL Workload Simulation

Posted by Dave Page in pgEdge on 2026-01-16 at 05:52

Anyone who has worked with PostgreSQL in production environments knows that testing database performance is rarely straightforward. Synthetic benchmarks like pgbench are useful for stress testing, but they don't reflect how real applications behave. Production workloads have peaks and troughs, complex query patterns, and user behaviour that varies throughout the day. This is why I'm pleased to introduce the pgEdge Load Generator.

The Problem with Traditional Benchmarks

Most database benchmarking tools focus on raw throughput: how many queries per second can the database handle at maximum load? Whilst this is valuable information, it tells us little about how a system will cope with real-world usage patterns.Consider a typical e-commerce platform. Traffic peaks during lunch breaks and evenings, drops off overnight, and behaves differently at weekends compared to weekdays. A stock trading application has intense activity during market hours and virtually none outside them. These temporal patterns matter enormously for capacity planning, replication testing, and failover validation.

What Is pgEdge Load Generator?

The pgEdge Load Generator (pgedge-loadgen) is a command-line tool that creates realistic PostgreSQL workloads for testing and validation. It's not a benchmarking tool; it's a workload simulator designed to exercise your database in ways that mirror actual application behaviour.The tool provides seven pre-built applications spanning different use cases:Transaction Processing (TPC-based):

-
wholesale
(TPC-C): Classic OLTP with orders, inventory, and payment processing

analytics
(TPC-H): Decision support with 22 complex analytical queries

-
brokerage
(TPC-E): Mixed read/write stock trading simulation

-
retail
(TPC-DS): Multi-channel retail decision support

Semantic Search (pgvector-based):

-
ecommerce
: Product search with vector embeddings

-
knowledgebase
: FAQ and documentation similarity matching

[...]

Latest Blog Posts

Posted by warda bibi in Stormatics on 2026-01-29 at 07:30

shared_buffers

Posted by Andrei Lepikhov in pgEdge on 2026-01-28 at 15:25

The Context

Investigating the Problem

The Hunt for Root Cause

Posted by Bruce Momjian in EDB on 2026-01-28 at 14:00

Posted by Avi Vallarapu in HexaCluster on 2026-01-28 at 11:04

Posted by Lætitia AVROT on 2026-01-28 at 00:00

Posted by semab tariq in Stormatics on 2026-01-27 at 09:57

1. Why Unused Large Indexes Become a Long-Term Problem

1.1. Slower INSERT, UPDATE, And DELETE Operations

1.2. Increased Vacuum And Autovacuum Overhead

1.3. Longer Maintenance Windows

1.4. Disk Space Waste And Cache Pollution

2. How To Safely Drop Unused Indexes In PostgreSQL

2.1. Check When System Statistics Were Last Reset

Posted by Hubert 'depesz' Lubaczewski on 2026-01-27 at 09:25

Posted by Dave Stokes on 2026-01-25 at 16:45

Nothing!

The Elephant

Posted by damien clochard in Dalibo on 2026-01-25 at 12:30

More and More Platforms Are Embracing Data Anonymization

Growing Support Across PostgreSQL Forks

Beyond PostgreSQL: The Django Integration

Reflecting on our journey

Posted by David Wheeler on 2026-01-24 at 22:53

Posted by Radim Marek on 2026-01-24 at 16:15

The 8KB page

PostgreSQL vs OS

Posted by Kaarel Moppel on 2026-01-23 at 22:00

Posted by Dave Page in pgEdge on 2026-01-23 at 05:34

Beta 2: Write Access, Token Efficiency, and a Better CLI

Write Access Mode

Token Management Improvements

Posted by Floor Drees in EDB on 2026-01-22 at 12:35

Posted by Sarah Conway in Data Bene on 2026-01-22 at 00:00

Posted by Chao Li in Highgo Software on 2026-01-21 at 08:53

The Problem: “Inconsistent” Is Not the Same as “Undocumented”

How I Evaluated Each ALTER TABLE Sub-command

Four evaluation criteria

Posted by Mark Wong on 2026-01-20 at 21:58

Posted by Elizabeth Garrett Christensen in Crunchy Data on 2026-01-20 at 13:00

Posted by Umair Shahid in Stormatics on 2026-01-20 at 11:01

Start with the real question: are you running a database, or building a database platform?

A neutral comparison model: 6 dimensions that actually matter

Posted by Shinya Kato on 2026-01-20 at 08:55

What Is Table Bloat?

Test Environment

Long-Running Transactions

Setup

Posted by Umut TEKIN in Cybertec on 2026-01-20 at 05:49

Introduction

Requirements

Bootstrapping A Cluster with the pgvector Extension

Posted by Jeremy Schneider on 2026-01-20 at 04:23

The Problem in Brief

Posted by Lætitia AVROT on 2026-01-20 at 00:00

Posted by Haki Benita on 2026-01-19 at 22:00

Posted by Akhil Reddy Banappagari in HexaCluster on 2026-01-19 at 15:47

Posted by Robert Haas in EDB on 2026-01-19 at 15:29

Posted by Frédéric Yhuel in Dalibo on 2026-01-19 at 07:15

Execution plans

Posted by Robins Tharakan on 2026-01-18 at 11:40

The Problem

The Fix

Benchmarking Methodology

Posted by Henrietta Dombrovskaya on 2026-01-17 at 15:14

Posted by Dave Page in pgEdge on 2026-01-16 at 05:52

The Problem with Traditional Benchmarks

What Is pgEdge Load Generator?

Top posters

Top teams

Feeds

Planet

Contact

How I Evaluated Each `ALTER TABLE` Sub-command