Thanks to visit codestin.com
Credit goes to mydbanotebook.org

Why Your HA Architecture is a Lie (And That's Okay)

· 927 words · 5 minute read
If Darth Vader existed and decided to do to Earth what he did to Alderaan, everyone would lose data.

I love this quote from Robert Haas because it’s a reality check we all need. In the database world, we’re constantly sold the dream of “Five Nines” (99.999% uptime) and “Zero Data Loss” (RPO1 0). We spend months building complex clusters to achieve it.

Let’s be honest: these are fairy tales. Beautiful to imagine, but they don’t exist in production. If a planet-killing laser—or even just a nasty network partition—hits your data center, your “guarantees” are gone.

My goal today isn’t to help you believe in fairy tales. It’s to help you build an architecture that actually works.

The CAP Theorem: Stop Saying “Pick Two Out of Three” 🔗

A very common misconception I see: you simply “pick two out of three” with CAP (Consistency, Availability, Partition Tolerance).

Wrong.

When there’s no network partition, you can have all three! The theorem states that during a network partition (P), you’re forced to choose between Consistency (C) and Availability (A). You can’t keep both when nodes can’t communicate.

That’s it. That’s the theorem.

The CAP theorem

Figure 1: The CAP theorem - you can have all three when there’s no partition, but must choose during a partition

PACELC: What CAP Forgot to Tell You 🔗

CAP tells us what happens during a disaster. It doesn’t describe normal operation—when your network actually behaves. This is where PACELC comes in:

  • P (Partition): If there’s a network partition, you must choose between Availability or Consistency. You can’t have both.
  • Else: When there’s no partition, you must choose between Latency or Consistency. You can’t have both.

PACELC theorem

Figure 2: PACELC extends CAP by forcing choices during both partition (left) and normal operation (right)

In PostgreSQL, PACELC is your map for every architecture decision. To describe these setups, I’ll use Dimitri Fontaine’s terminology: the Queen (primary/read-write), the Princess (standby), and the Worker (read-only replica).

Case Study 1: Standard Architecture (Async Replication) 🔗

The KISS setup: one Queen, one or two Princesses, asynchronous streaming replication.

CAP perspective: This is AP. During a partition, we prioritize Availability. We promote a Princess to Queen so the application keeps working, even if it means losing the last few transactions.

PACELC trade-off: During normal operation (Else), we favor Latency (L) over Consistency (C). The Queen doesn’t wait for the Princess to confirm writes. Fast, but you accept a small RPO (~200ms2).

Case Study 2: “Safety First” (Sync Replication) 🔗

If the standard architecture is too risky for your RPO, move to synchronous streaming replication.

CAP perspective: This is closer to CP, but not entirely. Here’s the misconception: people think all RDBMS are CP systems. That’s only true for standalone RDBMS. The moment you add replication, you’re building a distributed system, and distributed systems can’t be purely CP.

Why “closer to CP” but not entirely? There’s a window where the Princess commits a transaction but the transaction is still not committed on the Queen. If you read from the Princess, you get different data than reading from the Queen. The system isn’t purely consistent during this window.

PACELC trade-off: During normal operation (Else), we trade Latency (L) for Consistency (C). The Queen won’t commit until a Princess acknowledges the data. You’ve exchanged potential data loss for guaranteed latency penalty.

Case Study 3: High-Read Architecture 🔗

When read access is critical (medical records, for example), add several Workers.

The setup: You have one Queen, one synchronous Princess (with synchronous_commit and a quorum of 1), and multiple Workers. The Princess can be any of your standbys—whichever node is fastest at that moment gets crowned. PostgreSQL will pick the first standby that confirms the write. This means your “Princess” role can shift between nodes depending on network conditions and load.

The Workers provide read scalability without impacting write latency.

PACELC trade-off: We prioritize Consistency (C) and Read Availability at the cost of write Latency (L).

Case Study 4: Logical Bi-directional Replication 🔗

For global businesses requiring high write availability across multiple sites, we look at logical bi-directional replication.

PACELC choice: Maximum Availability (A) during a partition. During normal operation (Else), it prioritizes Latency (L) for fast local writes across geographical regions.

The reality check: This is the most complex path. Choosing A and L introduces massive complexity and conflict management challenges. Native Postgres replication is “naïve”—Shaun Thomas’s words, not mine. It lacks sophisticated conflict resolution (often “Last Update Wins,” where data is silently overwritten) and will stop entirely if a constraint is violated.

Currently, the products capable of this “miracle” are closed-source and not free.

To Conclude 🔗

Designing a database isn’t about achieving a perfect, mythical setup. It’s about choosing an acceptable trade-off. Whether you’re protecting a bank’s transactions or a hospital’s patient files, the best architecture acknowledges that failure is inevitable.

Follow the KISS principle: don’t build more than your requirements demand. The French say “qui peut le plus peut le moins” (whoever can do more can do less), but in engineering, that’s a terrible idea. “More” in engineering means more complexity, more costs, more failure modes, and more headaches you don’t need.

Stop chasing perfection. Model your architecture for reality.



  1. RPO (Recovery Point Objective): The maximum amount of data you can afford to lose, measured in time. An RPO of 0 means zero data loss. ↩︎

  2. 200ms is the typical replication lag for servers on the same local network. While network hardware has improved significantly in the last 5 years, this figure remains a safe real-world benchmark because database workloads and write volumes have grown proportionally, keeping the effective lag in this range. ↩︎