-
Notifications
You must be signed in to change notification settings - Fork 8k
Description
You have to provide the following information whenever possible.
Describe what's wrong
I have a PostgreSQL cluster with 2 databases, let's say A and B, hosted on Kubernetes. At any given point in time, there is a Kubernetes service that points to whichever database is the primary.
I set up my MaterializedPostgreSQL engine to point to that Kubernetes service, which when I set it up was pointing to A. At some point, several days later, a failover occurred and the primary had become B.
As a result, Clickhouse was no longer receiving things from the WAL logs, and the replication slot fell behind writes and started to pile up on disk, causing some problems on the primary.
I had to restart the Clickhouse pod in order to cause it to reconnect, which then properly reconnected to the primary.
Does it reproduce on recent release?
Version 21.8.4.51
Enable crash reporting
Unfortunately I lost logs when I killed the Clickhouse pod.
How to reproduce
Described above.
Expected behavior
The MaterializedPostgreSQL engine should detect when a failover has occurred (when it is connected to a read replica only) and reconnect.
Additional context
Tagging @kssenii . Very thankful for you work so far on this engine - it saved our team a lot of time setting up a custom sync, and is the reason we are choosing Clickhouse! Our sync time for a table of 30+ million rows in the same data center was under 10 minutes :), and our lag time has consistently been under 15 seconds even for inserts of 250k+ on PostgreSQL. Very impressive.