1
0
mirror of https://github.com/postgres/postgres.git synced 2025-12-19 17:02:53 +03:00

Add retry logic to pg_sync_replication_slots().

Previously, pg_sync_replication_slots() would finish without synchronizing
slots that didn't meet requirements, rather than failing outright. This
could leave some failover slots unsynchronized if required catalog rows or
WAL segments were missing or at risk of removal, while the standby
continued removing needed data.

To address this, the function now waits for the primary slot to advance to
a position where all required data is available on the standby before
completing synchronization. It retries cyclically until all failover slots
that existed on the primary at the start of the call are synchronized.
Slots created after the function begins are not included. If the standby
is promoted during this wait, the function exits gracefully and the
temporary slots will be removed.

Author: Ajin Cherian <itsajin@gmail.com>
Author: Hou Zhijie <houzj.fnst@fujitsu.com>
Reviewed-by: Shveta Malik <shveta.malik@gmail.com>
Reviewed-by: Japin Li <japinli@hotmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Ashutosh Sharma <ashu.coek88@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Yilin Zhang <jiezhilove@126.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAFPTHDZAA%2BgWDntpa5ucqKKba41%3DtXmoXqN3q4rpjO9cdxgQrw%40mail.gmail.com
This commit is contained in:
Amit Kapila
2025-12-15 02:50:21 +00:00
parent 33980eaa6d
commit 0d2d4a0ec3
5 changed files with 244 additions and 59 deletions

View File

@@ -1497,9 +1497,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
standby server. Temporary synced slots, if any, cannot be used for
logical decoding and must be dropped after promotion. See
<xref linkend="logicaldecoding-replication-slots-synchronization"/> for details.
Note that this function is primarily intended for testing and
debugging purposes and should be used with caution. Additionally,
this function cannot be executed if
Note that this function cannot be executed if
<link linkend="guc-sync-replication-slots"><varname>
sync_replication_slots</varname></link> is enabled and the slotsync
worker is already running to perform the synchronization of slots.

View File

@@ -405,15 +405,13 @@ postgres=# SELECT * from pg_logical_slot_get_changes('regression_slot', NULL, NU
periodic synchronization of failover slots, they can also be manually
synchronized using the <link linkend="pg-sync-replication-slots">
<function>pg_sync_replication_slots</function></link> function on the standby.
However, this function is primarily intended for testing and debugging and
should be used with caution. Unlike automatic synchronization, it does not
include cyclic retries, making it more prone to synchronization failures,
particularly during initial sync scenarios where the required WAL files
or catalog rows for the slot might have already been removed or are at risk
of being removed on the standby. In contrast, automatic synchronization
via <varname>sync_replication_slots</varname> provides continuous slot
updates, enabling seamless failover and supporting high availability.
Therefore, it is the recommended method for synchronizing slots.
However, unlike automatic synchronization, it does not perform incremental
updates. It retries cyclically until all the failover slots that existed on
primary at the start of the function call are synchronized. Any slots created
after the function begins will not be synchronized. In contrast, automatic
synchronization via <varname>sync_replication_slots</varname> provides
continuous slot updates, enabling seamless failover and supporting high
availability. Therefore, it is the recommended method for synchronizing slots.
</para>
</note>