1
0
mirror of https://github.com/postgres/postgres.git synced 2025-06-08 22:02:03 +03:00

Fix handling of synchronous replication for stopping WAL senders

This fixes an oversight from c6c3334 which has introduced a more strict
ordering in the way WAL senders are stopped to prevent current WAL
activity when a shutdown checkpoint is created.  After all backends are
stopped, all WAL senders are requested to stop which makes them stop any
activity, and switching their state as stopping.  Once the checkpointer
knows that all WAL senders are in a stopping state, the shutdown
checkpoint can begin, with all WAL senders activated, waiting for their
clients to flush the shutdown checkpoint record.

If a subset of WAL senders are stopping and in a sync state, other WAL
senders could still be waiting for a WAL position to be synced while
committing a transaction, however the subset of stopping senders would
not release waiters, potentially breaking synchronous replication
guarantees.  This commit makes sure that even WAL senders stopping are
able to release waiters properly.

On 9.4, this can also trigger an assertion failure when setting for
example max_wal_senders to 1 where a WAL sender is not able to find
itself as in synchronous state when the instance stops.

Reported-by: Paul Guo
Author: Paul Guo, Michael Paquier
Discussion: https://postgr.es/m/CAEET0ZEv8VFqT3C-cQm6byOB4r4VYWcef1J21dOX-gcVhCSpmA@mail.gmail.com
Backpatch-through: 9.4
This commit is contained in:
Michael Paquier 2018-11-29 09:13:04 +09:00
parent c1a5caea82
commit b81d08d600
2 changed files with 10 additions and 5 deletions

View File

@ -379,10 +379,12 @@ SyncRepReleaseWaiters(void)
* If this WALSender is serving a standby that is not on the list of
* potential sync standbys then we have nothing to do. If we are still
* starting up, still running base backup or the current flush position
* is still invalid, then leave quickly also.
* is still invalid, then leave quickly also. Streaming or stopping WAL
* senders are allowed to release waiters.
*/
if (MyWalSnd->sync_standby_priority == 0 ||
MyWalSnd->state < WALSNDSTATE_STREAMING ||
(MyWalSnd->state != WALSNDSTATE_STREAMING &&
MyWalSnd->state != WALSNDSTATE_STOPPING) ||
XLogRecPtrIsInvalid(MyWalSnd->flush))
return;
@ -400,7 +402,8 @@ SyncRepReleaseWaiters(void)
volatile WalSnd *walsnd = &walsndctl->walsnds[i];
if (walsnd->pid != 0 &&
walsnd->state == WALSNDSTATE_STREAMING &&
(walsnd->state == WALSNDSTATE_STREAMING ||
walsnd->state == WALSNDSTATE_STOPPING) &&
walsnd->sync_standby_priority > 0 &&
(priority == 0 ||
priority > walsnd->sync_standby_priority) &&

View File

@ -2941,12 +2941,14 @@ pg_stat_get_wal_senders(PG_FUNCTION_ARGS)
/*
* Treat a standby such as a pg_basebackup background process
* which always returns an invalid flush location, as an
* asynchronous standby.
* asynchronous standby. WAL sender must be streaming or
* stopping.
*/
sync_priority[i] = XLogRecPtrIsInvalid(walsnd->flush) ?
0 : walsnd->sync_standby_priority;
if (walsnd->state == WALSNDSTATE_STREAMING &&
if ((walsnd->state == WALSNDSTATE_STREAMING ||
walsnd->state == WALSNDSTATE_STOPPING) &&
walsnd->sync_standby_priority > 0 &&
(priority == 0 ||
priority > walsnd->sync_standby_priority) &&