mirror of
https://github.com/postgres/postgres.git
synced 2025-11-09 06:21:09 +03:00
Prevent panic during shutdown checkpoint
When the checkpointer writes the shutdown checkpoint, it checks afterwards whether any WAL has been written since it started and throws a PANIC if so. At that point, only walsenders are still active, so one might think this could not happen, but walsenders can also generate WAL, for instance in BASE_BACKUP and certain variants of CREATE_REPLICATION_SLOT. So they can trigger this panic if such a command is run while the shutdown checkpoint is being written. To fix this, divide the walsender shutdown into two phases. First, the postmaster sends a SIGUSR2 signal to all walsenders. The walsenders then put themselves into the "stopping" state. In this state, they reject any new commands. (For simplicity, we reject all new commands, so that in the future we do not have to track meticulously which commands might generate WAL.) The checkpointer waits for all walsenders to reach this state before proceeding with the shutdown checkpoint. After the shutdown checkpoint is done, the postmaster sends SIGINT (previously unused) to the walsenders. This triggers the existing shutdown behavior of sending out the shutdown checkpoint record and then terminating. Author: Michael Paquier <michael.paquier@gmail.com> Reported-by: Fujii Masao <masao.fujii@gmail.com>
This commit is contained in:
@@ -2918,7 +2918,7 @@ reaper(SIGNAL_ARGS)
|
||||
* Waken walsenders for the last time. No regular backends
|
||||
* should be around anymore.
|
||||
*/
|
||||
SignalChildren(SIGUSR2);
|
||||
SignalChildren(SIGINT);
|
||||
|
||||
pmState = PM_SHUTDOWN_2;
|
||||
|
||||
@@ -3656,7 +3656,9 @@ PostmasterStateMachine(void)
|
||||
/*
|
||||
* If we get here, we are proceeding with normal shutdown. All
|
||||
* the regular children are gone, and it's time to tell the
|
||||
* checkpointer to do a shutdown checkpoint.
|
||||
* checkpointer to do a shutdown checkpoint. All WAL senders
|
||||
* are told to switch to a stopping state so that the shutdown
|
||||
* checkpoint can go ahead.
|
||||
*/
|
||||
Assert(Shutdown > NoShutdown);
|
||||
/* Start the checkpointer if not running */
|
||||
@@ -3665,6 +3667,7 @@ PostmasterStateMachine(void)
|
||||
/* And tell it to shut down */
|
||||
if (CheckpointerPID != 0)
|
||||
{
|
||||
SignalSomeChildren(SIGUSR2, BACKEND_TYPE_WALSND);
|
||||
signal_child(CheckpointerPID, SIGUSR2);
|
||||
pmState = PM_SHUTDOWN;
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user