1
0
mirror of https://github.com/postgres/postgres.git synced 2025-11-06 07:49:08 +03:00

Fix LOCK_TIMEOUT handling during parallel apply.

Previously, the parallel apply worker used SIGINT to receive a graceful
shutdown signal from the leader apply worker. However, SIGINT is also used
by the LOCK_TIMEOUT handler to trigger a query-cancel interrupt. This
overlap caused the parallel apply worker to miss LOCK_TIMEOUT signals,
leading to incorrect behavior during lock wait/contention.

This patch resolves the conflict by switching the graceful shutdown signal
from SIGINT to SIGUSR2.

Reported-by: Zane Duffield <duffieldzane@gmail.com>
Diagnosed-by: Zhijie Hou <houzj.fnst@fujitsu.com>
Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 16, where it was introduced
Discussion: https://postgr.es/m/CACMiCkXyC4au74kvE2g6Y=mCEF8X6r-Ne_ty4r7qWkUjRE4+oQ@mail.gmail.com
This commit is contained in:
Amit Kapila
2025-09-24 04:11:53 +00:00
parent f83fe65f3f
commit e41d954da6
3 changed files with 16 additions and 10 deletions

View File

@@ -870,10 +870,17 @@ ParallelApplyWorkerMain(Datum main_arg)
InitializingApplyWorker = true;
/* Setup signal handling. */
/*
* Setup signal handling.
*
* Note: We intentionally used SIGUSR2 to trigger a graceful shutdown
* initiated by the leader apply worker. This helps to differentiate it
* from the case where we abort the current transaction and exit on
* receiving SIGTERM.
*/
pqsignal(SIGHUP, SignalHandlerForConfigReload);
pqsignal(SIGINT, SignalHandlerForShutdownRequest);
pqsignal(SIGTERM, die);
pqsignal(SIGUSR2, SignalHandlerForShutdownRequest);
BackgroundWorkerUnblockSignals();
/*
@@ -972,9 +979,9 @@ ParallelApplyWorkerMain(Datum main_arg)
/*
* The parallel apply worker must not get here because the parallel apply
* worker will only stop when it receives a SIGTERM or SIGINT from the
* leader, or when there is an error. None of these cases will allow the
* code to reach here.
* worker will only stop when it receives a SIGTERM or SIGUSR2 from the
* leader, or SIGINT from itself, or when there is an error. None of these
* cases will allow the code to reach here.
*/
Assert(false);
}