mirror of
https://github.com/postgres/postgres.git
synced 2025-04-20 00:42:27 +03:00
Provide options for postmaster to kill child processes with SIGABRT.
The postmaster normally sends SIGQUIT to force-terminate its child processes after a child crash or immediate-stop request. If that doesn't result in child exit within a few seconds, we follow it up with SIGKILL. This patch provides GUC flags that allow either of these signals to be replaced with SIGABRT. On typically-configured Unix systems, that will result in a core dump being produced for each such child. This can be useful for debugging problems, although it's not something you'd want to have on in production due to the risk of disk space bloat from lots of core files. The old postmaster -T switch, which sent SIGSTOP in place of SIGQUIT, is changed to be the same as send_abort_for_crash. As far as I can tell from the code comments, the intent of that switch was just to block things for long enough to force core dumps manually, which seems like an unnecessary extra step. (Maybe at the time, there was no way to get most kernels to produce core files with per-PID names, requiring manual core file renaming after each one. But now it's surely the hard way.) I also took the opportunity to remove the old postmaster -n (skip shmem reinit) switch, which hasn't actually done anything in decades, though the documentation still claimed it did. Discussion: https://postgr.es/m/2251016.1668797294@sss.pgh.pa.us
This commit is contained in:
parent
e2933a6e11
commit
51b5834cd5
@ -11500,6 +11500,62 @@ LOG: CleanUpLock: deleting: lock(0xb7acd844) id(24688,24696,0,0,0,1)
|
|||||||
</listitem>
|
</listitem>
|
||||||
</varlistentry>
|
</varlistentry>
|
||||||
|
|
||||||
|
<varlistentry id="guc-send-abort-for-crash" xreflabel="send_abort_for_crash">
|
||||||
|
<term><varname>send_abort_for_crash</varname> (<type>boolean</type>)
|
||||||
|
<indexterm>
|
||||||
|
<primary><varname>send_abort_for_crash</varname> configuration parameter</primary>
|
||||||
|
</indexterm>
|
||||||
|
</term>
|
||||||
|
<listitem>
|
||||||
|
<para>
|
||||||
|
By default, after a backend crash the postmaster will stop remaining
|
||||||
|
child processes by sending them <systemitem>SIGQUIT</systemitem>
|
||||||
|
signals, which permits them to exit more-or-less gracefully. When
|
||||||
|
this option is set to <literal>on</literal>,
|
||||||
|
<systemitem>SIGABRT</systemitem> is sent instead. That normally
|
||||||
|
results in production of a core dump file for each such child
|
||||||
|
process.
|
||||||
|
This can be handy for investigating the states of other processes
|
||||||
|
after a crash. It can also consume lots of disk space in the event
|
||||||
|
of repeated crashes, so do not enable this on systems you are not
|
||||||
|
monitoring carefully.
|
||||||
|
Beware that no support exists for cleaning up the core file(s)
|
||||||
|
automatically.
|
||||||
|
This parameter can only be set in
|
||||||
|
the <filename>postgresql.conf</filename> file or on the server
|
||||||
|
command line.
|
||||||
|
</para>
|
||||||
|
</listitem>
|
||||||
|
</varlistentry>
|
||||||
|
|
||||||
|
<varlistentry id="guc-send-abort-for-kill" xreflabel="send_abort_for_kill">
|
||||||
|
<term><varname>send_abort_for_kill</varname> (<type>boolean</type>)
|
||||||
|
<indexterm>
|
||||||
|
<primary><varname>send_abort_for_kill</varname> configuration parameter</primary>
|
||||||
|
</indexterm>
|
||||||
|
</term>
|
||||||
|
<listitem>
|
||||||
|
<para>
|
||||||
|
By default, after attempting to stop a child process with
|
||||||
|
<systemitem>SIGQUIT</systemitem>, the postmaster will wait five
|
||||||
|
seconds and then send <systemitem>SIGKILL</systemitem> to force
|
||||||
|
immediate termination. When this option is set
|
||||||
|
to <literal>on</literal>, <systemitem>SIGABRT</systemitem> is sent
|
||||||
|
instead of <systemitem>SIGKILL</systemitem>. That normally results
|
||||||
|
in production of a core dump file for each such child process.
|
||||||
|
This can be handy for investigating the states
|
||||||
|
of <quote>stuck</quote> child processes. It can also consume lots
|
||||||
|
of disk space in the event of repeated crashes, so do not enable
|
||||||
|
this on systems you are not monitoring carefully.
|
||||||
|
Beware that no support exists for cleaning up the core file(s)
|
||||||
|
automatically.
|
||||||
|
This parameter can only be set in
|
||||||
|
the <filename>postgresql.conf</filename> file or on the server
|
||||||
|
command line.
|
||||||
|
</para>
|
||||||
|
</listitem>
|
||||||
|
</varlistentry>
|
||||||
|
|
||||||
</variablelist>
|
</variablelist>
|
||||||
</sect1>
|
</sect1>
|
||||||
<sect1 id="runtime-config-short">
|
<sect1 id="runtime-config-short">
|
||||||
|
@ -409,24 +409,6 @@ PostgreSQL documentation
|
|||||||
</listitem>
|
</listitem>
|
||||||
</varlistentry>
|
</varlistentry>
|
||||||
|
|
||||||
<varlistentry>
|
|
||||||
<term><option>-n</option></term>
|
|
||||||
<listitem>
|
|
||||||
<para>
|
|
||||||
This option is for debugging problems that cause a server
|
|
||||||
process to die abnormally. The ordinary strategy in this
|
|
||||||
situation is to notify all other server processes that they
|
|
||||||
must terminate and then reinitialize the shared memory and
|
|
||||||
semaphores. This is because an errant server process could
|
|
||||||
have corrupted some shared state before terminating. This
|
|
||||||
option specifies that <command>postgres</command> will
|
|
||||||
not reinitialize shared data structures. A knowledgeable
|
|
||||||
system programmer can then use a debugger to examine shared
|
|
||||||
memory and semaphore state.
|
|
||||||
</para>
|
|
||||||
</listitem>
|
|
||||||
</varlistentry>
|
|
||||||
|
|
||||||
<varlistentry>
|
<varlistentry>
|
||||||
<term><option>-O</option></term>
|
<term><option>-O</option></term>
|
||||||
<listitem>
|
<listitem>
|
||||||
@ -466,14 +448,9 @@ PostgreSQL documentation
|
|||||||
This option is for debugging problems that cause a server
|
This option is for debugging problems that cause a server
|
||||||
process to die abnormally. The ordinary strategy in this
|
process to die abnormally. The ordinary strategy in this
|
||||||
situation is to notify all other server processes that they
|
situation is to notify all other server processes that they
|
||||||
must terminate and then reinitialize the shared memory and
|
must terminate, by sending them <systemitem>SIGQUIT</systemitem>
|
||||||
semaphores. This is because an errant server process could
|
signals. With this option, <systemitem>SIGABRT</systemitem>
|
||||||
have corrupted some shared state before terminating. This
|
will be sent instead, resulting in production of core dump files.
|
||||||
option specifies that <command>postgres</command> will
|
|
||||||
stop all other server processes by sending the signal
|
|
||||||
<literal>SIGSTOP</literal>, but will not cause them to
|
|
||||||
terminate. This permits system programmers to collect core
|
|
||||||
dumps from all server processes by hand.
|
|
||||||
</para>
|
</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
</varlistentry>
|
</varlistentry>
|
||||||
|
@ -352,11 +352,10 @@ help(const char *progname)
|
|||||||
|
|
||||||
printf(_("\nDeveloper options:\n"));
|
printf(_("\nDeveloper options:\n"));
|
||||||
printf(_(" -f s|i|o|b|t|n|m|h forbid use of some plan types\n"));
|
printf(_(" -f s|i|o|b|t|n|m|h forbid use of some plan types\n"));
|
||||||
printf(_(" -n do not reinitialize shared memory after abnormal exit\n"));
|
|
||||||
printf(_(" -O allow system table structure changes\n"));
|
printf(_(" -O allow system table structure changes\n"));
|
||||||
printf(_(" -P disable system indexes\n"));
|
printf(_(" -P disable system indexes\n"));
|
||||||
printf(_(" -t pa|pl|ex show timings after each query\n"));
|
printf(_(" -t pa|pl|ex show timings after each query\n"));
|
||||||
printf(_(" -T send SIGSTOP to all backend processes if one dies\n"));
|
printf(_(" -T send SIGABRT to all backend processes if one dies\n"));
|
||||||
printf(_(" -W NUM wait NUM seconds to allow attach from a debugger\n"));
|
printf(_(" -W NUM wait NUM seconds to allow attach from a debugger\n"));
|
||||||
|
|
||||||
printf(_("\nOptions for single-user mode:\n"));
|
printf(_("\nOptions for single-user mode:\n"));
|
||||||
|
@ -219,16 +219,6 @@ int ReservedBackends;
|
|||||||
#define MAXLISTEN 64
|
#define MAXLISTEN 64
|
||||||
static pgsocket ListenSocket[MAXLISTEN];
|
static pgsocket ListenSocket[MAXLISTEN];
|
||||||
|
|
||||||
/*
|
|
||||||
* These globals control the behavior of the postmaster in case some
|
|
||||||
* backend dumps core. Normally, it kills all peers of the dead backend
|
|
||||||
* and reinitializes shared memory. By specifying -s or -n, we can have
|
|
||||||
* the postmaster stop (rather than kill) peers and not reinitialize
|
|
||||||
* shared data structures. (Reinit is currently dead code, though.)
|
|
||||||
*/
|
|
||||||
static bool Reinit = true;
|
|
||||||
static int SendStop = false;
|
|
||||||
|
|
||||||
/* still more option variables */
|
/* still more option variables */
|
||||||
bool EnableSSL = false;
|
bool EnableSSL = false;
|
||||||
|
|
||||||
@ -243,6 +233,8 @@ bool enable_bonjour = false;
|
|||||||
char *bonjour_name;
|
char *bonjour_name;
|
||||||
bool restart_after_crash = true;
|
bool restart_after_crash = true;
|
||||||
bool remove_temp_files_after_crash = true;
|
bool remove_temp_files_after_crash = true;
|
||||||
|
bool send_abort_for_crash = false;
|
||||||
|
bool send_abort_for_kill = false;
|
||||||
|
|
||||||
/* PIDs of special child processes; 0 when not running */
|
/* PIDs of special child processes; 0 when not running */
|
||||||
static pid_t StartupPID = 0,
|
static pid_t StartupPID = 0,
|
||||||
@ -414,6 +406,7 @@ static void report_fork_failure_to_client(Port *port, int errnum);
|
|||||||
static CAC_state canAcceptConnections(int backend_type);
|
static CAC_state canAcceptConnections(int backend_type);
|
||||||
static bool RandomCancelKey(int32 *cancel_key);
|
static bool RandomCancelKey(int32 *cancel_key);
|
||||||
static void signal_child(pid_t pid, int signal);
|
static void signal_child(pid_t pid, int signal);
|
||||||
|
static void sigquit_child(pid_t pid);
|
||||||
static bool SignalSomeChildren(int signal, int target);
|
static bool SignalSomeChildren(int signal, int target);
|
||||||
static void TerminateChildren(int signal);
|
static void TerminateChildren(int signal);
|
||||||
|
|
||||||
@ -697,7 +690,7 @@ PostmasterMain(int argc, char *argv[])
|
|||||||
* tcop/postgres.c (the option sets should not conflict) and with the
|
* tcop/postgres.c (the option sets should not conflict) and with the
|
||||||
* common help() function in main/main.c.
|
* common help() function in main/main.c.
|
||||||
*/
|
*/
|
||||||
while ((opt = getopt(argc, argv, "B:bc:C:D:d:EeFf:h:ijk:lN:nOPp:r:S:sTt:W:-:")) != -1)
|
while ((opt = getopt(argc, argv, "B:bc:C:D:d:EeFf:h:ijk:lN:OPp:r:S:sTt:W:-:")) != -1)
|
||||||
{
|
{
|
||||||
switch (opt)
|
switch (opt)
|
||||||
{
|
{
|
||||||
@ -767,11 +760,6 @@ PostmasterMain(int argc, char *argv[])
|
|||||||
SetConfigOption("max_connections", optarg, PGC_POSTMASTER, PGC_S_ARGV);
|
SetConfigOption("max_connections", optarg, PGC_POSTMASTER, PGC_S_ARGV);
|
||||||
break;
|
break;
|
||||||
|
|
||||||
case 'n':
|
|
||||||
/* Don't reinit shared mem after abnormal exit */
|
|
||||||
Reinit = false;
|
|
||||||
break;
|
|
||||||
|
|
||||||
case 'O':
|
case 'O':
|
||||||
SetConfigOption("allow_system_table_mods", "true", PGC_POSTMASTER, PGC_S_ARGV);
|
SetConfigOption("allow_system_table_mods", "true", PGC_POSTMASTER, PGC_S_ARGV);
|
||||||
break;
|
break;
|
||||||
@ -799,11 +787,10 @@ PostmasterMain(int argc, char *argv[])
|
|||||||
case 'T':
|
case 'T':
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* In the event that some backend dumps core, send SIGSTOP,
|
* This option used to be defined as sending SIGSTOP after a
|
||||||
* rather than SIGQUIT, to all its peers. This lets the wily
|
* backend crash, but sending SIGABRT seems more useful.
|
||||||
* post_hacker collect core dumps from everyone.
|
|
||||||
*/
|
*/
|
||||||
SendStop = true;
|
SetConfigOption("send_abort_for_crash", "true", PGC_POSTMASTER, PGC_S_ARGV);
|
||||||
break;
|
break;
|
||||||
|
|
||||||
case 't':
|
case 't':
|
||||||
@ -1896,20 +1883,23 @@ ServerLoop(void)
|
|||||||
|
|
||||||
/*
|
/*
|
||||||
* If we already sent SIGQUIT to children and they are slow to shut
|
* If we already sent SIGQUIT to children and they are slow to shut
|
||||||
* down, it's time to send them SIGKILL. This doesn't happen
|
* down, it's time to send them SIGKILL (or SIGABRT if requested).
|
||||||
* normally, but under certain conditions backends can get stuck while
|
* This doesn't happen normally, but under certain conditions backends
|
||||||
* shutting down. This is a last measure to get them unwedged.
|
* can get stuck while shutting down. This is a last measure to get
|
||||||
|
* them unwedged.
|
||||||
*
|
*
|
||||||
* Note we also do this during recovery from a process crash.
|
* Note we also do this during recovery from a process crash.
|
||||||
*/
|
*/
|
||||||
if ((Shutdown >= ImmediateShutdown || (FatalError && !SendStop)) &&
|
if ((Shutdown >= ImmediateShutdown || FatalError) &&
|
||||||
AbortStartTime != 0 &&
|
AbortStartTime != 0 &&
|
||||||
(now - AbortStartTime) >= SIGKILL_CHILDREN_AFTER_SECS)
|
(now - AbortStartTime) >= SIGKILL_CHILDREN_AFTER_SECS)
|
||||||
{
|
{
|
||||||
/* We were gentle with them before. Not anymore */
|
/* We were gentle with them before. Not anymore */
|
||||||
ereport(LOG,
|
ereport(LOG,
|
||||||
(errmsg("issuing SIGKILL to recalcitrant children")));
|
/* translator: %s is SIGKILL or SIGABRT */
|
||||||
TerminateChildren(SIGKILL);
|
(errmsg("issuing %s to recalcitrant children",
|
||||||
|
send_abort_for_kill ? "SIGABRT" : "SIGKILL")));
|
||||||
|
TerminateChildren(send_abort_for_kill ? SIGABRT : SIGKILL);
|
||||||
/* reset flag so we don't SIGKILL again */
|
/* reset flag so we don't SIGKILL again */
|
||||||
AbortStartTime = 0;
|
AbortStartTime = 0;
|
||||||
}
|
}
|
||||||
@ -2903,6 +2893,7 @@ pmdie(SIGNAL_ARGS)
|
|||||||
#endif
|
#endif
|
||||||
|
|
||||||
/* tell children to shut down ASAP */
|
/* tell children to shut down ASAP */
|
||||||
|
/* (note we don't apply send_abort_for_crash here) */
|
||||||
SetQuitSignalReason(PMQUIT_FOR_STOP);
|
SetQuitSignalReason(PMQUIT_FOR_STOP);
|
||||||
TerminateChildren(SIGQUIT);
|
TerminateChildren(SIGQUIT);
|
||||||
pmState = PM_WAIT_BACKENDS;
|
pmState = PM_WAIT_BACKENDS;
|
||||||
@ -3470,20 +3461,9 @@ HandleChildCrash(int pid, int exitstatus, const char *procname)
|
|||||||
/*
|
/*
|
||||||
* This worker is still alive. Unless we did so already, tell it
|
* This worker is still alive. Unless we did so already, tell it
|
||||||
* to commit hara-kiri.
|
* to commit hara-kiri.
|
||||||
*
|
|
||||||
* SIGQUIT is the special signal that says exit without proc_exit
|
|
||||||
* and let the user know what's going on. But if SendStop is set
|
|
||||||
* (-T on command line), then we send SIGSTOP instead, so that we
|
|
||||||
* can get core dumps from all backends by hand.
|
|
||||||
*/
|
*/
|
||||||
if (take_action)
|
if (take_action)
|
||||||
{
|
sigquit_child(rw->rw_pid);
|
||||||
ereport(DEBUG2,
|
|
||||||
(errmsg_internal("sending %s to process %d",
|
|
||||||
(SendStop ? "SIGSTOP" : "SIGQUIT"),
|
|
||||||
(int) rw->rw_pid)));
|
|
||||||
signal_child(rw->rw_pid, (SendStop ? SIGSTOP : SIGQUIT));
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -3514,13 +3494,8 @@ HandleChildCrash(int pid, int exitstatus, const char *procname)
|
|||||||
* This backend is still alive. Unless we did so already, tell it
|
* This backend is still alive. Unless we did so already, tell it
|
||||||
* to commit hara-kiri.
|
* to commit hara-kiri.
|
||||||
*
|
*
|
||||||
* SIGQUIT is the special signal that says exit without proc_exit
|
* We could exclude dead_end children here, but at least when
|
||||||
* and let the user know what's going on. But if SendStop is set
|
* sending SIGABRT it seems better to include them.
|
||||||
* (-T on command line), then we send SIGSTOP instead, so that we
|
|
||||||
* can get core dumps from all backends by hand.
|
|
||||||
*
|
|
||||||
* We could exclude dead_end children here, but at least in the
|
|
||||||
* SIGSTOP case it seems better to include them.
|
|
||||||
*
|
*
|
||||||
* Background workers were already processed above; ignore them
|
* Background workers were already processed above; ignore them
|
||||||
* here.
|
* here.
|
||||||
@ -3529,13 +3504,7 @@ HandleChildCrash(int pid, int exitstatus, const char *procname)
|
|||||||
continue;
|
continue;
|
||||||
|
|
||||||
if (take_action)
|
if (take_action)
|
||||||
{
|
sigquit_child(bp->pid);
|
||||||
ereport(DEBUG2,
|
|
||||||
(errmsg_internal("sending %s to process %d",
|
|
||||||
(SendStop ? "SIGSTOP" : "SIGQUIT"),
|
|
||||||
(int) bp->pid)));
|
|
||||||
signal_child(bp->pid, (SendStop ? SIGSTOP : SIGQUIT));
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -3547,11 +3516,7 @@ HandleChildCrash(int pid, int exitstatus, const char *procname)
|
|||||||
}
|
}
|
||||||
else if (StartupPID != 0 && take_action)
|
else if (StartupPID != 0 && take_action)
|
||||||
{
|
{
|
||||||
ereport(DEBUG2,
|
sigquit_child(StartupPID);
|
||||||
(errmsg_internal("sending %s to process %d",
|
|
||||||
(SendStop ? "SIGSTOP" : "SIGQUIT"),
|
|
||||||
(int) StartupPID)));
|
|
||||||
signal_child(StartupPID, (SendStop ? SIGSTOP : SIGQUIT));
|
|
||||||
StartupStatus = STARTUP_SIGNALED;
|
StartupStatus = STARTUP_SIGNALED;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -3559,73 +3524,37 @@ HandleChildCrash(int pid, int exitstatus, const char *procname)
|
|||||||
if (pid == BgWriterPID)
|
if (pid == BgWriterPID)
|
||||||
BgWriterPID = 0;
|
BgWriterPID = 0;
|
||||||
else if (BgWriterPID != 0 && take_action)
|
else if (BgWriterPID != 0 && take_action)
|
||||||
{
|
sigquit_child(BgWriterPID);
|
||||||
ereport(DEBUG2,
|
|
||||||
(errmsg_internal("sending %s to process %d",
|
|
||||||
(SendStop ? "SIGSTOP" : "SIGQUIT"),
|
|
||||||
(int) BgWriterPID)));
|
|
||||||
signal_child(BgWriterPID, (SendStop ? SIGSTOP : SIGQUIT));
|
|
||||||
}
|
|
||||||
|
|
||||||
/* Take care of the checkpointer too */
|
/* Take care of the checkpointer too */
|
||||||
if (pid == CheckpointerPID)
|
if (pid == CheckpointerPID)
|
||||||
CheckpointerPID = 0;
|
CheckpointerPID = 0;
|
||||||
else if (CheckpointerPID != 0 && take_action)
|
else if (CheckpointerPID != 0 && take_action)
|
||||||
{
|
sigquit_child(CheckpointerPID);
|
||||||
ereport(DEBUG2,
|
|
||||||
(errmsg_internal("sending %s to process %d",
|
|
||||||
(SendStop ? "SIGSTOP" : "SIGQUIT"),
|
|
||||||
(int) CheckpointerPID)));
|
|
||||||
signal_child(CheckpointerPID, (SendStop ? SIGSTOP : SIGQUIT));
|
|
||||||
}
|
|
||||||
|
|
||||||
/* Take care of the walwriter too */
|
/* Take care of the walwriter too */
|
||||||
if (pid == WalWriterPID)
|
if (pid == WalWriterPID)
|
||||||
WalWriterPID = 0;
|
WalWriterPID = 0;
|
||||||
else if (WalWriterPID != 0 && take_action)
|
else if (WalWriterPID != 0 && take_action)
|
||||||
{
|
sigquit_child(WalWriterPID);
|
||||||
ereport(DEBUG2,
|
|
||||||
(errmsg_internal("sending %s to process %d",
|
|
||||||
(SendStop ? "SIGSTOP" : "SIGQUIT"),
|
|
||||||
(int) WalWriterPID)));
|
|
||||||
signal_child(WalWriterPID, (SendStop ? SIGSTOP : SIGQUIT));
|
|
||||||
}
|
|
||||||
|
|
||||||
/* Take care of the walreceiver too */
|
/* Take care of the walreceiver too */
|
||||||
if (pid == WalReceiverPID)
|
if (pid == WalReceiverPID)
|
||||||
WalReceiverPID = 0;
|
WalReceiverPID = 0;
|
||||||
else if (WalReceiverPID != 0 && take_action)
|
else if (WalReceiverPID != 0 && take_action)
|
||||||
{
|
sigquit_child(WalReceiverPID);
|
||||||
ereport(DEBUG2,
|
|
||||||
(errmsg_internal("sending %s to process %d",
|
|
||||||
(SendStop ? "SIGSTOP" : "SIGQUIT"),
|
|
||||||
(int) WalReceiverPID)));
|
|
||||||
signal_child(WalReceiverPID, (SendStop ? SIGSTOP : SIGQUIT));
|
|
||||||
}
|
|
||||||
|
|
||||||
/* Take care of the autovacuum launcher too */
|
/* Take care of the autovacuum launcher too */
|
||||||
if (pid == AutoVacPID)
|
if (pid == AutoVacPID)
|
||||||
AutoVacPID = 0;
|
AutoVacPID = 0;
|
||||||
else if (AutoVacPID != 0 && take_action)
|
else if (AutoVacPID != 0 && take_action)
|
||||||
{
|
sigquit_child(AutoVacPID);
|
||||||
ereport(DEBUG2,
|
|
||||||
(errmsg_internal("sending %s to process %d",
|
|
||||||
(SendStop ? "SIGSTOP" : "SIGQUIT"),
|
|
||||||
(int) AutoVacPID)));
|
|
||||||
signal_child(AutoVacPID, (SendStop ? SIGSTOP : SIGQUIT));
|
|
||||||
}
|
|
||||||
|
|
||||||
/* Take care of the archiver too */
|
/* Take care of the archiver too */
|
||||||
if (pid == PgArchPID)
|
if (pid == PgArchPID)
|
||||||
PgArchPID = 0;
|
PgArchPID = 0;
|
||||||
else if (PgArchPID != 0 && take_action)
|
else if (PgArchPID != 0 && take_action)
|
||||||
{
|
sigquit_child(PgArchPID);
|
||||||
ereport(DEBUG2,
|
|
||||||
(errmsg_internal("sending %s to process %d",
|
|
||||||
(SendStop ? "SIGSTOP" : "SIGQUIT"),
|
|
||||||
(int) PgArchPID)));
|
|
||||||
signal_child(PgArchPID, (SendStop ? SIGSTOP : SIGQUIT));
|
|
||||||
}
|
|
||||||
|
|
||||||
/* We do NOT restart the syslogger */
|
/* We do NOT restart the syslogger */
|
||||||
|
|
||||||
@ -3834,6 +3763,10 @@ PostmasterStateMachine(void)
|
|||||||
* Any required cleanup will happen at next restart. We
|
* Any required cleanup will happen at next restart. We
|
||||||
* set FatalError so that an "abnormal shutdown" message
|
* set FatalError so that an "abnormal shutdown" message
|
||||||
* gets logged when we exit.
|
* gets logged when we exit.
|
||||||
|
*
|
||||||
|
* We don't consult send_abort_for_crash here, as it's
|
||||||
|
* unlikely that dumping cores would illuminate the reason
|
||||||
|
* for checkpointer fork failure.
|
||||||
*/
|
*/
|
||||||
FatalError = true;
|
FatalError = true;
|
||||||
pmState = PM_WAIT_DEAD_END;
|
pmState = PM_WAIT_DEAD_END;
|
||||||
@ -4003,8 +3936,8 @@ signal_child(pid_t pid, int signal)
|
|||||||
case SIGINT:
|
case SIGINT:
|
||||||
case SIGTERM:
|
case SIGTERM:
|
||||||
case SIGQUIT:
|
case SIGQUIT:
|
||||||
case SIGSTOP:
|
|
||||||
case SIGKILL:
|
case SIGKILL:
|
||||||
|
case SIGABRT:
|
||||||
if (kill(-pid, signal) < 0)
|
if (kill(-pid, signal) < 0)
|
||||||
elog(DEBUG3, "kill(%ld,%d) failed: %m", (long) (-pid), signal);
|
elog(DEBUG3, "kill(%ld,%d) failed: %m", (long) (-pid), signal);
|
||||||
break;
|
break;
|
||||||
@ -4014,6 +3947,24 @@ signal_child(pid_t pid, int signal)
|
|||||||
#endif
|
#endif
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Convenience function for killing a child process after a crash of some
|
||||||
|
* other child process. We log the action at a higher level than we would
|
||||||
|
* otherwise do, and we apply send_abort_for_crash to decide which signal
|
||||||
|
* to send. Normally it's SIGQUIT -- and most other comments in this file
|
||||||
|
* are written on the assumption that it is -- but developers might prefer
|
||||||
|
* to use SIGABRT to collect per-child core dumps.
|
||||||
|
*/
|
||||||
|
static void
|
||||||
|
sigquit_child(pid_t pid)
|
||||||
|
{
|
||||||
|
ereport(DEBUG2,
|
||||||
|
(errmsg_internal("sending %s to process %d",
|
||||||
|
(send_abort_for_crash ? "SIGABRT" : "SIGQUIT"),
|
||||||
|
(int) pid)));
|
||||||
|
signal_child(pid, (send_abort_for_crash ? SIGABRT : SIGQUIT));
|
||||||
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Send a signal to the targeted children (but NOT special children;
|
* Send a signal to the targeted children (but NOT special children;
|
||||||
* dead_end children are never signaled, either).
|
* dead_end children are never signaled, either).
|
||||||
@ -4069,7 +4020,7 @@ TerminateChildren(int signal)
|
|||||||
if (StartupPID != 0)
|
if (StartupPID != 0)
|
||||||
{
|
{
|
||||||
signal_child(StartupPID, signal);
|
signal_child(StartupPID, signal);
|
||||||
if (signal == SIGQUIT || signal == SIGKILL)
|
if (signal == SIGQUIT || signal == SIGKILL || signal == SIGABRT)
|
||||||
StartupStatus = STARTUP_SIGNALED;
|
StartupStatus = STARTUP_SIGNALED;
|
||||||
}
|
}
|
||||||
if (BgWriterPID != 0)
|
if (BgWriterPID != 0)
|
||||||
|
@ -1227,6 +1227,26 @@ struct config_bool ConfigureNamesBool[] =
|
|||||||
true,
|
true,
|
||||||
NULL, NULL, NULL
|
NULL, NULL, NULL
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
{"send_abort_for_crash", PGC_SIGHUP, DEVELOPER_OPTIONS,
|
||||||
|
gettext_noop("Send SIGABRT not SIGQUIT to child processes after backend crash."),
|
||||||
|
NULL,
|
||||||
|
GUC_NOT_IN_SAMPLE
|
||||||
|
},
|
||||||
|
&send_abort_for_crash,
|
||||||
|
false,
|
||||||
|
NULL, NULL, NULL
|
||||||
|
},
|
||||||
|
{
|
||||||
|
{"send_abort_for_kill", PGC_SIGHUP, DEVELOPER_OPTIONS,
|
||||||
|
gettext_noop("Send SIGABRT not SIGKILL to stuck child processes."),
|
||||||
|
NULL,
|
||||||
|
GUC_NOT_IN_SAMPLE
|
||||||
|
},
|
||||||
|
&send_abort_for_kill,
|
||||||
|
false,
|
||||||
|
NULL, NULL, NULL
|
||||||
|
},
|
||||||
|
|
||||||
{
|
{
|
||||||
{"log_duration", PGC_SUSET, LOGGING_WHAT,
|
{"log_duration", PGC_SUSET, LOGGING_WHAT,
|
||||||
|
@ -30,6 +30,8 @@ extern PGDLLIMPORT bool enable_bonjour;
|
|||||||
extern PGDLLIMPORT char *bonjour_name;
|
extern PGDLLIMPORT char *bonjour_name;
|
||||||
extern PGDLLIMPORT bool restart_after_crash;
|
extern PGDLLIMPORT bool restart_after_crash;
|
||||||
extern PGDLLIMPORT bool remove_temp_files_after_crash;
|
extern PGDLLIMPORT bool remove_temp_files_after_crash;
|
||||||
|
extern PGDLLIMPORT bool send_abort_for_crash;
|
||||||
|
extern PGDLLIMPORT bool send_abort_for_kill;
|
||||||
|
|
||||||
#ifdef WIN32
|
#ifdef WIN32
|
||||||
extern PGDLLIMPORT HANDLE PostmasterHandle;
|
extern PGDLLIMPORT HANDLE PostmasterHandle;
|
||||||
|
Loading…
x
Reference in New Issue
Block a user