mirror of
https://github.com/MariaDB/server.git
synced 2025-07-29 05:21:33 +03:00
MDEV-31448: Killing a replica thread awaiting its GCO can hang/crash a parallel replica
The problem is that when a worker thread is (user) killed in wait_for_prior_commit, the event group may complete out-of-order since the wait for prior commit was aborted by the kill. This fix ensures that event groups will always complete in-order, even in the error case. This is done in finish_event_group() by doing an extra wait_for_prior_commit(), if necessary, that ignores kills. This fix supersedes the fix for MDEV-30780, so the earlier fix for that is reverted in this patch. Also fix that an error from wait_for_prior_commit() inside finish_event_group() would not signal the error to wakeup_subsequent_commits(). Based on earlier work by Brandon Nesterenko and Andrei Elkin, with some changes to simplify the semantics of wait_for_prior_commit() and make the code more robust to future changes. Reviewed-by: Andrei Elkin <andrei.elkin@mariadb.com> Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
This commit is contained in:
@ -7635,11 +7635,18 @@ wait_for_commit::register_wait_for_prior_commit(wait_for_commit *waitee)
|
||||
with register_wait_for_prior_commit(). If the commit already completed,
|
||||
returns immediately.
|
||||
|
||||
If ALLOW_KILL is set to true (the default), the wait can be aborted by a
|
||||
kill. In case of kill, the wait registration is still removed, so another
|
||||
call of unregister_wait_for_prior_commit() is needed to later retry the
|
||||
wait. If ALLOW_KILL is set to false, then kill will be ignored and this
|
||||
function will not return until the prior commit (if any) has called
|
||||
wakeup_subsequent_commits().
|
||||
|
||||
If thd->backup_commit_lock is set, release it while waiting for other threads
|
||||
*/
|
||||
|
||||
int
|
||||
wait_for_commit::wait_for_prior_commit2(THD *thd)
|
||||
wait_for_commit::wait_for_prior_commit2(THD *thd, bool allow_kill)
|
||||
{
|
||||
PSI_stage_info old_stage;
|
||||
wait_for_commit *loc_waitee;
|
||||
@ -7664,7 +7671,7 @@ wait_for_commit::wait_for_prior_commit2(THD *thd)
|
||||
&stage_waiting_for_prior_transaction_to_commit,
|
||||
&old_stage);
|
||||
while ((loc_waitee= this->waitee.load(std::memory_order_relaxed)) &&
|
||||
likely(!thd->check_killed(1)))
|
||||
(!allow_kill || likely(!thd->check_killed(1))))
|
||||
mysql_cond_wait(&COND_wait_commit, &LOCK_wait_commit);
|
||||
if (!loc_waitee)
|
||||
{
|
||||
|
Reference in New Issue
Block a user