1
0
mirror of https://github.com/MariaDB/server.git synced 2025-07-29 05:21:33 +03:00

MDEV-13915: STOP SLAVE takes very long time on a busy system

The problem is that a parallel replica would not immediately stop
running/queued transactions when issued STOP SLAVE. That is, it
allowed the current group of transactions to run, and sometimes the
transactions which belong to the next group could be started and run
through commit after STOP SLAVE was issued too, if the last group
had started committing. This would lead to long periods to wait for
all waiting transactions to finish.

This patch updates a parallel replica to try and abort immediately
and roll-back any ongoing transactions. The exception to this is any
transactions which are non-transactional (e.g. those modifying
sequences or non-transactional tables), and any prior transactions,
will be run to completion.

The specifics are as follows:

 1. A new stage was added to SHOW PROCESSLIST output for the SQL
Thread when it is waiting for a replica thread to either rollback or
finish its transaction before stopping. This stage presents as
“Waiting for worker thread to stop”

 2. Worker threads which error or are killed no longer perform GCO
cleanup if there is a concurrently running prior transaction. This
is because a worker thread scheduled to run in a future GCO could be
killed and incorrectly perform cleanup of the active GCO.

 3. Refined cases when the FL_TRANSACTIONAL flag is added to GTID
binlog events to disallow adding it to transactions which modify
both transactional and non-transactional engines when the binlogging
configuration allow the modifications to exist in the same event,
i.e. when using binlog_direct_non_trans_update == 0 and
binlog_format == statement.

 4. A few existing MTR tests relied on the completion of certain
transactions after issuing STOP SLAVE, and were re-recorded
(potentially with added synchronizations) under the new rollback
behavior.

Reviewed By
===========
Andrei Elkin <andrei.elkin@mariadb.com>
This commit is contained in:
Brandon Nesterenko
2023-03-08 13:49:32 -07:00
parent 8de6740a2f
commit 0a99d457b3
20 changed files with 2187 additions and 24 deletions

View File

@ -7639,7 +7639,7 @@ wait_for_commit::register_wait_for_prior_commit(wait_for_commit *waitee)
*/
int
wait_for_commit::wait_for_prior_commit2(THD *thd)
wait_for_commit::wait_for_prior_commit2(THD *thd, bool force_wait)
{
PSI_stage_info old_stage;
wait_for_commit *loc_waitee;
@ -7664,9 +7664,24 @@ wait_for_commit::wait_for_prior_commit2(THD *thd)
&stage_waiting_for_prior_transaction_to_commit,
&old_stage);
while ((loc_waitee= this->waitee.load(std::memory_order_relaxed)) &&
likely(!thd->check_killed(1)))
(likely(!thd->check_killed(1)) || force_wait))
mysql_cond_wait(&COND_wait_commit, &LOCK_wait_commit);
if (!loc_waitee)
if (!loc_waitee
#ifndef EMBEDDED_LIBRARY
/*
If a worker has been killed prior to this wait, e.g. in do_gco_wait(),
then it should not perform thread cleanup if there are threads which
have yet to commit. This is to prevent the cleanup of resources that
the prior RGI may need, e.g. its GCO. This is achieved by skipping
the unregistration of the waitee, such that each subsequent call to
wait_for_prior_commit() will exit early (while maintaining the
dependence), thus allowing the final call to
thd->wait_for_prior_commit() within finish_event_group() to wait.
*/
|| (thd->rgi_slave && (thd->rgi_slave->worker_error &&
!thd->rgi_slave->did_mark_start_commit))
#endif
)
{
if (wakeup_error)
my_error(ER_PRIOR_COMMIT_FAILED, MYF(0));