MDEV-13915: STOP SLAVE takes very long time on a busy system

At STOP SLAVE, worker threads will continue applying event groups until the end of the current GCO before stopping. This is a left-over from when only conservative mode was available. In optimistic and aggressive mode, often _all_ queued event will be in the same GCO, and slave stop will be needlessly delayed. This patch instead records at STOP SLAVE time the latest (highest sub_id) event group that has started. Then worker threads will continue to apply event groups up to that event group, but skip any following. The result is that each worker thread will complete its currently running event group, and then the slave will stop. If the slave is caught up, and STOP SLAVE is run in the middle of an event group that is already executing in a worker thread, then that event group will be rolled back and the slave stop immediately, as normal. Reviewed-by: Andrei Elkin <andrei.elkin@mariadb.com> Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2025-07-14 13:41:20 +03:00 · 2023-06-10 22:36:16 +02:00
parent b4646c675c
commit 60bec1d54d
2 changed files with 18 additions and 19 deletions
--- a/sql/rpl_parallel.cc
+++ b/sql/rpl_parallel.cc
@ -395,12 +395,13 @@ do_gco_wait(rpl_group_info *rgi, group_commit_orderer *gco,
    } while (wait_count > entry->count_committing_event_groups);
  }

-  if (entry->force_abort && wait_count > entry->stop_count)
+  if (entry->force_abort && rgi->gtid_sub_id > entry->stop_sub_id)
  {
    /*
-      We are stopping (STOP SLAVE), and this event group is beyond the point
-      where we can safely stop. So return a flag that will cause us to skip,
-      rather than execute, the following events.
+      We are stopping (STOP SLAVE), and this event group need not be applied
+      before we can safely stop. So return a flag that will cause us to skip,
+      rather than execute, the following events. Once all queued events have
+      been skipped, the STOP SLAVE is complete (for this thread).
    */
    return true;
  }
@ -2357,20 +2358,18 @@ rpl_parallel::wait_for_done(THD *thd, Relay_log_info *rli)
      are also executed, so that we stop at a consistent point in the binlog
      stream (per replication domain).

-      All event groups wait for e->count_committing_event_groups to reach
-      the value of group_commit_orderer::wait_count before starting to
-      execute. Thus, at this point we know that any event group with a
-      strictly larger wait_count are safe to skip, none of them can have
-      started executing yet. So we set e->stop_count here and use it to
-      decide in the worker threads whether to continue executing an event
-      group or whether to skip it, when force_abort is set.
+      At this point, we are holding LOCK_parallel_entry, and we know that no
+      event group after e->largest_started_sub_id has started running yet. We
+      record this value in e->stop_sub_id, and then each event group can check
+      their own sub_id against it. If their sub_id is strictly larger, then
+      that event group will be skipped.

      If we stop due to reaching the START SLAVE UNTIL condition, then we
      need to continue executing any queued events up to that point.
    */
    e->force_abort= true;
-    e->stop_count= rli->stop_for_until ?
-      e->count_queued_event_groups : e->count_committing_event_groups;
+    e->stop_sub_id= rli->stop_for_until ?
+      e->current_sub_id : e->largest_started_sub_id;
    mysql_mutex_unlock(&e->LOCK_parallel_entry);
    for (j= 0; j < e->rpl_thread_max; ++j)
    {
@ -2426,7 +2425,7 @@ rpl_parallel::stop_during_until()
    e= (struct rpl_parallel_entry *)my_hash_element(&domain_hash, i);
    mysql_mutex_lock(&e->LOCK_parallel_entry);
    if (e->force_abort)
-      e->stop_count= e->count_committing_event_groups;
+      e->stop_sub_id= e->largest_started_sub_id;
    mysql_mutex_unlock(&e->LOCK_parallel_entry);
  }
 }