1
0
mirror of https://github.com/MariaDB/server.git synced 2025-07-30 16:24:05 +03:00

MDEV-6775: Wrong binlog order in parallel replication: Intermediate commit

The code in binlog group commit around wait_for_commit that controls commit
order, did the wakeup of subsequent commits early, as soon as a following
transaction is put into the group commit queue, but before any such commit has
actually taken place. This causes problems with too early wakeup of
transactions that need to wait for prior to commit, but do not take part in
the binlog group commit for one reason or the other.

This patch solves the problem, by moving the wakeup to happen only after the
binlog group commit is completed.

This requires a new solution to ensure that transactions that arrive later
than the leader are still able to participate in group commit. This patch
introduces a flag wait_for_commit::commit_started. When this is set, a waiter
can queue up itself in the group commit queue.

This way, effectively the wait_for_prior_commit() is skipped only for
transactions that participate in group commit, so that skipping the wait is
safe. Other transactions still wait as needed for correctness.
This commit is contained in:
Kristian Nielsen
2014-11-13 10:31:20 +01:00
parent eec04fb4f6
commit d08b893b39
5 changed files with 250 additions and 136 deletions

View File

@ -1713,6 +1713,16 @@ struct wait_for_commit
on that function for details.
*/
bool wakeup_subsequent_commits_running;
/*
This flag can be set when a commit starts, but has not completed yet.
It is used by binlog group commit to allow a waiting transaction T2 to
join the group commit of an earlier transaction T1. When T1 has queued
itself for group commit, it will set the commit_started flag. Then when
T2 becomes ready to commit and needs to wait for T1 to commit first, T2
can queue itself before waiting, and thereby participate in the same
group commit as T1.
*/
bool commit_started;
void register_wait_for_prior_commit(wait_for_commit *waitee);
int wait_for_prior_commit(THD *thd)