1
0
mirror of https://github.com/MariaDB/server.git synced 2025-07-30 16:24:05 +03:00

MDEV-7249: Performance problem in parallel replication with multi-level slaves

Parallel replication (in 10.0 / "conservative" mode) relies on binlog group
commits to group transactions that can be safely run in parallel on the
slave. The --binlog-commit-wait-count and --binlog-commit-wait-usec options
exist to increase the number of commits per group. But in case of conflicts
between transactions, this can cause unnecessary delay and reduced througput,
especially on a slave where commit order is fixed.

This patch adds a heuristics to reduce this problem. When transaction T1 goes
to commit, it will first wait for N transactions to queue up for a group
commit. However, if we detect that another transaction T2 is waiting for a row
lock held by T1, then we will skip the wait and let T1 commit immediately,
releasing locks and let T2 continue.

On a slave, this avoids the unfortunate situation where T1 is waiting for T2
to join the group commit, but T2 is waiting for T1 to release locks, causing
no work to be done for the duration of the --binlog-commit-wait-usec timeout.

(The heuristic seems reasonable on the master as well, so it is enabled for
all transactions, not just replication transactions).
This commit is contained in:
Kristian Nielsen
2015-03-13 10:46:00 +01:00
parent bc902a2bfc
commit 184f718fef
10 changed files with 267 additions and 5 deletions

View File

@ -2674,6 +2674,18 @@ public:
it returned an error on master, and this is OK on the slave.
*/
bool is_slave_error;
/*
True when a transaction is queued up for binlog group commit.
Used so that if another transaction needs to wait for a row lock held by
this transaction, it can signal to trigger the group commit immediately,
skipping the normal --binlog-commit-wait-count wait.
*/
bool waiting_on_group_commit;
/*
Set true when another transaction goes to wait on a row lock held by this
transaction. Used together with waiting_on_group_commit.
*/
bool has_waiter;
/*
In case of a slave, set to the error code the master got when executing
the query. 0 if no error on the master.