mariadb/sql/rpl_parallel.cc at a2578018bf6fb166efaed33e332e1bd9dbab330b

mirror of https://github.com/MariaDB/server.git synced 2025-11-28 17:36:30 +03:00

Files

Kristian Nielsen f27817c1d0 MDEV-7326: Server deadlock in connection with parallel replication

The bug occurs when a transaction does a retry after all transactions have
done mark_start_commit() in a batch of group commit from the master. In this
case, the retrying transaction can unmark_start_commit() after the following
batch has already started running and de-allocated the GCO. Then after retry,
the transaction will re-do mark_start_commit() on a de-allocated GCO, and also
wakeup of later GCOs can be lost.

This was seen "in the wild" by a user, even though it is not known exactly
what circumstances can lead to retry of one transaction after all transactions
in a group have reached the commit phase.

The lifetime around GCO was somewhat clunky anyway. With this patch, a GCO
lives until rpl_parallel_entry::last_committed_sub_id has reached the last
transaction in the GCO. This guarantees that the GCO will still be alive when
a transaction does mark_start_commit(). Also, we now loop over the list of
active GCOs for wakeup, to ensure we do not lose a wakeup even in the
problematic case.

2015-01-07 14:45:39 +01:00

65 KiB

Raw Blame History

View Raw

65 KiB Raw Blame History

65 KiB

Raw Blame History