1
0
mirror of https://github.com/MariaDB/server.git synced 2025-07-30 16:24:05 +03:00

MDEV-34696: do_gco_wait() completes too early on InnoDB dict stats updates

Before doing mark_start_commit(), check that there is no pending deadlock
kill. If there is a pending kill, we won't commit (we will abort, roll back,
and retry). Then we should not mark the commit as started, since that could
potentially make the following GCO start too early, before we completed the
commit after the retry.

This condition could trigger in some corner cases, where InnoDB would take
temporarily table/row locks that are released again immediately, not held
until the transaction commits. This happens with dict_stats updates and
possibly auto-increment locks.

Such locks can be passed to thd_rpl_deadlock_check() and cause a deadlock
kill to be scheduled in the background. But since the blocking locks are
held only temporarily, they can be released before the background kill
happens. This way, the kill can be delayed until after mark_start_commit()
has been called. Thus we need to check the synchronous indication
rgi->killed_for_retry, not just the asynchroneous thd->killed.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
This commit is contained in:
Kristian Nielsen
2024-08-03 03:51:47 +02:00
parent 1f040ae048
commit b4c2e23954
2 changed files with 33 additions and 4 deletions

View File

@ -1450,11 +1450,23 @@ handle_rpl_parallel_thread(void *arg)
after mark_start_commit(), we have to unmark, which has at least a
theoretical possibility of leaving a window where it looks like all
transactions in a GCO have started committing, while in fact one
will need to rollback and retry. This is not supposed to be possible
(since there is a deadlock, at least one transaction should be
blocked from reaching commit), but this seems a fragile ensurance,
and there were historically a number of subtle bugs in this area.
will need to rollback and retry.
Normally this will not happen, since the kill is there to resolve a
deadlock that is preventing at least one transaction from proceeding.
One case it can happen is with InnoDB dict stats update, which can
temporarily cause transactions to block each other, but locks are
released immediately, they don't linger until commit. There could be
other similar cases, there were historically a number of subtle bugs
in this area.
But once we start the commit, we can expect that no new lock
conflicts will be introduced. So by handling any lingering deadlock
kill at this point just before mark_start_commit(), we should be
robust even towards spurious deadlock kills.
*/
if (rgi->killed_for_retry != rpl_group_info::RETRY_KILL_NONE)
wait_for_pending_deadlock_kill(thd, rgi);
if (!thd->killed)
{
DEBUG_SYNC(thd, "rpl_parallel_before_mark_start_commit");