1
0
mirror of https://github.com/MariaDB/server.git synced 2025-07-29 05:21:33 +03:00

MDEV-15326: InnoDB: Failing assertion: !other_lock

MySQL 5.7.9 (and MariaDB 10.2.2) introduced a race condition
between InnoDB transaction commit and the conversion of implicit
locks into explicit ones.

The assertion failure can be triggered with a test that runs
3 concurrent single-statement transactions in a loop on a simple
table:

CREATE TABLE t (a INT PRIMARY KEY) ENGINE=InnoDB;
thread1: INSERT INTO t SET a=1;
thread2: DELETE FROM t;
thread3: SELECT * FROM t FOR UPDATE; -- or DELETE FROM t;

The failure scenarios are like the following:
(1) The INSERT statement is being committed, waiting for lock_sys->mutex.
(2) At the time of the failure, both the DELETE and SELECT transactions
are active but have not logged any changes yet.
(3) The transaction where the !other_lock assertion fails started
lock_rec_convert_impl_to_expl().
(4) After this point, the commit of the INSERT removed the transaction from
trx_sys->rw_trx_set, in trx_erase_lists().
(5) The other transaction consulted trx_sys->rw_trx_set and determined
that there is no implicit lock. Hence, it grabbed the lock.
(6) The !other_lock assertion fails in lock_rec_add_to_queue()
for the lock_rec_convert_impl_to_expl(), because the lock was 'stolen'.
This assertion failure looks genuine, because the INSERT transaction
is still active (trx->state=TRX_STATE_ACTIVE).

The problematic step (4) was introduced in
mysql/mysql-server@e27e0e0bb7
which fixed something related to MVCC (covered by the test
innodb.innodb-read-view). Basically, it reintroduced an error
that had been mentioned in an earlier commit
mysql/mysql-server@a17be6963f:
"The active transaction was removed from trx_sys->rw_trx_set prematurely."

Our fix goes along the following lines:

(a) Implicit locks will released by assigning
trx->state=TRX_STATE_COMMITTED_IN_MEMORY as the first step.
This transition will no longer be protected by lock_sys_t::mutex,
only by trx->mutex. This idea is by Sergey Vojtovich.
(b) We detach the transaction from trx_sys before starting to release
explicit locks.
(c) All callers of trx_rw_is_active() and trx_rw_is_active_low() must
recheck trx->state after acquiring trx->mutex.
(d) Before releasing any explicit locks, we will ensure that any activity
by other threads to convert implicit locks into explicit will have ceased,
by checking !trx_is_referenced(trx). There was a glitch
in this check when it was part of lock_trx_release_locks(); at the end
we would release trx->mutex and acquire lock_sys->mutex and trx->mutex,
and fail to recheck (trx_is_referenced() is protected by trx_t::mutex).
(e) Explicit locks can be released in batches (LOCK_RELEASE_INTERVAL=1000)
just like we did before.

trx_t::state: Document that the transition to COMMITTED is only
protected by trx_t::mutex, no longer by lock_sys_t::mutex.

trx_rw_is_active_low(), trx_rw_is_active(): Document that the transaction
state should be rechecked after acquiring trx_t::mutex.

trx_t::commit_state(): New function to change a transaction to committed
state, to release implicit locks.

trx_t::release_locks(): New function to release the explicit locks
after commit_state().

lock_trx_release_locks(): Move much of the logic to the caller
(which must invoke trx_t::commit_state() and trx_t::release_locks()
as needed), and assert that the transaction will have locks.

trx_get_trx_by_xid(): Make the parameter a pointer to const.

lock_rec_other_trx_holds_expl(): Recheck trx->state after acquiring
trx->mutex, and avoid a redundant lookup of the transaction.

lock_rec_queue_validate(): Recheck impl_trx->state while holding
impl_trx->mutex.

row_vers_impl_x_locked(), row_vers_impl_x_locked_low():
Document that the transaction state must be rechecked after
trx_mutex_enter().

trx_free_prepared(): Adjust for the changes to lock_trx_release_locks().
This commit is contained in:
Marko Mäkelä
2019-09-03 12:31:37 +03:00
parent 7c79c12784
commit b07beff894
11 changed files with 280 additions and 354 deletions

View File

@ -575,9 +575,9 @@ struct xid_t {
char data[XIDDATASIZE]; // not \0-terminated !
xid_t() {} /* Remove gcc warning */
bool eq(struct xid_t *xid)
bool eq(struct xid_t *xid) const
{ return !xid->is_null() && eq(xid->gtrid_length, xid->bqual_length, xid->data); }
bool eq(long g, long b, const char *d)
bool eq(long g, long b, const char *d) const
{ return !is_null() && g == gtrid_length && b == bqual_length && !memcmp(d, data, g+b); }
void set(struct xid_t *xid)
{ memcpy(this, xid, xid->length()); }