mirror of
https://github.com/MariaDB/server.git
synced 2025-08-09 22:24:09 +03:00
sel_restore_position_for_mysql() moves forward persistent cursor position after btr_pcur_restore_position() call if cursor relative position is BTR_PCUR_ON and the cursor points to the record with NOT the same field values as in a stored record(and some other not important for this case conditions). It was done because btr_pcur_restore_position() sets page_cur_mode_t mode to PAGE_CUR_LE for cursor->rel_pos == BTR_PCUR_ON before opening cursor. So we are searching for the record less or equal to stored one. And if the found record is not equal to stored one, then it is less and we need to move cursor forward. But there can be a situation when the stored record was purged, but the new one with the same key but different value was inserted while row_search_mvcc() was suspended. In this case, when the thread is awaken, it will invoke sel_restore_position_for_mysql(), which, in turns, invoke btr_pcur_restore_position(), which will return false because found record don't match stored record, and sel_restore_position_for_mysql() will move forward cursor position. The above can lead to the case when awaken row_search_mvcc() do not see records inserted by other transactions while it slept. The mtr test case shows the example how it can be. The fix is to return special value from persistent cursor restoring function which would notify its caller that uniq fields of restored record and stored record are the same, and in this case sel_restore_position_for_mysql() don't move cursor forward. Delete-marked records are correctly processed in row_search_mvcc(). Non-unique secondary indexes are "uniquified" by adding the PK, the index->n_uniq should then be index->n_fields. So there is no need in additional checks in the fix. If transaction's readview can't see the changes made in secondary index record, it requests clustered index record in row_search_mvcc() to check its transaction id and get the correspondent record version. After this row_search_mvcc() commits mtr to preserve clustered index latching order, and starts mtr. Between those mtr commit and start secondary index pages are unlatched, and purge has the ability to remove stored in the cursor record, what causes rows duplication in result set for non-locking reads, as cursor position is restored to the previously visited record. To solve this the changes are just switched off for non-locking reads, it's quite simple solution, besides the changes don't make sense for non-locking reads. The more complex and effective from performance perspective solution is to create mtr savepoint before clustered record requesting and rolling back to that savepoint after that. See MDEV-27557. One more solution is to have per-record transaction id for secondary indexes. See MDEV-17598. If any of those is implemented, just remove select_lock_type argument in sel_restore_position_for_mysql().
45 lines
1.6 KiB
Plaintext
45 lines
1.6 KiB
Plaintext
--source include/have_innodb.inc
|
|
--source include/have_debug.inc
|
|
--source include/have_debug_sync.inc
|
|
--source include/count_sessions.inc
|
|
|
|
SET @saved_frequency = @@GLOBAL.innodb_purge_rseg_truncate_frequency;
|
|
SET GLOBAL innodb_purge_rseg_truncate_frequency=1;
|
|
|
|
CREATE TABLE t1 (pk int PRIMARY KEY, c int UNIQUE) ENGINE=InnoDB;
|
|
|
|
INSERT INTO t1 VALUES (10,10),(20,20),(30,30);
|
|
|
|
--connect(prevent_purge,localhost,root,,)
|
|
start transaction with consistent snapshot;
|
|
# We need this to update page's transaction id for secondary index.
|
|
UPDATE t1 SET c=300 WHERE pk = 30;
|
|
|
|
--connection default
|
|
DELETE FROM t1 WHERE pk = 10;
|
|
INSERT INTO t1 VALUES(5,10);
|
|
SET DEBUG_SYNC = "row_search_clust_unlatched SIGNAL unlatched WAIT_FOR cont";
|
|
# With the above sync point row_search_mvcc() will be blocked on delete-marked
|
|
# record (10,10) in secondary index just after all page latches are released.
|
|
# After this record is purged, row_searc_mvcc() will be unblocked, and cursor
|
|
# will be restored to the secondary index record (10,5). As the unique field is
|
|
# the same as in the cursor's stored record, and the bug is not fixed, there
|
|
# value 5 will be doubled in the result set.
|
|
--send SELECT pk FROM t1 FORCE INDEX (c)
|
|
|
|
--connect(con1,localhost,root,,)
|
|
SET DEBUG_SYNC = "now WAIT_FOR unlatched";
|
|
--disconnect prevent_purge
|
|
let $wait_all_purged= 1;
|
|
--source include/wait_all_purged.inc
|
|
SET DEBUG_SYNC = 'now SIGNAL cont';
|
|
--disconnect con1
|
|
|
|
--connection default
|
|
--reap
|
|
|
|
SET DEBUG_SYNC = 'RESET';
|
|
DROP TABLE t1;
|
|
SET GLOBAL innodb_purge_rseg_truncate_frequency = @saved_frequency;
|
|
--source include/wait_until_count_sessions.inc
|