mirror of
https://github.com/MariaDB/server.git
synced 2025-08-08 11:22:35 +03:00
MDEV-515 Reduce InnoDB undo logging for insert into empty table
We implement an idea that was suggested by Michael 'Monty' Widenius in October 2017: When InnoDB is inserting into an empty table or partition, we can write a single undo log record TRX_UNDO_EMPTY, which will cause ROLLBACK to clear the table. For this to work, the insert into an empty table or partition must be covered by an exclusive table lock that will be held until the transaction has been committed or rolled back, or the INSERT operation has been rolled back (and the table is empty again), in lock_table_x_unlock(). Clustered index records that are covered by the TRX_UNDO_EMPTY record will carry DB_TRX_ID=0 and DB_ROLL_PTR=1<<55, and thus they cannot be distinguished from what MDEV-12288 leaves behind after purging the history of row-logged operations. Concurrent non-locking reads must be adjusted: If the read view was created before the INSERT into an empty table, then we must continue to imagine that the table is empty, and not try to read any records. If the read view was created after the INSERT was committed, then all records must be visible normally. To implement this, we introduce the field dict_table_t::bulk_trx_id. This special handling only applies to the very first INSERT statement of a transaction for the empty table or partition. If a subsequent statement in the transaction is modifying the initially empty table again, we must enable row-level undo logging, so that we will be able to roll back to the start of the statement in case of an error (such as duplicate key). INSERT IGNORE will continue to use row-level logging and locking, because implementing it would require the ability to roll back the latest row. Since the undo log that we write only allows us to roll back the entire statement, we cannot support INSERT IGNORE. We will introduce a handler::extra() parameter HA_EXTRA_IGNORE_INSERT to indicate to storage engines that INSERT IGNORE is being executed. In many test cases, we add an extra record to the table, so that during the 'interesting' part of the test, row-level locking and logging will be used. Replicas will continue to use row-level logging and locking until MDEV-24622 has been addressed. Likewise, this optimization will be disabled in Galera cluster until MDEV-24623 enables it. dict_table_t::bulk_trx_id: The latest active or committed transaction that initiated an insert into an empty table or partition. Protected by exclusive table lock and a clustered index leaf page latch. ins_node_t::bulk_insert: Whether bulk insert was initiated. trx_t::mod_tables: Use C++11 style accessors (emplace instead of insert). Unlike earlier, this collection will cover also temporary tables. trx_mod_table_time_t: Add start_bulk_insert(), end_bulk_insert(), is_bulk_insert(), was_bulk_insert(). trx_undo_report_row_operation(): Before accessing any undo log pages, invoke trx->mod_tables.emplace() in order to determine whether undo logging was disabled, or whether this is the first INSERT and we are supposed to write a TRX_UNDO_EMPTY record. row_ins_clust_index_entry_low(): If we are inserting into an empty clustered index leaf page, set the ins_node_t::bulk_insert flag for the subsequent trx_undo_report_row_operation() call. lock_rec_insert_check_and_lock(), lock_prdt_insert_check_and_lock(): Remove the redundant parameter 'flags' that can be checked in the caller. btr_cur_ins_lock_and_undo(): Simplify the logic. Correctly write DB_TRX_ID,DB_ROLL_PTR after invoking trx_undo_report_row_operation(). trx_mark_sql_stat_end(), ha_innobase::extra(HA_EXTRA_IGNORE_INSERT), ha_innobase::external_lock(): Invoke trx_t::end_bulk_insert() so that the next statement will not be covered by table-level undo logging. ReadView::changes_visible(trx_id_t) const: New accessor for the case where the trx_id_t is not read from a potentially corrupted index page but directly from the memory. In this case, we can skip a sanity check. row_sel(), row_sel_try_search_shortcut(), row_search_mvcc(): row_sel_try_search_shortcut_for_mysql(), row_merge_read_clustered_index(): Check dict_table_t::bulk_trx_id. row_sel_clust_sees(): Replaces lock_clust_rec_cons_read_sees(). lock_sec_rec_cons_read_sees(): Replaced with lower-level code. btr_root_page_init(): Refactored from btr_create(). dict_index_t::clear(), dict_table_t::clear(): Empty an index or table, for the ROLLBACK of an INSERT operation. ROW_T_EMPTY, ROW_OP_EMPTY: Note a concurrent ROLLBACK of an INSERT into an empty table. This is joint work with Thirunarayanan Balathandayuthapani, who created a working prototype. Thanks to Matthias Leich for extensive testing.
This commit is contained in:
@@ -18,15 +18,12 @@ ddl_log_file_alter_table 0
|
||||
SET DEBUG_SYNC = 'RESET';
|
||||
SET DEBUG_SYNC = 'write_row_noreplace SIGNAL have_handle WAIT_FOR go_ahead';
|
||||
INSERT INTO t1 VALUES(1,2,3);
|
||||
# Establish session con1 (user=root)
|
||||
connect con1,localhost,root,,;
|
||||
connection con1;
|
||||
SET DEBUG_SYNC = 'now WAIT_FOR have_handle';
|
||||
SET lock_wait_timeout = 1;
|
||||
ALTER TABLE t1 ROW_FORMAT=REDUNDANT;
|
||||
ERROR HY000: Lock wait timeout exceeded; try restarting transaction
|
||||
SET DEBUG_SYNC = 'now SIGNAL go_ahead';
|
||||
# session default
|
||||
connection default;
|
||||
ERROR 23000: Duplicate entry '1' for key 'PRIMARY'
|
||||
SELECT name, count FROM INFORMATION_SCHEMA.INNODB_METRICS WHERE subsystem = 'ddl';
|
||||
@@ -37,7 +34,6 @@ ddl_online_create_index 0
|
||||
ddl_pending_alter_table 0
|
||||
ddl_sort_file_alter_table 0
|
||||
ddl_log_file_alter_table 0
|
||||
# session con1
|
||||
connection con1;
|
||||
SET @saved_debug_dbug = @@SESSION.debug_dbug;
|
||||
SET DEBUG_DBUG = '+d,innodb_OOM_prepare_inplace_alter';
|
||||
@@ -55,7 +51,6 @@ SET SESSION DEBUG = @saved_debug_dbug;
|
||||
Warnings:
|
||||
Warning 1287 '@@debug' is deprecated and will be removed in a future release. Please use '@@debug_dbug' instead
|
||||
ALTER TABLE t1 ROW_FORMAT=REDUNDANT, ALGORITHM=INPLACE, LOCK=NONE;
|
||||
# session default
|
||||
connection default;
|
||||
SHOW CREATE TABLE t1;
|
||||
Table Create Table
|
||||
@@ -67,22 +62,17 @@ t1 CREATE TABLE `t1` (
|
||||
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ROW_FORMAT=REDUNDANT
|
||||
BEGIN;
|
||||
INSERT INTO t1 VALUES(7,4,2);
|
||||
# session con1
|
||||
connection con1;
|
||||
SET DEBUG_SYNC = 'row_log_table_apply1_before SIGNAL scanned WAIT_FOR insert_done';
|
||||
ALTER TABLE t1 DROP PRIMARY KEY, ADD UNIQUE INDEX(c2);
|
||||
ERROR HY000: Lock wait timeout exceeded; try restarting transaction
|
||||
# session default
|
||||
connection default;
|
||||
COMMIT;
|
||||
# session con1
|
||||
connection con1;
|
||||
ALTER TABLE t1 DROP PRIMARY KEY, ADD UNIQUE INDEX(c2);
|
||||
ERROR 23000: Duplicate entry '4' for key 'c2'
|
||||
# session default
|
||||
connection default;
|
||||
DELETE FROM t1 WHERE c1 = 7;
|
||||
# session con1
|
||||
connection con1;
|
||||
ALTER TABLE t1 DROP PRIMARY KEY, ADD UNIQUE INDEX(c2), ROW_FORMAT=COMPACT,
|
||||
LOCK = SHARED, ALGORITHM = INPLACE;
|
||||
@@ -100,7 +90,6 @@ t1 CREATE TABLE `t1` (
|
||||
UNIQUE KEY `c2_2` (`c2`)
|
||||
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ROW_FORMAT=COMPACT
|
||||
ALTER TABLE t1 DROP INDEX c2, ADD PRIMARY KEY(c1);
|
||||
# session default
|
||||
connection default;
|
||||
SET DEBUG_SYNC = 'now WAIT_FOR scanned';
|
||||
SELECT name, count FROM INFORMATION_SCHEMA.INNODB_METRICS WHERE subsystem = 'ddl';
|
||||
@@ -114,13 +103,10 @@ ddl_log_file_alter_table 0
|
||||
BEGIN;
|
||||
INSERT INTO t1 VALUES(4,7,2);
|
||||
SET DEBUG_SYNC = 'now SIGNAL insert_done';
|
||||
# session con1
|
||||
connection con1;
|
||||
ERROR 23000: Duplicate entry '4' for key 'PRIMARY'
|
||||
# session default
|
||||
connection default;
|
||||
ROLLBACK;
|
||||
# session con1
|
||||
connection con1;
|
||||
SHOW CREATE TABLE t1;
|
||||
Table Create Table
|
||||
@@ -142,7 +128,6 @@ ddl_online_create_index 0
|
||||
ddl_pending_alter_table 0
|
||||
ddl_sort_file_alter_table 0
|
||||
ddl_log_file_alter_table 0
|
||||
# session default
|
||||
connection default;
|
||||
INSERT INTO t1 VALUES(6,3,1);
|
||||
ERROR 23000: Duplicate entry '3' for key 'c2_2'
|
||||
@@ -152,14 +137,12 @@ DROP INDEX c2_2 ON t1;
|
||||
BEGIN;
|
||||
INSERT INTO t1 VALUES(7,4,2);
|
||||
ROLLBACK;
|
||||
# session con1
|
||||
connection con1;
|
||||
KILL QUERY @id;
|
||||
ERROR 70100: Query execution was interrupted
|
||||
SET DEBUG_SYNC = 'row_log_table_apply1_before SIGNAL rebuilt WAIT_FOR dml_done';
|
||||
SET DEBUG_SYNC = 'row_log_table_apply2_before SIGNAL applied WAIT_FOR kill_done';
|
||||
ALTER TABLE t1 ROW_FORMAT=REDUNDANT;
|
||||
# session default
|
||||
connection default;
|
||||
SET DEBUG_SYNC = 'now WAIT_FOR rebuilt';
|
||||
SELECT name, count FROM INFORMATION_SCHEMA.INNODB_METRICS WHERE subsystem = 'ddl';
|
||||
@@ -176,7 +159,6 @@ ROLLBACK;
|
||||
SET DEBUG_SYNC = 'now SIGNAL dml_done WAIT_FOR applied';
|
||||
KILL QUERY @id;
|
||||
SET DEBUG_SYNC = 'now SIGNAL kill_done';
|
||||
# session con1
|
||||
connection con1;
|
||||
ERROR 70100: Query execution was interrupted
|
||||
SELECT name, count FROM INFORMATION_SCHEMA.INNODB_METRICS WHERE subsystem = 'ddl';
|
||||
@@ -187,7 +169,6 @@ ddl_online_create_index 0
|
||||
ddl_pending_alter_table 0
|
||||
ddl_sort_file_alter_table 0
|
||||
ddl_log_file_alter_table 0
|
||||
# session default
|
||||
connection default;
|
||||
CHECK TABLE t1;
|
||||
Table Op Msg_type Msg_text
|
||||
@@ -212,7 +193,6 @@ WHERE variable_name = 'innodb_encryption_n_merge_blocks_decrypted');
|
||||
SET @rowlog_encrypt_0=
|
||||
(SELECT variable_value FROM information_schema.global_status
|
||||
WHERE variable_name = 'innodb_encryption_n_rowlog_blocks_encrypted');
|
||||
# session con1
|
||||
connection con1;
|
||||
SHOW CREATE TABLE t1;
|
||||
Table Create Table
|
||||
@@ -227,7 +207,6 @@ SET DEBUG_SYNC = 'row_log_table_apply1_before SIGNAL rebuilt2 WAIT_FOR dml2_done
|
||||
SET lock_wait_timeout = 10;
|
||||
ALTER TABLE t1 ROW_FORMAT=COMPACT
|
||||
PAGE_COMPRESSED = YES PAGE_COMPRESSION_LEVEL = 1, ALGORITHM = INPLACE;
|
||||
# session default
|
||||
connection default;
|
||||
INSERT INTO t1 SELECT 80 + c1, c2, c3 FROM t1;
|
||||
INSERT INTO t1 SELECT 160 + c1, c2, c3 FROM t1;
|
||||
@@ -290,7 +269,6 @@ SELECT
|
||||
sort_balance @merge_encrypt_1>@merge_encrypt_0 @merge_decrypt_1>@merge_decrypt_0 @rowlog_encrypt_1>@rowlog_encrypt_0
|
||||
0 0 0 0
|
||||
SET DEBUG_SYNC = 'now SIGNAL dml2_done';
|
||||
# session con1
|
||||
connection con1;
|
||||
ERROR HY000: Creating index 'PRIMARY' required more than 'innodb_online_alter_log_max_size' bytes of modification log. Please try again
|
||||
SELECT name, count FROM INFORMATION_SCHEMA.INNODB_METRICS WHERE subsystem = 'ddl';
|
||||
@@ -321,7 +299,6 @@ ERROR 23000: Duplicate entry '5' for key 'PRIMARY'
|
||||
ALTER TABLE t1 DROP PRIMARY KEY, ADD PRIMARY KEY(c22f,c1,c4(5)),
|
||||
CHANGE c2 c22f INT, CHANGE c3 c3 CHAR(255) NULL, CHANGE c1 c1 INT AFTER c22f,
|
||||
ADD COLUMN c4 VARCHAR(6) DEFAULT 'Online', LOCK=NONE;
|
||||
# session default
|
||||
connection default;
|
||||
SET DEBUG_SYNC = 'now WAIT_FOR rebuilt3';
|
||||
SELECT name, count FROM INFORMATION_SCHEMA.INNODB_METRICS WHERE subsystem = 'ddl';
|
||||
@@ -349,7 +326,6 @@ ddl_pending_alter_table 1
|
||||
ddl_sort_file_alter_table 2
|
||||
ddl_log_file_alter_table 2
|
||||
SET DEBUG_SYNC = 'now SIGNAL dml3_done';
|
||||
# session con1
|
||||
connection con1;
|
||||
SELECT name, count FROM INFORMATION_SCHEMA.INNODB_METRICS WHERE subsystem = 'ddl';
|
||||
name count
|
||||
@@ -405,20 +381,16 @@ SET DEBUG_SYNC = 'row_log_table_apply1_before SIGNAL c3p5_created0 WAIT_FOR ins_
|
||||
ALTER TABLE t1 MODIFY c3 CHAR(255) NOT NULL, DROP COLUMN c22f,
|
||||
DROP PRIMARY KEY, ADD PRIMARY KEY(c1,c4(5)),
|
||||
ADD COLUMN c5 CHAR(5) DEFAULT 'tired' FIRST;
|
||||
# session default
|
||||
connection default;
|
||||
SET DEBUG_SYNC = 'now WAIT_FOR c3p5_created0';
|
||||
BEGIN;
|
||||
INSERT INTO t1 VALUES(347,33101,'Pikku kakkosen posti','YLETV2');
|
||||
INSERT INTO t1 VALUES(33101,347,NULL,'');
|
||||
SET DEBUG_SYNC = 'now SIGNAL ins_done0';
|
||||
# session con1
|
||||
connection con1;
|
||||
ERROR 01000: Data truncated for column 'c3' at row 323
|
||||
# session default
|
||||
connection default;
|
||||
ROLLBACK;
|
||||
# session con1
|
||||
connection con1;
|
||||
ALTER TABLE t1 MODIFY c3 CHAR(255) NOT NULL;
|
||||
SET DEBUG_SYNC = 'row_log_table_apply1_before SIGNAL c3p5_created WAIT_FOR ins_done';
|
||||
@@ -426,14 +398,12 @@ ALTER TABLE t1 DROP PRIMARY KEY, DROP COLUMN c22f,
|
||||
ADD COLUMN c6 VARCHAR(1000) DEFAULT
|
||||
'I love tracking down hard-to-reproduce bugs.',
|
||||
ADD PRIMARY KEY c3p5(c3(5), c6(2));
|
||||
# session default
|
||||
connection default;
|
||||
SET DEBUG_SYNC = 'now WAIT_FOR c3p5_created';
|
||||
SET DEBUG_SYNC = 'ib_after_row_insert SIGNAL ins_done WAIT_FOR ddl_timed_out';
|
||||
INSERT INTO t1 VALUES(347,33101,NULL,'');
|
||||
ERROR 23000: Column 'c3' cannot be null
|
||||
INSERT INTO t1 VALUES(347,33101,'Pikku kakkosen posti','');
|
||||
# session con1
|
||||
connection con1;
|
||||
ERROR HY000: Lock wait timeout exceeded; try restarting transaction
|
||||
SET DEBUG_SYNC = 'now SIGNAL ddl_timed_out';
|
||||
@@ -445,7 +415,6 @@ ddl_online_create_index 0
|
||||
ddl_pending_alter_table 0
|
||||
ddl_sort_file_alter_table 6
|
||||
ddl_log_file_alter_table 2
|
||||
# session default
|
||||
connection default;
|
||||
SELECT COUNT(*) FROM t1;
|
||||
COUNT(*)
|
||||
@@ -463,12 +432,8 @@ c22f c1 c3 c4
|
||||
5 36 36foofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoo Online
|
||||
5 41 41foofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoo Online
|
||||
5 46 46foofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoo Online
|
||||
# session con1
|
||||
connection con1;
|
||||
ALTER TABLE t1 DISCARD TABLESPACE;
|
||||
# Disconnect session con1
|
||||
disconnect con1;
|
||||
# session default
|
||||
connection default;
|
||||
SHOW CREATE TABLE t1;
|
||||
Table Create Table
|
||||
@@ -479,9 +444,25 @@ t1 CREATE TABLE `t1` (
|
||||
`c4` varchar(6) NOT NULL DEFAULT 'Online',
|
||||
PRIMARY KEY (`c22f`,`c1`,`c4`(5))
|
||||
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ROW_FORMAT=REDUNDANT
|
||||
SET DEBUG_SYNC = 'RESET';
|
||||
SET GLOBAL innodb_monitor_disable = module_ddl;
|
||||
DROP TABLE t1;
|
||||
CREATE TABLE t1 (a INT PRIMARY KEY) ENGINE=InnoDB;
|
||||
connection con1;
|
||||
SET DEBUG_SYNC = 'row_log_table_apply1_before SIGNAL created WAIT_FOR ins';
|
||||
ALTER TABLE t1 FORCE;
|
||||
connection default;
|
||||
SET DEBUG_SYNC = 'now WAIT_FOR created';
|
||||
BEGIN;
|
||||
INSERT INTO t1 VALUES(1);
|
||||
ROLLBACK;
|
||||
SET DEBUG_SYNC = 'now SIGNAL ins';
|
||||
connection con1;
|
||||
disconnect con1;
|
||||
connection default;
|
||||
SELECT * FROM t1;
|
||||
a
|
||||
DROP TABLE t1;
|
||||
SET DEBUG_SYNC = 'RESET';
|
||||
SET GLOBAL innodb_file_per_table = @global_innodb_file_per_table_orig;
|
||||
SET GLOBAL innodb_monitor_enable = default;
|
||||
SET GLOBAL innodb_monitor_disable = default;
|
||||
|
Reference in New Issue
Block a user