From b42294bc6409794bdbd2051b32fa079d81cea61d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marko=20M=C3=A4kel=C3=A4?= Date: Fri, 11 Oct 2019 17:28:15 +0300 Subject: [PATCH] MDEV-19514 Defer change buffer merge until pages are requested We will remove the InnoDB background operation of merging buffered changes to secondary index leaf pages. Changes will only be merged as a result of an operation that accesses a secondary index leaf page, such as a SQL statement that performs a lookup via that index, or is modifying the index. Also ROLLBACK and some background operations, such as purging the history of committed transactions, or computing index cardinality statistics, can cause change buffer merge. Encryption key rotation will not perform change buffer merge. The motivation of this change is to simplify the I/O logic and to allow crash recovery to happen in the background (MDEV-14481). We also hope that this will reduce the number of "mystery" crashes due to corrupted data. Because change buffer merge will typically take place as a result of executing SQL statements, there should be a clearer connection between the crash and the SQL statements that were executed when the server crashed. In many cases, a slight performance improvement was observed. This is joint work with Thirunarayanan Balathandayuthapani and was tested by Axel Schwenke and Matthias Leich. The InnoDB monitor counter innodb_ibuf_merge_usec will be removed. On slow shutdown (innodb_fast_shutdown=0), we will continue to merge all buffered changes (and purge all undo log history). Two InnoDB configuration parameters will be changed as follows: innodb_disable_background_merge: Removed. This parameter existed only in debug builds. All change buffer merges will use synchronous reads. innodb_force_recovery will be changed as follows: * innodb_force_recovery=4 will be the same as innodb_force_recovery=3 (the change buffer merge cannot be disabled; it can only happen as a result of an operation that accesses a secondary index leaf page). The option used to be capable of corrupting secondary index leaf pages. Now that capability is removed, and innodb_force_recovery=4 becomes 'safe'. * innodb_force_recovery=5 (which essentially hard-wires SET GLOBAL TRANSACTION ISOLATION LEVEL READ UNCOMMITTED) becomes safe to use. Bogus data can be returned to SQL, but persistent InnoDB data files will not be corrupted further. * innodb_force_recovery=6 (ignore the redo log files) will be the only option that can potentially cause persistent corruption of InnoDB data files. Code changes: buf_page_t::ibuf_exist: New flag, to indicate whether buffered changes exist for a buffer pool page. Pages with pending changes can be returned by buf_page_get_gen(). Previously, the changes were always merged inside buf_page_get_gen() if needed. ibuf_page_exists(const buf_page_t&): Check if a buffered changes exist for an X-latched or read-fixed page. buf_page_get_gen(): Add the parameter allow_ibuf_merge=false. All callers that know that they may be accessing a secondary index leaf page must pass this parameter as allow_ibuf_merge=true, unless it does not matter for that caller whether all buffered changes have been applied. Assert that whenever allow_ibuf_merge holds, the page actually is a leaf page. Attempt change buffer merge only to secondary B-tree index leaf pages. btr_block_get(): Add parameter 'bool merge'. All callers of btr_block_get() should know whether the page could be a secondary index leaf page. If it is not, we should avoid consulting the change buffer bitmap to even consider a merge. This is the main interface to requesting index pages from the buffer pool. ibuf_merge_or_delete_for_page(), recv_recover_page(): Replace buf_page_get_known_nowait() with much simpler logic, because it is now guaranteed that that the block is x-latched or read-fixed. mlog_init_t::mark_ibuf_exist(): Renamed from mlog_init_t::ibuf_merge(). On crash recovery, we will no longer merge any buffered changes for the pages that we read into the buffer pool during the last batch of applying log records. buf_page_get_gen_known_nowait(), BUF_MAKE_YOUNG, BUF_KEEP_OLD: Remove. btr_search_guess_on_hash(): Merge buf_page_get_gen_known_nowait() to its only remaining caller. buf_page_make_young_if_needed(): Define as an inline function. Add the parameter buf_pool. buf_page_peek_if_young(), buf_page_peek_if_too_old(): Add the parameter buf_pool. fil_space_validate_for_mtr_commit(): Remove a bogus comment about background merge of the change buffer. btr_cur_open_at_rnd_pos_func(), btr_cur_search_to_nth_level_func(), btr_cur_open_at_index_side_func(): Use narrower data types and scopes. ibuf_read_merge_pages(): Replaces buf_read_ibuf_merge_pages(). Merge the change buffer by invoking buf_page_get_gen(). --- mysql-test/main/tc_heuristic_recover.test | 4 +- .../suite/innodb/r/ibuf_not_empty.result | 3 +- .../suite/innodb/r/innodb-wl5522-debug.result | 4 +- .../innodb/r/innodb_force_recovery.result | 23 +- .../r/innodb_skip_innodb_is_tables.result | 1 - mysql-test/suite/innodb/r/monitor.result | 1 - .../suite/innodb/t/innodb-wl5522-debug.test | 5 - .../suite/innodb/t/innodb_force_recovery.test | 25 +- .../innodb_zip/r/wl5522_debug_zip.result | 4 +- .../suite/innodb_zip/t/wl5522_debug_zip.test | 5 - ...nodb_disable_background_merge_basic.result | 4 - .../sys_vars/r/sysvars_innodb,32bit.rdiff | 2 +- .../suite/sys_vars/r/sysvars_innodb.result | 14 +- ...innodb_disable_background_merge_basic.test | 12 - storage/innobase/btr/btr0btr.cc | 51 +-- storage/innobase/btr/btr0bulk.cc | 4 +- storage/innobase/btr/btr0cur.cc | 222 ++++++------ storage/innobase/btr/btr0defragment.cc | 3 +- storage/innobase/btr/btr0pcur.cc | 3 +- storage/innobase/btr/btr0scrub.cc | 11 +- storage/innobase/btr/btr0sea.cc | 63 +++- storage/innobase/buf/buf0buf.cc | 325 ++++++------------ storage/innobase/buf/buf0rea.cc | 85 +---- storage/innobase/dict/dict0boot.cc | 35 +- storage/innobase/dict/dict0stats.cc | 11 +- storage/innobase/fil/fil0fil.cc | 10 +- storage/innobase/gis/gis0rtree.cc | 4 +- storage/innobase/handler/ha_innodb.cc | 19 +- storage/innobase/ibuf/ibuf0ibuf.cc | 235 ++++++------- storage/innobase/include/btr0btr.h | 10 +- storage/innobase/include/buf0buf.h | 104 +++--- storage/innobase/include/buf0buf.ic | 40 +-- storage/innobase/include/buf0rea.h | 20 -- storage/innobase/include/ibuf0ibuf.h | 12 +- storage/innobase/include/ibuf0ibuf.ic | 8 +- storage/innobase/include/srv0mon.h | 1 - storage/innobase/include/srv0srv.h | 6 +- storage/innobase/log/log0recv.cc | 41 +-- storage/innobase/row/row0merge.cc | 2 +- storage/innobase/srv/srv0mon.cc | 5 - storage/innobase/srv/srv0srv.cc | 16 +- storage/innobase/srv/srv0start.cc | 6 +- .../r/innodb_i_s_tables_disabled.result | 1 - 43 files changed, 549 insertions(+), 911 deletions(-) delete mode 100644 mysql-test/suite/sys_vars/r/innodb_disable_background_merge_basic.result delete mode 100644 mysql-test/suite/sys_vars/t/innodb_disable_background_merge_basic.test diff --git a/mysql-test/main/tc_heuristic_recover.test b/mysql-test/main/tc_heuristic_recover.test index 8cbf7d61143..86fea084de8 100644 --- a/mysql-test/main/tc_heuristic_recover.test +++ b/mysql-test/main/tc_heuristic_recover.test @@ -49,7 +49,7 @@ SELECT * FROM t1; # TODO: MDEV-12700 Allow innodb_read_only startup without prior slow shutdown. --source include/kill_mysqld.inc --error 1 ---exec $MYSQLD_LAST_CMD --log-bin=master-bin --binlog-format=mixed --core-file --loose-debug-sync-timeout=300 --innodb-force-recovery=4 +--exec $MYSQLD_LAST_CMD --log-bin=master-bin --binlog-format=mixed --core-file --loose-debug-sync-timeout=300 --debug_dbug="+d,innobase_xa_fail" --let SEARCH_PATTERN= was in the XA prepared state --source include/search_pattern_in_file.inc @@ -59,7 +59,7 @@ SELECT * FROM t1; --source include/search_pattern_in_file.inc --error 1 ---exec $MYSQLD_LAST_CMD --log-bin=master-bin --binlog-format=mixed --core-file --loose-debug-sync-timeout=300 --innodb-force-recovery=4 --tc-heuristic-recover=COMMIT +--exec $MYSQLD_LAST_CMD --log-bin=master-bin --binlog-format=mixed --core-file --loose-debug-sync-timeout=300 --debug_dbug="+d,innobase_xa_fail" --tc-heuristic-recover=COMMIT --let SEARCH_PATTERN= was in the XA prepared state --source include/search_pattern_in_file.inc --let SEARCH_PATTERN= Found 1 prepared transactions! diff --git a/mysql-test/suite/innodb/r/ibuf_not_empty.result b/mysql-test/suite/innodb/r/ibuf_not_empty.result index 667f0b2c90b..7e6099e7fea 100644 --- a/mysql-test/suite/innodb/r/ibuf_not_empty.result +++ b/mysql-test/suite/innodb/r/ibuf_not_empty.result @@ -21,7 +21,6 @@ INSERT INTO t1 SELECT 0,b,c FROM t1; # restart: --innodb-force-recovery=6 check table t1; Table Op Msg_type Msg_text -test.t1 check Warning InnoDB: Index 'b' contains #### entries, should be 4096. -test.t1 check error Corrupt +test.t1 check status OK # restart DROP TABLE t1; diff --git a/mysql-test/suite/innodb/r/innodb-wl5522-debug.result b/mysql-test/suite/innodb/r/innodb-wl5522-debug.result index 2fb6d0edff5..e367c5d3705 100644 --- a/mysql-test/suite/innodb/r/innodb-wl5522-debug.result +++ b/mysql-test/suite/innodb/r/innodb-wl5522-debug.result @@ -491,7 +491,6 @@ INDEX idx3(c4(512))) Engine=InnoDB; connect purge_control,localhost,root; START TRANSACTION WITH CONSISTENT SNAPSHOT; connection default; -SET GLOBAL innodb_disable_background_merge=ON; SET GLOBAL innodb_monitor_reset = ibuf_merges; SET GLOBAL innodb_monitor_reset = ibuf_merges_insert; INSERT INTO test_wl5522.t1(c2, c3, c4) VALUES @@ -642,6 +641,7 @@ SELECT name FROM information_schema.innodb_metrics WHERE name = 'ibuf_merges_insert' AND count = 0; name +ibuf_merges_insert FLUSH TABLES test_wl5522.t1 FOR EXPORT; backup: t1 UNLOCK TABLES; @@ -649,12 +649,10 @@ SELECT name FROM information_schema.innodb_metrics WHERE name = 'ibuf_merges' AND count > 0; name -ibuf_merges SELECT name FROM information_schema.innodb_metrics WHERE name = 'ibuf_merges_inserts' AND count > 0; name -SET GLOBAL innodb_disable_background_merge=OFF; connection purge_control; COMMIT; disconnect purge_control; diff --git a/mysql-test/suite/innodb/r/innodb_force_recovery.result b/mysql-test/suite/innodb/r/innodb_force_recovery.result index 838de9844e1..9d537126216 100644 --- a/mysql-test/suite/innodb/r/innodb_force_recovery.result +++ b/mysql-test/suite/innodb/r/innodb_force_recovery.result @@ -3,38 +3,26 @@ create table t2(f1 int primary key, f2 int, index idx(f2))engine=innodb; insert into t1 values(1, 2); insert into t2 values(1, 2); SET GLOBAL innodb_fast_shutdown = 0; -# Restart the server with innodb_force_recovery as 4. # restart: --innodb-force-recovery=4 select * from t1; f1 f2 1 2 +begin; insert into t1 values(2, 3); -ERROR HY000: Running in read-only mode +rollback; alter table t1 add f3 int not null, algorithm=copy; -ERROR HY000: Can't create table `test`.`t1` (errno: 165 "Table is read only") -alter table t1 add f3 int not null, algorithm=inplace; -ERROR 0A000: ALGORITHM=INPLACE is not supported. Reason: Running in read-only mode. Try ALGORITHM=COPY +alter table t1 add f4 int not null, algorithm=inplace; drop index idx on t1; -ERROR HY000: Can't create table `test`.`t1` (errno: 165 "Table is read only") -alter table t1 drop index idx, algorithm=inplace; -ERROR 0A000: ALGORITHM=INPLACE is not supported. Reason: Running in read-only mode. Try ALGORITHM=COPY update t1 set f1=3 where f2=2; -ERROR HY000: Running in read-only mode create table t3(f1 int not null)engine=innodb; -ERROR HY000: Can't create table `test`.`t3` (errno: 165 "Table is read only") drop table t3; -ERROR 42S02: Unknown table 'test.t3' rename table t1 to t3; -ERROR HY000: Error on rename of './test/t1' to './test/t3' (errno: 165 "Table is read only") +rename table t3 to t1; truncate table t1; -ERROR HY000: Table 't1' is read only -drop table t1; -ERROR HY000: Table 't1' is read only show tables; Tables_in_test t1 t2 -# Restart the server with innodb_force_recovery as 5. # restart: --innodb-force-recovery=5 select * from t2; f1 f2 @@ -65,7 +53,6 @@ show tables; Tables_in_test t1 t2 -# Restart the server with innodb_force_recovery as 6. # restart: --innodb-force-recovery=6 select * from t2; f1 f2 @@ -94,7 +81,6 @@ show tables; Tables_in_test t1 t2 -# Restart the server with innodb_force_recovery=2 # restart: --innodb-force-recovery=2 select * from t2; f1 f2 @@ -108,7 +94,6 @@ drop table t1; disconnect con1; connection default; # Kill the server -# Restart the server with innodb_force_recovery=3 # restart: --innodb-force-recovery=3 SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; select * from t2; diff --git a/mysql-test/suite/innodb/r/innodb_skip_innodb_is_tables.result b/mysql-test/suite/innodb/r/innodb_skip_innodb_is_tables.result index a2353913b15..912c3d77867 100644 --- a/mysql-test/suite/innodb/r/innodb_skip_innodb_is_tables.result +++ b/mysql-test/suite/innodb/r/innodb_skip_innodb_is_tables.result @@ -248,7 +248,6 @@ innodb_activity_count server 0 NULL NULL NULL 0 NULL NULL NULL NULL NULL NULL NU innodb_master_active_loops server 0 NULL NULL NULL 0 NULL NULL NULL NULL NULL NULL NULL 0 counter Number of times master thread performs its tasks when server is active innodb_master_idle_loops server 0 NULL NULL NULL 0 NULL NULL NULL NULL NULL NULL NULL 0 counter Number of times master thread performs its tasks when server is idle innodb_background_drop_table_usec server 0 NULL NULL NULL 0 NULL NULL NULL NULL NULL NULL NULL 0 counter Time (in microseconds) spent to process drop table list -innodb_ibuf_merge_usec server 0 NULL NULL NULL 0 NULL NULL NULL NULL NULL NULL NULL 0 counter Time (in microseconds) spent to process change buffer merge innodb_log_flush_usec server 0 NULL NULL NULL 0 NULL NULL NULL NULL NULL NULL NULL 0 counter Time (in microseconds) spent to flush log records innodb_mem_validate_usec server 0 NULL NULL NULL 0 NULL NULL NULL NULL NULL NULL NULL 0 counter Time (in microseconds) spent to do memory validation innodb_master_purge_usec server 0 NULL NULL NULL 0 NULL NULL NULL NULL NULL NULL NULL 0 counter Time (in microseconds) spent by master thread to purge records diff --git a/mysql-test/suite/innodb/r/monitor.result b/mysql-test/suite/innodb/r/monitor.result index d6784a71299..16aa8630012 100644 --- a/mysql-test/suite/innodb/r/monitor.result +++ b/mysql-test/suite/innodb/r/monitor.result @@ -214,7 +214,6 @@ innodb_activity_count disabled innodb_master_active_loops disabled innodb_master_idle_loops disabled innodb_background_drop_table_usec disabled -innodb_ibuf_merge_usec disabled innodb_log_flush_usec disabled innodb_mem_validate_usec disabled innodb_master_purge_usec disabled diff --git a/mysql-test/suite/innodb/t/innodb-wl5522-debug.test b/mysql-test/suite/innodb/t/innodb-wl5522-debug.test index b223113b6ff..e8e02d1aeba 100644 --- a/mysql-test/suite/innodb/t/innodb-wl5522-debug.test +++ b/mysql-test/suite/innodb/t/innodb-wl5522-debug.test @@ -1047,9 +1047,6 @@ connect (purge_control,localhost,root); START TRANSACTION WITH CONSISTENT SNAPSHOT; connection default; -# Disable change buffer merge from the master thread, additionally -# enable aggressive flushing so that more changes are buffered. -SET GLOBAL innodb_disable_background_merge=ON; SET GLOBAL innodb_monitor_reset = ibuf_merges; SET GLOBAL innodb_monitor_reset = ibuf_merges_insert; @@ -1112,8 +1109,6 @@ SELECT name FROM information_schema.innodb_metrics WHERE name = 'ibuf_merges_inserts' AND count > 0; -SET GLOBAL innodb_disable_background_merge=OFF; - # Enable normal operation connection purge_control; COMMIT; diff --git a/mysql-test/suite/innodb/t/innodb_force_recovery.test b/mysql-test/suite/innodb/t/innodb_force_recovery.test index fe070100c08..bd9554e6b6e 100644 --- a/mysql-test/suite/innodb/t/innodb_force_recovery.test +++ b/mysql-test/suite/innodb/t/innodb_force_recovery.test @@ -17,47 +17,33 @@ insert into t2 values(1, 2); SET GLOBAL innodb_fast_shutdown = 0; ---echo # Restart the server with innodb_force_recovery as 4. --let $restart_parameters= --innodb-force-recovery=4 --source include/restart_mysqld.inc let $status=`SHOW ENGINE INNODB STATUS`; select * from t1; ---error ER_READ_ONLY_MODE +begin; insert into t1 values(2, 3); +rollback; ---error ER_CANT_CREATE_TABLE alter table t1 add f3 int not null, algorithm=copy; ---error ER_ALTER_OPERATION_NOT_SUPPORTED_REASON -alter table t1 add f3 int not null, algorithm=inplace; +alter table t1 add f4 int not null, algorithm=inplace; ---error ER_CANT_CREATE_TABLE drop index idx on t1; ---error ER_ALTER_OPERATION_NOT_SUPPORTED_REASON -alter table t1 drop index idx, algorithm=inplace; ---error ER_READ_ONLY_MODE update t1 set f1=3 where f2=2; ---error ER_CANT_CREATE_TABLE create table t3(f1 int not null)engine=innodb; - ---error ER_BAD_TABLE_ERROR drop table t3; ---error ER_ERROR_ON_RENAME rename table t1 to t3; - ---error ER_OPEN_AS_READONLY +rename table t3 to t1; truncate table t1; ---error ER_OPEN_AS_READONLY -drop table t1; show tables; ---echo # Restart the server with innodb_force_recovery as 5. --let $restart_parameters= --innodb-force-recovery=5 --source include/restart_mysqld.inc let $status=`SHOW ENGINE INNODB STATUS`; @@ -98,7 +84,6 @@ create schema db; drop schema db; show tables; ---echo # Restart the server with innodb_force_recovery as 6. --let $restart_parameters= --innodb-force-recovery=6 --source include/restart_mysqld.inc let $status=`SHOW ENGINE INNODB STATUS`; @@ -136,7 +121,6 @@ truncate table t2; drop table t2; show tables; ---echo # Restart the server with innodb_force_recovery=2 --let $restart_parameters= --innodb-force-recovery=2 --source include/restart_mysqld.inc let $status=`SHOW ENGINE INNODB STATUS`; @@ -154,7 +138,6 @@ disconnect con1; connection default; --source include/kill_mysqld.inc ---echo # Restart the server with innodb_force_recovery=3 --let $restart_parameters= --innodb-force-recovery=3 --source include/start_mysqld.inc let $status=`SHOW ENGINE INNODB STATUS`; diff --git a/mysql-test/suite/innodb_zip/r/wl5522_debug_zip.result b/mysql-test/suite/innodb_zip/r/wl5522_debug_zip.result index 864ffba2117..4ce6fe769fd 100644 --- a/mysql-test/suite/innodb_zip/r/wl5522_debug_zip.result +++ b/mysql-test/suite/innodb_zip/r/wl5522_debug_zip.result @@ -120,7 +120,6 @@ ROW_FORMAT=COMPRESSED; connect purge_control,localhost,root; START TRANSACTION WITH CONSISTENT SNAPSHOT; connection default; -SET GLOBAL innodb_disable_background_merge=ON; SET GLOBAL innodb_monitor_reset = ibuf_merges; SET GLOBAL innodb_monitor_reset = ibuf_merges_insert; INSERT INTO test_wl5522.t1(c2, c3, c4) VALUES @@ -271,6 +270,7 @@ SELECT name FROM information_schema.innodb_metrics WHERE name = 'ibuf_merges_insert' AND count = 0; name +ibuf_merges_insert FLUSH TABLES test_wl5522.t1 FOR EXPORT; backup: t1 UNLOCK TABLES; @@ -278,12 +278,10 @@ SELECT name FROM information_schema.innodb_metrics WHERE name = 'ibuf_merges' AND count > 0; name -ibuf_merges SELECT name FROM information_schema.innodb_metrics WHERE name = 'ibuf_merges_inserts' AND count > 0; name -SET GLOBAL innodb_disable_background_merge=OFF; connection purge_control; COMMIT; disconnect purge_control; diff --git a/mysql-test/suite/innodb_zip/t/wl5522_debug_zip.test b/mysql-test/suite/innodb_zip/t/wl5522_debug_zip.test index e42f1baed74..22f729eccbe 100644 --- a/mysql-test/suite/innodb_zip/t/wl5522_debug_zip.test +++ b/mysql-test/suite/innodb_zip/t/wl5522_debug_zip.test @@ -312,9 +312,6 @@ connect (purge_control,localhost,root); START TRANSACTION WITH CONSISTENT SNAPSHOT; connection default; -# Disable change buffer merge from the master thread, additionally -# enable aggressive flushing so that more changes are buffered. -SET GLOBAL innodb_disable_background_merge=ON; SET GLOBAL innodb_monitor_reset = ibuf_merges; SET GLOBAL innodb_monitor_reset = ibuf_merges_insert; @@ -377,8 +374,6 @@ SELECT name FROM information_schema.innodb_metrics WHERE name = 'ibuf_merges_inserts' AND count > 0; -SET GLOBAL innodb_disable_background_merge=OFF; - # Enable normal operation connection purge_control; COMMIT; diff --git a/mysql-test/suite/sys_vars/r/innodb_disable_background_merge_basic.result b/mysql-test/suite/sys_vars/r/innodb_disable_background_merge_basic.result deleted file mode 100644 index c4bf621a33d..00000000000 --- a/mysql-test/suite/sys_vars/r/innodb_disable_background_merge_basic.result +++ /dev/null @@ -1,4 +0,0 @@ -SET @orig = @@global.innodb_disable_background_merge; -SELECT @orig; -@orig -0 diff --git a/mysql-test/suite/sys_vars/r/sysvars_innodb,32bit.rdiff b/mysql-test/suite/sys_vars/r/sysvars_innodb,32bit.rdiff index a7850658aa9..60be1dc78a0 100644 --- a/mysql-test/suite/sys_vars/r/sysvars_innodb,32bit.rdiff +++ b/mysql-test/suite/sys_vars/r/sysvars_innodb,32bit.rdiff @@ -176,7 +176,7 @@ VARIABLE_SCOPE GLOBAL -VARIABLE_TYPE BIGINT UNSIGNED +VARIABLE_TYPE INT UNSIGNED - VARIABLE_COMMENT Helps to save your data in case the disk image of the database becomes corrupt. + VARIABLE_COMMENT Helps to save your data in case the disk image of the database becomes corrupt. Value 5 can return bogus data, and 6 can permanently corrupt data. NUMERIC_MIN_VALUE 0 NUMERIC_MAX_VALUE 6 @@ -949,7 +949,7 @@ diff --git a/mysql-test/suite/sys_vars/r/sysvars_innodb.result b/mysql-test/suite/sys_vars/r/sysvars_innodb.result index ddd47f3c819..b53c5879386 100644 --- a/mysql-test/suite/sys_vars/r/sysvars_innodb.result +++ b/mysql-test/suite/sys_vars/r/sysvars_innodb.result @@ -621,18 +621,6 @@ NUMERIC_BLOCK_SIZE NULL ENUM_VALUE_LIST OFF,ON READ_ONLY NO COMMAND_LINE_ARGUMENT OPTIONAL -VARIABLE_NAME INNODB_DISABLE_BACKGROUND_MERGE -SESSION_VALUE NULL -DEFAULT_VALUE OFF -VARIABLE_SCOPE GLOBAL -VARIABLE_TYPE BOOLEAN -VARIABLE_COMMENT Disable change buffering merges by the master thread -NUMERIC_MIN_VALUE NULL -NUMERIC_MAX_VALUE NULL -NUMERIC_BLOCK_SIZE NULL -ENUM_VALUE_LIST OFF,ON -READ_ONLY NO -COMMAND_LINE_ARGUMENT NONE VARIABLE_NAME INNODB_DISABLE_RESIZE_BUFFER_POOL_DEBUG SESSION_VALUE NULL DEFAULT_VALUE ON @@ -926,7 +914,7 @@ SESSION_VALUE NULL DEFAULT_VALUE 0 VARIABLE_SCOPE GLOBAL VARIABLE_TYPE BIGINT UNSIGNED -VARIABLE_COMMENT Helps to save your data in case the disk image of the database becomes corrupt. +VARIABLE_COMMENT Helps to save your data in case the disk image of the database becomes corrupt. Value 5 can return bogus data, and 6 can permanently corrupt data. NUMERIC_MIN_VALUE 0 NUMERIC_MAX_VALUE 6 NUMERIC_BLOCK_SIZE 0 diff --git a/mysql-test/suite/sys_vars/t/innodb_disable_background_merge_basic.test b/mysql-test/suite/sys_vars/t/innodb_disable_background_merge_basic.test deleted file mode 100644 index 9ab1a90efe1..00000000000 --- a/mysql-test/suite/sys_vars/t/innodb_disable_background_merge_basic.test +++ /dev/null @@ -1,12 +0,0 @@ -# -# Basic test for innodb_disable_background_merge. -# - --- source include/have_innodb.inc - -# The config variable is a debug variable --- source include/have_debug.inc - -# Check the default value -SET @orig = @@global.innodb_disable_background_merge; -SELECT @orig; diff --git a/storage/innobase/btr/btr0btr.cc b/storage/innobase/btr/btr0btr.cc index f1d899691d9..d9253809642 100644 --- a/storage/innobase/btr/btr0btr.cc +++ b/storage/innobase/btr/btr0btr.cc @@ -223,7 +223,8 @@ btr_root_block_get( return NULL; } - buf_block_t* block = btr_block_get(*index, index->page, mode, mtr); + buf_block_t* block = btr_block_get(*index, index->page, mode, false, + mtr); if (!block) { index->table->file_unreadable = true; @@ -833,7 +834,8 @@ btr_node_ptr_get_child( return btr_block_get( *index, btr_node_ptr_get_child_page_no(node_ptr, offsets), - RW_SX_LATCH, mtr); + RW_SX_LATCH, btr_page_get_level(page_align(node_ptr)) == 1, + mtr); } /************************************************************//** @@ -2498,7 +2500,6 @@ btr_attach_half_pages( { ulint prev_page_no; ulint next_page_no; - ulint level; page_t* page = buf_block_get_frame(block); page_t* lower_page; page_t* upper_page; @@ -2551,6 +2552,10 @@ btr_attach_half_pages( upper_page_zip = buf_block_get_page_zip(new_block); } + /* Get the level of the split pages */ + const ulint level = btr_page_get_level(buf_block_get_frame(block)); + ut_ad(level == btr_page_get_level(buf_block_get_frame(new_block))); + /* Get the previous and next pages of page */ prev_page_no = btr_page_get_prev(page, mtr); next_page_no = btr_page_get_next(page, mtr); @@ -2558,17 +2563,13 @@ btr_attach_half_pages( /* for consistency, both blocks should be locked, before change */ if (prev_page_no != FIL_NULL && direction == FSP_DOWN) { prev_block = btr_block_get(*index, prev_page_no, RW_X_LATCH, - mtr); + !level, mtr); } if (next_page_no != FIL_NULL && direction != FSP_DOWN) { next_block = btr_block_get(*index, next_page_no, RW_X_LATCH, - mtr); + !level, mtr); } - /* Get the level of the split pages */ - level = btr_page_get_level(buf_block_get_frame(block)); - ut_ad(level == btr_page_get_level(buf_block_get_frame(new_block))); - /* Build the node pointer (= node key and page address) for the upper half */ @@ -2709,7 +2710,7 @@ btr_insert_into_right_sibling( ulint max_size; next_block = btr_block_get(*cursor->index, next_page_no, RW_X_LATCH, - mtr); + page_is_leaf(page), mtr); if (UNIV_UNLIKELY(!next_block)) { return NULL; } @@ -3218,7 +3219,8 @@ void btr_level_list_remove(const buf_block_t& block, const dict_index_t& index, if (prev_page_no != FIL_NULL) { buf_block_t* prev_block = btr_block_get( - index, prev_page_no, RW_X_LATCH, mtr); + index, prev_page_no, RW_X_LATCH, page_is_leaf(page), + mtr); page_t* prev_page = buf_block_get_frame(prev_block); #ifdef UNIV_BTR_DEBUG @@ -3234,7 +3236,8 @@ void btr_level_list_remove(const buf_block_t& block, const dict_index_t& index, if (next_page_no != FIL_NULL) { buf_block_t* next_block = btr_block_get( - index, next_page_no, RW_X_LATCH, mtr); + index, next_page_no, RW_X_LATCH, page_is_leaf(page), + mtr); page_t* next_page = buf_block_get_frame(next_block); #ifdef UNIV_BTR_DEBUG @@ -4199,7 +4202,7 @@ btr_discard_page( ut_d(bool parent_is_different = false); if (left_page_no != FIL_NULL) { merge_block = btr_block_get(*index, left_page_no, RW_X_LATCH, - mtr); + true, mtr); merge_page = buf_block_get_frame(merge_block); #ifdef UNIV_BTR_DEBUG ut_a(btr_page_get_next(merge_page, mtr) @@ -4213,7 +4216,7 @@ btr_discard_page( == btr_cur_get_rec(&parent_cursor))); } else if (right_page_no != FIL_NULL) { merge_block = btr_block_get(*index, right_page_no, RW_X_LATCH, - mtr); + true, mtr); merge_page = buf_block_get_frame(merge_block); #ifdef UNIV_BTR_DEBUG ut_a(btr_page_get_prev(merge_page, mtr) @@ -4866,7 +4869,8 @@ btr_validate_level( savepoint2 = mtr_set_savepoint(&mtr); block = btr_block_get(*index, left_page_no, - RW_SX_LATCH, &mtr); + RW_SX_LATCH, false, + &mtr); page = buf_block_get_frame(block); left_page_no = btr_page_get_prev(page, &mtr); } @@ -4935,7 +4939,7 @@ loop: savepoint = mtr_set_savepoint(&mtr); right_block = btr_block_get(*index, right_page_no, RW_SX_LATCH, - &mtr); + !level, &mtr); right_page = buf_block_get_frame(right_block); if (btr_page_get_prev(right_page, &mtr) @@ -5109,10 +5113,11 @@ loop: &mtr, savepoint, right_block); btr_block_get(*index, parent_right_page_no, - RW_SX_LATCH, &mtr); + RW_SX_LATCH, false, &mtr); right_block = btr_block_get(*index, right_page_no, - RW_SX_LATCH, &mtr); + RW_SX_LATCH, + !level, &mtr); } btr_cur_position( @@ -5187,16 +5192,17 @@ node_ptr_fails: if (parent_right_page_no != FIL_NULL) { btr_block_get(*index, parent_right_page_no, - RW_SX_LATCH, &mtr); + RW_SX_LATCH, false, + &mtr); } } else if (parent_page_no != FIL_NULL) { btr_block_get(*index, parent_page_no, - RW_SX_LATCH, &mtr); + RW_SX_LATCH, false, &mtr); } } block = btr_block_get(*index, right_page_no, RW_SX_LATCH, - &mtr); + !level, &mtr); page = buf_block_get_frame(block); goto loop; @@ -5299,7 +5305,8 @@ btr_can_merge_with_page( index = btr_cur_get_index(cursor); page = btr_cur_get_page(cursor); - mblock = btr_block_get(*index, page_no, RW_X_LATCH, mtr); + mblock = btr_block_get(*index, page_no, RW_X_LATCH, page_is_leaf(page), + mtr); mpage = buf_block_get_frame(mblock); n_recs = page_get_n_recs(page); diff --git a/storage/innobase/btr/btr0bulk.cc b/storage/innobase/btr/btr0bulk.cc index e559f1dfc8c..f2ad31f3a5d 100644 --- a/storage/innobase/btr/btr0bulk.cc +++ b/storage/innobase/btr/btr0bulk.cc @@ -120,7 +120,7 @@ PageBulk::init() } } else { new_block = btr_block_get(*m_index, m_page_no, RW_X_LATCH, - &m_mtr); + false, &m_mtr); new_page = buf_block_get_frame(new_block); new_page_zip = buf_block_get_page_zip(new_block); @@ -1014,7 +1014,7 @@ BtrBulk::finish(dberr_t err) ut_ad(last_page_no != FIL_NULL); last_block = btr_block_get(*m_index, last_page_no, RW_X_LATCH, - &mtr); + false, &mtr); first_rec = page_rec_get_next( page_get_infimum_rec(last_block->frame)); ut_ad(page_rec_is_user_rec(first_rec)); diff --git a/storage/innobase/btr/btr0cur.cc b/storage/innobase/btr/btr0cur.cc index 8140eea96e1..0d24e3ffb94 100644 --- a/storage/innobase/btr/btr0cur.cc +++ b/storage/innobase/btr/btr0cur.cc @@ -248,7 +248,8 @@ btr_cur_latch_leaves( mode = latch_mode == BTR_MODIFY_LEAF ? RW_X_LATCH : RW_S_LATCH; latch_leaves.savepoints[1] = mtr_set_savepoint(mtr); get_block = btr_block_get(*cursor->index, - block->page.id.page_no(), mode, mtr); + block->page.id.page_no(), mode, + true, mtr); latch_leaves.blocks[1] = get_block; #ifdef UNIV_BTR_DEBUG ut_a(page_is_comp(get_block->frame) == page_is_comp(page)); @@ -278,7 +279,8 @@ btr_cur_latch_leaves( latch_leaves.savepoints[0] = mtr_set_savepoint(mtr); get_block = btr_block_get( - *cursor->index, left_page_no, RW_X_LATCH, mtr); + *cursor->index, left_page_no, RW_X_LATCH, + true, mtr); latch_leaves.blocks[0] = get_block; if (spatial) { @@ -295,7 +297,7 @@ btr_cur_latch_leaves( latch_leaves.savepoints[1] = mtr_set_savepoint(mtr); get_block = btr_block_get( *cursor->index, block->page.id.page_no(), - RW_X_LATCH, mtr); + RW_X_LATCH, true, mtr); latch_leaves.blocks[1] = get_block; #ifdef UNIV_BTR_DEBUG @@ -326,7 +328,7 @@ btr_cur_latch_leaves( latch_leaves.savepoints[2] = mtr_set_savepoint(mtr); get_block = btr_block_get(*cursor->index, right_page_no, RW_X_LATCH, - mtr); + true, mtr); latch_leaves.blocks[2] = get_block; #ifdef UNIV_BTR_DEBUG ut_a(page_is_comp(get_block->frame) @@ -353,7 +355,8 @@ btr_cur_latch_leaves( if (left_page_no != FIL_NULL) { latch_leaves.savepoints[0] = mtr_set_savepoint(mtr); get_block = btr_block_get( - *cursor->index, left_page_no, mode, mtr); + *cursor->index, left_page_no, mode, + true, mtr); latch_leaves.blocks[0] = get_block; cursor->left_block = get_block; #ifdef UNIV_BTR_DEBUG @@ -366,7 +369,8 @@ btr_cur_latch_leaves( latch_leaves.savepoints[1] = mtr_set_savepoint(mtr); get_block = btr_block_get(*cursor->index, - block->page.id.page_no(), mode, mtr); + block->page.id.page_no(), mode, + true, mtr); latch_leaves.blocks[1] = get_block; #ifdef UNIV_BTR_DEBUG ut_a(page_is_comp(get_block->frame) == page_is_comp(page)); @@ -752,18 +756,17 @@ btr_cur_optimistic_latch_leaves( goto unpin_failed; } - left_page_no = btr_page_get_prev( - buf_block_get_frame(block), mtr); + left_page_no = btr_page_get_prev(block->frame, mtr); rw_lock_s_unlock(&block->lock); cursor->left_block = left_page_no != FIL_NULL ? btr_block_get(*cursor->index, left_page_no, mode, - mtr) + page_is_leaf(block->frame), mtr) : NULL; if (buf_page_optimistic_get(mode, block, modify_clock, file, line, mtr)) { - if (btr_page_get_prev(buf_block_get_frame(block), mtr) + if (btr_page_get_prev(block->frame, mtr) == left_page_no) { buf_block_buf_fix_dec(block); *latch_mode = mode; @@ -1185,7 +1188,6 @@ btr_cur_search_to_nth_level_func( ulint up_bytes; ulint low_match; ulint low_bytes; - ulint savepoint; ulint rw_latch; page_cur_mode_t page_mode; page_cur_mode_t search_mode = PAGE_CUR_UNSUPP; @@ -1197,7 +1199,6 @@ btr_cur_search_to_nth_level_func( ulint root_height = 0; /* remove warning */ dberr_t err = DB_SUCCESS; - ulint upper_rw_latch, root_leaf_rw_latch; btr_intention_t lock_intention; bool modify_external; buf_block_t* tree_blocks[BTR_MAX_LEVELS]; @@ -1387,7 +1388,9 @@ btr_cur_search_to_nth_level_func( /* Store the position of the tree latch we push to mtr so that we know how to release it when we have latched leaf node(s) */ - savepoint = mtr_set_savepoint(mtr); + ulint savepoint = mtr_set_savepoint(mtr); + + rw_lock_type_t upper_rw_latch; switch (latch_mode) { case BTR_MODIFY_TREE: @@ -1448,7 +1451,8 @@ btr_cur_search_to_nth_level_func( upper_rw_latch = RW_NO_LATCH; } } - root_leaf_rw_latch = btr_cur_latch_for_root_leaf(latch_mode); + const rw_lock_type_t root_leaf_rw_latch = btr_cur_latch_for_root_leaf( + latch_mode); page_cursor = btr_cur_get_page_cur(cursor); @@ -1536,7 +1540,8 @@ retry_page_get: ut_ad(n_blocks < BTR_MAX_LEVELS); tree_savepoints[n_blocks] = mtr_set_savepoint(mtr); block = buf_page_get_gen(page_id, zip_size, rw_latch, guess, - buf_mode, file, line, mtr, &err); + buf_mode, file, line, mtr, &err, + height == 0 && !index->is_clust()); tree_blocks[n_blocks] = block; /* Note that block==NULL signifies either an error or change @@ -1681,8 +1686,9 @@ retry_page_get: tree_blocks[n_blocks]); tree_savepoints[n_blocks] = mtr_set_savepoint(mtr); - block = buf_page_get_gen(page_id, zip_size, rw_latch, NULL, - buf_mode, file, line, mtr, &err); + block = buf_page_get_gen(page_id, zip_size, + rw_latch, NULL, buf_mode, + file, line, mtr, &err); tree_blocks[n_blocks] = block; if (err != DB_SUCCESS) { @@ -2339,7 +2345,7 @@ need_opposite_intention: buf_block_t* child_block = btr_block_get( *index, page_id.page_no(), latch_mode == BTR_CONT_MODIFY_TREE - ? RW_X_LATCH : RW_SX_LATCH, mtr); + ? RW_X_LATCH : RW_SX_LATCH, false, mtr); btr_assert_not_corrupted(child_block, index); } else { ut_ad(mtr_memo_contains(mtr, block, upper_rw_latch)); @@ -2471,8 +2477,6 @@ btr_cur_open_at_index_side_func( ulint root_height = 0; /* remove warning */ rec_t* node_ptr; ulint estimate; - ulint savepoint; - ulint upper_rw_latch, root_leaf_rw_latch; btr_intention_t lock_intention; buf_block_t* tree_blocks[BTR_MAX_LEVELS]; ulint tree_savepoints[BTR_MAX_LEVELS]; @@ -2509,7 +2513,9 @@ btr_cur_open_at_index_side_func( /* Store the position of the tree latch we push to mtr so that we know how to release it when we have latched the leaf node */ - savepoint = mtr_set_savepoint(mtr); + ulint savepoint = mtr_set_savepoint(mtr); + + rw_lock_type_t upper_rw_latch; switch (latch_mode) { case BTR_CONT_MODIFY_TREE: @@ -2548,7 +2554,9 @@ btr_cur_open_at_index_side_func( upper_rw_latch = RW_NO_LATCH; } } - root_leaf_rw_latch = btr_cur_latch_for_root_leaf(latch_mode); + + const rw_lock_type_t root_leaf_rw_latch = btr_cur_latch_for_root_leaf( + latch_mode); page_cursor = btr_cur_get_page_cur(cursor); cursor->index = index; @@ -2563,22 +2571,17 @@ btr_cur_open_at_index_side_func( height = ULINT_UNDEFINED; for (;;) { - buf_block_t* block; - ulint rw_latch; - ut_ad(n_blocks < BTR_MAX_LEVELS); - - if (height != 0 - && (latch_mode != BTR_MODIFY_TREE - || height == level)) { - rw_latch = upper_rw_latch; - } else { - rw_latch = RW_NO_LATCH; - } - tree_savepoints[n_blocks] = mtr_set_savepoint(mtr); - block = buf_page_get_gen(page_id, zip_size, rw_latch, NULL, - BUF_GET, file, line, mtr, &err); + + const ulint rw_latch = height + && (latch_mode != BTR_MODIFY_TREE || height == level) + ? upper_rw_latch : RW_NO_LATCH; + buf_block_t* block = buf_page_get_gen(page_id, zip_size, + rw_latch, NULL, BUF_GET, + file, line, mtr, &err, + height == 0 + && !index->is_clust()); ut_ad((block != NULL) == (err == DB_SUCCESS)); tree_blocks[n_blocks] = block; @@ -2630,75 +2633,62 @@ btr_cur_open_at_index_side_func( ut_ad(height == btr_page_get_level(page)); } - if (height == level) { - if (srv_read_only_mode) { - btr_cur_latch_leaves( - block, latch_mode, cursor, mtr); - } else if (height == 0) { - if (rw_latch == RW_NO_LATCH) { - btr_cur_latch_leaves(block, latch_mode, - cursor, mtr); - } - /* In versions <= 3.23.52 we had - forgotten to release the tree latch - here. If in an index scan we had to - scan far to find a record visible to - the current transaction, that could - starve others waiting for the tree - latch. */ + if (height == 0) { + if (rw_latch == RW_NO_LATCH) { + btr_cur_latch_leaves(block, latch_mode, + cursor, mtr); + } - switch (latch_mode) { - case BTR_MODIFY_TREE: - case BTR_CONT_MODIFY_TREE: - case BTR_CONT_SEARCH_TREE: + /* In versions <= 3.23.52 we had forgotten to + release the tree latch here. If in an index + scan we had to scan far to find a record + visible to the current transaction, that could + starve others waiting for the tree latch. */ + + switch (latch_mode) { + case BTR_MODIFY_TREE: + case BTR_CONT_MODIFY_TREE: + case BTR_CONT_SEARCH_TREE: + break; + default: + if (UNIV_UNLIKELY(srv_read_only_mode)) { break; - default: - if (!s_latch_by_caller) { - /* Release the tree s-latch */ - mtr_release_s_latch_at_savepoint( - mtr, savepoint, - dict_index_get_lock( - index)); - } - - /* release upper blocks */ - for (; n_releases < n_blocks; - n_releases++) { - mtr_release_block_at_savepoint( - mtr, - tree_savepoints[ - n_releases], - tree_blocks[ - n_releases]); - } } - } else { /* height != 0 */ - /* We already have the block latched. */ - ut_ad(latch_mode == BTR_SEARCH_TREE); - ut_ad(s_latch_by_caller); - ut_ad(upper_rw_latch == RW_S_LATCH); + if (!s_latch_by_caller) { + /* Release the tree s-latch */ + mtr_release_s_latch_at_savepoint( + mtr, savepoint, &index->lock); + } - ut_ad(mtr_memo_contains(mtr, block, - upper_rw_latch)); - - if (s_latch_by_caller) { - /* to exclude modifying tree operations - should sx-latch the index. */ - ut_ad(mtr_memo_contains( + /* release upper blocks */ + for (; n_releases < n_blocks; n_releases++) { + mtr_release_block_at_savepoint( mtr, - dict_index_get_lock(index), - MTR_MEMO_SX_LOCK)); - /* because has sx-latch of index, - can release upper blocks. */ - for (; n_releases < n_blocks; - n_releases++) { - mtr_release_block_at_savepoint( - mtr, - tree_savepoints[ - n_releases], - tree_blocks[ - n_releases]); - } + tree_savepoints[n_releases], + tree_blocks[n_releases]); + } + } + } else if (height == level /* height != 0 */ + && UNIV_LIKELY(!srv_read_only_mode)) { + /* We already have the block latched. */ + ut_ad(latch_mode == BTR_SEARCH_TREE); + ut_ad(s_latch_by_caller); + ut_ad(upper_rw_latch == RW_S_LATCH); + + ut_ad(mtr_memo_contains(mtr, block, upper_rw_latch)); + + if (s_latch_by_caller) { + /* to exclude modifying tree operations + should sx-latch the index. */ + ut_ad(mtr_memo_contains(mtr, &index->lock, + MTR_MEMO_SX_LOCK)); + /* because has sx-latch of index, + can release upper blocks. */ + for (; n_releases < n_blocks; n_releases++) { + mtr_release_block_at_savepoint( + mtr, + tree_savepoints[n_releases], + tree_blocks[n_releases]); } } } @@ -2838,8 +2828,6 @@ btr_cur_open_at_rnd_pos_func( ulint node_ptr_max_size = srv_page_size / 2; ulint height; rec_t* node_ptr; - ulint savepoint; - ulint upper_rw_latch, root_leaf_rw_latch; btr_intention_t lock_intention; buf_block_t* tree_blocks[BTR_MAX_LEVELS]; ulint tree_savepoints[BTR_MAX_LEVELS]; @@ -2856,7 +2844,9 @@ btr_cur_open_at_rnd_pos_func( ut_ad(!(latch_mode & BTR_MODIFY_EXTERNAL)); - savepoint = mtr_set_savepoint(mtr); + ulint savepoint = mtr_set_savepoint(mtr); + + rw_lock_type_t upper_rw_latch; switch (latch_mode) { case BTR_MODIFY_TREE: @@ -2903,7 +2893,8 @@ btr_cur_open_at_rnd_pos_func( return(false); } - root_leaf_rw_latch = btr_cur_latch_for_root_leaf(latch_mode); + const rw_lock_type_t root_leaf_rw_latch = btr_cur_latch_for_root_leaf( + latch_mode); page_cursor = btr_cur_get_page_cur(cursor); cursor->index = index; @@ -2919,22 +2910,19 @@ btr_cur_open_at_rnd_pos_func( height = ULINT_UNDEFINED; for (;;) { - buf_block_t* block; page_t* page; - ulint rw_latch; ut_ad(n_blocks < BTR_MAX_LEVELS); - - if (height != 0 - && latch_mode != BTR_MODIFY_TREE) { - rw_latch = upper_rw_latch; - } else { - rw_latch = RW_NO_LATCH; - } - tree_savepoints[n_blocks] = mtr_set_savepoint(mtr); - block = buf_page_get_gen(page_id, zip_size, rw_latch, NULL, - BUF_GET, file, line, mtr, &err); + + const rw_lock_type_t rw_latch = height + && latch_mode != BTR_MODIFY_TREE + ? upper_rw_latch : RW_NO_LATCH; + buf_block_t* block = buf_page_get_gen(page_id, zip_size, + rw_latch, NULL, BUF_GET, + file, line, mtr, &err, + height == 0 + && !index->is_clust()); tree_blocks[n_blocks] = block; ut_ad((block != NULL) == (err == DB_SUCCESS)); @@ -7453,7 +7441,7 @@ struct btr_blob_log_check_t { if (m_op == BTR_STORE_INSERT_BULK) { mtr_x_lock(dict_index_get_lock(index), m_mtr); m_pcur->btr_cur.page_cur.block = btr_block_get( - *index, page_no, RW_X_LATCH, m_mtr); + *index, page_no, RW_X_LATCH, false, m_mtr); m_pcur->btr_cur.page_cur.rec = m_pcur->btr_cur.page_cur.block->frame + offs; diff --git a/storage/innobase/btr/btr0defragment.cc b/storage/innobase/btr/btr0defragment.cc index a4211afbb9a..631527d2f1f 100644 --- a/storage/innobase/btr/btr0defragment.cc +++ b/storage/innobase/btr/btr0defragment.cc @@ -585,7 +585,8 @@ btr_defragment_n_pages( break; } - blocks[i] = btr_block_get(*index, page_no, RW_X_LATCH, mtr); + blocks[i] = btr_block_get(*index, page_no, RW_X_LATCH, true, + mtr); } if (n_pages == 1) { diff --git a/storage/innobase/btr/btr0pcur.cc b/storage/innobase/btr/btr0pcur.cc index 9027f9a25c6..943ae996f20 100644 --- a/storage/innobase/btr/btr0pcur.cc +++ b/storage/innobase/btr/btr0pcur.cc @@ -465,7 +465,8 @@ btr_pcur_move_to_next_page( } buf_block_t* next_block = btr_block_get( - *btr_pcur_get_btr_cur(cursor)->index, next_page_no, mode, mtr); + *btr_pcur_get_btr_cur(cursor)->index, next_page_no, mode, + page_is_leaf(page), mtr); if (UNIV_UNLIKELY(!next_block)) { return; diff --git a/storage/innobase/btr/btr0scrub.cc b/storage/innobase/btr/btr0scrub.cc index 5cfd6f5f8b0..b8d3e2e50fa 100644 --- a/storage/innobase/btr/btr0scrub.cc +++ b/storage/innobase/btr/btr0scrub.cc @@ -431,7 +431,7 @@ btr_pessimistic_scrub( } /* read block variables */ - const ulint page_no = mach_read_from_4(page + FIL_PAGE_OFFSET); + const ulint page_no = block->page.id.page_no(); const ulint left_page_no = mach_read_from_4(page + FIL_PAGE_PREV); const ulint right_page_no = mach_read_from_4(page + FIL_PAGE_NEXT); @@ -448,12 +448,14 @@ btr_pessimistic_scrub( */ mtr->release_block_at_savepoint(scrub_data->savepoint, block); - btr_block_get(*index, left_page_no, RW_X_LATCH, mtr); + btr_block_get(*index, left_page_no, RW_X_LATCH, + page_is_leaf(page), mtr); /** * Refetch block and re-initialize page */ - block = btr_block_get(*index, page_no, RW_X_LATCH, mtr); + block = btr_block_get(*index, page_no, RW_X_LATCH, + page_is_leaf(page), mtr); page = buf_block_get_frame(block); @@ -465,7 +467,8 @@ btr_pessimistic_scrub( } if (right_page_no != FIL_NULL) { - btr_block_get(*index, right_page_no, RW_X_LATCH, mtr); + btr_block_get(*index, right_page_no, RW_X_LATCH, + page_is_leaf(page), mtr); } /* arguments to btr_page_split_and_insert */ diff --git a/storage/innobase/btr/btr0sea.cc b/storage/innobase/btr/btr0sea.cc index a220acd185c..3cc7c6b825b 100644 --- a/storage/innobase/btr/btr0sea.cc +++ b/storage/innobase/btr/btr0sea.cc @@ -882,10 +882,8 @@ btr_search_guess_on_hash( const rec_t* rec; ulint fold; index_id_t index_id; -#ifdef notdefined - btr_cur_t cursor2; - btr_pcur_t pcur; -#endif + + ut_ad(mtr->is_active()); ut_ad(!ahi_latch || rw_lock_own_flagged( ahi_latch, RW_LOCK_FLAG_X | RW_LOCK_FLAG_S)); @@ -893,11 +891,12 @@ btr_search_guess_on_hash( return(FALSE); } - ut_ad(index && info && tuple && cursor && mtr); - ut_ad(!dict_index_is_ibuf(index)); + ut_ad(!index->is_ibuf()); ut_ad(!ahi_latch || ahi_latch == btr_get_search_latch(index)); ut_ad((latch_mode == BTR_SEARCH_LEAF) || (latch_mode == BTR_MODIFY_LEAF)); + compile_time_assert(ulint{BTR_SEARCH_LEAF} == ulint{RW_S_LATCH}); + compile_time_assert(ulint{BTR_MODIFY_LEAF} == ulint{RW_X_LATCH}); /* Not supported for spatial index */ ut_ad(!dict_index_is_spatial(index)); @@ -955,16 +954,47 @@ fail: return(FALSE); } - buf_block_t* block = buf_block_from_ahi(rec); + buf_block_t* block = buf_block_from_ahi(rec); + buf_pool_t* buf_pool = buf_pool_from_block(block); if (use_latch) { + mutex_enter(&block->mutex); - if (!buf_page_get_known_nowait( - latch_mode, block, BUF_MAKE_YOUNG, - __FILE__, __LINE__, mtr)) { + if (buf_block_get_state(block) == BUF_BLOCK_REMOVE_HASH) { + /* Another thread is just freeing the block + from the LRU list of the buffer pool: do not + try to access this page. */ + mutex_exit(&block->mutex); goto fail; } + ut_ad(buf_block_get_state(block) == BUF_BLOCK_FILE_PAGE); + ut_ad(!block->page.file_page_was_freed); + buf_page_set_accessed(&block->page); + buf_block_buf_fix_inc(block, __FILE__, __LINE__); + mutex_exit(&block->mutex); + + buf_page_make_young_if_needed(buf_pool, &block->page); + mtr_memo_type_t fix_type; + if (latch_mode == BTR_SEARCH_LEAF) { + if (!rw_lock_s_lock_nowait(&block->lock, + __FILE__, __LINE__)) { +got_no_latch: + buf_block_buf_fix_dec(block); + goto fail; + } + fix_type = MTR_MEMO_PAGE_S_FIX; + } else { + if (!rw_lock_x_lock_func_nowait_inline( + &block->lock, __FILE__, __LINE__)) { + goto got_no_latch; + } + fix_type = MTR_MEMO_PAGE_X_FIX; + } + mtr->memo_push(block, fix_type); + + buf_pool->stat.n_page_gets++; + rw_lock_s_unlock(use_latch); buf_block_dbg_add_level(block, SYNC_TREE_NODE_FROM_HASH); @@ -1052,20 +1082,15 @@ fail: #ifdef UNIV_SEARCH_PERF_STAT btr_search_n_succ++; #endif - if (!ahi_latch && buf_page_peek_if_too_old(&block->page)) { - - buf_page_make_young(&block->page); - } - /* Increment the page get statistics though we did not really fix the page: for user info only */ - { - buf_pool_t* buf_pool = buf_pool_from_bpage(&block->page); + ++buf_pool->stat.n_page_gets; - ++buf_pool->stat.n_page_gets; + if (!ahi_latch) { + buf_page_make_young_if_needed(buf_pool, &block->page); } - return(TRUE); + return true; } /** Drop any adaptive hash index entries that point to an index page. diff --git a/storage/innobase/buf/buf0buf.cc b/storage/innobase/buf/buf0buf.cc index 61ab91ced56..41d242a9360 100644 --- a/storage/innobase/buf/buf0buf.cc +++ b/storage/innobase/buf/buf0buf.cc @@ -3708,28 +3708,6 @@ buf_page_make_young( buf_pool_mutex_exit(buf_pool); } -/********************************************************************//** -Moves a page to the start of the buffer pool LRU list if it is too old. -This high-level function can be used to prevent an important page from -slipping out of the buffer pool. */ -static -void -buf_page_make_young_if_needed( -/*==========================*/ - buf_page_t* bpage) /*!< in/out: buffer block of a - file page */ -{ -#ifdef UNIV_DEBUG - buf_pool_t* buf_pool = buf_pool_from_bpage(bpage); - ut_ad(!buf_pool_mutex_own(buf_pool)); -#endif /* UNIV_DEBUG */ - ut_a(buf_page_in_file(bpage)); - - if (buf_page_peek_if_too_old(bpage)) { - buf_page_make_young(bpage); - } -} - #ifdef UNIV_DEBUG /** Sets file_page_was_freed TRUE if the page is found in the buffer pool. @@ -3913,7 +3891,7 @@ got_block: mutex_exit(block_mutex); - buf_page_make_young_if_needed(bpage); + buf_page_make_young_if_needed(buf_pool, bpage); #if defined UNIV_DEBUG || defined UNIV_BUF_DEBUG ut_a(++buf_dbg_counter % 5771 || buf_validate()); @@ -4193,14 +4171,6 @@ buf_debug_execute_is_force_flush() /*==============================*/ { DBUG_EXECUTE_IF("ib_buf_force_flush", return(true); ); - - /* This is used during queisce testing, we want to ensure maximum - buffering by the change buffer. */ - - if (srv_ibuf_disable_background_merge) { - return(true); - } - return(false); } #endif /* UNIV_DEBUG || UNIV_IBUF_DEBUG */ @@ -4247,16 +4217,20 @@ buf_wait_for_read( } /** This is the general function used to get access to a database page. -@param[in] page_id page id -@param[in] zip_size ROW_FORMAT=COMPRESSED page size, or 0 -@param[in] rw_latch RW_S_LATCH, RW_X_LATCH, RW_NO_LATCH -@param[in] guess guessed block or NULL -@param[in] mode BUF_GET, BUF_GET_IF_IN_POOL, +@param[in] page_id page id +@param[in] zip_size ROW_FORMAT=COMPRESSED page size, or 0 +@param[in] rw_latch RW_S_LATCH, RW_X_LATCH, RW_NO_LATCH +@param[in] guess guessed block or NULL +@param[in] mode BUF_GET, BUF_GET_IF_IN_POOL, BUF_PEEK_IF_IN_POOL, BUF_GET_NO_LATCH, or BUF_GET_IF_IN_POOL_OR_WATCH -@param[in] file file name -@param[in] line line where called -@param[in] mtr mini-transaction -@param[out] err DB_SUCCESS or error code +@param[in] file file name +@param[in] line line where called +@param[in] mtr mini-transaction +@param[out] err DB_SUCCESS or error code +@param[in] allow_ibuf_merge Allow change buffer merge to happen +while reading the page from file +then it makes sure that it does merging of change buffer changes while +reading the page from file. @return pointer to the block or NULL */ buf_block_t* buf_page_get_gen( @@ -4268,7 +4242,8 @@ buf_page_get_gen( const char* file, unsigned line, mtr_t* mtr, - dberr_t* err) + dberr_t* err, + bool allow_ibuf_merge) { buf_block_t* block; unsigned access_time; @@ -4283,6 +4258,10 @@ buf_page_get_gen( || (rw_latch == RW_X_LATCH) || (rw_latch == RW_SX_LATCH) || (rw_latch == RW_NO_LATCH)); + ut_ad(!allow_ibuf_merge + || mode == BUF_GET + || mode == BUF_GET_IF_IN_POOL + || mode == BUF_GET_IF_IN_POOL_OR_WATCH); if (err) { *err = DB_SUCCESS; @@ -4499,11 +4478,11 @@ loop: if (fsp_is_system_temporary(page_id.space())) { /* For temporary tablespace, the mutex is being used - for synchronization between user thread and flush - thread, instead of block->lock. See buf_flush_page() - for the flush thread counterpart. */ + for synchorization between user thread and flush thread, + instead of block->lock. See buf_flush_page() for the flush + thread counterpart. */ BPageMutex* fix_mutex = buf_page_get_mutex( - &fix_block->page); + &fix_block->page); mutex_enter(fix_mutex); fix_block->fix(); mutex_exit(fix_mutex); @@ -4539,13 +4518,11 @@ got_block: } } - switch (buf_block_get_state(fix_block)) { - buf_page_t* bpage; - + switch (UNIV_EXPECT(buf_block_get_state(fix_block), + BUF_BLOCK_FILE_PAGE)) { case BUF_BLOCK_FILE_PAGE: - bpage = &block->page; if (fsp_is_system_temporary(page_id.space()) - && buf_page_get_io_fix(bpage) != BUF_IO_NONE) { + && buf_block_get_io_fix(block) != BUF_IO_NONE) { /* This suggests that the page is being flushed. Avoid returning reference to this page. Instead wait for the flush action to complete. */ @@ -4568,9 +4545,16 @@ evict_from_pool: return(NULL); } break; + default: + ut_error; + break; case BUF_BLOCK_ZIP_PAGE: case BUF_BLOCK_ZIP_DIRTY: + if (UNIV_UNLIKELY(mode == BUF_EVICT_IF_IN_POOL)) { + goto evict_from_pool; + } + if (mode == BUF_PEEK_IF_IN_POOL) { /* This mode is only used for dropping an adaptive hash index. There cannot be an @@ -4581,7 +4565,7 @@ evict_from_pool: return(NULL); } - bpage = &block->page; + buf_page_t* bpage = &block->page; /* Note: We have already buffer fixed this block. */ if (bpage->buf_fix_count > 1 @@ -4599,10 +4583,6 @@ evict_from_pool: goto loop; } - if (UNIV_UNLIKELY(mode == BUF_EVICT_IF_IN_POOL)) { - goto evict_from_pool; - } - /* Buffer-fix the block so that it cannot be evicted or relocated while we are attempting to allocate an uncompressed page. */ @@ -4696,35 +4676,31 @@ evict_from_pool: buf_page_mutex_exit(block); + if (!access_time && !recv_no_ibuf_operations + && ibuf_page_exists(block->page)) { + block->page.ibuf_exist = true; + } + buf_page_free_descriptor(bpage); /* Decompress the page while not holding buf_pool->mutex or block->mutex. */ - { - bool success = buf_zip_decompress(block, TRUE); + if (!buf_zip_decompress(block, TRUE)) { + buf_pool_mutex_enter(buf_pool); + buf_page_mutex_enter(fix_block); + buf_block_set_io_fix(fix_block, BUF_IO_NONE); + buf_page_mutex_exit(fix_block); - if (!success) { - buf_pool_mutex_enter(buf_pool); - buf_page_mutex_enter(fix_block); - buf_block_set_io_fix(fix_block, BUF_IO_NONE); - buf_page_mutex_exit(fix_block); + --buf_pool->n_pend_unzip; + fix_block->unfix(); + buf_pool_mutex_exit(buf_pool); + rw_lock_x_unlock(&fix_block->lock); - --buf_pool->n_pend_unzip; - fix_block->unfix(); - buf_pool_mutex_exit(buf_pool); - rw_lock_x_unlock(&fix_block->lock); - - if (err) { - *err = DB_PAGE_CORRUPTED; - } - return NULL; + if (err) { + *err = DB_PAGE_CORRUPTED; } - } - - if (!access_time && !recv_no_ibuf_operations) { - ibuf_merge_or_delete_for_page( - block, block->page.id, zip_size, true); + return NULL; } buf_pool_mutex_enter(buf_pool); @@ -4742,14 +4718,6 @@ evict_from_pool: rw_lock_x_unlock(&block->lock); break; - - case BUF_BLOCK_POOL_WATCH: - case BUF_BLOCK_NOT_USED: - case BUF_BLOCK_READY_FOR_USE: - case BUF_BLOCK_MEMORY: - case BUF_BLOCK_REMOVE_HASH: - ut_error; - break; } ut_ad(block == fix_block); @@ -4876,7 +4844,7 @@ evict_from_pool: } if (mode != BUF_PEEK_IF_IN_POOL) { - buf_page_make_young_if_needed(&fix_block->page); + buf_page_make_young_if_needed(buf_pool, &fix_block->page); } #if defined UNIV_DEBUG || defined UNIV_BUF_DEBUG @@ -4905,35 +4873,49 @@ evict_from_pool: return NULL; } - mtr_memo_type_t fix_type; - - switch (rw_latch) { - case RW_NO_LATCH: - - fix_type = MTR_MEMO_BUF_FIX; - break; - - case RW_S_LATCH: - rw_lock_s_lock_inline(&fix_block->lock, 0, file, line); - - fix_type = MTR_MEMO_PAGE_S_FIX; - break; - - case RW_SX_LATCH: - rw_lock_sx_lock_inline(&fix_block->lock, 0, file, line); - - fix_type = MTR_MEMO_PAGE_SX_FIX; - break; - - default: - ut_ad(rw_latch == RW_X_LATCH); + if (allow_ibuf_merge + && mach_read_from_2(fix_block->frame + FIL_PAGE_TYPE) + == FIL_PAGE_INDEX + && page_is_leaf(fix_block->frame)) { rw_lock_x_lock_inline(&fix_block->lock, 0, file, line); - fix_type = MTR_MEMO_PAGE_X_FIX; - break; - } + if (fix_block->page.ibuf_exist) { + fix_block->page.ibuf_exist = false; + ibuf_merge_or_delete_for_page(fix_block, page_id, + zip_size, true); + } - mtr_memo_push(mtr, fix_block, fix_type); + if (rw_latch == RW_X_LATCH) { + mtr->memo_push(fix_block, MTR_MEMO_PAGE_X_FIX); + } else { + rw_lock_x_unlock(&fix_block->lock); + goto get_latch; + } + } else { +get_latch: + mtr_memo_type_t fix_type; + + switch (rw_latch) { + case RW_NO_LATCH: + fix_type = MTR_MEMO_BUF_FIX; + break; + case RW_S_LATCH: + rw_lock_s_lock_inline(&fix_block->lock, 0, file, line); + fix_type = MTR_MEMO_PAGE_S_FIX; + break; + case RW_SX_LATCH: + rw_lock_sx_lock_inline(&fix_block->lock, 0, file, line); + fix_type = MTR_MEMO_PAGE_SX_FIX; + break; + default: + ut_ad(rw_latch == RW_X_LATCH); + rw_lock_x_lock_inline(&fix_block->lock, 0, file, line); + fix_type = MTR_MEMO_PAGE_X_FIX; + break; + } + + mtr->memo_push(block, fix_type); + } if (mode != BUF_PEEK_IF_IN_POOL && !access_time) { /* In the case of a first access, try to apply linear @@ -4962,7 +4944,6 @@ buf_page_optimistic_get( unsigned line, /*!< in: line where called */ mtr_t* mtr) /*!< in: mini-transaction */ { - buf_pool_t* buf_pool; unsigned access_time; ibool success; @@ -4988,7 +4969,8 @@ buf_page_optimistic_get( buf_page_mutex_exit(block); - buf_page_make_young_if_needed(&block->page); + buf_pool_t* buf_pool = buf_pool_from_block(block); + buf_page_make_young_if_needed(buf_pool, &block->page); ut_ad(!ibuf_inside(mtr) || ibuf_page(block->page.id, block->zip_size(), NULL)); @@ -5049,109 +5031,6 @@ buf_page_optimistic_get( ibuf_inside(mtr)); } - buf_pool = buf_pool_from_block(block); - buf_pool->stat.n_page_gets++; - - return(TRUE); -} - -/********************************************************************//** -This is used to get access to a known database page, when no waiting can be -done. For example, if a search in an adaptive hash index leads us to this -frame. -@return TRUE if success */ -ibool -buf_page_get_known_nowait( -/*======================*/ - ulint rw_latch,/*!< in: RW_S_LATCH, RW_X_LATCH */ - buf_block_t* block, /*!< in: the known page */ - ulint mode, /*!< in: BUF_MAKE_YOUNG or BUF_KEEP_OLD */ - const char* file, /*!< in: file name */ - unsigned line, /*!< in: line where called */ - mtr_t* mtr) /*!< in: mini-transaction */ -{ - buf_pool_t* buf_pool; - ibool success; - - ut_ad(mtr->is_active()); - ut_ad((rw_latch == RW_S_LATCH) || (rw_latch == RW_X_LATCH)); - - buf_page_mutex_enter(block); - - if (buf_block_get_state(block) == BUF_BLOCK_REMOVE_HASH) { - /* Another thread is just freeing the block from the LRU list - of the buffer pool: do not try to access this page; this - attempt to access the page can only come through the hash - index because when the buffer block state is ..._REMOVE_HASH, - we have already removed it from the page address hash table - of the buffer pool. */ - - buf_page_mutex_exit(block); - - return(FALSE); - } - - ut_a(buf_block_get_state(block) == BUF_BLOCK_FILE_PAGE); - - buf_block_buf_fix_inc(block, file, line); - - buf_page_set_accessed(&block->page); - - buf_page_mutex_exit(block); - - buf_pool = buf_pool_from_block(block); - - if (mode == BUF_MAKE_YOUNG) { - buf_page_make_young_if_needed(&block->page); - } - - ut_ad(!ibuf_inside(mtr) || mode == BUF_KEEP_OLD); - - mtr_memo_type_t fix_type; - - switch (rw_latch) { - case RW_S_LATCH: - success = rw_lock_s_lock_nowait(&block->lock, file, line); - fix_type = MTR_MEMO_PAGE_S_FIX; - break; - case RW_X_LATCH: - success = rw_lock_x_lock_func_nowait_inline( - &block->lock, file, line); - - fix_type = MTR_MEMO_PAGE_X_FIX; - break; - default: - ut_error; /* RW_SX_LATCH is not implemented yet */ - } - - if (!success) { - buf_block_buf_fix_dec(block); - return(FALSE); - } - - mtr_memo_push(mtr, block, fix_type); - -#if defined UNIV_DEBUG || defined UNIV_BUF_DEBUG - ut_a(++buf_dbg_counter % 5771 || buf_validate()); - ut_a(block->page.buf_fix_count > 0); - ut_a(buf_block_get_state(block) == BUF_BLOCK_FILE_PAGE); -#endif /* UNIV_DEBUG || UNIV_BUF_DEBUG */ - -#ifdef UNIV_DEBUG - if (mode != BUF_KEEP_OLD) { - /* If mode == BUF_KEEP_OLD, we are executing an I/O - completion routine. Avoid a bogus assertion failure - when ibuf_merge_or_delete_for_page() is processing a - page that was just freed due to DROP INDEX, or - deleting a record from SYS_INDEXES. This check will be - skipped in recv_recover_page() as well. */ - - buf_page_mutex_enter(block); - ut_a(!block->page.file_page_was_freed); - buf_page_mutex_exit(block); - } -#endif /* UNIV_DEBUG */ - buf_pool->stat.n_page_gets++; return(TRUE); @@ -5258,6 +5137,7 @@ buf_page_init_low( bpage->write_size = 0; bpage->real_size = 0; bpage->slot = NULL; + bpage->ibuf_exist = false; HASH_INVALIDATE(bpage, hash); @@ -6004,9 +5884,9 @@ static dberr_t buf_page_check_corrupt(buf_page_t* bpage, fil_space_t* space) } /** Complete a read or write request of a file page to or from the buffer pool. -@param[in,out] bpage page to complete -@param[in] dblwr whether the doublewrite buffer was used (on write) -@param[in] evict whether or not to evict the page from LRU list +@param[in,out] bpage page to complete +@param[in] dblwr whether the doublewrite buffer was used (on write) +@param[in] evict whether or not to evict the page from LRU list @return whether the operation succeeded @retval DB_SUCCESS always when writing, or if a read page was OK @retval DB_TABLESPACE_DELETED if the tablespace does not exist @@ -6201,10 +6081,9 @@ release_page: && (bpage->id.space() == 0 || !is_predefined_tablespace(bpage->id.space())) && fil_page_get_type(frame) == FIL_PAGE_INDEX - && page_is_leaf(frame)) { - ibuf_merge_or_delete_for_page( - reinterpret_cast(bpage), - bpage->id, bpage->zip_size(), true); + && page_is_leaf(frame) + && ibuf_page_exists(*bpage)) { + bpage->ibuf_exist = true; } space->release_for_io(); diff --git a/storage/innobase/buf/buf0rea.cc b/storage/innobase/buf/buf0rea.cc index e2e93e2d0bb..f147bb807ce 100644 --- a/storage/innobase/buf/buf0rea.cc +++ b/storage/innobase/buf/buf0rea.cc @@ -312,7 +312,7 @@ buf_read_ahead_random(const page_id_t page_id, ulint zip_size, bool ibuf) if (bpage != NULL && buf_page_is_accessed(bpage) - && buf_page_peek_if_young(bpage)) { + && buf_page_peek_if_young(buf_pool, bpage)) { recent_blocks++; @@ -754,89 +754,6 @@ buf_read_ahead_linear(const page_id_t page_id, ulint zip_size, bool ibuf) return(count); } -/********************************************************************//** -Issues read requests for pages which the ibuf module wants to read in, in -order to contract the insert buffer tree. Technically, this function is like -a read-ahead function. */ -void -buf_read_ibuf_merge_pages( -/*======================*/ - bool sync, /*!< in: true if the caller - wants this function to wait - for the highest address page - to get read in, before this - function returns */ - const ulint* space_ids, /*!< in: array of space ids */ - const ulint* page_nos, /*!< in: array of page numbers - to read, with the highest page - number the last in the - array */ - ulint n_stored) /*!< in: number of elements - in the arrays */ -{ -#ifdef UNIV_IBUF_DEBUG - ut_a(n_stored < srv_page_size); -#endif - - for (ulint i = 0; i < n_stored; i++) { - fil_space_t* s = fil_space_acquire_for_io(space_ids[i]); - if (!s) { -tablespace_deleted: - /* The tablespace was not found: remove all - entries for it */ - ibuf_delete_for_discarded_space(space_ids[i]); - while (i + 1 < n_stored - && space_ids[i + 1] == space_ids[i]) { - i++; - } - continue; - } - - const ulint zip_size = s->zip_size(); - s->release_for_io(); - - const page_id_t page_id(space_ids[i], page_nos[i]); - - buf_pool_t* buf_pool = buf_pool_get(page_id); - - while (buf_pool->n_pend_reads - > buf_pool->curr_size / BUF_READ_AHEAD_PEND_LIMIT) { - os_thread_sleep(500000); - } - - dberr_t err; - - buf_read_page_low(&err, - sync && (i + 1 == n_stored), - 0, - BUF_READ_ANY_PAGE, page_id, zip_size, - true, true /* ignore_missing_space */); - - switch(err) { - case DB_SUCCESS: - case DB_ERROR: - break; - case DB_TABLESPACE_DELETED: - goto tablespace_deleted; - case DB_PAGE_CORRUPTED: - case DB_DECRYPTION_FAILED: - ib::error() << "Failed to read or decrypt " << page_id - << " for change buffer merge"; - break; - default: - ut_error; - } - } - - os_aio_simulated_wake_handler_threads(); - - if (n_stored) { - DBUG_PRINT("ib_buf", - ("ibuf merge read-ahead %u pages, space %u", - unsigned(n_stored), unsigned(space_ids[0]))); - } -} - /** Issues read requests for pages which recovery wants to read in. @param[in] sync true if the caller wants this function to wait for the highest address page to get read in, before this function returns diff --git a/storage/innobase/dict/dict0boot.cc b/storage/innobase/dict/dict0boot.cc index c8538543b3e..95bd9a48edd 100644 --- a/storage/innobase/dict/dict0boot.cc +++ b/storage/innobase/dict/dict0boot.cc @@ -1,7 +1,7 @@ /***************************************************************************** Copyright (c) 1996, 2017, Oracle and/or its affiliates. All Rights Reserved. -Copyright (c) 2016, 2018, MariaDB Corporation. +Copyright (c) 2016, 2019, MariaDB Corporation. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software @@ -450,36 +450,15 @@ dict_boot(void) /* Initialize the insert buffer table and index for each tablespace */ - dberr_t err = DB_SUCCESS; - - err = ibuf_init_at_db_start(); + dberr_t err = ibuf_init_at_db_start(); if (err == DB_SUCCESS) { - if (srv_read_only_mode - && srv_force_recovery != SRV_FORCE_NO_LOG_REDO - && !ibuf_is_empty()) { + /* Load definitions of other indexes on system tables */ - if (srv_force_recovery < SRV_FORCE_NO_IBUF_MERGE) { - ib::error() << "Change buffer must be empty when" - " --innodb-read-only is set!" - "You can try to recover the database with innodb_force_recovery=5"; - - err = DB_ERROR; - } else { - ib::warn() << "Change buffer not empty when --innodb-read-only " - "is set! but srv_force_recovery = " << srv_force_recovery - << " , ignoring."; - } - } - - if (err == DB_SUCCESS) { - /* Load definitions of other indexes on system tables */ - - dict_load_sys_table(dict_sys.sys_tables); - dict_load_sys_table(dict_sys.sys_columns); - dict_load_sys_table(dict_sys.sys_indexes); - dict_load_sys_table(dict_sys.sys_fields); - } + dict_load_sys_table(dict_sys.sys_tables); + dict_load_sys_table(dict_sys.sys_columns); + dict_load_sys_table(dict_sys.sys_indexes); + dict_load_sys_table(dict_sys.sys_fields); } mutex_exit(&dict_sys.mutex); diff --git a/storage/innobase/dict/dict0stats.cc b/storage/innobase/dict/dict0stats.cc index bd4bb261320..ad2698a1be9 100644 --- a/storage/innobase/dict/dict0stats.cc +++ b/storage/innobase/dict/dict0stats.cc @@ -1493,6 +1493,7 @@ dict_stats_analyze_index_below_cur( rec_offs_set_n_alloc(offsets2, size); rec = btr_cur_get_rec(cur); + page = page_align(rec); ut_ad(!page_rec_is_leaf(rec)); offsets_rec = rec_get_offsets(rec, index, offsets1, false, @@ -1514,9 +1515,11 @@ dict_stats_analyze_index_below_cur( dberr_t err = DB_SUCCESS; - block = buf_page_get_gen(page_id, zip_size, RW_S_LATCH, - NULL /* no guessed block */, - BUF_GET, __FILE__, __LINE__, &mtr, &err); + block = buf_page_get_gen(page_id, zip_size, + RW_S_LATCH, NULL, BUF_GET, + __FILE__, __LINE__, &mtr, &err, + !index->is_clust() + && 1 == btr_page_get_level(page)); page = buf_block_get_frame(block); @@ -3143,7 +3146,7 @@ dict_stats_update( if (!table->is_readable()) { return (dict_stats_report_error(table)); - } else if (srv_force_recovery >= SRV_FORCE_NO_IBUF_MERGE) { + } else if (srv_force_recovery > SRV_FORCE_NO_IBUF_MERGE) { /* If we have set a high innodb_force_recovery level, do not calculate statistics, as a badly corrupted index can cause a crash in it. */ diff --git a/storage/innobase/fil/fil0fil.cc b/storage/innobase/fil/fil0fil.cc index d49d07defc3..4c512335112 100644 --- a/storage/innobase/fil/fil0fil.cc +++ b/storage/innobase/fil/fil0fil.cc @@ -4783,15 +4783,7 @@ fil_space_validate_for_mtr_commit( /* We are serving mtr_commit(). While there is an active mini-transaction, we should have !space->stop_new_ops. This is guaranteed by meta-data locks or transactional locks, or - dict_sys.latch (X-lock in DROP, S-lock in purge). - - However, a file I/O thread can invoke change buffer merge - while fil_check_pending_operations() is waiting for operations - to quiesce. This is not a problem, because - ibuf_merge_or_delete_for_page() would call - fil_space_acquire() before mtr_start() and - fil_space_t::release() after mtr_commit(). This is why - n_pending_ops should not be zero if stop_new_ops is set. */ + dict_sys.latch (X-lock in DROP, S-lock in purge). */ ut_ad(!space->stop_new_ops || space->is_being_truncated /* fil_truncate_prepare() */ || space->referenced()); diff --git a/storage/innobase/gis/gis0rtree.cc b/storage/innobase/gis/gis0rtree.cc index 320a23017fa..cf78a61e9c5 100644 --- a/storage/innobase/gis/gis0rtree.cc +++ b/storage/innobase/gis/gis0rtree.cc @@ -756,7 +756,7 @@ rtr_adjust_upper_level( /* Update page links of the level */ if (prev_page_no != FIL_NULL) { buf_block_t* prev_block = btr_block_get( - *index, prev_page_no, RW_X_LATCH, mtr); + *index, prev_page_no, RW_X_LATCH, false, mtr); #ifdef UNIV_BTR_DEBUG ut_a(page_is_comp(prev_block->frame) == page_is_comp(page)); ut_a(btr_page_get_next(prev_block->frame, mtr) @@ -770,7 +770,7 @@ rtr_adjust_upper_level( if (next_page_no != FIL_NULL) { buf_block_t* next_block = btr_block_get( - *index, next_page_no, RW_X_LATCH, mtr); + *index, next_page_no, RW_X_LATCH, false, mtr); #ifdef UNIV_BTR_DEBUG ut_a(page_is_comp(next_block->frame) == page_is_comp(page)); ut_a(btr_page_get_prev(next_block->frame, mtr) diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc index 0ba2e471bca..eb1fcb577cf 100644 --- a/storage/innobase/handler/ha_innodb.cc +++ b/storage/innobase/handler/ha_innodb.cc @@ -5901,7 +5901,7 @@ initialize_auto_increment(dict_table_t* table, const Field* field) table->persistent_autoinc without autoinc_mutex protection, and there might be multiple ha_innobase::open() executing concurrently. */ - } else if (srv_force_recovery >= SRV_FORCE_NO_IBUF_MERGE) { + } else if (srv_force_recovery > SRV_FORCE_NO_IBUF_MERGE) { /* If the recovery level is set so high that writes are disabled we force the AUTOINC counter to 0 value effectively disabling writes to the table. @@ -14037,7 +14037,7 @@ ha_innobase::info_low( } } - if (srv_force_recovery >= SRV_FORCE_NO_IBUF_MERGE) { + if (srv_force_recovery > SRV_FORCE_NO_IBUF_MERGE) { goto func_exit; @@ -16683,6 +16683,9 @@ innobase_commit_by_xid( { DBUG_ASSERT(hton == innodb_hton_ptr); + DBUG_EXECUTE_IF("innobase_xa_fail", + return XAER_RMFAIL;); + if (high_level_read_only) { return(XAER_RMFAIL); } @@ -16715,6 +16718,9 @@ innobase_rollback_by_xid( { DBUG_ASSERT(hton == innodb_hton_ptr); + DBUG_EXECUTE_IF("innobase_xa_fail", + return XAER_RMFAIL;); + if (high_level_read_only) { return(XAER_RMFAIL); } @@ -19039,7 +19045,7 @@ static MYSQL_SYSVAR_ULONG(write_io_threads, srv_n_write_io_threads, static MYSQL_SYSVAR_ULONG(force_recovery, srv_force_recovery, PLUGIN_VAR_RQCMDARG | PLUGIN_VAR_READONLY, - "Helps to save your data in case the disk image of the database becomes corrupt.", + "Helps to save your data in case the disk image of the database becomes corrupt. Value 5 can return bogus data, and 6 can permanently corrupt data.", NULL, NULL, 0, 0, 6, 0); static MYSQL_SYSVAR_ULONG(page_size, srv_page_size, @@ -19227,12 +19233,6 @@ static MYSQL_SYSVAR_UINT(change_buffering_debug, ibuf_debug, PLUGIN_VAR_RQCMDARG, "Debug flags for InnoDB change buffering (0=none, 1=try to buffer)", NULL, NULL, 0, 0, 1, 0); - -static MYSQL_SYSVAR_BOOL(disable_background_merge, - srv_ibuf_disable_background_merge, - PLUGIN_VAR_NOCMDARG | PLUGIN_VAR_RQCMDARG, - "Disable change buffering merges by the master thread", - NULL, NULL, FALSE); #endif /* UNIV_DEBUG || UNIV_IBUF_DEBUG */ static MYSQL_SYSVAR_ULONG(buf_dump_status_frequency, srv_buf_dump_status_frequency, @@ -19694,7 +19694,6 @@ static struct st_mysql_sys_var* innobase_system_variables[]= { MYSQL_SYSVAR(change_buffer_max_size), #if defined UNIV_DEBUG || defined UNIV_IBUF_DEBUG MYSQL_SYSVAR(change_buffering_debug), - MYSQL_SYSVAR(disable_background_merge), #endif /* UNIV_DEBUG || UNIV_IBUF_DEBUG */ #ifdef WITH_INNODB_DISALLOW_WRITES MYSQL_SYSVAR(disallow_writes), diff --git a/storage/innobase/ibuf/ibuf0ibuf.cc b/storage/innobase/ibuf/ibuf0ibuf.cc index db808df3867..f10a27d6683 100644 --- a/storage/innobase/ibuf/ibuf0ibuf.cc +++ b/storage/innobase/ibuf/ibuf0ibuf.cc @@ -28,10 +28,6 @@ Created 7/19/1997 Heikki Tuuri #include "sync0sync.h" #include "btr0sea.h" -#if defined UNIV_DEBUG || defined UNIV_IBUF_DEBUG -my_bool srv_ibuf_disable_background_merge; -#endif /* UNIV_DEBUG || UNIV_IBUF_DEBUG */ - /** Number of bits describing a single page */ #define IBUF_BITS_PER_PAGE 4 /** The start address for an insert buffer bitmap page bitmap */ @@ -257,16 +253,6 @@ const ulint IBUF_MERGE_THRESHOLD = 4; batch, in order to merge the entries for them in the insert buffer */ const ulint IBUF_MAX_N_PAGES_MERGED = IBUF_MERGE_AREA; -/** If the combined size of the ibuf trees exceeds ibuf.max_size by this -many pages, we start to contract it in connection to inserts there, using -non-synchronous contract */ -const ulint IBUF_CONTRACT_ON_INSERT_NON_SYNC = 0; - -/** If the combined size of the ibuf trees exceeds ibuf.max_size by this -many pages, we start to contract it in connection to inserts there, using -synchronous contract */ -const ulint IBUF_CONTRACT_ON_INSERT_SYNC = 5; - /** If the combined size of the ibuf trees exceeds ibuf.max_size by this many pages, we start to contract it synchronous contract, but do not insert */ @@ -701,9 +687,9 @@ ibuf_bitmap_get_map_page_func( buf_block_t* block = NULL; dberr_t err = DB_SUCCESS; - block = buf_page_get_gen(ibuf_bitmap_page_no_calc(page_id, zip_size), - zip_size, RW_X_LATCH, NULL, BUF_GET, - file, line, mtr, &err); + block = buf_page_get_gen( + ibuf_bitmap_page_no_calc(page_id, zip_size), + zip_size, RW_X_LATCH, NULL, BUF_GET, file, line, mtr, &err); if (err != DB_SUCCESS) { return NULL; @@ -2083,10 +2069,6 @@ void ibuf_free_excess_pages(void) /*========================*/ { - if (srv_force_recovery >= SRV_FORCE_NO_IBUF_MERGE) { - return; - } - /* Free at most a few pages at a time, so that we do not delay the requested service too much */ @@ -2369,6 +2351,40 @@ ibuf_get_merge_pages( return(volume); } +/** Merge the change buffer to some pages. */ +static void ibuf_read_merge_pages(const ulint* space_ids, + const ulint* page_nos, ulint n_stored) +{ + for (ulint i = 0; i < n_stored; i++) { + const ulint space_id = space_ids[i]; + fil_space_t* s = fil_space_acquire_for_io(space_id); + if (!s) { +tablespace_deleted: + /* The tablespace was not found: remove all + entries for it */ + ibuf_delete_for_discarded_space(space_id); + while (i + 1 < n_stored + && space_ids[i + 1] == space_id) { + i++; + } + continue; + } + + const ulint zip_size = s->zip_size(); + s->release_for_io(); + mtr_t mtr; + mtr.start(); + dberr_t err; + buf_page_get_gen(page_id_t(space_id, page_nos[i]), + zip_size, RW_X_LATCH, NULL, BUF_GET, + __FILE__, __LINE__, &mtr, &err, true); + mtr.commit(); + if (err == DB_TABLESPACE_DELETED) { + goto tablespace_deleted; + } + } +} + /*********************************************************************//** Contracts insert buffer trees by reading pages to the buffer pool. @return a lower limit for the combined size in bytes of entries which @@ -2378,10 +2394,7 @@ static ulint ibuf_merge_pages( /*=============*/ - ulint* n_pages, /*!< out: number of pages to which merged */ - bool sync) /*!< in: true if the caller wants to wait for - the issued read with the highest tablespace - address to complete */ + ulint* n_pages) /*!< out: number of pages to which merged */ { mtr_t mtr; btr_pcur_t pcur; @@ -2424,15 +2437,10 @@ ibuf_merge_pages( btr_pcur_get_rec(&pcur), &mtr, space_ids, page_nos, n_pages); -#if 0 /* defined UNIV_IBUF_DEBUG */ - fprintf(stderr, "Ibuf contract sync %lu pages %lu volume %lu\n", - sync, *n_pages, sum_sizes); -#endif ibuf_mtr_commit(&mtr); btr_pcur_close(&pcur); - buf_read_ibuf_merge_pages( - sync, space_ids, page_nos, *n_pages); + ibuf_read_merge_pages(space_ids, page_nos, *n_pages); return(sum_sizes + 1); } @@ -2502,8 +2510,7 @@ ibuf_merge_space( } #endif /* UNIV_DEBUG */ - buf_read_ibuf_merge_pages( - true, spaces, pages, n_pages); + ibuf_read_merge_pages(spaces, pages, n_pages); } return(n_pages); @@ -2516,11 +2523,8 @@ the issued reads to complete @return a lower limit for the combined size in bytes of entries which will be merged from ibuf trees to the pages read, 0 if ibuf is empty */ -static MY_ATTRIBUTE((warn_unused_result)) -ulint -ibuf_merge( - ulint* n_pages, - bool sync) +MY_ATTRIBUTE((warn_unused_result)) +static ulint ibuf_merge(ulint* n_pages) { *n_pages = 0; @@ -2536,88 +2540,46 @@ ibuf_merge( return(0); #endif /* UNIV_DEBUG || UNIV_IBUF_DEBUG */ } else { - return(ibuf_merge_pages(n_pages, sync)); + return ibuf_merge_pages(n_pages); } } /** Contract the change buffer by reading pages to the buffer pool. -@param[in] sync whether the caller waits for -the issued reads to complete @return a lower limit for the combined size in bytes of entries which will be merged from ibuf trees to the pages read, 0 if ibuf is empty */ -static -ulint -ibuf_contract( - bool sync) +static ulint ibuf_contract() { - ulint n_pages; - - return(ibuf_merge_pages(&n_pages, sync)); + ulint n_pages; + return ibuf_merge_pages(&n_pages); } /** Contract the change buffer by reading pages to the buffer pool. -@param[in] full If true, do a full contraction based -on PCT_IO(100). If false, the size of contract batch is determined -based on the current size of the change buffer. @return a lower limit for the combined size in bytes of entries which will be merged from ibuf trees to the pages read, 0 if ibuf is empty */ -ulint -ibuf_merge_in_background( - bool full) +ulint ibuf_merge_all() { - ulint sum_bytes = 0; - ulint sum_pages = 0; - ulint n_pag2; - ulint n_pages; - -#if defined UNIV_DEBUG || defined UNIV_IBUF_DEBUG - if (srv_ibuf_disable_background_merge) { - return(0); - } -#endif /* UNIV_DEBUG || UNIV_IBUF_DEBUG */ - - if (full) { - /* Caller has requested a full batch */ - n_pages = PCT_IO(100); - } else { - /* By default we do a batch of 5% of the io_capacity */ - n_pages = PCT_IO(5); - - mutex_enter(&ibuf_mutex); - - /* If the ibuf.size is more than half the max_size - then we make more agreesive contraction. - +1 is to avoid division by zero. */ - if (ibuf.size > ibuf.max_size / 2) { - ulint diff = ibuf.size - ibuf.max_size / 2; - n_pages += PCT_IO((diff * 100) - / (ibuf.max_size + 1)); - } - - mutex_exit(&ibuf_mutex); - } - #if defined UNIV_DEBUG || defined UNIV_IBUF_DEBUG if (ibuf_debug) { return(0); } #endif /* UNIV_DEBUG || UNIV_IBUF_DEBUG */ - while (sum_pages < n_pages) { - ulint n_bytes; + ulint sum_bytes = 0; + ulint n_pages = PCT_IO(100); - n_bytes = ibuf_merge(&n_pag2, false); + for (ulint sum_pages = 0; sum_pages < n_pages; ) { + ulint n_pag2; + ulint n_bytes = ibuf_merge(&n_pag2); if (n_bytes == 0) { - return(sum_bytes); + break; } sum_bytes += n_bytes; - sum_pages += n_pag2; } - return(sum_bytes); + return sum_bytes; } /*********************************************************************//** @@ -2629,11 +2591,6 @@ ibuf_contract_after_insert( ulint entry_size) /*!< in: size of a record which was inserted into an ibuf tree */ { - ibool sync; - ulint sum_sizes; - ulint size; - ulint max_size; - /* Perform dirty reads of ibuf.size and ibuf.max_size, to reduce ibuf_mutex contention. ibuf.max_size remains constant after ibuf_init_at_db_start(), but ibuf.size should be @@ -2641,22 +2598,16 @@ ibuf_contract_after_insert( machine word, this should be OK; at worst we are doing some excessive ibuf_contract() or occasionally skipping a ibuf_contract(). */ - size = ibuf.size; - max_size = ibuf.max_size; - - if (size < max_size + IBUF_CONTRACT_ON_INSERT_NON_SYNC) { + if (ibuf.size < ibuf.max_size) { return; } - sync = (size >= max_size + IBUF_CONTRACT_ON_INSERT_SYNC); - /* Contract at least entry_size many bytes */ - sum_sizes = 0; - size = 1; + ulint sum_sizes = 0; + ulint size; do { - - size = ibuf_contract(sync); + size = ibuf_contract(); sum_sizes += size; } while (size > 0 && sum_sizes < entry_size); } @@ -3296,7 +3247,7 @@ ibuf_insert_low( #ifdef UNIV_IBUF_DEBUG fputs("Ibuf too big\n", stderr); #endif - ibuf_contract(true); + ibuf_contract(); return(DB_STRONG_FAIL); } @@ -3551,8 +3502,7 @@ func_exit: #ifdef UNIV_IBUF_DEBUG ut_a(n_stored <= IBUF_MAX_N_PAGES_MERGED); #endif - buf_read_ibuf_merge_pages(false, space_ids, - page_nos, n_stored); + ibuf_read_merge_pages(space_ids, page_nos, n_stored); } return(err); @@ -4251,6 +4201,42 @@ func_exit: return(TRUE); } +/** Check whether buffered changes exist for a page. +@param[in,out] block page +@return whether buffered changes exist */ +bool ibuf_page_exists(const buf_page_t& bpage) +{ + ut_ad(buf_page_get_io_fix(&bpage) == BUF_IO_READ + || recv_recovery_is_on()); + ut_ad(!fsp_is_system_temporary(bpage.id.space())); + ut_ad(buf_page_in_file(&bpage)); + ut_ad(buf_page_get_state(&bpage) != BUF_BLOCK_FILE_PAGE + || bpage.io_fix == BUF_IO_READ + || rw_lock_own(&const_cast + (reinterpret_cast + (bpage)).lock, RW_LOCK_X)); + + const ulint physical_size = bpage.physical_size(); + + if (ibuf_fixed_addr_page(bpage.id, physical_size) + || fsp_descr_page(bpage.id, physical_size)) { + return false; + } + + mtr_t mtr; + bool bitmap_bits = false; + + ibuf_mtr_start(&mtr); + if (const page_t* bitmap_page = ibuf_bitmap_get_map_page( + bpage.id, bpage.zip_size(), &mtr)) { + bitmap_bits = ibuf_bitmap_page_get_bits( + bitmap_page, bpage.id, bpage.zip_size(), + IBUF_BITMAP_BUFFERED, &mtr) != 0; + } + ibuf_mtr_commit(&mtr); + return bitmap_bits; +} + /** When an index page is read from a disk to the buffer pool, this function applies any buffered operations to the page and deletes the entries from the insert buffer. If the page is not read, but created in the buffer pool, this @@ -4286,11 +4272,9 @@ ibuf_merge_or_delete_for_page( ulint dops[IBUF_OP_COUNT]; ut_ad(block == NULL || page_id == block->page.id); - ut_ad(block == NULL || buf_block_get_io_fix(block) == BUF_IO_READ - || recv_recovery_is_on()); + ut_ad(!block || buf_block_get_state(block) == BUF_BLOCK_FILE_PAGE); - if (srv_force_recovery >= SRV_FORCE_NO_IBUF_MERGE - || trx_sys_hdr_page(page_id) + if (trx_sys_hdr_page(page_id) || fsp_is_system_temporary(page_id.space())) { return; } @@ -4391,16 +4375,12 @@ loop: &pcur, &mtr); if (block != NULL) { - ibool success; + ut_ad(rw_lock_own(&block->lock, RW_LOCK_X)); + buf_block_buf_fix_inc(block, __FILE__, __LINE__); + rw_lock_x_lock(&block->lock); mtr.set_named_space(space); - - success = buf_page_get_known_nowait( - RW_X_LATCH, block, - BUF_KEEP_OLD, __FILE__, __LINE__, &mtr); - - ut_a(success); - + mtr.memo_push(block, MTR_MEMO_PAGE_X_FIX); /* This is a user page (secondary index leaf page), but we pretend that it is a change buffer page in order to obey the latching order. This should be OK, @@ -4466,7 +4446,6 @@ loop: ut_ad(page_validate(block->frame, dummy_index)); switch (op) { - ibool success; case IBUF_OP_INSERT: #ifdef UNIV_IBUF_DEBUG volume += rec_get_converted_size( @@ -4515,11 +4494,11 @@ loop: ibuf_mtr_start(&mtr); mtr.set_named_space(space); - success = buf_page_get_known_nowait( - RW_X_LATCH, block, - BUF_KEEP_OLD, - __FILE__, __LINE__, &mtr); - ut_a(success); + ut_ad(rw_lock_own(&block->lock, RW_LOCK_X)); + buf_block_buf_fix_inc(block, + __FILE__, __LINE__); + rw_lock_x_lock(&block->lock); + mtr.memo_push(block, MTR_MEMO_PAGE_X_FIX); /* This is a user page (secondary index leaf page), but it should be OK diff --git a/storage/innobase/include/btr0btr.h b/storage/innobase/include/btr0btr.h index ae25ac76615..d00130dbd3d 100644 --- a/storage/innobase/include/btr0btr.h +++ b/storage/innobase/include/btr0btr.h @@ -221,12 +221,13 @@ btr_height_get( @param[in] index index tree @param[in] page page number @param[in] mode latch mode +@param[in] merge whether change buffer merge should be attempted @param[in] file file name @param[in] line line where called @param[in,out] mtr mini-transaction @return block */ inline buf_block_t* btr_block_get_func(const dict_index_t& index, ulint page, - ulint mode, + ulint mode, bool merge, const char* file, unsigned line, mtr_t* mtr) { @@ -235,7 +236,7 @@ inline buf_block_t* btr_block_get_func(const dict_index_t& index, ulint page, if (buf_block_t* block = buf_page_get_gen( page_id_t(index.table->space->id, page), index.table->space->zip_size(), mode, NULL, BUF_GET, - file, line, mtr, &err)) { + file, line, mtr, &err, merge && !index.is_clust())) { ut_ad(err == DB_SUCCESS); if (mode != RW_NO_LATCH) { buf_block_dbg_add_level(block, index.is_ibuf() @@ -260,10 +261,11 @@ inline buf_block_t* btr_block_get_func(const dict_index_t& index, ulint page, @param index index tree @param page page number @param mode latch mode +@param merge whether change buffer merge should be attempted @param mtr mini-transaction handle @return the block descriptor */ -# define btr_block_get(index, page, mode, mtr) \ - btr_block_get_func(index, page, mode, __FILE__, __LINE__, mtr) +# define btr_block_get(index, page, mode, merge, mtr) \ + btr_block_get_func(index, page, mode, merge, __FILE__, __LINE__, mtr) /**************************************************************//** Gets the index id field of a page. @return index id */ diff --git a/storage/innobase/include/buf0buf.h b/storage/innobase/include/buf0buf.h index 461d4f48f48..332b6d93aeb 100644 --- a/storage/innobase/include/buf0buf.h +++ b/storage/innobase/include/buf0buf.h @@ -67,16 +67,6 @@ struct fil_addr_t; if the file page has been freed. */ #define BUF_EVICT_IF_IN_POOL 20 /*!< evict a clean block if found */ /* @} */ -/** @name Modes for buf_page_get_known_nowait */ -/* @{ */ -#define BUF_MAKE_YOUNG 51 /*!< Move the block to the - start of the LRU list if there - is a danger that the block - would drift out of the buffer - pool*/ -#define BUF_KEEP_OLD 52 /*!< Preserve the current LRU - position of the block. */ -/* @} */ #define MAX_BUFFER_POOLS_BITS 6 /*!< Number of bits to representing a buffer pool ID */ @@ -132,7 +122,6 @@ enum buf_page_state { before putting to the free list */ }; - /** This structure defines information we will fetch from each buffer pool. It will be used to print table IO stats */ struct buf_pool_info_t{ @@ -357,7 +346,8 @@ NOTE! The following macros should be used instead of buf_page_get_gen, to improve debugging. Only values RW_S_LATCH and RW_X_LATCH are allowed in LA! */ #define buf_page_get(ID, SIZE, LA, MTR) \ - buf_page_get_gen(ID, SIZE, LA, NULL, BUF_GET, __FILE__, __LINE__, MTR, NULL) + buf_page_get_gen(ID, SIZE, LA, NULL, BUF_GET, __FILE__, __LINE__, MTR) + /**************************************************************//** Use these macros to bufferfix a page with no latching. Remember not to read the contents of the page unless you know it is safe. Do not modify @@ -366,7 +356,7 @@ error-prone programming not to set a latch, and it should be used with care. */ #define buf_page_get_with_no_latch(ID, SIZE, MTR) \ buf_page_get_gen(ID, SIZE, RW_NO_LATCH, NULL, BUF_GET_NO_LATCH, \ - __FILE__, __LINE__, MTR, NULL) + __FILE__, __LINE__, MTR) /********************************************************************//** This is the general function used to get optimistic access to a database page. @@ -380,19 +370,6 @@ buf_page_optimistic_get( const char* file, /*!< in: file name */ unsigned line, /*!< in: line where called */ mtr_t* mtr); /*!< in: mini-transaction */ -/********************************************************************//** -This is used to get access to a known database page, when no waiting can be -done. -@return TRUE if success */ -ibool -buf_page_get_known_nowait( -/*======================*/ - ulint rw_latch,/*!< in: RW_S_LATCH, RW_X_LATCH */ - buf_block_t* block, /*!< in: the known page */ - ulint mode, /*!< in: BUF_MAKE_YOUNG or BUF_KEEP_OLD */ - const char* file, /*!< in: file name */ - unsigned line, /*!< in: line where called */ - mtr_t* mtr); /*!< in: mini-transaction */ /** Given a tablespace id and page number tries to get that page. If the page is not in the buffer pool it is not loaded and NULL is returned. @@ -431,16 +408,18 @@ the same set of mutexes or latches. buf_page_t* buf_page_get_zip(const page_id_t page_id, ulint zip_size); /** This is the general function used to get access to a database page. -@param[in] page_id page id -@param[in] zip_size ROW_FORMAT=COMPRESSED page size, or 0 -@param[in] rw_latch RW_S_LATCH, RW_X_LATCH, RW_NO_LATCH -@param[in] guess guessed block or NULL -@param[in] mode BUF_GET, BUF_GET_IF_IN_POOL, +@param[in] page_id page id +@param[in] zip_size ROW_FORMAT=COMPRESSED page size, or 0 +@param[in] rw_latch RW_S_LATCH, RW_X_LATCH, RW_NO_LATCH +@param[in] guess guessed block or NULL +@param[in] mode BUF_GET, BUF_GET_IF_IN_POOL, BUF_PEEK_IF_IN_POOL, BUF_GET_NO_LATCH, or BUF_GET_IF_IN_POOL_OR_WATCH -@param[in] file file name of caller -@param[in] line line number of caller -@param[in,out] mtr mini-transaction -@param[out] err DB_SUCCESS or error code +@param[in] file file name +@param[in] line line where called +@param[in] mtr mini-transaction +@param[out] err DB_SUCCESS or error code +@param[in] allow_ibuf_merge Allow change buffer merge while +reading the pages from file. @return pointer to the block or NULL */ buf_block_t* buf_page_get_gen( @@ -452,7 +431,8 @@ buf_page_get_gen( const char* file, unsigned line, mtr_t* mtr, - dberr_t* err); + dberr_t* err = NULL, + bool allow_ibuf_merge = false); /** Initialize a page in the buffer pool. The page is usually not read from a file even if it cannot be found in the buffer buf_pool. This is one @@ -538,28 +518,36 @@ buf_block_get_freed_page_clock( const buf_block_t* block) /*!< in: block */ MY_ATTRIBUTE((warn_unused_result)); -/********************************************************************//** -Tells if a block is still close enough to the MRU end of the LRU list +/** Determine if a block is still close enough to the MRU end of the LRU list meaning that it is not in danger of getting evicted and also implying that it has been accessed recently. Note that this is for heuristics only and does not reserve buffer pool mutex. -@return TRUE if block is close to MRU end of LRU */ -UNIV_INLINE -ibool -buf_page_peek_if_young( -/*===================*/ - const buf_page_t* bpage); /*!< in: block */ -/********************************************************************//** -Recommends a move of a block to the start of the LRU list if there is danger -of dropping from the buffer pool. NOTE: does not reserve the buffer pool -mutex. -@return TRUE if should be made younger */ -UNIV_INLINE -ibool -buf_page_peek_if_too_old( -/*=====================*/ - const buf_page_t* bpage); /*!< in: block to make younger */ +@param[in] buf_pool buffer pool +@param[in] bpage buffer pool page +@return whether bpage is close to MRU end of LRU */ +inline bool buf_page_peek_if_young(const buf_pool_t* buf_pool, + const buf_page_t* bpage); + +/** Determine if a block should be moved to the start of the LRU list if +there is danger of dropping from the buffer pool. +@param[in,out] buf_pool buffer pool +@param[in] bpage buffer pool page +@return true if bpage should be made younger */ +inline bool buf_page_peek_if_too_old(buf_pool_t* buf_pool, + const buf_page_t* bpage); + +/** Move a page to the start of the buffer pool LRU list if it is too old. +@param[in,out] buf_pool buffer pool +@param[in,out] bpage buffer pool page */ +inline void buf_page_make_young_if_needed(buf_pool_t* buf_pool, + buf_page_t* bpage) +{ + if (UNIV_UNLIKELY(buf_page_peek_if_too_old(buf_pool, bpage))) { + buf_page_make_young(bpage); + } +} + /********************************************************************//** Gets the youngest modification log sequence number for a frame. Returns zero if not file page or no modification occurred yet. @@ -1175,7 +1163,10 @@ buf_page_init_for_read( not match */ UNIV_INTERN dberr_t -buf_page_io_complete(buf_page_t* bpage, bool dblwr = false, bool evict = false) +buf_page_io_complete( + buf_page_t* bpage, + bool dblwr = false, + bool evict = false) MY_ATTRIBUTE((nonnull)); /********************************************************************//** @@ -1619,6 +1610,9 @@ public: protected by buf_pool->zip_mutex or buf_block_t::mutex. */ # endif /* UNIV_DEBUG */ + /** Change buffer entries for the page exist. + Protected by io_fix==BUF_IO_READ or by buf_block_t::lock. */ + bool ibuf_exist; void fix() { buf_fix_count++; } uint32_t unfix() diff --git a/storage/innobase/include/buf0buf.ic b/storage/innobase/include/buf0buf.ic index 970119edd6e..699b3ee8407 100644 --- a/storage/innobase/include/buf0buf.ic +++ b/storage/innobase/include/buf0buf.ic @@ -141,21 +141,17 @@ buf_block_get_freed_page_clock( return(buf_page_get_freed_page_clock(&block->page)); } -/********************************************************************//** -Tells if a block is still close enough to the MRU end of the LRU list +/** Determine if a block is still close enough to the MRU end of the LRU list meaning that it is not in danger of getting evicted and also implying that it has been accessed recently. Note that this is for heuristics only and does not reserve buffer pool mutex. -@return TRUE if block is close to MRU end of LRU */ -UNIV_INLINE -ibool -buf_page_peek_if_young( -/*===================*/ - const buf_page_t* bpage) /*!< in: block */ +@param[in] buf_pool buffer pool +@param[in] bpage buffer pool page +@return whether bpage is close to MRU end of LRU */ +inline bool buf_page_peek_if_young(const buf_pool_t* buf_pool, + const buf_page_t* bpage) { - buf_pool_t* buf_pool = buf_pool_from_bpage(bpage); - /* FIXME: bpage->freed_page_clock is 31 bits */ return((buf_pool->freed_page_clock & ((1UL << 31) - 1)) < (bpage->freed_page_clock @@ -164,18 +160,16 @@ buf_page_peek_if_young( / (BUF_LRU_OLD_RATIO_DIV * 4)))); } -/********************************************************************//** -Recommends a move of a block to the start of the LRU list if there is danger -of dropping from the buffer pool. NOTE: does not reserve the buffer pool -mutex. -@return TRUE if should be made younger */ -UNIV_INLINE -ibool -buf_page_peek_if_too_old( -/*=====================*/ - const buf_page_t* bpage) /*!< in: block to make younger */ +/** Determine if a block should be moved to the start of the LRU list if +there is danger of dropping from the buffer pool. +@param[in,out] buf_pool buffer pool +@param[in] bpage buffer pool page +@return true if bpage should be made younger */ +inline bool buf_page_peek_if_too_old(buf_pool_t* buf_pool, + const buf_page_t* bpage) { - buf_pool_t* buf_pool = buf_pool_from_bpage(bpage); + ut_ad(!buf_pool_mutex_own(buf_pool)); + ut_ad(buf_page_in_file(bpage)); if (buf_pool->freed_page_clock == 0) { /* If eviction has not started yet, do not update the @@ -198,9 +192,9 @@ buf_page_peek_if_too_old( } buf_pool->stat.n_pages_not_made_young++; - return(FALSE); + return false; } else { - return(!buf_page_peek_if_young(bpage)); + return !buf_page_peek_if_young(buf_pool, bpage); } } diff --git a/storage/innobase/include/buf0rea.h b/storage/innobase/include/buf0rea.h index ff0ba474bb3..7653bdbe55f 100644 --- a/storage/innobase/include/buf0rea.h +++ b/storage/innobase/include/buf0rea.h @@ -100,26 +100,6 @@ which could result in a deadlock if the OS does not support asynchronous io. ulint buf_read_ahead_linear(const page_id_t page_id, ulint zip_size, bool ibuf); -/********************************************************************//** -Issues read requests for pages which the ibuf module wants to read in, in -order to contract the insert buffer tree. Technically, this function is like -a read-ahead function. */ -void -buf_read_ibuf_merge_pages( -/*======================*/ - bool sync, /*!< in: true if the caller - wants this function to wait - for the highest address page - to get read in, before this - function returns */ - const ulint* space_ids, /*!< in: array of space ids */ - const ulint* page_nos, /*!< in: array of page numbers - to read, with the highest page - number the last in the - array */ - ulint n_stored); /*!< in: number of elements - in the arrays */ - /** Issues read requests for pages which recovery wants to read in. @param[in] sync true if the caller wants this function to wait for the highest address page to get read in, before this function returns diff --git a/storage/innobase/include/ibuf0ibuf.h b/storage/innobase/include/ibuf0ibuf.h index ca7fa892126..356e120a7bc 100644 --- a/storage/innobase/include/ibuf0ibuf.h +++ b/storage/innobase/include/ibuf0ibuf.h @@ -317,6 +317,11 @@ ibuf_insert( ulint zip_size, que_thr_t* thr); +/** Check whether buffered changes exist for a page. +@param[in,out] bpage buffer page +@return whether buffered changes exist */ +bool ibuf_page_exists(const buf_page_t& bpage); + /** When an index page is read from a disk to the buffer pool, this function applies any buffered operations to the page and deletes the entries from the insert buffer. If the page is not read, but created in the buffer pool, this @@ -343,15 +348,10 @@ in DISCARD TABLESPACE, IMPORT TABLESPACE, or crash recovery. void ibuf_delete_for_discarded_space(ulint space); /** Contract the change buffer by reading pages to the buffer pool. -@param[in] full If true, do a full contraction based -on PCT_IO(100). If false, the size of contract batch is determined -based on the current size of the change buffer. @return a lower limit for the combined size in bytes of entries which will be merged from ibuf trees to the pages read, 0 if ibuf is empty */ -ulint -ibuf_merge_in_background( - bool full); +ulint ibuf_merge_all(); /** Contracts insert buffer trees by reading pages referring to space_id to the buffer pool. diff --git a/storage/innobase/include/ibuf0ibuf.ic b/storage/innobase/include/ibuf0ibuf.ic index e2170f79579..ba772359630 100644 --- a/storage/innobase/include/ibuf0ibuf.ic +++ b/storage/innobase/include/ibuf0ibuf.ic @@ -44,6 +44,11 @@ ibuf_mtr_start( { mtr_start(mtr); mtr->enter_ibuf(); + + if (high_level_read_only || srv_read_only_mode) { + mtr_set_log_mode(mtr, MTR_LOG_NONE); + } + } /***************************************************************//** Commits an insert buffer mini-transaction. */ @@ -130,8 +135,7 @@ ibuf_should_try( && !dict_index_is_clust(index) && !dict_index_is_spatial(index) && index->table->quiesce == QUIESCE_NONE - && (ignore_sec_unique || !dict_index_is_unique(index)) - && srv_force_recovery < SRV_FORCE_NO_IBUF_MERGE); + && (ignore_sec_unique || !dict_index_is_unique(index))); } /******************************************************************//** diff --git a/storage/innobase/include/srv0mon.h b/storage/innobase/include/srv0mon.h index 5a4d424981e..60a7211c166 100644 --- a/storage/innobase/include/srv0mon.h +++ b/storage/innobase/include/srv0mon.h @@ -397,7 +397,6 @@ enum monitor_id_t { MONITOR_MASTER_ACTIVE_LOOPS, MONITOR_MASTER_IDLE_LOOPS, MONITOR_SRV_BACKGROUND_DROP_TABLE_MICROSECOND, - MONITOR_SRV_IBUF_MERGE_MICROSECOND, MONITOR_SRV_LOG_FLUSH_MICROSECOND, MONITOR_SRV_MEM_VALIDATE_MICROSECOND, MONITOR_SRV_PURGE_MICROSECOND, diff --git a/storage/innobase/include/srv0srv.h b/storage/innobase/include/srv0srv.h index e97b964cb03..87094785dde 100644 --- a/storage/innobase/include/srv0srv.h +++ b/storage/innobase/include/srv0srv.h @@ -257,7 +257,7 @@ recovery and open all tables in RO mode instead of RW mode. We don't sync the max trx id to disk either. */ extern my_bool srv_read_only_mode; /** Set if InnoDB operates in read-only mode or innodb-force-recovery -is greater than SRV_FORCE_NO_TRX_UNDO. */ +is greater than SRV_FORCE_NO_IBUF_MERGE. */ extern my_bool high_level_read_only; /** store to its own file each table created by an user; data dictionary tables are in the system tablespace 0 */ @@ -534,10 +534,6 @@ extern ulint srv_main_idle_loops; /** Log writes involving flush. */ extern ulint srv_log_writes_and_flush; -#if defined UNIV_DEBUG || defined UNIV_IBUF_DEBUG -extern my_bool srv_ibuf_disable_background_merge; -#endif /* UNIV_DEBUG || UNIV_IBUF_DEBUG */ - #ifdef UNIV_DEBUG extern my_bool innodb_evict_tables_on_commit_debug; extern my_bool srv_sync_debug; diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc index 9af7dbdbdd9..47c9adeeb78 100644 --- a/storage/innobase/log/log0recv.cc +++ b/storage/innobase/log/log0recv.cc @@ -254,42 +254,37 @@ public: { ut_ad(mutex_own(&recv_sys.mutex)); ut_ad(recv_no_ibuf_operations); - for (map::iterator i= inits.begin(); i != inits.end(); i++) { - i->second.created = false; + for (auto &i : inits) { + i.second.created = false; } } - /** On the last recovery batch, merge buffered changes to those - pages that were initialized by buf_page_create() and still reside - in the buffer pool. Stale pages are not allowed in the buffer pool. + /** On the last recovery batch, mark whether the page contains + change buffered changes for the list of pages that were initialized + by buf_page_create() and still reside in the buffer pool. Note: When MDEV-14481 implements redo log apply in the background, we will have to ensure that buf_page_get_gen() will not deliver stale pages to users (pages on which the - change buffer was not merged yet). Normally, the change - buffer merge is performed on I/O completion. Maybe, add a - flag to buf_page_t and perform the change buffer merge on - the first actual access? + change buffer was not merged yet). @param[in,out] mtr dummy mini-transaction */ - void ibuf_merge(mtr_t& mtr) + void mark_ibuf_exist(mtr_t& mtr) { ut_ad(mutex_own(&recv_sys.mutex)); ut_ad(!recv_no_ibuf_operations); mtr.start(); - for (map::const_iterator i= inits.begin(); i != inits.end(); - i++) { - if (!i->second.created) { + for (const auto& i : inits) { + if (!i.second.created) { continue; } if (buf_block_t* block = buf_page_get_gen( - i->first, 0, RW_X_LATCH, NULL, + i.first, 0, RW_X_LATCH, NULL, BUF_GET_IF_IN_POOL, __FILE__, __LINE__, - &mtr, NULL)) { + &mtr)) { mutex_exit(&recv_sys.mutex); - ibuf_merge_or_delete_for_page( - block, i->first, - block->zip_size(), true); + block->page.ibuf_exist = ibuf_page_exists( + block->page); mtr.commit(); mtr.start(); mutex_enter(&recv_sys.mutex); @@ -1995,11 +1990,9 @@ void recv_recover_page(buf_page_t* bpage) x-latch on it. This is needed for the operations to the page to pass the debug checks. */ rw_lock_x_lock_move_ownership(&block->lock); - buf_block_dbg_add_level(block, SYNC_NO_ORDER_CHECK); - ibool success = buf_page_get_known_nowait( - RW_X_LATCH, block, BUF_KEEP_OLD, - __FILE__, __LINE__, &mtr); - ut_a(success); + buf_block_buf_fix_inc(block, __FILE__, __LINE__); + rw_lock_x_lock(&block->lock); + mtr.memo_push(block, MTR_MEMO_PAGE_X_FIX); mutex_enter(&recv_sys.mutex); if (recv_sys.apply_log_recs) { @@ -2275,7 +2268,7 @@ done: mlog_init.reset(); } else if (!recv_no_ibuf_operations) { /* We skipped this in buf_page_create(). */ - mlog_init.ibuf_merge(mtr); + mlog_init.mark_ibuf_exist(mtr); } recv_sys.apply_log_recs = false; diff --git a/storage/innobase/row/row0merge.cc b/storage/innobase/row/row0merge.cc index ec299ba36fc..180577ecdae 100644 --- a/storage/innobase/row/row0merge.cc +++ b/storage/innobase/row/row0merge.cc @@ -2029,7 +2029,7 @@ end_of_index: block = page_cur_get_block(cur); block = btr_block_get( *clust_index, next_page_no, - RW_S_LATCH, &mtr); + RW_S_LATCH, false, &mtr); btr_leaf_page_release(page_cur_get_block(cur), BTR_SEARCH_LEAF, &mtr); diff --git a/storage/innobase/srv/srv0mon.cc b/storage/innobase/srv/srv0mon.cc index 8d4345d910e..6624c569533 100644 --- a/storage/innobase/srv/srv0mon.cc +++ b/storage/innobase/srv/srv0mon.cc @@ -1188,11 +1188,6 @@ static monitor_info_t innodb_counter_info[] = MONITOR_NONE, MONITOR_DEFAULT_START, MONITOR_SRV_BACKGROUND_DROP_TABLE_MICROSECOND}, - {"innodb_ibuf_merge_usec", "server", - "Time (in microseconds) spent to process change buffer merge", - MONITOR_NONE, - MONITOR_DEFAULT_START, MONITOR_SRV_IBUF_MERGE_MICROSECOND}, - {"innodb_log_flush_usec", "server", "Time (in microseconds) spent to flush log records", MONITOR_NONE, diff --git a/storage/innobase/srv/srv0srv.cc b/storage/innobase/srv/srv0srv.cc index ffb87db58f0..5638b783224 100644 --- a/storage/innobase/srv/srv0srv.cc +++ b/storage/innobase/srv/srv0srv.cc @@ -2172,13 +2172,6 @@ srv_master_do_active_tasks(void) srv_main_thread_op_info = "checking free log space"; log_free_check(); - /* Do an ibuf merge */ - srv_main_thread_op_info = "doing insert buffer merge"; - counter_time = microsecond_interval_timer(); - ibuf_merge_in_background(false); - MONITOR_INC_TIME_IN_MICRO_SECS( - MONITOR_SRV_IBUF_MERGE_MICROSECOND, counter_time); - /* Flush logs if needed */ srv_main_thread_op_info = "flushing log"; srv_sync_log_buffer_in_background(); @@ -2265,13 +2258,6 @@ srv_master_do_idle_tasks(void) srv_main_thread_op_info = "checking free log space"; log_free_check(); - /* Do an ibuf merge */ - counter_time = microsecond_interval_timer(); - srv_main_thread_op_info = "doing insert buffer merge"; - ibuf_merge_in_background(true); - MONITOR_INC_TIME_IN_MICRO_SECS( - MONITOR_SRV_IBUF_MERGE_MICROSECOND, counter_time); - if (srv_shutdown_state != SRV_SHUTDOWN_NONE) { return; } @@ -2335,7 +2321,7 @@ srv_shutdown(bool ibuf_merge) srv_main_thread_op_info = "checking free log space"; log_free_check(); srv_main_thread_op_info = "doing insert buffer merge"; - n_bytes_merged = ibuf_merge_in_background(true); + n_bytes_merged = ibuf_merge_all(); /* Flush logs if needed */ srv_sync_log_buffer_in_background(); diff --git a/storage/innobase/srv/srv0start.cc b/storage/innobase/srv/srv0start.cc index 9cc77259239..46435bb0055 100644 --- a/storage/innobase/srv/srv0start.cc +++ b/storage/innobase/srv/srv0start.cc @@ -1311,7 +1311,7 @@ dberr_t srv_start(bool create_new_db) } high_level_read_only = srv_read_only_mode - || srv_force_recovery > SRV_FORCE_NO_TRX_UNDO + || srv_force_recovery > SRV_FORCE_NO_IBUF_MERGE || srv_sys_space.created_new_raw(); /* Reset the start state. */ @@ -2135,7 +2135,7 @@ files_checked: /* Validate a few system page types that were left uninitialized before MySQL or MariaDB 5.5. */ if (!high_level_read_only) { - ut_ad(srv_force_recovery < SRV_FORCE_NO_IBUF_MERGE); + ut_ad(srv_force_recovery <= SRV_FORCE_NO_IBUF_MERGE); buf_block_t* block; mtr.start(); /* Bitmap page types will be reset in @@ -2190,7 +2190,7 @@ files_checked: /* FIXME: Skip the following if srv_read_only_mode, while avoiding "Allocated tablespace ID" warnings. */ - if (srv_force_recovery < SRV_FORCE_NO_IBUF_MERGE) { + if (srv_force_recovery <= SRV_FORCE_NO_IBUF_MERGE) { /* Open or Create SYS_TABLESPACES and SYS_DATAFILES so that tablespace names and other metadata can be found. */ diff --git a/storage/rocksdb/mysql-test/rocksdb/r/innodb_i_s_tables_disabled.result b/storage/rocksdb/mysql-test/rocksdb/r/innodb_i_s_tables_disabled.result index 6f446a13132..9bf35793159 100644 --- a/storage/rocksdb/mysql-test/rocksdb/r/innodb_i_s_tables_disabled.result +++ b/storage/rocksdb/mysql-test/rocksdb/r/innodb_i_s_tables_disabled.result @@ -230,7 +230,6 @@ innodb_activity_count server 0 NULL NULL NULL 0 NULL NULL NULL NULL NULL NULL NU innodb_master_active_loops server 0 NULL NULL NULL 0 NULL NULL NULL NULL NULL NULL NULL 0 counter Number of times master thread performs its tasks when server is active innodb_master_idle_loops server 0 NULL NULL NULL 0 NULL NULL NULL NULL NULL NULL NULL 0 counter Number of times master thread performs its tasks when server is idle innodb_background_drop_table_usec server 0 NULL NULL NULL 0 NULL NULL NULL NULL NULL NULL NULL 0 counter Time (in microseconds) spent to process drop table list -innodb_ibuf_merge_usec server 0 NULL NULL NULL 0 NULL NULL NULL NULL NULL NULL NULL 0 counter Time (in microseconds) spent to process change buffer merge innodb_log_flush_usec server 0 NULL NULL NULL 0 NULL NULL NULL NULL NULL NULL NULL 0 counter Time (in microseconds) spent to flush log records innodb_mem_validate_usec server 0 NULL NULL NULL 0 NULL NULL NULL NULL NULL NULL NULL 0 counter Time (in microseconds) spent to do memory validation innodb_master_purge_usec server 0 NULL NULL NULL 0 NULL NULL NULL NULL NULL NULL NULL 0 counter Time (in microseconds) spent by master thread to purge records