1
0
mirror of https://github.com/MariaDB/server.git synced 2025-11-28 17:36:30 +03:00
Commit Graph

14 Commits

Author SHA1 Message Date
Marko Mäkelä
b42294bc64 MDEV-19514 Defer change buffer merge until pages are requested
We will remove the InnoDB background operation of merging buffered
changes to secondary index leaf pages. Changes will only be merged as a
result of an operation that accesses a secondary index leaf page,
such as a SQL statement that performs a lookup via that index,
or is modifying the index. Also ROLLBACK and some background operations,
such as purging the history of committed transactions, or computing
index cardinality statistics, can cause change buffer merge.
Encryption key rotation will not perform change buffer merge.

The motivation of this change is to simplify the I/O logic and to
allow crash recovery to happen in the background (MDEV-14481).
We also hope that this will reduce the number of "mystery" crashes
due to corrupted data. Because change buffer merge will typically
take place as a result of executing SQL statements, there should be
a clearer connection between the crash and the SQL statements that
were executed when the server crashed.

In many cases, a slight performance improvement was observed.

This is joint work with Thirunarayanan Balathandayuthapani
and was tested by Axel Schwenke and Matthias Leich.

The InnoDB monitor counter innodb_ibuf_merge_usec will be removed.

On slow shutdown (innodb_fast_shutdown=0), we will continue to
merge all buffered changes (and purge all undo log history).

Two InnoDB configuration parameters will be changed as follows:

innodb_disable_background_merge: Removed.
This parameter existed only in debug builds.
All change buffer merges will use synchronous reads.

innodb_force_recovery will be changed as follows:
* innodb_force_recovery=4 will be the same as innodb_force_recovery=3
(the change buffer merge cannot be disabled; it can only happen as
a result of an operation that accesses a secondary index leaf page).
The option used to be capable of corrupting secondary index leaf pages.
Now that capability is removed, and innodb_force_recovery=4 becomes 'safe'.
* innodb_force_recovery=5 (which essentially hard-wires
SET GLOBAL TRANSACTION ISOLATION LEVEL READ UNCOMMITTED)
becomes safe to use. Bogus data can be returned to SQL, but
persistent InnoDB data files will not be corrupted further.
* innodb_force_recovery=6 (ignore the redo log files)
will be the only option that can potentially cause
persistent corruption of InnoDB data files.

Code changes:

buf_page_t::ibuf_exist: New flag, to indicate whether buffered
changes exist for a buffer pool page. Pages with pending changes
can be returned by buf_page_get_gen(). Previously, the changes
were always merged inside buf_page_get_gen() if needed.

ibuf_page_exists(const buf_page_t&): Check if a buffered changes
exist for an X-latched or read-fixed page.

buf_page_get_gen(): Add the parameter allow_ibuf_merge=false.
All callers that know that they may be accessing a secondary index
leaf page must pass this parameter as allow_ibuf_merge=true,
unless it does not matter for that caller whether all buffered
changes have been applied. Assert that whenever allow_ibuf_merge
holds, the page actually is a leaf page. Attempt change buffer
merge only to secondary B-tree index leaf pages.

btr_block_get(): Add parameter 'bool merge'.
All callers of btr_block_get() should know whether the page could be
a secondary index leaf page. If it is not, we should avoid consulting
the change buffer bitmap to even consider a merge. This is the main
interface to requesting index pages from the buffer pool.

ibuf_merge_or_delete_for_page(), recv_recover_page(): Replace
buf_page_get_known_nowait() with much simpler logic, because
it is now guaranteed that that the block is x-latched or read-fixed.

mlog_init_t::mark_ibuf_exist(): Renamed from mlog_init_t::ibuf_merge().
On crash recovery, we will no longer merge any buffered changes
for the pages that we read into the buffer pool during the last batch
of applying log records.

buf_page_get_gen_known_nowait(), BUF_MAKE_YOUNG, BUF_KEEP_OLD: Remove.

btr_search_guess_on_hash(): Merge buf_page_get_gen_known_nowait()
to its only remaining caller.

buf_page_make_young_if_needed(): Define as an inline function.
Add the parameter buf_pool.

buf_page_peek_if_young(), buf_page_peek_if_too_old(): Add the
parameter buf_pool.

fil_space_validate_for_mtr_commit(): Remove a bogus comment
about background merge of the change buffer.

btr_cur_open_at_rnd_pos_func(), btr_cur_search_to_nth_level_func(),
btr_cur_open_at_index_side_func(): Use narrower data types and scopes.

ibuf_read_merge_pages(): Replaces buf_read_ibuf_merge_pages().
Merge the change buffer by invoking buf_page_get_gen().
2019-10-11 17:28:15 +03:00
Marko Mäkelä
d09aec7a15 MDEV-19940 Clean up INFORMATION_SCHEMA.INNODB_ tables
Shorten some VARCHAR attributes to a more reasonable length.

INNODB_METRICS: Rename the column STATUS to ENABLED, and make it Boolean.

Replace with INT(1) many Boolean attributes that were declared as VARCHAR
containing 'NO','YES','disabled','enabled','Uninitialized','Initialized'.

Replace some VARCHAR attributes with ENUM.

Replace some BIGINT with INT when 32 bits are sufficient.

Remove INNODB_SYS_TABLESPACES.SPACE_TYPE. The type of a tablespace
can be derived from the tablespace ID. A fixed number is used for
the system tablespace and the temporary tablespace. All other tablespaces
are single-table or single-partition tablespaces.

i_s_locks_row_t::lock_type, lock_get_type_str(): Remove.
This is a redundant field. Table and record locks can be
distinguished by whether i_s_locks_row_t::lock_index is NULL.

fill_trx_row(): Do not unnecessarily copy the constant strings that
trx->op_info is pointing to.

i_s_locks_row_t::lock_mode: Replace string with integer.

lock_get_mode_str(), lock_get_trx_id(), lock_get_trx(): Remove.

field_store_ulint(): Remove.
2019-07-04 00:09:16 +03:00
Monty
007f68c37f Replace ha_notify_table_changed() with notify_tabledef_changed()
Reason for the change was that ha_notify_table_changed() was done
after table open when .frm had been replaced, which caused failure
in engines that checks on open if .frm matches the engines table
definition.

Other changes:
- Remove not needed open/close call at end of inline alter table.
  Some test that depended on the table beeing in the table cache after
  ALTER TABLE had to be updated.
2019-05-23 01:20:18 +03:00
Marko Mäkelä
b3860a8621 InnoDB review fixes
Fix the formatting, and remove the MONITOR interface.
Remove unnecessary wrapper functions for the callbacks,
and replace void* with ha_innobase*.
2019-02-05 21:51:35 +02:00
Igor Babaev
33907360f5 MDEV-16188 Post-merge corrections and adjustments 2019-02-04 22:44:33 -08:00
Eugene Kosov
89337d510e MDEV-16580 Remove unused monitor counters from InnoDB
Remove one totally dead monitor.
2018-11-16 11:52:44 +03:00
Alexander Barkov
9c0f5a252b Merge remote-tracking branch 'origin/10.3' into 10.4 2018-07-25 08:25:57 +04:00
Marko Mäkelä
1748a31ae8 MDEV-16675 Unnecessary explicit lock acquisition during UPDATE or DELETE
In InnoDB, an INSERT will not create an explicit lock object. Instead,
the inserted record is initially implicitly locked by the transaction
that wrote its trx_t::id to the hidden system column DB_TRX_ID.
(Other transactions would check if DB_TRX_ID is referring to a
transaction that has not been committed.)

If a record was inserted in the current transaction, it would be
implicitly locked by that transaction. Only if some other transaction
is requesting access to the record, the implicit lock should be
converted to an explicit one, so that the waits-for graph can be
constructed for detecting deadlocks and lock wait timeouts.

Before this fix, InnoDB would convert implicit locks to
explicit ones, even if no conflict exists.

lock_rec_convert_impl_to_expl(): Return whether caller_trx
already holds an explicit lock that covers the record.

row_vers_impl_x_locked_low(): Avoid a lookup if the record matches
caller_trx->id.

lock_trx_has_expl_x_lock(): Renamed from lock_trx_has_rec_x_lock().

row_upd_clust_step(): In a debug assertion, check for implicit lock
before invoking lock_trx_has_expl_x_lock().

rw_trx_hash_t::find(): Make do_ref_count a mandatory parameter.
Assert that trx_id is not 0 (the caller should check it).

trx_sys_t::is_registered(): Only invoke find() if id != 0.

trx_sys_t::find(): Add the optional parameter do_ref_count.

lock_rec_queue_validate(): Avoid lookup for trx_id == 0.
2018-07-03 15:10:06 +03:00
Marko Mäkelä
13f7ac2269 MDEV-15705 Remove global status counter Innodb_pages0_read
MDEV-9931 introduced a counter for keeping track of reads of the
first page of InnoDB data files, because the original implementation
of data-at-rest-encryption for InnoDB introduced new code paths for
reading the pages.

Ultimately, the extra reads of the first page were removed, and
the encryption subsystem will be initialized whenever we first read
the first page of each data file, in fil_node_open_file(). It should not
be that interesting to observe how many times an InnoDB data file was
opened for the first time.
2018-05-28 09:36:18 +03:00
Eugene Kosov
c31aa75dee SQL: open TRT only after versioned write [#305][fixes #321] 2017-11-21 21:54:11 +03:00
Eugene Kosov
75cf92fac9 Tests: regenerate embedded [#302]
sys_vars.sysvars_server_embedded
funcs_1.is_key_column_usage_embedded
funcs_1.is_statistics_mysql_embedded
funcs_1.is_table_constraints_mysql_embedded
funcs_1.is_tables_mysql_embedded
funcs_1.is_columns_mysql_embedded
innodb.monitor
2017-11-17 14:28:45 +03:00
Monty
07977c13e7 Fixed monitor.test to handle statistics >= 10 2017-09-08 13:24:42 +03:00
Marko Mäkelä
ff0530ef68 MDEV-12121: Revert test adjustments for -DWITH_INNODB_AHI=OFF
Because the default build configuration of the server will remain
at -DWITH_INNODB_AHI=ON, we want to test the instrumentation.

We make and revert the test adjustments in separate commits on purpose,
so that this commit can be easily reverted later if the default
build configuration is changed to -DWITH_INNODB_AHI=OFF.
2017-03-03 17:08:06 +02:00
Marko Mäkelä
27b9989d31 MDEV-12121 Introduce build option WITH_INNODB_AHI to disable innodb_adaptive_hash_index
The InnoDB adaptive hash index is sometimes degrading the performance of
InnoDB, and it is sometimes disabled to get more consistent performance.
We should have a compile-time option to disable the adaptive hash index.

Let us introduce two options:

OPTION(WITH_INNODB_AHI "Include innodb_adaptive_hash_index" ON)
OPTION(WITH_INNODB_ROOT_GUESS "Cache index root block descriptors" ON)

where WITH_INNODB_AHI always implies WITH_INNODB_ROOT_GUESS.

As part of this change, the misleadingly named function
trx_search_latch_release_if_reserved(trx) will be replaced with the macro
trx_assert_no_search_latch(trx) that will be empty unless
BTR_CUR_HASH_ADAPT is defined (cmake -DWITH_INNODB_AHI=ON).

We will also remove the unused column
INFORMATION_SCHEMA.INNODB_TRX.TRX_ADAPTIVE_HASH_TIMEOUT.
In MariaDB Server 10.1, it used to reflect the value of
trx_t::search_latch_timeout which could be adjusted during
row_search_for_mysql(). In 10.2, there is no such field.

Other than the removal of the unused column TRX_ADAPTIVE_HASH_TIMEOUT,
this is an almost non-functional change to the server when using the
default build options.

Some tests are adjusted so that they will work with both
-DWITH_INNODB_AHI=ON and -DWITH_INNODB_AHI=OFF. The test
innodb.innodb_monitor has been renamed to innodb.monitor
in order to track MySQL 5.7, and the duplicate tests
sys_vars.innodb_monitor_* are removed.
2017-03-03 16:55:50 +02:00