The InnoDB adaptive hash index is sometimes degrading the performance of
InnoDB, and it is sometimes disabled to get more consistent performance.
We should have a compile-time option to disable the adaptive hash index.
Let us introduce two options:
OPTION(WITH_INNODB_AHI "Include innodb_adaptive_hash_index" ON)
OPTION(WITH_INNODB_ROOT_GUESS "Cache index root block descriptors" ON)
where WITH_INNODB_AHI always implies WITH_INNODB_ROOT_GUESS.
As part of this change, the misleadingly named function
trx_search_latch_release_if_reserved(trx) will be replaced with the macro
trx_assert_no_search_latch(trx) that will be empty unless
BTR_CUR_HASH_ADAPT is defined (cmake -DWITH_INNODB_AHI=ON).
We will also remove the unused column
INFORMATION_SCHEMA.INNODB_TRX.TRX_ADAPTIVE_HASH_TIMEOUT.
In MariaDB Server 10.1, it used to reflect the value of
trx_t::search_latch_timeout which could be adjusted during
row_search_for_mysql(). In 10.2, there is no such field.
Other than the removal of the unused column TRX_ADAPTIVE_HASH_TIMEOUT,
this is an almost non-functional change to the server when using the
default build options.
Some tests are adjusted so that they will work with both
-DWITH_INNODB_AHI=ON and -DWITH_INNODB_AHI=OFF. The test
innodb.innodb_monitor has been renamed to innodb.monitor
in order to track MySQL 5.7, and the duplicate tests
sys_vars.innodb_monitor_* are removed.
MDEV-7618 introduced configuration parameter innodb_instrument_semaphores
in MariaDB Server 10.1. The parameter seems to only affect the rw-lock
X-latch acquisition. Extra fields are added to rw_lock_t to remember one
X-latch holder or waiter. These fields are not being consulted or reported
anywhere. This is basically only adding code bloat.
If the intention is to debug hangs or deadlocks, we have better tools for
that in the debug server, and for the non-debug server, core dumps can
reveal a lot. For example, the mini-transaction memo records the
currently held buffer block or index rw-locks, to be released at
mtr_t::commit().
The configuration parameter innodb_instrument_semaphores will be
deprecated in 10.2.5 and removed in 10.3.0.
rw_lock_t: Remove the members lock_name, file_name, line, thread_id
which did not affect any output.
to tables in the system tablespace
This is a regression caused by MDEV-11585, which accidentally
changed Tablespace::is_undo_tablespace() in an incorrect way,
causing the InnoDB system tablespace to be reported as a dedicated
undo tablespace, for which the change buffer is not applicable.
Tablespace::is_undo_tablespace(): Remove. There were only 2
calls from the function buf_page_io_complete(). Replace those
calls as appropriate.
Also, merge changes to tablespace import/export tests from
MySQL 5.7, and clean up the tests a little further, allowing
them to be run with any innodb_page_size.
Remove duplicated error injection instrumentation for the
import/export tests. In MySQL 5.7, the error injection label
buf_page_is_corrupt_failure was renamed to
buf_page_import_corrupt_failure.
fil_space_extend_must_retry(): Correct a debug assertion
(tablespaces can be extended during IMPORT), and remove a
TODO comment about compressed temporary tables that was
already addressed in MDEV-11816.
dict_build_tablespace_for_table(): Correct a comment that
no longer holds after MDEV-11816, and assert that
ROW_FORMAT=COMPRESSED can only be used in .ibd files.
dict_init_free(): Make global, and move the call from
dict_close() to srv_free(), because this is initialized
earlier than dict_sys.
innobase_space_shutdown(): Do not leak srv_allow_writes_event.
Write only one encryption key to the checkpoint page.
Use 4 bytes of nonce. Encrypt more of each redo log block,
only skipping the 4-byte field LOG_BLOCK_HDR_NO which the
initialization vector is derived from.
Issue notes, not warning messages for rewriting the redo log files.
recv_recovery_from_checkpoint_finish(): Do not generate any redo log,
because we must avoid that before rewriting the redo log files, or
otherwise a crash during a redo log rewrite (removing or adding
encryption) may end up making the database unrecoverable.
Instead, do these tasks in innobase_start_or_create_for_mysql().
Issue a firm "Missing MLOG_CHECKPOINT" error message. Remove some
unreachable code and duplicated error messages for log corruption.
LOG_HEADER_FORMAT_ENCRYPTED: A flag for identifying an encrypted redo
log format.
log_group_t::is_encrypted(), log_t::is_encrypted(): Determine
if the redo log is in encrypted format.
recv_find_max_checkpoint(): Interpret LOG_HEADER_FORMAT_ENCRYPTED.
srv_prepare_to_delete_redo_log_files(): Display NOTE messages about
adding or removing encryption. Do not issue warnings for redo log
resizing any more.
innobase_start_or_create_for_mysql(): Rebuild the redo logs also when
the encryption changes.
innodb_log_checksums_func_update(): Always use the CRC-32C checksum
if innodb_encrypt_log. If needed, issue a warning
that innodb_encrypt_log implies innodb_log_checksums.
log_group_write_buf(): Compute the checksum on the encrypted
block contents, so that transmission errors or incomplete blocks can be
detected without decrypting.
Rewrite most of the redo log encryption code. Only remember one
encryption key at a time (but remember up to 5 when upgrading from the
MariaDB 10.1 format.)
The InnoDB redo log consists of a list of files that logically form
a bigger file, as if the individual files were concatenated together.
The first file will always be written on redo log checkpoint, because
the two checkpoint pages are at the start of the single logical
redo log file.
There is no technical reason why InnoDB requires at least 2 files
to exist. Let us reduce the minimum number to 1. In that way,
restoring from backups will become easier, since InnoDB can directly
deal with a single backed-up redo log file.
Ever since MDEV-5800 enabled indexed virtual columns for InnoDB,
the InnoDB shutdown relied on close_connections() that would set
thd->killed for the InnoDB purge threads. Alas, the embedded server
shutdown is not invoking close_connections(), and thus InnoDB purge
threads fail to initiate shutdown, causing a hang.
innodb_inited: Remove. Use srv_was_started instead.
innobase_fast_shutdown: Remove. Use srv_fast_shutdown instead.
srv_running: Renamed from thd_destructor_myvar, and made global.
The value NULL means that shutdown was requested or the purge threads
should not be running because of innodb_read_only_mode=1.
innobase_init(): Set srv_was_started after ensuring that srv_running
was initialized. (In innodb_read_only mode, the purge threads are not
started and we do not care if srv_running==NULL.)
innobase_start_or_create_for_mysql(): Do not set srv_was_started.
Let it be set by the only caller innobase_init().
srv_purge_should_exit(): Check also srv_was_started and srv_running
when evaluating thd->killed.
Oracle introduced a Memcached plugin interface to the InnoDB
storage engine in MySQL 5.6. That interface is essentially a
fork of Memcached development snapshot 1.6.0-beta1 of an old
development branch 'engine-pu'.
To my knowledge, there have not been any updates to the Memcached code
between MySQL 5.6 and 5.7; only bug fixes and extensions related to
the Oracle modifications.
The Memcached plugin is not part of the MariaDB Server. Therefore it
does not make sense to include the InnoDB interfaces for the Memcached
plugin, or to have any related configuration parameters:
innodb_api_bk_commit_interval
innodb_api_disable_rowlock
innodb_api_enable_binlog
innodb_api_enable_mdl
innodb_api_trx_level
Removing this code in one commit makes it possible to easily restore
it, in case it turns out to be needed later.
Encryption stores used key_version to
FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION (offset 26)
field. Spatial indexes store RTREE Split Sequence Number
(FIL_RTREE_SPLIT_SEQ_NUM) in the same field. Both values
can't be stored in same field. Thus, current encryption
implementation does not support encrypting spatial indexes.
fil_space_encrypt(): Do not encrypt page if page type is
FIL_PAGE_RTREE (this is required for background
encryption innodb-encrypt-tables=ON).
create_table_info_t::check_table_options() Do not allow creating
table with ENCRYPTED=YES if table contains spatial index.
Galera disallow-writes feature was lost in InnoDB 5.7 merge
to 10.2. This patch restores this feature and fixes test
failure on test galera.galera_var_innodb_disallow_writes.
Remove the debug parameter innodb_force_recovery_crash that was
introduced into MySQL 5.6 by me in WL#6494 which allowed InnoDB
to resize the redo log on startup.
Let innodb.log_file_size actually start up the server, but ensure
that the InnoDB storage engine refuses to start up in each of the
scenarios.
This fixes memory leaks in tests that cause InnoDB startup to fail.
buf_pool_free_instance(): Also free buf_pool->flush_rbt, which would
normally be freed when crash recovery finishes.
fil_node_close_file(), fil_space_free_low(), fil_close_all_files():
Relax some debug assertions to tolerate !srv_was_started.
innodb_shutdown(): Renamed from innobase_shutdown_for_mysql().
Changed the return type to void. Do not assume that all subsystems
were started.
que_init(), que_close(): Remove (empty functions).
srv_init(), srv_general_init(): Remove as global functions.
srv_free(): Allow srv_sys=NULL.
srv_get_active_thread_type(): Only return SRV_PURGE if purge really
is running.
srv_shutdown_all_bg_threads(): Do not reset srv_start_state. It will
be needed by innodb_shutdown().
innobase_start_or_create_for_mysql(): Always call srv_boot() so that
innodb_shutdown() can assume that it was called. Make more subsystems
dependent on SRV_START_STATE_STAT.
srv_shutdown_bg_undo_sources(): Require SRV_START_STATE_STAT.
trx_sys_close(): Do not assume purge_sys!=NULL. Do not call
buf_dblwr_free(), because the doublewrite buffer can exist while
the transaction system does not.
logs_empty_and_mark_files_at_shutdown(): Do a faster shutdown if
!srv_was_started.
recv_sys_close(): Invoke dblwr.pages.clear() which would normally
be invoked by buf_dblwr_process().
recv_recovery_from_checkpoint_start(): Always release log_sys->mutex.
row_mysql_close(): Allow the subsystem not to exist.
Remove the debug parameter innodb_force_recovery_crash that was
introduced into MySQL 5.6 by me in WL#6494 which allowed InnoDB
to resize the redo log on startup.
Let innodb.log_file_size actually start up the server, but ensure
that the InnoDB storage engine refuses to start up in each of the
scenarios.
1. wait for thd_destructor_proxy thread to start after creating it.
this ensures that the thread is ready to receive a shutdown signal
whenever we want to send it.
2. join it at shutdown, this guarantees that no innodb THD will exist
after innobase_end().
this fixes crashes and memory leaks in main.mysqld_option_err
(were innodb was started and then immediately shut down).
crashes server
This bug is the result of merging the Oracle MySQL follow-up fix
BUG#22963169 MYSQL CRASHES ON CREATE FULLTEXT INDEX
without merging the base bug fix:
Bug#79475 Insert a token of 84 4-bytes chars into fts index causes
server crash.
Unlike the above mentioned fixes in MySQL, our fix will not change
the storage format of fulltext indexes in InnoDB or XtraDB
when a character encoding with mbmaxlen=2 or mbmaxlen=3
and the length of a word is between 128 and 84*mbmaxlen bytes.
The Oracle fix would allocate 2 length bytes for these cases.
Compatibility with other MySQL and MariaDB releases is ensured by
persisting the used maximum length in the SYS_COLUMNS table in the
InnoDB data dictionary.
This fix also removes some unnecessary strcmp() calls when checking
for the legacy default collation my_charset_latin1
(my_charset_latin1.name=="latin1_swedish_ci").
fts_create_one_index_table(): Store the actual length in bytes.
This metadata will be written to the SYS_COLUMNS table.
fts_zip_initialize(): Initialize only the first byte of the buffer.
Actually the code should not even care about this first byte, because
the length is set as 0.
FTX_MAX_WORD_LEN: Define as HA_FT_MAXCHARLEN * 4 aka 336 bytes,
not as 254 bytes.
row_merge_create_fts_sort_index(): Set the actual maximum length of the
column in bytes, similar to fts_create_one_index_table().
row_merge_fts_doc_tokenize(): Remove the redundant parameter word_dtype.
Use the actual maximum length of the column. Calculate the extra_size
in the same way as row_merge_buf_encode() does.
Problem was that implementation merged from 10.1 was incompatible
with InnoDB 5.7.
buf0buf.cc: Add functions to return should we punch hole and
how big.
buf0flu.cc: Add written page to IORequest
fil0fil.cc: Remove unneeded status call and add test is
sparse files and punch hole supported by file system when
tablespace is created. Add call to get file system
block size. Used file node is added to IORequest. Added
functions to check is punch hole supported and setting
punch hole.
ha_innodb.cc: Remove unneeded status variables (trim512-32768)
and trim_op_saved. Deprecate innodb_use_trim and
set it ON by default. Add function to set innodb-use-trim
dynamically.
dberr.h: Add error code DB_IO_NO_PUNCH_HOLE
if punch hole operation fails.
fil0fil.h: Add punch_hole variable to fil_space_t and
block size to fil_node_t.
os0api.h: Header to helper functions on buf0buf.cc and
fil0fil.cc for os0file.h
os0file.h: Remove unneeded m_block_size from IORequest
and add bpage to IORequest to know actual size of
the block and m_fil_node to know tablespace file
system block size and does it support punch hole.
os0file.cc: Add function punch_hole() to IORequest
to do punch_hole operation,
get the file system block size and determine
does file system support sparse files (for punch hole).
page0size.h: remove implicit copy disable and
use this implicit copy to implement copy_from()
function.
buf0dblwr.cc, buf0flu.cc, buf0rea.cc, fil0fil.cc, fil0fil.h,
os0file.h, os0file.cc, log0log.cc, log0recv.cc:
Remove unneeded write_size parameter from fil_io
calls.
srv0mon.h, srv0srv.h, srv0mon.cc: Remove unneeded
trim512-trim32678 status variables. Removed
these from monitor tests.
Change default to zlib, this has effect only if user has
explicitly requested page compression and then user
naturally expects that pages are really compressed
if they can be compressed.
Most notably, this includes MDEV-11623, which includes a fix and
an upgrade procedure for the InnoDB file format incompatibility
that is present in MariaDB Server 10.1.0 through 10.1.20.
In other words, this merge should address
MDEV-11202 InnoDB 10.1 -> 10.2 migration does not work
When MySQL 5.7.9 (and MariaDB Server 10.2) introduced
innodb_default_row_format and made ROW_FORMAT=DYNAMIC the default,
it became possible to create any ROW_FORMAT tables in the InnoDB
system tablespace, except ROW_FORMAT=COMPRESSED.
In MySQL 5.7, it is possible to create ROW_FORMAT=DYNAMIC
tables when TABLESPACE=innodb_system is explicitly specified.
Because MariaDB Server 10.2 does not support the MySQL 5.7
TABLESPACE=innodb_system attribute for tables, we should allow
ROW_FORMAT=DYNAMIC when innodb_file_per_table=0.
Also, remove the test innodb_zip.innodb-create-options, which was
an outdated copy of innodb_zip.create_options.
MySQL 5.7 allows temporary tables to be created in ROW_FORMAT=COMPRESSED.
The usefulness of this is questionable. WL#7899 in MySQL 8.0.0
prevents the creation of such compressed tables, so that all InnoDB
temporary tables will be located inside the predefined
InnoDB temporary tablespace.
Pick up and adjust some tests from MySQL 5.7 and 8.0.
dict_tf_to_fsp_flags(): Remove the parameter is_temp.
fsp_flags_init(): Remove the parameter is_temporary.
row_mysql_drop_temp_tables(): Remove. There cannot be any temporary
tables in InnoDB. (This never removed #sql* tables in the datadir
which were created by DDL.)
dict_table_t::dir_path_of_temp_table: Remove.
create_table_info_t::m_temp_path: Remove.
create_table_info_t::create_options_are_invalid(): Do not allow
ROW_FORMAT=COMPRESSED or KEY_BLOCK_SIZE for temporary tables.
create_table_info_t::innobase_table_flags(): Do not unnecessarily
prevent CREATE TEMPORARY TABLE with SPATIAL INDEX.
(MySQL 5.7 does allow this.)
fil_space_belongs_in_lru(): The only FIL_TYPE_TEMPORARY tablespace
is never subjected to closing least-recently-used files.
MySQL 5.7 introduced partial support for user-created shared tablespaces
(for example, import and export are not supported).
MariaDB Server does not support tablespaces at this point of time.
Let us remove most InnoDB code and data structures that is related
to shared tablespaces.
The MariaDB 10.1 page_compression is incompatible with the Oracle
implementation that was introduced in MySQL 5.7 later.
Remove the Oracle implementation. Also remove the remaining traces of
MYSQL_ENCRYPTION.
This will also remove traces of PUNCH_HOLE until it is implemented
better. The only effective call to os_file_punch_hole() was in
fil_node_create_low() to test if the operation is supported for the file.
In other words, it looks like page_compression is not working in
MariaDB 10.2, because no code equivalent to the 10.1 os_file_trim()
is enabled.
MariaDB will likely never support MySQL-style encryption for
InnoDB, because we cannot link with the Oracle encryption plugin.
This is preparation for merging MDEV-11623.
The INFORMATION_SCHEMA view INNODB_TEMP_TABLE_INFO was added to
MySQL 5.7 as part of the work to implement temporary tables
without any redo logging.
The only use case of this view was SELECT COUNT(*) in some tests,
to see how many temporary tables exist in InnoDB. The columns do
not report much useful information. For example, the table name
would not be the user-specified table name, but a generated #sql
name. Also, the session that created the table is not identified.
- Atomic writes are enabled by default
- Automatically detect if device supports atomic write and use it if
atomic writes are enabled
- Remove ATOMIC WRITE options from CREATE TABLE
- Atomic write is a device option, not a table options as the table may
crash if the media changes
- Add support for SHANNON SSD cards
Sometimes innodb_data_file_size_debug was reported as INT UNSIGNED
instead of BIGINT UNSIGNED. Make it uint instead of ulong to get
a more deterministic result.