ALTER TABLE IMPORT doesn't properly handle instant alter metadata.
This patch makes IMPORT read, parse and apply instant alter metadata at the
very beginning of operation. So, cases when source table has some metadata
and destination table doesn't have it now works fine.
DISCARD already removes instant metadata so importing normal table into
instant table worked fine before this patch.
decrypt_decompress(): decrypts and decompresses page if needed
handle_instant_metadata(): this should be the first thing to read source
table. Basically, it applies instant metadata to a destination
dict_table_t object. This is the first thing to read FSP flags so
all possible checks of it were moved to this function.
PageConverter::update_index_page(): it doesn't now read instant metadata.
This logic were moved into handle_instant_metadata()
row_import::match_flags(): this is a first part row_import::match_schema().
As a separate function it's used by handle_instant_metadata().
fil_space_t::is_full_crc32_compressed(): added convenient function
ha_innobase::discard_or_import_tablespace(): do not reload table definition
to read instant metadata because handle_instant_metadata() does it better.
The reverted code was originally added in
4e7ee166a9
ANONYMOUS_VAR: this is a handy thing to use along with make_scope_exit()
full_crc32_import.test shows different results, because no
dict_table_close() and dict_table_open_on_id() happens.
Thus, SHOW CREATE TABLE shows a little bit older table definition.
Import operation without .cfg file fails when there is mismatch of index
between metadata table and .ibd file. Moreover, MDEV-19022 shows
that InnoDB can end up with index tree where non-leaf page has only
one child page. So it is unsafe to find the secondary index root page.
This patch does the following when importing the table without .cfg file:
1) If the metadata contains more than one index then InnoDB stops
the import operation and report the user to drop all secondary
indexes before doing import operation.
2) When the metadata contain only clustered index then InnoDB finds the
index id by reading page 0 & page 3 instead of traversing the
whole tablespace.
Problem:
=========
As a part of MDEV-14398 patch, InnoDB added and removed
the tablespace from default encrypt list. But InnoDB removes
the tablespace from the default encrypt list too early due to
i) other encryption thread working on the tablespace
ii) When tablespace is being flushed at the end of
key rotation
InnoDB fails to decrypt/encrypt the tablespace since
the tablespace removed too early and it leads to
test case failure.
Solution:
=========
Avoid the removal of tablespace from default_encrypt_list
only when
1) Another active encryption thread working on tablespace
2) Eligible for tablespace key rotation
3) Tablespace is in flushing phase
Removed the workaround in encryption.innodb_encryption_filekeys test case.
Set tests to non-valgrind:
oqgraph.social
encryption.innodb-page_encryption
binlog_encryption.encrypted_master
innodb.innodb-page_compression_lz4
main.lock_multi_bug38499
main.lock_multi_bug38691
- Set the innodb_encrypt_tables variable before
timeout happens. It will add the pending tablespace
to default encrypt list and does the encryption/decryption
based on innodb_encrypt_tables.
Problem:
=======
- InnoDB iterates the fil_system space list to encrypt the
tablespace in case of key rotation. But it is not
necessary for any encryption plugin which doesn't do
key version rotation.
Solution:
=========
- Introduce a new variable called srv_encrypt_rotate to
indicate whether encryption plugin does key rotation
fil_space_crypt_t::key_get_latest_version(): Enable the
srv_encrypt_rotate only once if current key version is
higher than innodb_encyrption_rotate_key_age
fil_crypt_must_default_encrypt(): Default encryption tables
should be added to default_encryp_tables list if
innodb_encyrption_rotate_key_age is zero and encryption
plugin doesn't do key version rotation
fil_space_create(): Add the newly created space to
default_encrypt_tables list if
fil_crypt_must_default_encrypt() returns true
Removed the nondeterministic select from
innodb-key-rotation-disable test. By default,
InnoDB adds the tablespace to the rotation list and
background crypt thread does encryption of tablespace.
So these select doesn't give reliable results.
* Make Item_in_optimizer::fix_fields inherit the with_window_func
attribute of the subquery's left expression (the subquery itself
cannot have window functions that are aggregated in this select)
* Make Item_cache_wrapper::Item_cache_wrapper() inherit
with_window_func attribute of the item it is caching.
Commit b5615eff0d introduced comment in result file during shutdown.
In case of Windows for the tests involving `file_key_managment.so` as plugin-load-add the tests will be overwritten with .dll extension.
The same happens with environment variable `$FILE_KEY_MANAGMENT_SO`.
So the patch is removing the extension to be extension agnostic.
Reviewed by: wlad@mariadb.com
MDEV-25604 Atomic DDL: Binlog event written upon recovery does not
have default database
The purpose of this task is to ensure that ALTER TABLE is atomic even if
the MariaDB server would be killed at any point of the alter table.
This means that either the ALTER TABLE succeeds (including that triggers,
the status tables and the binary log are updated) or things should be
reverted to their original state.
If the server crashes before the new version is fully up to date and
commited, it will revert to the original table and remove all
temporary files and tables.
If the new version is commited, crash recovery will use the new version,
and update triggers, the status tables and the binary log.
The one execption is ALTER TABLE .. RENAME .. where no changes are done
to table definition. This one will work as RENAME and roll back unless
the whole statement completed, including updating the binary log (if
enabled).
Other changes:
- Added handlerton->check_version() function to allow the ddl recovery
code to check, in case of inplace alter table, if the table in the
storage engine is of the new or old version.
- Added handler->table_version() so that an engine can report the current
version of the table. This should be changed each time the table
definition changes.
- Added ha_signal_ddl_recovery_done() and
handlerton::signal_ddl_recovery_done() to inform all handlers when
ddl recovery has been done. (Needed by InnoDB).
- Added handlerton call inplace_alter_table_committed, to signal engine
that ddl_log has been closed for the alter table query.
- Added new handerton flag
HTON_REQUIRES_NOTIFY_TABLEDEF_CHANGED_AFTER_COMMIT to signal when we
should call hton->notify_tabledef_changed() during
mysql_inplace_alter_table. This was required as MyRocks and InnoDB
needed the call at different times.
- Added function server_uuid_value() to be able to generate a temporary
xid when ddl recovery writes the query to the binary log. This is
needed to be able to handle crashes during ddl log recovery.
- Moved freeing of the frm definition to end of mysql_alter_table() to
remove duplicate code and have a common exit strategy.
-------
InnoDB part of atomic ALTER TABLE
(Implemented by Marko Mäkelä)
innodb_check_version(): Compare the saved dict_table_t::def_trx_id
to determine whether an ALTER TABLE operation was committed.
We must correctly recover dict_table_t::def_trx_id for this to work.
Before purge removes any trace of DB_TRX_ID from system tables, it
will make an effort to load the user table into the cache, so that
the dict_table_t::def_trx_id can be recovered.
ha_innobase::table_version(): return garbage, or the trx_id that would
be used for committing an ALTER TABLE operation.
In InnoDB, table names starting with #sql-ib will remain special:
they will be dropped on startup. This may be revisited later in
MDEV-18518 when we implement proper undo logging and rollback
for creating or dropping multiple tables in a transaction.
Table names starting with #sql will retain some special meaning:
dict_table_t::parse_name() will not consider such names for
MDL acquisition, and dict_table_rename_in_cache() will treat such
names specially when handling FOREIGN KEY constraints.
Simplify InnoDB DROP INDEX.
Prevent purge wakeup
To ensure that dict_table_t::def_trx_id will be recovered correctly
in case the server is killed before ddl_log_complete(), we will block
the purge of any history in SYS_TABLES, SYS_INDEXES, SYS_COLUMNS
between ha_innobase::commit_inplace_alter_table(commit=true)
(purge_sys.stop_SYS()) and purge_sys.resume_SYS().
The completion callback purge_sys.resume_SYS() must be between
ddl_log_complete() and MDL release.
--------
MyRocks support for atomic ALTER TABLE
(Implemented by Sergui Petrunia)
Implement these SE API functions:
- ha_rocksdb::table_version()
- hton->check_version = rocksdb_check_versionMyRocks data dictionary
now stores table version for each table.
(Absence of table version record is interpreted as table_version=0,
that is, which means no upgrade changes are needed)
- For inplace alter table of a partitioned table, call the underlying
handlerton when checking if the table is ok. This assumes that the
partition engine commits all changes at once.
This happens during repair when a temporary table is opened
with HA_OPEN_COPY, which resets 'share->born_transactional', which
the encryption code did not like.
Fixed by resetting just share->now_transactional.
This was because of a wrong test in encryption code that wrote random
numbers over the LSN for pages for transactional Aria tables during repair.
The effect was that after an ALTER TABLE ENABLE KEYS of a encrypted
recovery of the tables would not work.
Fixed by changing testing of !share->now_transactional to
!share->base.born_transactional.
Other things:
- Extended Aria check_table() to check for wrong (= too big) LSN numbers.
- If check_table() failed just because of wrong LSN or TRN numbers,
a following repair table will just do a zerofill which is much faster.
- Limit number of LSN errors in one check table to MAX_LSN_ERROR (10).
- Removed old obsolete test of 'if (error_count & 2)'. Changed error_count
and warning_count from bits to numbers of errors/warnings as this is
more useful.
MDEV-25105 (commit 7a4fbb55b0)
in MariaDB 10.6 will refuse the innodb_checksum_algorithm
values none, innodb, strict_none, strict_innodb.
We will issue a deprecation warning if innodb_checksum_algorithm
is set to any of these non-default unsafe values.
innodb_checksum_algorithm=crc32 was made the default in
MySQL 5.7 and MariaDB Server 10.2, and given that older versions
of the server have reached their end of life, there is no valid
reason to use anything else than innodb_checksum_algorithm=crc32
or innodb_checksum_algorithm=strict_crc32 in MariaDB 10.3.
Reviewed by: Sergei Golubchik
Historically, InnoDB supported a buggy page checksum algorithm that did not
compute a checksum over the full page. Later, well before MySQL 4.1
introduced .ibd files and the innodb_file_per_table option, the algorithm
was corrected and the first 4 bytes of each page were redefined to be
a checksum.
The original checksum was so slow that an option to disable page checksum
was introduced for benchmarketing purposes.
The Intel Nehalem microarchitecture introduced the SSE4.2 instruction set
extension, which includes instructions for faster computation of CRC-32C.
In MySQL 5.6 (and MariaDB 10.0), innodb_checksum_algorithm=crc32 was
implemented to make of that. As that option was changed to be the default
in MySQL 5.7, a bug was found on big-endian platforms and some work-around
code was added to weaken that checksum further. MariaDB disables that
work-around by default since MDEV-17958.
Later, SIMD-accelerated CRC-32C has been implemented in MariaDB for POWER
and ARM and also for IA-32/AMD64, making use of carry-less multiplication
where available.
Long story short, innodb_checksum_algorithm=crc32 is faster and more secure
than the pre-MySQL 5.6 checksum, called innodb_checksum_algorithm=innodb.
It should have removed any need to use innodb_checksum_algorithm=none.
The setting innodb_checksum_algorithm=crc32 is the default in
MySQL 5.7 and MariaDB Server 10.2, 10.3, 10.4. In MariaDB 10.5,
MDEV-19534 made innodb_checksum_algorithm=full_crc32 the default.
It is even faster and more secure.
The default settings in MariaDB do allow old data files to be read,
no matter if a worse checksum algorithm had been used.
(Unfortunately, before innodb_checksum_algorithm=full_crc32,
the data files did not identify which checksum algorithm is being used.)
The non-default settings innodb_checksum_algorithm=strict_crc32 or
innodb_checksum_algorithm=strict_full_crc32 would only allow CRC-32C
checksums. The incompatibility with old data files is why they are
not the default.
The newest server not to support innodb_checksum_algorithm=crc32
were MySQL 5.5 and MariaDB 5.5. Both have reached their end of life.
A valid reason for using innodb_checksum_algorithm=innodb could have
been the ability to downgrade. If it is really needed, data files
can be converted with an older version of the innochecksum utility.
Because there is no good reason to allow data files to be written
with insecure checksums, we will reject those option values:
innodb_checksum_algorithm=none
innodb_checksum_algorithm=innodb
innodb_checksum_algorithm=strict_none
innodb_checksum_algorithm=strict_innodb
Furthermore, the following innochecksum options will be removed,
because only strict crc32 will be supported:
innochecksum --strict-check=crc32
innochecksum -C crc32
innochecksum --write=crc32
innochecksum -w crc32
If a user wishes to convert a data file to use a different checksum
(so that it might be used with the no-longer-supported
MySQL 5.5 or MariaDB 5.5, which do not support IMPORT TABLESPACE
nor system tablespace format changes that were made in MariaDB 10.3),
then the innochecksum tool from MariaDB 10.2, 10.3, 10.4, 10.5 or
MySQL 5.7 can be used.
Reviewed by: Thirunarayanan Balathandayuthapani