The initial fix only covered a part of Mariabackup.
This fix hardens InnoDB and XtraDB in a similar way, in order
to reduce the probability of mistaking a corrupted encrypted page
for a valid unencrypted one.
This is based on work by Thirunarayanan Balathandayuthapani.
fil_space_verify_crypt_checksum(): Assert that key_version!=0.
Let the callers guarantee that. Now that we have this assertion,
we also know that buf_page_is_zeroes() cannot hold.
Also, remove all diagnostic output and related parameters,
and let the relevant callers emit such messages.
Last but not least, validate the post-encryption checksum
according to the innodb_checksum_algorithm (only accepting
one checksum for the strict variants), and no longer
try to validate the page as if it was unencrypted.
buf_page_is_zeroes(): Move to the compilation unit of the only callers,
and declare static.
xb_fil_cur_read(), buf_page_check_corrupt(): Add a condition before
calling fil_space_verify_crypt_checksum(). This is a non-functional
change.
buf_dblwr_process(): Validate the page only as encrypted or unencrypted,
but not both.
Also, apply the MDEV-17957 changes to encrypted page checksums,
and remove error message output from the checksum function,
because these messages would be useless noise when mariabackup
is retrying reads of corrupted-looking pages, and not that
useful during normal server operation either.
The error messages in fil_space_verify_crypt_checksum()
should be refactored separately.
Background: Used encryption key_id is stored to encryption metadata
i.e. crypt_data that is stored on page 0 of the tablespace of the
table. crypt_data is created only if implicit encryption/not encryption
is requested i.e. ENCRYPTED=[YES|NO] table option is used
fil_create_new_single_table_tablespace on fil0fil.cc.
Later if encryption is enabled all tables that use default encryption
mode (i.e. no encryption table option is set) are encrypted with
default encryption key_id that is 1. See fil_crypt_start_encrypting_space on
fil0crypt.cc.
ha_innobase::check_table_options()
If default encryption is used and encryption is disabled, you may
not use nondefault encryption_key_id as it is not stored anywhere.
Stop supporting the additional *trunc.log files that were
introduced via MySQL 5.7 to MariaDB Server 10.2 and 10.3.
DB_TABLESPACE_TRUNCATED: Remove.
purge_sys.truncate: A new structure to track undo tablespace
file truncation.
srv_start(): Remove the call to buf_pool_invalidate(). It is
no longer necessary, given that we no longer access things in
ways that violate the ARIES protocol. This call was originally
added for innodb_file_format, and it may later have been necessary
for the proper function of the MySQL 5.7 TRUNCATE recovery, which
we are now removing.
trx_purge_cleanse_purge_queue(): Take the undo tablespace as a
parameter.
trx_purge_truncate_history(): Rewrite everything mostly in a
single function, replacing references to undo::Truncate.
recv_apply_hashed_log_recs(): If any redo log is to be applied,
and if the log_sys.log.subformat indicates that separately
logged truncate may have been used, refuse to proceed except if
innodb_force_recovery is set. We will still refuse crash-upgrade
if TRUNCATE TABLE was logged. Undo tablespace truncation would
only be logged in undo*trunc.log files, which we are no longer
checking for.
This is a merge from 10.2, but the 10.2 version of this will not
be pushed into 10.2 yet, because the 10.2 version would include
backports of MDEV-14717 and MDEV-14585, which would introduce
a crash recovery regression: Tables could be lost on
table-rebuilding DDL operations, such as ALTER TABLE,
OPTIMIZE TABLE or this new backup-friendly TRUNCATE TABLE.
The test innodb.truncate_crash occasionally loses the table due to
the following bug:
MDEV-17158 log_write_up_to() sometimes fails
This will change the InnoDB encrypted redo log format only.
Unencrypted redo log will keep using the MariaDB 10.3 format.
In the new encrypted redo log format, 4 additional bytes will
be reserved in the redo log block trailer for storing the
encryption key version.
For performance reasons, the encryption key rotation
(checking if the latest encryption key version is being used)
is only done at log_checkpoint().
LOG_HEADER_FORMAT_CURRENT: Remove.
LOG_HEADER_FORMAT_ENC_10_4: The encrypted 10.4 format.
LOG_BLOCK_KEY: The encryption key version field.
LOG_BLOCK_TRL_SIZE: Remove.
log_t: Add accessors framing_size(), payload_size(), trailer_offset(),
to be used instead of referring to LOG_BLOCK_TRL_SIZE.
log_crypt_t: An operation passed to log_crypt().
log_crypt(): Perform decryption, encryption, or encryption with key
rotation. Return an error if key rotation at decryption fails.
On encryption, keep using the previous key if the rotation fails.
At startup, old-format encrypted redo log may be written before
the redo log is upgraded (rebuilt) to the latest format.
log_write_up_to(): Add the parameter rotate_key=false.
log_checkpoint(): Invoke log_write_up_to() with rotate_key=true.
fil_iterate(): Invoke fil_encrypt_buf() correctly when
a ROW_FORMAT=COMPRESSED table with a physical page size of
innodb_page_size is being imported. Also, validate the page checksum
before decryption, and reduce the scope of some variables.
AbstractCallback::operator()(): Remove the parameter 'offset'.
The check for it in FetchIndexRootPages::operator() was basically
redundant and dead code since the previous refactoring.
MDEV-9931 introduced a counter for keeping track of reads of the
first page of InnoDB data files, because the original implementation
of data-at-rest-encryption for InnoDB introduced new code paths for
reading the pages.
Ultimately, the extra reads of the first page were removed, and
the encryption subsystem will be initialized whenever we first read
the first page of each data file, in fil_node_open_file(). It should not
be that interesting to observe how many times an InnoDB data file was
opened for the first time.
Added --skip-test-db option to mysql_install_db. If specified, no test
database created and relevant grants issued.
Removed --skip-auth-anonymous-user option of mysql_install_db. Now it is
covered by --skip-test-db.
Dropped some Debian patches that did the same.
Removed unused make_win_bin_dist.1, make_win_bin_dist and
mysql_install_db.pl.in.
Problem was that key rotation from encrypted to unecrypted was skipped
when encryption is disabled (i.e. set global innodb-encrypt-tables=OFF).
fil_crypt_needs_rotation
If encryption is disabled (i.e. innodb-encrypt-tables=off)
and there is tablespaces using default encryption (e.g.
system tablespace) that are still encrypted state we need
to rotate them from encrypted state to unencrypted state.
InnoDB always keeps all tablespaces in the fil_system cache.
The fil_system.LRU is only for closing file handles; the
fil_space_t and fil_node_t for all data files will remain
in main memory. Between startup to shutdown, they can only be
created and removed by DDL statements. Therefore, we can
let dict_table_t::space point directly to the fil_space_t.
dict_table_t::space_id: A numeric tablespace ID for the corner cases
where we do not have a tablespace. The most prominent examples are
ALTER TABLE...DISCARD TABLESPACE or a missing or corrupted file.
There are a few functional differences; most notably:
(1) DROP TABLE will delete matching .ibd and .cfg files,
even if they were not attached to the data dictionary.
(2) Some error messages will report file names instead of numeric IDs.
There still are many functions that use numeric tablespace IDs instead
of fil_space_t*, and many functions could be converted to fil_space_t
member functions. Also, Tablespace and Datafile should be merged with
fil_space_t and fil_node_t. page_id_t and buf_page_get_gen() could use
fil_space_t& instead of a numeric ID, and after moving to a single
buffer pool (MDEV-15058), buf_pool_t::page_hash could be moved to
fil_space_t::page_hash.
FilSpace: Remove. Only few calls to fil_space_acquire() will remain,
and gradually they should be removed.
mtr_t::set_named_space_id(ulint): Renamed from set_named_space(),
to prevent accidental calls to this slower function. Very few
callers remain.
fseg_create(), fsp_reserve_free_extents(): Take fil_space_t*
as a parameter instead of a space_id.
fil_space_t::rename(): Wrapper for fil_rename_tablespace_check(),
fil_name_write_rename(), fil_rename_tablespace(). Mariabackup
passes the parameter log=false; InnoDB passes log=true.
dict_mem_table_create(): Take fil_space_t* instead of space_id
as parameter.
dict_process_sys_tables_rec_and_mtr_commit(): Replace the parameter
'status' with 'bool cached'.
dict_get_and_save_data_dir_path(): Avoid copying the fil_node_t::name.
fil_ibd_open(): Return the tablespace.
fil_space_t::set_imported(): Replaces fil_space_set_imported().
truncate_t: Change many member function parameters to fil_space_t*,
and remove page_size parameters.
row_truncate_prepare(): Merge to its only caller.
row_drop_table_from_cache(): Assert that the table is persistent.
dict_create_sys_indexes_tuple(): Write SYS_INDEXES.SPACE=FIL_NULL
if the tablespace has been discarded.
row_import_update_discarded_flag(): Remove a constant parameter.
The following INFORMATION_SCHEMA views were unnecessarily retrieving
the data from the SYS_TABLESPACES table instead of directly fetching
it from the fil_system cache:
information_schema.innodb_tablespaces_encryption
information_schema.innodb_tablespaces_scrubbing
InnoDB always loads all tablespace metadata into memory at startup
and never evicts it while the tablespace exists.
With this fix, accessing these views will be much faster and use less
memory, and include data about all tablespaces, including undo
tablespaces.
The view information_schema.innodb_sys_tablespaces will still reflect
the contents of the SYS_TABLESPACES table.
fil_space_t::atomic_write_supported: Always set this flag for
TEMPORARY TABLESPACE and during IMPORT TABLESPACE. The page
writes during these operations are by definition not crash-safe
because they are not written to the redo log.
fil_space_t::use_doublewrite(): Determine if doublewrite should
be used.
buf_dblwr_update(): Add assertions, and let the caller check whether
doublewrite buffering is desired.
buf_flush_write_block_low(): Disable the doublewrite buffer for
the temporary tablespace and for IMPORT TABLESPACE.
fil_space_set_imported(), fil_node_open_file(), fil_space_create():
Initialize or revise the space->atomic_write_supported flag.
buf_page_io_complete(), buf_flush_write_complete(): Add the parameter
dblwr, to indicate whether doublewrite was used for writes.
buf_dblwr_sync_datafiles(): Remove an unnecessary flush of
persistent tablespaces when flushing temporary tablespaces.
(Move the call to buf_dblwr_flush_buffered_writes().)