1
0
mirror of https://github.com/MariaDB/server.git synced 2025-08-08 11:22:35 +03:00
Commit Graph

642 Commits

Author SHA1 Message Date
Marko Mäkelä
b5852ffbee MDEV-27735 Deprecate the parameter innodb_change_buffering
As a follow-up to MDEV-27734 Set innodb_change_buffering=none by default
we mark the option innodb_change_buffering deprecated, to inform users
of its future removal.
2022-02-14 10:29:18 +02:00
Marko Mäkelä
6c2a7d43df Merge mariadb-10.8.2 into 10.8 2022-02-14 08:37:00 +02:00
Sergei Golubchik
7dade0adbd Merge branch '10.7' into 10.8 2022-02-10 21:33:14 +01:00
Sergei Golubchik
65f602310c Merge branch '10.6' into 10.7 2022-02-10 21:16:50 +01:00
Sergei Golubchik
e3894f5d39 Merge branch '10.5 into 10.6 2022-02-10 21:07:03 +01:00
Sergei Golubchik
9aa3564e8a Merge branch '10.4' into 10.5 2022-02-10 21:04:51 +01:00
Sergei Golubchik
b4477ae73c Merge branch '10.3' into 10.4 2022-02-10 20:39:13 +01:00
Sergei Golubchik
a36fc80aeb Merge branch '10.2' into 10.3 2022-02-10 20:23:56 +01:00
Marko Mäkelä
a635c40648 MDEV-27774 Reduce scalability bottlenecks in mtr_t::commit()
A prominent bottleneck in mtr_t::commit() is log_sys.mutex between
log_sys.append_prepare() and log_close().

User-visible change: The minimum innodb_log_file_size will be
increased from 1MiB to 4MiB so that some conditions can be
trivially satisfied.

log_sys.latch (log_latch): Replaces log_sys.mutex and
log_sys.flush_order_mutex. Copying mtr_t::m_log to
log_sys.buf is protected by a shared log_sys.latch.
Writes from log_sys.buf to the file system will be protected
by an exclusive log_sys.latch.

log_sys.lsn_lock: Protects the allocation of log buffer
in log_sys.append_prepare().

sspin_lock: A simple spin lock, for log_sys.lsn_lock.

Thanks to Vladislav Vaintroub for suggesting this idea, and for
reviewing these changes.

mariadb-backup: Replace some use of log_sys.mutex with recv_sys.mutex.

buf_pool_t::insert_into_flush_list(): Implement sorting of flush_list
because ordering is otherwise no longer guaranteed. Ordering by LSN
is needed for the proper operation of redo log checkpoints.

log_sys.append_prepare(): Advance log_sys.lsn and log_sys.buf_free by
the length, and return the old values. Also increment write_to_buf,
which was previously done in log_close().

mtr_t::finish_write(): Obtain the buffer pointer from
log_sys.append_prepare().

log_sys.buf_free: Make the field Atomic_relaxed,
to simplify log_flush_margin(). Use only loads and stores
to avoid costly read-modify-write atomic operations.

buf_pool.flush_list_requests: Replaces
export_vars.innodb_buffer_pool_write_requests
and srv_stats.buf_pool_write_requests.
Protected by buf_pool.flush_list_mutex.

buf_pool_t::insert_into_flush_list(): Do not invoke page_cleaner_wakeup().
Let the caller do that after a batch of calls.

recv_recover_page(): Invoke a minimal part of
buf_pool.insert_into_flush_list().

ReleaseBlocks::modified: A number of pages added to buf_pool.flush_list.

ReleaseBlocks::operator(): Merge buf_flush_note_modification() here.

log_t::set_capacity(): Renamed from log_set_capacity().
2022-02-10 16:37:12 +02:00
Sergei Petrunia
5c89386fdb MDEV-17785: Window functions not working in ONLY_FULL_GROUP_BY mode
(Backport Varun Gupta's patch + edit the commit comment)

Name resolution code produced errors for valid queries with window
functions (but not for queries which used aggregate functions as
window functions).

Name resolution code worked incorrectly, because window function
objects had is_window_func_sum_expr()=false. This was so, because
mark_as_window_func_sum_expr() was only called for aggregate functions
used as window functions.

The fix is to call it for any window function.
2022-02-07 21:43:42 +03:00
Marko Mäkelä
685d958e38 MDEV-14425 Improve the redo log for concurrency
The InnoDB redo log used to be formatted in blocks of 512 bytes.
The log blocks were encrypted and the checksum was calculated while
holding log_sys.mutex, creating a serious scalability bottleneck.

We remove the fixed-size redo log block structure altogether and
essentially turn every mini-transaction into a log block of its own.
This allows encryption and checksum calculations to be performed
on local mtr_t::m_log buffers, before acquiring log_sys.mutex.
The mutex only protects a memcpy() of the data to the shared
log_sys.buf, as well as the padding of the log, in case the
to-be-written part of the log would not end in a block boundary of
the underlying storage. For now, the "padding" consists of writing
a single NUL byte, to allow recovery and mariadb-backup to detect
the end of the circular log faster.

Like the previous implementation, we will overwrite the last log block
over and over again, until it has been completely filled. It would be
possible to write only up to the last completed block (if no more
recent write was requested), or to write dummy FILE_CHECKPOINT records
to fill the incomplete block, by invoking the currently disabled
function log_pad(). This would require adjustments to some logic around
log checkpoints, page flushing, and shutdown.

An upgrade after a crash of any previous version is not supported.
Logically empty log files from a previous version will be upgraded.

An attempt to start up InnoDB without a valid ib_logfile0 will be
refused. Previously, the redo log used to be created automatically
if it was missing. Only with with innodb_force_recovery=6, it is
possible to start InnoDB in read-only mode even if the log file
does not exist. This allows the contents of a possibly corrupted
database to be dumped.

Because a prepared backup from an earlier version of mariadb-backup
will create a 0-sized log file, we will allow an upgrade from such
log files, provided that the FIL_PAGE_FILE_FLUSH_LSN in the system
tablespace looks valid.

The 512-byte log checkpoint blocks at 0x200 and 0x600 will be replaced
with 64-byte log checkpoint blocks at 0x1000 and 0x2000.

The start of log records will move from 0x800 to 0x3000. This allows us
to use 4096-byte aligned blocks for all I/O in a future revision.

We extend the MDEV-12353 redo log record format as follows.

(1) Empty mini-transactions or extra NUL bytes will not be allowed.
(2) The end-of-minitransaction marker (a NUL byte) will be replaced
with a 1-bit sequence number, which will be toggled each time when the
circular log file wraps back to the beginning.
(3) After the sequence bit, a CRC-32C checksum of all data
(excluding the sequence bit) will written.
(4) If the log is encrypted, 8 bytes will be written before
the checksum and included in it. This is part of the
initialization vector (IV) of encrypted log data.
(5) File names, page numbers, and checkpoint information will not be
encrypted. Only the payload bytes of page-level log will be encrypted.
The tablespace ID and page number will form part of the IV.
(6) For padding, arbitrary-length FILE_CHECKPOINT records may be written,
with all-zero payload, and with the normal end marker and checksum.
The minimum size is 7 bytes, or 7+8 with innodb_encrypt_log=ON.

In mariadb-backup and in Galera snapshot transfer (SST) scripts, we will
no longer remove ib_logfile0 or create an empty ib_logfile0. Server startup
will require a valid log file. When resizing the log, we will create
a logically empty ib_logfile101 at the current LSN and use an atomic rename
to replace ib_logfile0 with it. See the test innodb.log_file_size.

Because there is no mandatory padding in the log file, we are able
to create a dummy log file as of an arbitrary log sequence number.
See the test mariabackup.huge_lsn.

The parameter innodb_log_write_ahead_size and the
INFORMATION_SCHEMA.INNODB_METRICS counter log_padded will be removed.

The minimum value of innodb_log_buffer_size will be increased to 2MiB
(because log_sys.buf will replace recv_sys.buf) and the increment
adjusted to 4096 bytes (the maximum log block size).

The following INFORMATION_SCHEMA.INNODB_METRICS counters will be removed:

os_log_fsyncs
os_log_pending_fsyncs
log_pending_log_flushes
log_pending_checkpoint_writes

The following status variables will be removed:

Innodb_os_log_fsyncs (this is included in Innodb_data_fsyncs)
Innodb_os_log_pending_fsyncs (this was limited to at most 1 by design)

log_sys.get_block_size(): Return the physical block size of the log file.
This is only implemented on Linux and Microsoft Windows for now, and for
the power-of-2 block sizes between 64 and 4096 bytes (the minimum and
maximum size of a checkpoint block). If the block size is anything else,
the traditional 512-byte size will be used via normal file system
buffering.

If the file system buffers can be bypassed, a message like the following
will be issued:

InnoDB: File system buffers for log disabled (block size=512 bytes)
InnoDB: File system buffers for log disabled (block size=4096 bytes)

This has been tested on Linux and Microsoft Windows with both sizes.

On Linux, only enable O_DIRECT on the log for innodb_flush_method=O_DSYNC.
Tests in 3 different environments where the log is stored in a device
with a physical block size of 512 bytes are yielding better throughput
without O_DIRECT. This could be due to the fact that in the event the
last log block is being overwritten (if multiple transactions would
become durable at the same time, and each of will write a small
number of bytes to the last log block), it should be faster to re-copy
data from log_sys.buf or log_sys.flush_buf to the kernel buffer,
to be finally written at fdatasync() time.

The parameter innodb_flush_method=O_DSYNC will imply O_DIRECT for
data files. This option will enable O_DIRECT on the log file on Linux.
It may be unsafe to use when the storage device does not support
FUA (Force Unit Access) mode.

When the server is compiled WITH_PMEM=ON, we will use memory-mapped
I/O for the log file if the log resides on a "mount -o dax" device.
We will identify PMEM in a start-up message:

InnoDB: log sequence number 0 (memory-mapped); transaction id 3

On Linux, we will also invoke mmap() on any ib_logfile0 that resides
in /dev/shm, effectively treating the log file as persistent memory.
This should speed up "./mtr --mem" and increase the test coverage of
PMEM on non-PMEM hardware. It also allows users to estimate how much
the performance would be improved by installing persistent memory.
On other tmpfs file systems such as /run, we will not use mmap().

mariadb-backup: Eliminated several variables. We will refer
directly to recv_sys and log_sys.

backup_wait_for_lsn(): Detect non-progress of
xtrabackup_copy_logfile(). In this new log format with
arbitrary-sized blocks, we can only detect log file overrun
indirectly, by observing that the scanned log sequence number
is not advancing.

xtrabackup_copy_logfile(): On PMEM, do not modify the sequence bit,
because we are not allowed to modify the server's log file, and our
memory mapping is read-only.

trx_flush_log_if_needed_low(): Do not use the callback on pmem.
Using neither flush_lock nor write_lock around PMEM writes seems
to yield the best performance. The pmem_persist() calls may
still be somewhat slower than the pwrite() and fdatasync() based
interface (PMEM mounted without -o dax).

recv_sys_t::buf: Remove. We will use log_sys.buf for parsing.

recv_sys_t::MTR_SIZE_MAX: Replaces RECV_SCAN_SIZE.

recv_sys_t::file_checkpoint: Renamed from mlog_checkpoint_lsn.

recv_sys_t, log_sys_t: Removed many data members.

recv_sys.lsn: Renamed from recv_sys.recovered_lsn.
recv_sys.offset: Renamed from recv_sys.recovered_offset.
log_sys.buf_size: Replaces srv_log_buffer_size.

recv_buf: A smart pointer that wraps log_sys.buf[recv_sys.offset]
when the buffer is being allocated from the memory heap.

recv_ring: A smart pointer that wraps a circular log_sys.buf[] that is
backed by ib_logfile0. The pointer will wrap from recv_sys.len
(log_sys.file_size) to log_sys.START_OFFSET. For the record that
wraps around, we may copy file name or record payload data to
the auxiliary buffer decrypt_buf in order to have a contiguous
block of memory. The maximum size of a record is less than
innodb_page_size bytes.

recv_sys_t::parse(): Take the smart pointer as a template parameter.
Do not temporarily add a trailing NUL byte to FILE_ records, because
we are not supposed to modify the memory-mapped log file. (It is
attached in read-write mode already during recovery.)

recv_sys_t::parse_mtr(): Wrapper for recv_sys_t::parse().

recv_sys_t::parse_pmem(): Like parse_mtr(), but if PREMATURE_EOF would be
returned on PMEM, use recv_ring to wrap around the buffer to the start.

mtr_t::finish_write(), log_close(): Do not enforce log_sys.max_buf_free
on PMEM, because it has no meaning on the mmap-based log.

log_sys.write_to_buf: Count writes to log_sys.buf. Replaces
srv_stats.log_write_requests and export_vars.innodb_log_write_requests.
Protected by log_sys.mutex. Updated consistently in log_close().
Previously, mtr_t::commit() conditionally updated the count,
which was inconsistent.

log_sys.write_to_log: Count swaps of log_sys.buf and log_sys.flush_buf,
for writing to log_sys.log (the ib_logfile0). Replaces
srv_stats.log_writes and export_vars.innodb_log_writes.
Protected by log_sys.mutex.

log_sys.waits: Count waits in append_prepare(). Replaces
srv_stats.log_waits and export_vars.innodb_log_waits.

recv_recover_page(): Do not unnecessarily acquire
log_sys.flush_order_mutex. We are inserting the blocks in arbitary
order anyway, to be adjusted in recv_sys.apply(true).

We will change the definition of flush_lock and write_lock to
avoid potential false sharing. Depending on sizeof(log_sys) and
CPU_LEVEL1_DCACHE_LINESIZE, the flush_lock and write_lock could
share a cache line with each other or with the last data members
of log_sys.

Thanks to Matthias Leich for providing https://rr-project.org traces
for various failures during the development, and to
Thirunarayanan Balathandayuthapani for his help in debugging
some of the recovery code. And thanks to the developers of the
rr debugger for a tool without which extensive changes to InnoDB
would be very challenging to get right.

Thanks to Vladislav Vaintroub for useful feedback and
to him, Axel Schwenke and Krunal Bauskar for testing the performance.
2022-01-21 16:03:47 +02:00
Sergei Krivonos
73df7a3009 MDEV-27036: resolve duplicated key issues of JSON tracing outputs:
MDEV-27036: repeated "table" key resolve for print_explain_json

MDEV-27036: duplicated keys in best_access_path

MDEV-27036: Explain_aggr_filesort::print_json_members: resolve duplicated "filesort" member in Json object

MDEV-27036: Explain_basic_join::
            print_explain_json_interns fixed start_dups_weedout case for main.explain_json test
2021-11-26 15:11:06 +02:00
Marko Mäkelä
7e8a13d9d7 Merge 10.6 into 10.7 2021-11-19 17:45:52 +02:00
Marko Mäkelä
aaef2e1d8c MDEV-27058: Reduce the size of buf_block_t and buf_page_t
buf_page_t::frame: Moved from buf_block_t::frame.
All 'thin' buf_page_t describing compressed-only ROW_FORMAT=COMPRESSED
pages will have frame=nullptr, while all 'fat' buf_block_t
will have a non-null frame pointing to aligned innodb_page_size bytes.
This eliminates the need for separate states for
BUF_BLOCK_FILE_PAGE and BUF_BLOCK_ZIP_PAGE.

buf_page_t:🔒 Moved from buf_block_t::lock. That is, all block
descriptors will have a page latch. The IO_PIN state that was used
for discarding or creating the uncompressed page frame of a
ROW_FORMAT=COMPRESSED block is replaced by a combination of read-fix
and page X-latch.

page_zip_des_t::fix: Replaces state_, buf_fix_count_, io_fix_, status
of buf_page_t with a single std::atomic<uint32_t>. All modifications
will use store(), fetch_add(), fetch_sub(). This space was previously
wasted to alignment on 64-bit systems. We will use the following encoding
that combines a state (partly read-fix or write-fix) and a buffer-fix
count:

buf_page_t::NOT_USED=0 (previously BUF_BLOCK_NOT_USED)
buf_page_t::MEMORY=1 (previously BUF_BLOCK_MEMORY)
buf_page_t::REMOVE_HASH=2 (previously BUF_BLOCK_REMOVE_HASH)
buf_page_t::FREED=3 + fix: pages marked as freed in the file
buf_page_t::UNFIXED=1U<<29 + fix: normal pages
buf_page_t::IBUF_EXIST=2U<<29 + fix: normal pages; may need ibuf merge
buf_page_t::REINIT=3U<<29 + fix: reinitialized pages (skip doublewrite)
buf_page_t::READ_FIX=4U<<29 + fix: read-fixed pages (also X-latched)
buf_page_t::WRITE_FIX=5U<<29 + fix: write-fixed pages (also U-latched)
buf_page_t::WRITE_FIX_IBUF=6U<<29 + fix: write-fixed; may have ibuf
buf_page_t::WRITE_FIX_REINIT=7U<<29 + fix: write-fixed (no doublewrite)

buf_page_t::write_complete(): Change WRITE_FIX or WRITE_FIX_REINIT to
UNFIXED, and WRITE_FIX_IBUF to IBUF_EXIST, before releasing the U-latch.

buf_page_t::read_complete(): Renamed from buf_page_read_complete().
Change READ_FIX to UNFIXED or IBUF_EXIST, before releasing the X-latch.

buf_page_t::can_relocate(): If the page latch is being held or waited for,
or the block is buffer-fixed or io-fixed, return false. (The condition
on the page latch is new.)

Outside buf_page_get_gen(), buf_page_get_low() and buf_page_free(), we
will acquire the page latch before fix(), and unfix() before unlocking.

buf_page_t::flush(): Replaces buf_flush_page(). Optimize the
handling of FREED pages.

buf_pool_t::release_freed_page(): Assume that buf_pool.mutex is held
by the caller.

buf_page_t::is_read_fixed(), buf_page_t::is_write_fixed(): New predicates.

buf_page_get_low(): Ignore guesses that are read-fixed because they
may not yet be registered in buf_pool.page_hash and buf_pool.LRU.

buf_page_optimistic_get(): Acquire latch before buffer-fixing.

buf_page_make_young(): Leave read-fixed blocks alone, because they
might not be registered in buf_pool.LRU yet.

recv_sys_t::recover_deferred(), recv_sys_t::recover_low():
Possibly fix MDEV-26326, by holding a page X-latch instead of
only buffer-fixing the page.
2021-11-18 17:47:19 +02:00
Marko Mäkelä
0a168398a0 Merge 10.5 into 10.6 2021-11-17 15:03:47 +02:00
Marko Mäkelä
5489ce0ae1 Merge 10.4 into 10.5 2021-11-17 14:49:12 +02:00
Marko Mäkelä
70e788b1e5 Merge 10.3 into 10.4 2021-11-17 13:59:42 +02:00
Marko Mäkelä
9962cda527 Merge 10.2 into 10.3 2021-11-17 13:55:54 +02:00
Eugene Kosov
ed0a224b3d MDEV-26747 improve corruption check for encrypted tables on ALTER IMPORT
fil_space_decrypt(): change signature to return status via dberr_t only.
Also replace impossible condition with an assertion and prove it via
test cases.
2021-11-17 15:49:22 +06:00
Marko Mäkelä
06988bdcaa Merge 10.6 into 10.7 2021-11-09 09:40:29 +02:00
Marko Mäkelä
25ac047baf Merge 10.5 into 10.6 2021-11-09 09:11:50 +02:00
Marko Mäkelä
9c18b96603 Merge 10.4 into 10.5 2021-11-09 08:50:33 +02:00
Marko Mäkelä
47ab793d71 Merge 10.3 into 10.4 2021-11-09 08:40:14 +02:00
Marko Mäkelä
524b4a89da Merge 10.2 into 10.3 2021-11-09 08:26:59 +02:00
Marko Mäkelä
8c7e551da1 Remove restarts from encrypt_and_grep test 2021-11-09 08:08:29 +02:00
Sergei Golubchik
a52cd4aeda InnoDB: send "corrupted" error to the user, not only to the log 2021-10-27 15:55:14 +02:00
Marko Mäkelä
d4a89b9262 Merge 10.5 into 10.6 2021-10-27 10:06:02 +03:00
Marko Mäkelä
44f9736e0b Merge 10.4 into 10.5 2021-10-27 09:48:22 +03:00
Eugene Kosov
d74d95961a MDEV-18543 IMPORT TABLESPACE fails after instant DROP COLUMN
ALTER TABLE IMPORT doesn't properly handle instant alter metadata.
This patch makes IMPORT read, parse and apply instant alter metadata at the
very beginning of operation. So, cases when source table has some metadata
and destination table doesn't have it now works fine.

DISCARD already removes instant metadata so importing normal table into
instant table worked fine before this patch.

decrypt_decompress(): decrypts and decompresses page if needed

handle_instant_metadata(): this should be the first thing to read source
table. Basically, it applies instant metadata to a destination
dict_table_t object. This is the first thing to read FSP flags so
all possible checks of it were moved to this function.

PageConverter::update_index_page(): it doesn't now read instant metadata.
This logic were moved into handle_instant_metadata()

row_import::match_flags(): this is a first part row_import::match_schema().
As a separate function it's used by handle_instant_metadata().

fil_space_t::is_full_crc32_compressed(): added convenient function

ha_innobase::discard_or_import_tablespace(): do not reload table definition
to read instant metadata because handle_instant_metadata() does it better.
The reverted code was originally added in
4e7ee166a9

ANONYMOUS_VAR: this is a handy thing to use along with make_scope_exit()

full_crc32_import.test shows different results, because no
dict_table_close() and dict_table_open_on_id() happens.
Thus, SHOW CREATE TABLE shows a little bit older table definition.
2021-10-26 22:50:58 +06:00
Marko Mäkelä
d95361107c Merge 10.5 into 10.6 2021-09-24 14:38:52 +03:00
Marko Mäkelä
7e2b42324c Merge 10.4 into 10.5 2021-09-24 08:42:23 +03:00
Marko Mäkelä
9024498e88 Merge 10.3 into 10.4 2021-09-22 18:26:54 +03:00
Marko Mäkelä
b46cf33ab8 Merge 10.2 into 10.3 2021-09-22 18:01:41 +03:00
Thirunarayanan Balathandayuthapani
496d3dded4 MDEV-15675 encryption.innodb_encryption failed in buildbot with timeout
Test case fail to include undo tablespace while waiting for the
encryption thread to encrypt all existing tablespace
2021-09-17 19:37:57 +05:30
Marko Mäkelä
f3fcf5f45c Merge 10.5 to 10.6 2021-08-19 12:25:00 +03:00
Marko Mäkelä
4a25957274 Merge 10.4 into 10.5 2021-08-18 18:22:35 +03:00
Marko Mäkelä
f84e28c119 Merge 10.3 into 10.4 2021-08-18 16:51:52 +03:00
Marko Mäkelä
cd65845a0e Merge 10.2 into 10.3
MDEV-18734 FIXME: vcol.partition triggers ASAN heap-use-after-free
2021-08-18 12:26:58 +03:00
Thirunarayanan Balathandayuthapani
89445b64fe MDEV-26131 SEGV in ha_innobase::discard_or_import_tablespace
Import operation without .cfg file fails when there is mismatch of index
between metadata table and .ibd file. Moreover, MDEV-19022 shows
that InnoDB can end up with index tree where non-leaf page has only
one child page. So it is unsafe to find the secondary index root page.

This patch does the following when importing the table without .cfg file:

1) If the metadata contains more than one index then InnoDB stops
the import operation and report the user to drop all secondary
indexes before doing import operation.

2) When the metadata contain only clustered index then InnoDB finds the
index id by reading page 0 & page 3 instead of traversing the
whole tablespace.
2021-08-16 15:32:07 +05:30
Oleksandr Byelkin
6efb5e9f5e Merge branch '10.5' into 10.6 2021-08-02 10:11:41 +02:00
Oleksandr Byelkin
ae6bdc6769 Merge branch '10.4' into 10.5 2021-07-31 23:19:51 +02:00
Oleksandr Byelkin
7841a7eb09 Merge branch '10.3' into 10.4 2021-07-31 22:59:58 +02:00
Anel Husakovic
b30f26e3fe Record tempfiles_encrypted test failure 2021-07-22 09:19:18 +02:00
Sergei Golubchik
6190a02f35 Merge branch '10.2' into 10.3 2021-07-21 20:11:07 +02:00
Marko Mäkelä
4dfec8b230 Merge 10.5 into 10.6 2021-06-21 17:49:33 +03:00
Marko Mäkelä
a42c80bd48 Merge 10.4 into 10.5 2021-06-21 14:22:22 +03:00
Marko Mäkelä
d3e4fae797 Merge 10.3 into 10.4 2021-06-21 12:38:25 +03:00
Marko Mäkelä
241d30d3fa After-merge fixes for MDEV-14180 2021-06-21 12:09:00 +03:00
Marko Mäkelä
c9a85fb1b1 Merge 10.2 into 10.3 2021-06-21 09:07:40 +03:00
Thirunarayanan Balathandayuthapani
8c7d8b716c MDEV-14180 Automatically disable key rotation checks for file_key_managment plugin
Problem:
=======
- InnoDB iterates the fil_system space list to encrypt the
tablespace in case of key rotation. But it is not
necessary for any encryption plugin which doesn't do
key version rotation.

Solution:
=========
- Introduce a new variable called srv_encrypt_rotate to
indicate whether encryption plugin does key rotation

fil_space_crypt_t::key_get_latest_version(): Enable the
srv_encrypt_rotate only once if current key version is
higher than innodb_encyrption_rotate_key_age

fil_crypt_must_default_encrypt(): Default encryption tables
should be added to default_encryp_tables list if
innodb_encyrption_rotate_key_age is zero and encryption
plugin doesn't do key version rotation

fil_space_create(): Add the newly created space to
default_encrypt_tables list if
fil_crypt_must_default_encrypt() returns true

Removed the nondeterministic select from
innodb-key-rotation-disable test. By default,
InnoDB adds the tablespace to the rotation list and
background crypt thread does encryption of tablespace.
So these select doesn't give reliable results.
2021-06-15 13:15:32 +05:30