In key rotation, we must initialize unallocated but previously
initialized pages, so that if encryption is enabled on a table,
all clear-text data for the page will eventually be overwritten.
But we should not rotate keys on pages that were never allocated
after the data file was created.
According to the latching order rules, after acquiring the
tablespace latch, no page latches of previously allocated user pages
may be acquired. So, key rotation should check the page allocation
status after acquiring the page latch, not before. But, the latching
order rules also prohibit accessing pages that were not allocated first,
and then acquiring the tablespace latch. Such behaviour would indeed
result in a deadlock when running the following tests:
encryption.innodb_encryption-page-compression
encryption.innodb-checksum-algorithm
Because the key rotation is accessing potentially unallocated pages, it
cannot reliably check if these pages were allocated. It can only check
the page header. If the page number is zero, we can assume that the
page is unallocated.
fil_crypt_rotate_pages(): Skip pages that are known to be uninitialized.
fil_crypt_rotate_page(): Detect uninitialized pages by FIL_PAGE_OFFSET.
Page 0 is never encrypted, and on other pages that are initialized,
FIL_PAGE_OFFSET must contain the page number.
fil_crypt_is_page_uninitialized(): Remove. It suffices to check the
page number field in fil_crypt_rotate_page().
In key rotation, we must initialize unallocated but previously
initialized pages, so that if encryption is enabled on a table,
all clear-text data for the page will eventually be overwritten.
But we should not rotate keys on pages that were never allocated
after the data file was created.
According to the latching order rules, after acquiring the
tablespace latch, no page latches of previously allocated user pages
may be acquired. So, key rotation should check the page allocation
status after acquiring the page latch, not before. But, the latching
order rules also prohibit accessing pages that were not allocated first,
and then acquiring the tablespace latch. Such behaviour would indeed
result in a deadlock when running the following tests:
encryption.innodb_encryption-page-compression
encryption.innodb-checksum-algorithm
Because the key rotation is accessing potentially unallocated pages, it
cannot reliably check if these pages were allocated. It can only check
the page header. If the page number is zero, we can assume that the
page is unallocated.
fil_crypt_rotate_page(): Detect uninitialized pages by FIL_PAGE_OFFSET.
Page 0 is never encrypted, and on other pages that are initialized,
FIL_PAGE_OFFSET must contain the page number.
fil_crypt_is_page_uninitialized(): Remove. It suffices to check the
page number field in fil_crypt_rotate_page().
xdes_get_descriptor_const(): New function, to get read-only access to
the allocation descriptor.
fseg_page_is_free(): Only acquire a shared latch on the tablespace,
not an exclusive latch. Calculate the descriptor page address before
acquiring the tablespace latch. If the page number is out of bounds,
return without fetching any page. Access only one descriptor page.
fsp_page_is_free(), fsp_page_is_free_func(): Remove.
Use fseg_page_is_free() instead.
fsp_init_file_page(): Move the debug parameter into a separate function.
btr_validate_level(): Remove the unused variable "seg".
The parameter --innodb-sync-debug, which is disabled by default,
aims to find potential deadlocks in InnoDB.
When the parameter is enabled, lots of tests failed. Most of these
failures were due to bogus diagnostics. But, as part of this fix,
we are also fixing a bug in error handling code and removing dead
code, and fixing cases where an uninitialized mutex was being
locked and unlocked.
dict_create_foreign_constraints_low(): Remove an extraneous
mutex_exit() call that could cause corruption in an error handling
path. Also, do not unnecessarily acquire dict_foreign_err_mutex.
Its only purpose is to control concurrent access to
dict_foreign_err_file.
row_ins_foreign_trx_print(): Replace a redundant condition with a
debug assertion.
srv_dict_tmpfile, srv_dict_tmpfile_mutex: Remove. The
temporary file is never being written to or read from.
log_free_check(): Allow SYNC_FTS_CACHE (fts_cache_t::lock)
to be held.
ha_innobase::inplace_alter_table(), row_merge_insert_index_tuples():
Assert that no unexpected latches are being held.
sync_latch_meta_init(): Properly initialize dict_operation_lock_key
at SYNC_DICT_OPERATION. dict_sys->mutex is SYNC_DICT, and
the now-removed SRV_DICT_TMPFILE was wrongly registered at
SYNC_DICT_OPERATION.
buf_block_init(): Correctly register buf_block_t::debug_latch.
It was previously misleadingly reported as LATCH_ID_DICT_FOREIGN_ERR.
latch_level_t: Correct the relative latching order of
SYNC_IBUF_PESS_INSERT_MUTEX,SYNC_INDEX_TREE and
SYNC_FILE_FORMAT_TAG,SYNC_DICT_OPERATION to avoid bogus failures.
row_drop_table_for_mysql(): Avoid accessing btr_defragment_mutex
if the defragmentation thread has not been started. This is the
case during fts_drop_orphaned_tables() in recv_recovery_rollback_active().
fil_space_destroy_crypt_data(): Avoid acquiring fil_crypt_threads_mutex
when it is uninitialized. We may have created crypt_data before the
mutex was created, and the mutex creation would be skipped if
InnoDB startup failed or --innodb-read-only was specified.
InnoDB stopped generating the MLOG_INIT_FILE_PAGE record in
MySQL 5.7.5. Starting with MySQL 5.7.9 (which was imported to
MariaDB Server 10.2.2), the InnoDB redo log format tag prevents
crash recovery from old-format redo logs.
Remove the dead code for dealing with MLOG_INIT_FILE_PAGE.
The previous fix (commit dcdc1c6d09)
should have removed the assertion from log_close(), because every
caller that requires this assertion is already asserting that log
writes are allowed. When fil_names_clear() is called, it must be
able to write the MLOG_CHECKPOINT records. The purpose of the debug
variable recv_no_log_write is to prevent the creation of page-level
redo log records, or modifications to persistent data.
Add suppressions for the read and decompression errors.
This may be 10.3 specific and related to MDEV-13536 which increases
purge activity. But it does not hurt to suppress rarely occurring
and plausible error messages for this fault-injection test already in 10.2.
backup_release(): New function, refactored from backup_finish().
Release some resources that may have been acquired by backup_startup()
and should be released even after a failed operation.
xtrabackup_backup_low(): Refactored from xtrabackup_backup_func().
xtrabackup_backup_func(): Always call backup_release() after calling
backup_start().
The test mariabackup.incremental_backup revealed a memory leak
in have_queries_to_wait_for(). The problem is that
xb_mysql_query() is being invoked with bool use_result=true
but the result is not being freed by mysql_store_result().
There are similar leaks in other functions.
have_queries_to_wait_for(): Invoke mysql_free_result() to
clean up after the mysql_store_result() that was invoked
by xb_mysql_query().
select_incremental_lsn_from_history(): Plug the leak on failure.
kill_long_queries(): Plug the memory leak.
(This function always leaked memory when it was called.)
Problem was that if column was created in alter table when
it was refered again it was not tried to find from list
of current columns.
mysql_prepare_alter_table:
There is two cases
(1) If alter table adds a new column and then later alter
changes the field definition, there was no check from
list of new columns, instead an incorrect error was given.
(2) If alter table adds a new column and then later alter
changes the default, there was no check from list of
new columns, instead an incorrect error was given.
The fix broke mariabackup --prepare --incremental.
The restore of an incremental backup starts up (parts of) InnoDB twice.
First, all data files are discovered for applying .delta files. Then,
after the .delta files have been applied, InnoDB will be restarted
more completely, so that the redo log records will be applied via the
buffer pool.
During the first startup, the buffer pool is not initialized, and thus
trx_rseg_get_n_undo_tablespaces() must not be invoked. The apply of
the .delta files will currently assume that the --innodb-undo-tablespaces
option correctly specifies the number of undo tablespace files, just
like --backup does.
The second InnoDB startup of --prepare for applying the redo log will
properly invoke trx_rseg_get_n_undo_tablespaces().
enum srv_operation_mode: Add SRV_OPERATION_RESTORE_DELTA for
distinguishing the apply of .delta files from SRV_OPERATION_RESTORE.
srv_undo_tablespaces_init(): In mariabackup --prepare --incremental,
in the initial SRV_OPERATION_RESTORE_DELTA phase, do not invoke
trx_rseg_get_n_undo_tablespaces() because the buffer pool or the
redo logs are not available. Instead, blindly rely on the parameter
--innodb-undo-tablespaces.
fails with ERROR_INVALID_FUNCTION
This DeviceIoControl seems to happen on different boxes from time to time,
and there is not much user can do about it.
Instead of error, log a single INFO message, so it does not disturb users
much.
Page read could return DB_PAGE_CORRUPTED error that should
be reported and passed to upper layer. In case of unknown
error code we should print both number and string.
Apply this patch from upstream:
commit 2c8deddfb67f1cd41ea3d1ac95aa1aa9327e3406
Author: Yoshinori Matsunobu <yoshinorim@users.noreply.github.com>
Date: Tue Aug 15 16:21:58 2017 -0700
Set exclusive_manual_compaction = false on manual compactions
Summary:
Combining exclusive manual compaction and
non-exclusive manual compaction may hit rocksdb assertion errors.
This diff makes all MyRocks internal manual compactions non exclusive.
Closes https://github.com/facebook/mysql-5.6/pull/682
Differential Revision: D5633619
Pulled By: yoshinorim
fbshipit-source-id: a90786d
The bug was caused by a defect of the patch for the bug 11081.
The patch was actually a port of the fix this bug from the mysql
code line. Later a correction of this fix was added to the
mysql code. Here's the comment this correction was provided with:
Bug#16499751: Opening cursor on SELECT in stored procedure causes segfault
This is a regression from the fix of bug#14740889.
The fix started using another set of expressions as the source for
the temporary table used for the materialized cursor. However,
JOIN::make_tmp_tables_info() calls setup_copy_fields() which creates
an Item_copy wrapper object on top of the function being selected.
The Item_copy objects were not properly handled by create_tmp_table -
they were simply ignored. This patch creates temporary table fields
based on the underlying item of the Item_copy objects.
The test case for the bug 13346 was taken from mdev-13380.
The test mis-used MTR's "restart the server if it crashed or exited"
feature to try starting MyRocks plugin with invalid arguments.
Changed the test to use the --default-storage-engine=myisam which
allows the server to start when MyRocks fails to start.
This removes the need to "start the server with the arguments which
will caused it to fail to start", and so removes the race conditions
with MTR server restart code and mysqld.*.expect file.
Merged from mysql-wsrep-bugs following:
GCF-1058 MTR test galera.MW-86 fails on repeated runs
Wait for the sync point sync.wsrep_apply_cb to be reached before
executing the test and clearing the debug flag sync.wsrep_apply_cb.
The race scenario:
Intended behavior:
node2: set sync.wsrep_apply_cb in order to start waiting in the background INSERT
node1: INSERT start
node2 (background): INSERT start
node1: INSERT end
node2: send signal to background INSERT: "stop waiting and continue executing"
node2: clear sync.wsrep_apply_cb as no longer needed
node2 (background): consume the signal
node2 (background): INSERT end
node2: DROP TABLE
node2: check no pending signals are left - ok
What happens occasionally (unexpected):
node2: set sync.wsrep_apply_cb in order to start waiting in the background INSERT
node1: INSERT start
node2 (background): INSERT start
node1: INSERT end
// The background INSERT still has _not_ reached the place where it starts
// waiting for the signal:
// DBUG_EXECUTE_IF("sync.wsrep_apply_cb", "now wait_for...");
node2: send signal to background INSERT: "stop waiting and continue executing"
node2: clear sync.wsrep_apply_cb as no longer needed
// The background INSERT reaches DBUG_EXECUTE_IF("sync.wsrep_apply_cb", ...)
// but sync.wsrep_apply_cb has already been cleared and the "wait" code is not
// executed. The signal remains unconsumed.
node2 (background): INSERT end
node2: DROP TABLE
node2: check no pending signals are left - failure, signal.wsrep_apply_cb is
pending (not consumed)
Remove MW-360 test case as it is not intended for MariaDB (uses
MySQL GTID).
row_ins_check_foreign_constraint(): On timeout,
return DB_LOCK_WAIT_TIMEOUT instead of DB_LOCK_WAIT,
so that the lock wait will be properly terminated.
Also, replace some redundant assignments.
It looks like this bug was introduced in MySQL 5.7.8 by:
commit a97f6b91227c7e0fc3151cfe5421891e79c12d19
Author: Annamalai Gurusami <annamalai.gurusami@oracle.com>
Date: Tue Jun 9 16:02:31 2015 +0530
Bug #20953265 INNODB: FAILING ASSERTION: RESULT != FTS_INVALID
MDEV-13498 is a performance regression that was introduced in MariaDB 10.2.2
by commit fec844aca8
which introduced some Galera-specific conditions that were being
evaluated even if the write-set replication was not enabled.
MDEV-13246 Stale rows despite ON DELETE CASCADE constraint
is a correctness regression that was introduced by the same commit.
Especially the subcondition
!(parent && que_node_get_type(parent) == QUE_NODE_UPDATE)
which is equivalent to
!parent || que_node_get_type(parent) != QUE_NODE_UPDATE
makes little sense. If parent==NULL, the evaluation would proceed to the
std::find() expression, which would dereference parent. Because no SIGSEGV
was observed related to this, we can conclude that parent!=NULL always
holds. But then, the condition would be equivalent to
que_node_get_type(parent) != QUE_NODE_UPDATE
which would not make sense either, because the std::find() expression
is actually assuming the opposite when casting parent to upd_node_t*.
It looks like this condition never worked properly, or that
it was never properly tested, or both.
wsrep_must_process_fk(): Helper function to check if FOREIGN KEY
constraints need to be processed. Only evaluate the costly std::find()
expression when write-set replication is enabled.
Also, rely on operator<<(std::ostream&, const id_name_t&) and
operator<<(std::ostream&, const table_name_t&) for pretty-printing
index and table names.
row_upd_sec_index_entry(): Add !wsrep_thd_is_BF() to the condition.
This is applying part of "Galera MW-369 FK fixes"
f37b79c6da
that is described by the following part of the commit comment:
additionally: skipping wsrep_row_upd_check_foreign_constraint if thd has
BF, essentially is applier or replaying
This FK check would be needed only for populating parent row FK keys
in write set, so no use for appliers
trx_set_rw_mode(): Check the flag high_level_read_only instead
of testing srv_force_recovery (innodb_force_recovery) directly.
There is no need to prevent the creation of read-write transactions
if innodb_force_recovery=3 is used. Yes, in that mode any recovered
incomplete transactions will not be rolled back, but these transactions
will continue to hold locks on the records that they have modified.
If the new read-write transactions hit conflicts with already existing
(possibly recovered) transactions, the lock wait timeout mechanism
will work just fine.