Mutex order violation when wsrep bf thread kills a conflicting trx,
the stack is
wsrep_thd_LOCK()
wsrep_kill_victim()
lock_rec_other_has_conflicting()
lock_clust_rec_read_check_and_lock()
row_search_mvcc()
ha_innobase::index_read()
ha_innobase::rnd_pos()
handler::ha_rnd_pos()
handler::rnd_pos_by_record()
handler::ha_rnd_pos_by_record()
Rows_log_event::find_row()
Update_rows_log_event::do_exec_row()
Rows_log_event::do_apply_event()
Log_event::apply_event()
wsrep_apply_events()
and mutexes are taken in the order
lock_sys->mutex -> victim_trx->mutex -> victim_thread->LOCK_thd_data
When a normal KILL statement is executed, the stack is
innobase_kill_query()
kill_handlerton()
plugin_foreach_with_mask()
ha_kill_query()
THD::awake()
kill_one_thread()
and mutexes are
victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex
This patch is the plan D variant for fixing potetial mutex locking
order exercised by BF aborting and KILL command execution.
In this approach, KILL command is replicated as TOI operation.
This guarantees total isolation for the KILL command execution
in the first node: there is no concurrent replication applying
and no concurrent DDL executing. Therefore there is no risk of
BF aborting to happen in parallel with KILL command execution
either. Potential mutex deadlocks between the different mutex
access paths with KILL command execution and BF aborting cannot
therefore happen.
TOI replication is used, in this approach, purely as means
to provide isolated KILL command execution in the first node.
KILL command should not (and must not) be applied in secondary
nodes. In this patch, we make this sure by skipping KILL
execution in secondary nodes, in applying phase, where we
bail out if applier thread is trying to execute KILL command.
This is effective, but skipping the applying of KILL command
could happen much earlier as well.
This also fixed unprotected calls to wsrep_thd_abort
that will use wsrep_abort_transaction. This is fixed
by holding THD::LOCK_thd_data while we abort transaction.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
Since MDEV-18778, timezone tables get changed to innodb
to allow them to be replicated to other galera nodes.
Even without galera, timezone tables could be declared innodb.
With the standalone innodb tables, the mysql_tzinfo_to_sql takes
approximately 27 seconds.
With the transactions enabled in this patch, 1.2 seconds is
the approximate load time.
While explicit checks for the engine of the time zone tables could be
done, or checks against !opt_skip_write_binlog, non-transactional
storage engines will just ignore the transactional state without
even a warning so its safe to enact globally.
Leap seconds are pretty much ignored as they are a single insert
statement and have gone out of favour as they have caused MariaDB
stalls in the past.
When binlog is disabled, WSREP will not behave correctly when
SAVEPOINT ROLLBACK is executed since we don't register handlers for such case.
Fixed by registering WSREP handlerton for SAVEPOINT related commands.
Re-enable some Galera tests that should have been enabled.
Add client_ed25519.so to debian/libmariadb3.install;
merge e47a143fc0 correctly.
Remove a duplicated #include from wsrep_mysqld.cc.
For MDEV-15955, the fix in create_tmp_field_from_item() would cause a
compilation error. After a discussion with Alexander Barkov, the fix
was omitted and only the test case was kept.
In 10.3 and later, MDEV-15955 is fixed properly by overriding
create_tmp_field() in Item_func_user_var.
There were two problems:
(1) If user wanted same time zone information on all nodes in the Galera
cluster all updates were not replicated as time zone information was
stored on MyISAM tables. This is fixed on Galera by altering time zone
tables to InnoDB while they are modified.
(2) If user wanted different time zone information to nodes in the Galera
cluster TRUNCATE TABLE for time zone tables was replicated by Galera
destroying time zone information from other nodes. This is fixed
on Galera by introducing new option for mysql_tzinfo_to_sql_symlink
tool --skip-write-binlog to disable Galera replication while
time zone tables are modified.
Changes to be committed:
modified: mysql-test/r/mysql_tzinfo_to_sql_symlink.result
modified: mysql-test/suite/wsrep/r/mysql_tzinfo_to_sql_symlink.result
new file: mysql-test/suite/wsrep/r/mysql_tzinfo_to_sql_symlink_skip.result
new file: mysql-test/suite/wsrep/t/mysql_tzinfo_to_sql_symlink_skip.test
modified: sql/tztime.cc
Currently, running mtr with an incorrect (for example, new or
obsolete) version of wsrep_provider (for example, with the 26
version of libgalera_smm.so) leads to the failure of tests in
several suites with vague error diagnostics.
As for the galera_3nodes suite, the mtr also does not effectively
check all the prerequisites after merge with MDEV-18426 fixes.
For example, tests that using mariabackup do not check for presence
of ss and socat/nc. This is due to improper handling of relative
paths in mtr scripts.
In addition, some tests in different suites can be run without
setting the environment variables such as MTR_GALERA_TFMT, XBSTREAM,
and so on.
To eliminate all these issues, this patch makes the following changes:
1. Added auxiliary wsrep_mtr_check utility (which located in the
mysql-test/lib/My/SafeProcess subdirectory), which compares the
versions of the wsrep API that used by the server and by the wsrep
provider library, and it does this comparison safely, without
accessing the API if the versions do not match.
2. All checks related to the presence of mariabackup and utilities
that necessary for its operation transferred from the local directories
of different mtr suites (from the suite.pm files) to the main suite.pm
file. This not only reduces the amount of code and eliminates duplication
of identical code fragments, but also avoids problems due to the inability
of mtr to consider relative paths to include files when checking skip
combinations.
3. Setting the values of auxiliary environment variables that
are necessary for Galera, SST scripts and mariabackup (to work
properly) is moved to the main mysql-test-run.pl script, so as
not to duplicate this code in different suites, and to avoid
partial corrections of the same errors for different suites
(while other suites remain uncorrected).
4. Fixed duplication of the have_file_key_management.inc and
have_filekeymanagement.inc files between different suites,
these checks are also transferred to the top level.
5. Added garbd presence check and garbd path variable.
https://jira.mariadb.org/browse/MDEV-18565
Problem was that tests select INFORMATION_SCHEMA.PROCESSLIST processes
from user system user and empty state. Thus, there is not clear
state for slave threads.
Changes:
- Added new status variables that store current amount of applier threads
(wsrep_applier_thread_count) and rollbacker threads
(wsrep_rollbacker_thread_count). This will make clear how many slave threads
of certain type there is.
- Added THD state "wsrep applier idle" when applier slave thread is
waiting for work. This makes finding slave/applier threads easier.
- Added force-restart option for mtr to always restart servers between tests
to avoid race on start of the test
- Added wait_condition_with_debug to wait until the passed statement returns
true, or the operation times out. If operation times out, the additional error
statement will be executed
Changes to be committed:
new file: mysql-test/include/force_restart.inc
new file: mysql-test/include/wait_condition_with_debug.inc
modified: mysql-test/mysql-test-run.pl
modified: mysql-test/suite/galera/disabled.def
modified: mysql-test/suite/galera/r/MW-336.result
modified: mysql-test/suite/galera/r/galera_kill_applier.result
modified: mysql-test/suite/galera/r/galera_var_slave_threads.result
new file: mysql-test/suite/galera/t/MW-336.cnf
modified: mysql-test/suite/galera/t/MW-336.test
modified: mysql-test/suite/galera/t/galera_kill_applier.test
modified: mysql-test/suite/galera/t/galera_parallel_autoinc_largetrx.test
modified: mysql-test/suite/galera/t/galera_parallel_autoinc_manytrx.test
modified: mysql-test/suite/galera/t/galera_var_slave_threads.test
modified: mysql-test/suite/wsrep/disabled.def
modified: mysql-test/suite/wsrep/r/variables.result
modified: mysql-test/suite/wsrep/t/variables.test
modified: sql/mysqld.cc
modified: sql/wsrep_mysqld.cc
modified: sql/wsrep_mysqld.h
modified: sql/wsrep_thd.cc
modified: sql/wsrep_var.cc
* MDEV-18565: Galera mtr-suite fails if galera library is not installed
Currently, running mtr with an incorrect (for example, new or
obsolete) version of wsrep_provider (for example, with the 26
version of libgalera_smm.so) leads to the failure of tests in
several suites with vague error diagnostics.
As for the galera_3nodes suite, the mtr also does not effectively
check all the prerequisites after merge with MDEV-18426 fixes.
For example, tests that using mariabackup do not check for presence
of ss and socat/nc. This is due to improper handling of relative
paths in mtr scripts.
In addition, some tests in different suites can be run without
setting the environment variables such as MTR_GALERA_TFMT, XBSTREAM,
and so on.
To eliminate all these issues, this patch makes the following changes:
1. Added auxiliary wsrep_mtr_check utility (which located in the
mysql-test/lib/My/SafeProcess subdirectory), which compares the
versions of the wsrep API that used by the server and by the wsrep
provider library, and it does this comparison safely, without
accessing the API if the versions do not match.
2. All checks related to the presence of mariabackup and utilities
that necessary for its operation transferred from the local directories
of different mtr suites (from the suite.pm files) to the main suite.pm
file. This not only reduces the amount of code and eliminates duplication
of identical code fragments, but also avoids problems due to the inability
of mtr to consider relative paths to include files when checking skip
combinations.
3. Setting the values of auxiliary environment variables that
are necessary for Galera, SST scripts and mariabackup (to work
properly) is moved to the main mysql-test-run.pl script, so as
not to duplicate this code in different suites, and to avoid
partial corrections of the same errors for different suites
(while other suites remain uncorrected).
4. Fixed duplication of the have_file_key_management.inc and
have_filekeymanagement.inc files between different suites,
these checks are also transferred to the top level.
https://jira.mariadb.org/browse/MDEV-18565
* Build without additional utility in configurations without wsrep support
Backported wsrep-recover test from 10.4 to test the wsrep recovery
after database shutdown or crash. Renamed the test to wsrep-recover-v25
to avoid conflicts with existing test in 10.4 and to provide coverage
for wsrep-API v25 compatible behavior.
The test case in 10.2 is simpler because out of order prepare
step cannot happen, the commit order critical section is grabbed
before prepare phase.
Test is not recorded since it does not produce expected result.
The second line of changes related to replacing xtrabackup with
mariabackup:
1) All unnecessary references to xtrabackup are removed from
the documentation, from some comments, from the control files
that are used to prepare the packages.
2) Made corrections of the tests from the galera_3nodes suite
that mentioned xtrabackup or the old (associated with xtrabackup)
version of innobackupex.
3) Fixed flaws in the galera_3nodes mtr suite control scripts,
because of which they could not work with mariabackup.
4) Fixed numerous bugs in the SST scripts and in the mtr test
files (galera_3nodes mtr suite) that prevented the use of Galera
with IPv6 addresses.
5) Fixed flaws in tests for rsync and mysqldump (for galera_3nodes
mtr tests suite). These tests were not performed successfully without
these fixes.
https://jira.mariadb.org/browse/MDEV-17835