Problem was that tests select INFORMATION_SCHEMA.PROCESSLIST processes
from user system user and empty state. Thus, there is not clear
state for slave threads.
Changes:
- Added new status variables that store current amount of applier threads
(wsrep_applier_thread_count) and rollbacker threads
(wsrep_rollbacker_thread_count). This will make clear how many slave threads
of certain type there is.
- Added THD state "wsrep applier idle" when applier slave thread is
waiting for work. This makes finding slave/applier threads easier.
- Added force-restart option for mtr to always restart servers between tests
to avoid race on start of the test
- Added wait_condition_with_debug to wait until the passed statement returns
true, or the operation times out. If operation times out, the additional error
statement will be executed
Changes to be committed:
new file: mysql-test/include/force_restart.inc
new file: mysql-test/include/wait_condition_with_debug.inc
modified: mysql-test/mysql-test-run.pl
modified: mysql-test/suite/galera/disabled.def
modified: mysql-test/suite/galera/r/MW-336.result
modified: mysql-test/suite/galera/r/galera_kill_applier.result
modified: mysql-test/suite/galera/r/galera_var_slave_threads.result
new file: mysql-test/suite/galera/t/MW-336.cnf
modified: mysql-test/suite/galera/t/MW-336.test
modified: mysql-test/suite/galera/t/galera_kill_applier.test
modified: mysql-test/suite/galera/t/galera_parallel_autoinc_largetrx.test
modified: mysql-test/suite/galera/t/galera_parallel_autoinc_manytrx.test
modified: mysql-test/suite/galera/t/galera_var_slave_threads.test
modified: mysql-test/suite/wsrep/disabled.def
modified: mysql-test/suite/wsrep/r/variables.result
modified: mysql-test/suite/wsrep/t/variables.test
modified: sql/mysqld.cc
modified: sql/wsrep_mysqld.cc
modified: sql/wsrep_mysqld.h
modified: sql/wsrep_thd.cc
modified: sql/wsrep_var.cc
* MDEV-18565: Galera mtr-suite fails if galera library is not installed
Currently, running mtr with an incorrect (for example, new or
obsolete) version of wsrep_provider (for example, with the 26
version of libgalera_smm.so) leads to the failure of tests in
several suites with vague error diagnostics.
As for the galera_3nodes suite, the mtr also does not effectively
check all the prerequisites after merge with MDEV-18426 fixes.
For example, tests that using mariabackup do not check for presence
of ss and socat/nc. This is due to improper handling of relative
paths in mtr scripts.
In addition, some tests in different suites can be run without
setting the environment variables such as MTR_GALERA_TFMT, XBSTREAM,
and so on.
To eliminate all these issues, this patch makes the following changes:
1. Added auxiliary wsrep_mtr_check utility (which located in the
mysql-test/lib/My/SafeProcess subdirectory), which compares the
versions of the wsrep API that used by the server and by the wsrep
provider library, and it does this comparison safely, without
accessing the API if the versions do not match.
2. All checks related to the presence of mariabackup and utilities
that necessary for its operation transferred from the local directories
of different mtr suites (from the suite.pm files) to the main suite.pm
file. This not only reduces the amount of code and eliminates duplication
of identical code fragments, but also avoids problems due to the inability
of mtr to consider relative paths to include files when checking skip
combinations.
3. Setting the values of auxiliary environment variables that
are necessary for Galera, SST scripts and mariabackup (to work
properly) is moved to the main mysql-test-run.pl script, so as
not to duplicate this code in different suites, and to avoid
partial corrections of the same errors for different suites
(while other suites remain uncorrected).
4. Fixed duplication of the have_file_key_management.inc and
have_filekeymanagement.inc files between different suites,
these checks are also transferred to the top level.
https://jira.mariadb.org/browse/MDEV-18565
* Build without additional utility in configurations without wsrep support
Idea is that many users do not install galera library and do not want
to unnecessary run galera and wsrep suites. Furthermore, failures on
these suites disturb development as buildbot shows red failing column
and causes unnecessary work for those who do not care galera tests.
There will be other way to run these suites on buildbot.
--gdb now accepts an argument, it will be passed to gdb as a command.
multiple commands can be separated by a (non-standard and not escapable)
delimiter - semicolon (;).
Old usage with a bare --gdb continues to work too, of course.
Cherry-picked c47c0ca50c4 5441bbd3b1f 339b9055791
If a mtr test runs multiple servers and only some of them get
restarted on whatever reason with new command-line parameters,
then subsequent mtr test may fail, because no cleanup is performed.
Replication and Galera test suites are affected.
In the mtr script, there is a server_need_restart function
that decides whether we need to start a new mysqld process before
the new (next) test. If the mysqld parameters were changed in the
previous test - not necessarily the parameters of the primary mysqld
server, maybe even the secondary server parameters - this function
decides to start a new mysqld process. But since it does not remove
the old (changed) parameters, the new process starts with the
parameters changed by the *previous* test.
To correct this error, we must delete the modified process
parameters after checking that they have been changed during
the previous test.
This patch also simplifies and makes more stable the
galera_drop_database test, during debugging of which this
problem was detected.
https://jira.mariadb.org/browse/MDEV-17421
If a mtr test case has started two mysqld processes (replication tests),
then kills the first one and kills the second one before starting the
first (so at some point there are two mysqlds down), then the ./mtr
waiting process bricks and forgets to monitor the "expect" file of the
first mysqld, so it never gets started again, even when its contents is
changed to "restart".
A victim of this deficiency is at least galera.galera_gcache_recover.
The fix is to keep a list of all mysqlds we should wait to start, not
just one (the last one killed).