1
0
mirror of https://github.com/MariaDB/server.git synced 2025-05-02 19:25:03 +03:00

52 Commits

Author SHA1 Message Date
sjaakola
157b3a637f MDEV-23328 Server hang due to Galera lock conflict resolution
Mutex order violation when wsrep bf thread kills a conflicting trx,
the stack is

          wsrep_thd_LOCK()
          wsrep_kill_victim()
          lock_rec_other_has_conflicting()
          lock_clust_rec_read_check_and_lock()
          row_search_mvcc()
          ha_innobase::index_read()
          ha_innobase::rnd_pos()
          handler::ha_rnd_pos()
          handler::rnd_pos_by_record()
          handler::ha_rnd_pos_by_record()
          Rows_log_event::find_row()
          Update_rows_log_event::do_exec_row()
          Rows_log_event::do_apply_event()
          Log_event::apply_event()
          wsrep_apply_events()

and mutexes are taken in the order

          lock_sys->mutex -> victim_trx->mutex -> victim_thread->LOCK_thd_data

When a normal KILL statement is executed, the stack is

          innobase_kill_query()
          kill_handlerton()
          plugin_foreach_with_mask()
          ha_kill_query()
          THD::awake()
          kill_one_thread()

        and mutexes are

          victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex

This patch is the plan D variant for fixing potetial mutex locking
order exercised by BF aborting and KILL command execution.

In this approach, KILL command is replicated as TOI operation.
This guarantees total isolation for the KILL command execution
in the first node: there is no concurrent replication applying
and no concurrent DDL executing. Therefore there is no risk of
BF aborting to happen in parallel with KILL command execution
either. Potential mutex deadlocks between the different mutex
access paths with KILL command execution and BF aborting cannot
therefore happen.

TOI replication is used, in this approach,  purely as means
to provide isolated KILL command execution in the first node.
KILL command should not (and must not) be applied in secondary
nodes. In this patch, we make this sure by skipping KILL
execution in secondary nodes, in applying phase, where we
bail out if applier thread is trying to execute KILL command.
This is effective, but skipping the applying of KILL command
could happen much earlier as well.

This also fixed unprotected calls to wsrep_thd_abort
that will use wsrep_abort_transaction. This is fixed
by holding THD::LOCK_thd_data while we abort transaction.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2021-10-29 10:00:17 +03:00
Sergei Golubchik
0ab1e3914c Merge branch '10.2' into 10.3 2021-02-22 22:42:27 +01:00
Sergei Golubchik
a638f1577a Merge branch 'bb-10.2-release' into 10.2 2021-02-22 18:43:03 +01:00
Sergei Golubchik
0d55b020e1 Merge branch 'bb-10.2-release' into bb-10.3-release 2021-02-18 22:09:53 +01:00
Sergei Golubchik
ce3a2a688d make @@wsrep_provider and @@wsrep_notify_cmd read-only
this should simplify run-time cluster management
2021-02-18 19:03:01 +01:00
Jan Lindström
4d300ab1a8 MDEV-24867 : wsrep.variables MTR failed: Result length mismatch
Stabilize test case.
2021-02-17 10:28:37 +02:00
Jan Lindström
2391582ec3 Merge remote-tracking branch 10.2 into 10.3 2020-11-03 09:00:23 +02:00
Jan Lindström
94859d985e Clean up wsrep.variables 2020-11-03 08:49:10 +02:00
Marko Mäkelä
e3d692aa09 Merge 10.2 into 10.3 2020-10-22 08:26:28 +03:00
Jan Lindström
fc3b5c7db3 MDEV-17585 : wsrep.variables failed in buildbot with deadlock on CREATE USER
Stabilize test by using correct galera library and restore
original galera cluster at end.
2020-10-10 08:50:50 +03:00
Marko Mäkelä
eda719793a Merge 10.2 into 10.3 2020-01-07 12:14:35 +02:00
Jan Lindström
5824e9f8df MDEV-13569: wsrep_info.plugin failed in buildbot with "no nodes coming from prim view
Modify configuration so that all nodes are part of galera cluster
i.e. wsrep_on=ON. Add missing wait conditions.

test changes only.
2020-01-07 08:57:30 +02:00
Jan Lindström
6cdde9ebbf MDEV-20836 : Galera test failure on wsrep.variables
Add one more wait to make sure all threads have been started.
2019-10-16 13:01:40 +03:00
Marko Mäkelä
33215edcba Resolve conflicts in wsrep.variables
This was forgotten in the merge 0f83c8878dc1389212c134f65d37a43d9d248250
because the test is disabled.
2019-07-24 15:30:27 +03:00
Eugene Kosov
0f83c8878d Merge 10.2 into 10.3 2019-07-16 18:39:21 +03:00
Jan Lindström
ec49976e38 MDEV-19746: Galera test failures because of wsrep_slave_threads identification
Problem was that tests select INFORMATION_SCHEMA.PROCESSLIST processes
from user system user and empty state. Thus, there is not clear
state for slave threads.

Changes:
- Added new status variables that store current amount of applier threads
(wsrep_applier_thread_count) and rollbacker threads
(wsrep_rollbacker_thread_count). This will make clear how many slave threads
of certain type there is.
- Added THD state "wsrep applier idle" when applier slave thread is
waiting for work. This makes finding slave/applier threads easier.
- Added force-restart option for mtr to always restart servers between tests
to avoid race on start of the test
- Added wait_condition_with_debug to wait until the passed statement returns
true, or the operation times out. If operation times out, the additional error
statement will be executed

Changes to be committed:
	new file:   mysql-test/include/force_restart.inc
	new file:   mysql-test/include/wait_condition_with_debug.inc
	modified:   mysql-test/mysql-test-run.pl
	modified:   mysql-test/suite/galera/disabled.def
	modified:   mysql-test/suite/galera/r/MW-336.result
	modified:   mysql-test/suite/galera/r/galera_kill_applier.result
	modified:   mysql-test/suite/galera/r/galera_var_slave_threads.result
	new file:   mysql-test/suite/galera/t/MW-336.cnf
	modified:   mysql-test/suite/galera/t/MW-336.test
	modified:   mysql-test/suite/galera/t/galera_kill_applier.test
	modified:   mysql-test/suite/galera/t/galera_parallel_autoinc_largetrx.test
	modified:   mysql-test/suite/galera/t/galera_parallel_autoinc_manytrx.test
	modified:   mysql-test/suite/galera/t/galera_var_slave_threads.test
	modified:   mysql-test/suite/wsrep/disabled.def
	modified:   mysql-test/suite/wsrep/r/variables.result
	modified:   mysql-test/suite/wsrep/t/variables.test
	modified:   sql/mysqld.cc
	modified:   sql/wsrep_mysqld.cc
	modified:   sql/wsrep_mysqld.h
	modified:   sql/wsrep_thd.cc
	modified:   sql/wsrep_var.cc
2019-07-15 10:17:07 +03:00
Julius Goryavsky
0e89e90f42 MDEV-17835: Remove wsrep-sst-method=xtrabackup
The second line of changes related to replacing xtrabackup with
mariabackup:

1) All unnecessary references to xtrabackup are removed from
the documentation, from some comments, from the control files
that are used to prepare the packages.

2) Made corrections of the tests from the galera_3nodes suite
that mentioned xtrabackup or the old (associated with xtrabackup)
version of innobackupex.

3) Fixed flaws in the galera_3nodes mtr suite control scripts,
because of which they could not work with mariabackup.

4) Fixed numerous bugs in the SST scripts and in the mtr test
files (galera_3nodes mtr suite) that prevented the use of Galera
with IPv6 addresses.

5) Fixed flaws in tests for rsync and mysqldump (for galera_3nodes
mtr tests suite). These tests were not performed successfully without
these fixes.

https://jira.mariadb.org/browse/MDEV-17835
2019-01-22 13:28:03 +01:00
Marko Mäkelä
ae9d82c9f8 Merge 10.2 into 10.3 2018-10-11 08:22:08 +03:00
Jan Lindström
e2a1c58582 Fix test failure on wsrep.variables
SLES11 can't build currently latest Galera library version.
2018-10-04 07:13:30 +03:00
Jan Lindström
285969e1c6 Fix result file for wsrep.variables, for some reason had too new
galera library used.
2018-09-07 11:27:15 +03:00
Jan Lindström
fba683c069 MDEV-17062: Test failure on galera.MW-336
MDEV-17058: Test failure on wsrep.variables
MDEV-17060: Test failure on galera.galera_var_slave_threads

Fix incorrect calculation of increased applier (slave) threads.
Note that increase change takes effect "immediately" but we should
use proper wait condition to wait it. Reducing the number of
slave threads is not immediate as thread will only exit after a
replication event.
2018-09-06 16:05:31 +03:00
Jan Lindström
a290b807e8 MDEV-17062: Test failure on galera.MW-336
MDEV-17058: Test failure on wsrep.variables
MDEV-17060: Test failure on galera.galera_var_slave_threads

Fix incorrect calculation of increased applier (slave) threads.
Note that increase change takes effect "immediately" but we should
use proper wait condition to wait it. Reducing the number of
slave threads is not immediate as thread will only exit after a
replication event.
2018-08-27 16:10:33 +03:00
Vicențiu Ciorbaru
6e55236c0a Merge branch '10.0-galera' into 10.1 2018-06-12 19:39:37 +03:00
Jan Lindström
648cf7176c Merge remote-tracking branch 'origin/5.5-galera' into 10.0-galera 2018-05-07 13:49:14 +03:00
Sergei Golubchik
09b25f8596 only allow SUPER user to modify wsrep_on 2018-03-01 19:32:01 +01:00
Nirbhay Choubey
90266e8a0e Merge branch '10.0-galera' into bb-10.1-serg 2016-08-25 15:39:39 -04:00
Nirbhay Choubey
8b998a48cc Update galera version-dependent tests. 2016-08-21 16:17:04 -04:00
Nirbhay Choubey
dced5146bd Merge branch '10.0-galera' into 10.1 2015-07-14 16:05:29 -04:00
Nirbhay Choubey
3331d4e07e Merge galera tests from github.com/codership/mysql-wsrep 2015-05-08 17:43:57 -04:00
Nirbhay Choubey
4c191de323 MDEV-7560: wsrep* tests depend on the version of galera library
Added an include file to check galera library version.
2015-02-27 22:16:37 -05:00
Nirbhay Choubey
aa2904a7f4 MDEV-7560: wsrep* tests depend on the version of galera library
Added an include file to check galera library version.
2015-02-27 22:13:37 -05:00
Sergei Golubchik
8e7649867f Merge 10.0-galera into 10.1 2015-02-06 16:14:23 +01:00
Nirbhay Choubey
0105bf349a MDEV-7476: Allow SELECT to succeed even when node is not ready
Added a SESSION-only system variable "wsrep_dirty_reads" to allow SELECT
queries to pass even when the node is not prepared to accept queries
(wsrep_ready=OFF). Added a test case.
2015-01-22 18:00:37 -05:00
Nirbhay Choubey
887628acee Test changes (backported from 10.1). 2015-01-16 13:53:23 -05:00
Nirbhay Choubey
bb93d46241 Test changes (backported from 10.1). 2015-01-16 13:52:30 -05:00
Nirbhay Choubey
25aaa652c4 MDEV-6832: ER_LOCK_WAIT_TIMEOUT on SHOW STATUS
Synchronous read view should not be needed for
SHOW commands.
2014-12-31 19:46:48 -05:00
Nirbhay Choubey
952b575272 MDEV-6832: ER_LOCK_WAIT_TIMEOUT on SHOW STATUS
Synchronous read view should not be needed for
SHOW commands.
2014-12-31 19:28:20 -05:00
Nirbhay Choubey
c768af75b7 Minor modifications
- Simplified test cases in wsrep.variables
- Fixed a condition in wsrep_check_opts.cc
- Fixed an "unbound variable" in wsrep_sst_rsync
2014-10-04 13:53:33 -04:00
Jan Lindström
e44751b65f Merge revision 3882 from lp:maria/maria-10.0-galera
MDEV-6656: Test wsrep.variables hangs

  Analysis: wsrep_applier_thread shutdown signaling does not always work
  correctly causing a timing problem where main thread is waiting in a
  condition variable a signal that all worker threads to end.
2014-08-29 10:11:08 +03:00
Jan Lindström
f99f573dc7 MDEV-6656: Test wsrep.variables hangs
Analysis: wsrep_applier_thread shutdown signaling does not always work
correctly causing a timing problem where main thread is waiting in a
condition variable a signal that all worker threads to end.
2014-08-29 09:42:13 +03:00
Jan Lindström
df4dd593f2 MDEV-6247: Merge 10.0-galera to 10.1.
Merged lp:maria/maria-10.0-galera up to revision 3879.

Added a new functions to handler API to forcefully abort_transaction,
producing fake_trx_id, get_checkpoint and set_checkpoint for XA. These
were added for future possiblity to add more storage engines that
could use galera replication.
2014-08-26 15:43:46 +03:00
Nirbhay Choubey
b77fc5a343 Merge of patch for MDEV#6399. 2014-07-12 18:21:29 -04:00
Nirbhay Choubey
3ce3647055 MDEV#6399 - Make galera test suite run with --parallel
Galera tests used default base/SST ports which led to
failures due to port conflicts when run in parallel.
Fixed by setting them to ones generated by mtr framework.
2014-07-12 18:20:45 -04:00
Nirbhay Choubey
dc377fcbc0 Merge of patch for MDEV-6411 from maria-5.5-galera. 2014-07-09 11:07:23 -04:00
Nirbhay Choubey
40bfd20180 MDEV#6411 - Setting set @@global_wsrep_sst_auth=NULL
causes crash

Fixed by properly handling the NULL values.
2014-07-09 11:04:28 -04:00
Nirbhay Choubey
93cc06b20c Fixed a warning in mtr script.
Updated wsrep.variables test.
2014-06-10 18:31:07 -04:00
Nirbhay Choubey
ab4947463e Merging changes from maria-5.5-galera and
some test fixes.

bzr merge -r3479..3493 maria-5.5-galera
2014-05-22 18:31:04 -04:00
Nirbhay Choubey
00b6fff2e7 MDEV#6206: wsrep_slave_threads subtracts from max_connections
Decoupled wsrep thread count from connection count. By doing so,
the number of wsrep threads (applier/rollbacker) would no longer
affect the threads_connected status variable and thus maximum
allowable user connections limit would be @@max_connections.

Also introduced a new status variable 'wsrep_thread_count' to hold
the number of wsrep applier/rollbacker threads.

Added a test case.
2014-05-08 14:45:00 -04:00
Nirbhay Choubey
b11be05255 MDEV#6079: xtrabackup SST failing with maria-10.0-galera
Added logic to skip changing of case for wsrep status
variable names.
2014-04-16 13:04:03 -04:00
Nirbhay Choubey
7fd382f117 Merged r3466 from maria-5.5-galera. 2014-03-27 08:17:24 -04:00