1
0
mirror of https://github.com/MariaDB/server.git synced 2025-12-07 17:42:39 +03:00
Commit Graph

1027 Commits

Author SHA1 Message Date
Jan Lindström
8a931e4d16 MDEV-17571 : Make systemd timeout behavior more compatible with long Galera SSTs
This is 10.4 version.

Idea is to create monitor thread for both donor and joiner that will
periodically if needed extend systemd timeout while SST is being
processed. In 10.4 actual SST is executed by running SST script
and exchanging messages on pipe using blocking fgets. This fix
starts monitoring thread before SST script is started and
we stop monitoring thread when SST has been completed.
2020-01-22 16:55:59 +02:00
Oleksandr Byelkin
6918157e98 Merge branch '10.3' into 10.4 2020-01-21 23:15:02 +01:00
Oleksandr Byelkin
ade89fc898 Merge branch '10.2' into 10.3 2020-01-21 09:11:14 +01:00
Marko Mäkelä
ded128aa9b Merge 10.4 into 10.5 2020-01-20 16:48:56 +02:00
Jan Lindström
57ec527841 MDEV-17062 : Test failure on galera.MW-336
Add mutex protection while we calculate required slave
thread change and create them. Add error handling.
2020-01-20 15:54:30 +02:00
Marko Mäkelä
87a61355e8 Merge 10.3 into 10.4
The MDEV-17062 fix in commit c4195305b2
was omitted.
2020-01-20 15:49:48 +02:00
Jan Lindström
90d39f2f91 MDEV-21532 : galera.galera_rsu_drop_pk MTR failed: Result content mismatch
Add wait conditions to make sure correct number of rows have
been replicated.
2020-01-20 13:46:44 +02:00
Marko Mäkelä
6373ec3ec7 Merge 10.2 into 10.3 2020-01-18 16:56:16 +02:00
Jan Lindström
057fbfa356 Disable Galera tests failing on bb and Azure until they are fixed. 2020-01-18 09:38:48 +02:00
Jan Lindström
c4195305b2 MDEV-17062 : Test failure on galera.MW-336
Add mutex protection while we calculate required slave
thread change and create them. Add error handling.
2020-01-17 12:51:18 +02:00
Jan Lindström
7e378a8d31 Test requires debug build from galera library. 2020-01-17 11:59:53 +02:00
Sergei Petrunia
e709eb9bf7 Merge branch '10.2' into 10.3
# Conflicts:
#	mysql-test/suite/galera/r/MW-388.result
#	mysql-test/suite/galera/t/MW-388.test
#	mysql-test/suite/innodb/r/truncate_inject.result
#	mysql-test/suite/innodb/t/truncate_inject.test
#	mysql-test/suite/rpl/r/rpl_stop_slave.result
#	mysql-test/suite/rpl/t/rpl_stop_slave.test
#	sql/sp_head.cc
#	sql/sp_head.h
#	sql/sql_lex.cc
#	sql/sql_yacc.yy
#	storage/xtradb/buf/buf0dblwr.cc
2020-01-17 00:46:40 +03:00
Jan Lindström
bb8226deab MDEV-21492 : Galera test sporadic failure on galera.galera_events2
Add wait condition for event creation.
2020-01-16 12:00:59 +02:00
Jan Lindström
a382f69e24 MDEV-21498 : wsrep.binlog_format test failed on Azure
Waiting wsrep_ready is possible only if wsrep_on=ON.
2020-01-16 08:46:45 +02:00
Jan Lindström
800d1f3010 Disable usually failing Galera tests until a real fix is found. 2020-01-15 14:55:42 +02:00
Daniele Sciascia
7d31321464 MDEV-19803 Long semaphore wait error on galera.MW-388
The long semaphore wait appeared to be the caused by the following
pattern in the MTR test:

```
SET DEBUG_SYNC = "now SIGNAL wsrep_after_certification_continue";
SET DEBUG_SYNC = "now SIGNAL signal.wsrep_apply_cb;
```

Raising two signals, one right after another, caused one signal to
overwrite the other, before the signal was consumed by the thread.
This caused one thread to be stuck until the debug sync point would
timeout.
2020-01-14 09:11:35 +02:00
Daniele Sciascia
2d4b6571ec Wsrep position not updated in InnoDB after certification failures (#1432)
A certification failure followed by a clean shutdown would cause an
inconsistency between the sequence number stored in innodb and the
sequence number stored in provider.
This happened both in the case of local certification failure, and in
the case where dummy writeset is applied.
The fix consists of:
- updating wsrep position after dummy writeset is delivered in
 `Wsrep_high_priority_service::log_dummy_write_set()`
- updating wsrep position while releasing commit order in wsrep-lib
 side

Added two tests which stress the situation where a server is shutdown
after a certification failure.
2020-01-14 07:33:02 +02:00
Marko Mäkelä
ca8c3be47d Merge 10.4 into 10.5 2020-01-03 16:15:40 +02:00
Jan Lindström
2cff807d3f Add have_debug to MDEV-20793 and add wait condition to
galera_parallel_autoinc_largetrx to stabilize it.
2019-12-31 11:55:44 +02:00
Teemu Ollakka
13b3d7f1f1 MDEV-20793 Assertion failed after replay.
Assertion failed in wsrep-lib after transaction replay which
failed due to conflict in certification.

- Implemented reproducible test case MDEV-20793 to reproduce the crash.
- Fixed wsrep-lib to deal with certification error during replay.
2019-12-31 11:46:55 +02:00
Marko Mäkelä
8cc15c036d Merge 10.4 into 10.5 2019-12-27 21:17:16 +02:00
Marko Mäkelä
4c25e75ce7 Merge 10.3 into 10.4 2019-12-27 18:20:28 +02:00
Marko Mäkelä
5ab70e7f68 Merge 10.2 into 10.3 2019-12-27 15:14:48 +02:00
Jan Lindström
3fbd9f1522 MDEV-20909 : Galera test failure on galera.galera_gcs_fc_limit: Server crash with signal 6
Add proper wait condition when provider options are restored.
2019-12-23 16:06:25 +02:00
Jan Lindström
17b1b8118a MDEV-21189 : Dropping partition with 'wsrep_OSU_method=RSU' and 'SESSION sql_log_bin = 0' cases the galera node to hang
Test cleanup. Best practice for using RSU, is to isolate the node
up-front, so this test did not reflect real world scenario
2019-12-23 12:57:22 +02:00
Jan Lindström
cc9c55b2e2 MDEV-21189 : Dropping partition with 'wsrep_OSU_method=RSU' and 'SESSION sql_log_bin = 0' cases the galera node to hang
Test cleanup. Best practice for using RSU, is to isolate the node
up-front, so this test did not reflect real world scenario
2019-12-23 12:43:15 +02:00
Jan Lindström
c3824766c5 Fortify galera_partition test. 2019-12-18 10:02:57 +02:00
Jan Lindström
088de81d96 MDEV-21335 : Galera test failure on suite wsrep
Problem was that wsrep_on was OFF.

This is 10.4 version.
2019-12-18 08:22:07 +02:00
Marko Mäkelä
28c89b7151 Merge 10.4 into 10.5 2019-12-16 07:47:17 +02:00
Marko Mäkelä
8fa759a576 Merge 10.3 into 10.4
We disable the MDEV-21189 test galera.galera_partition
because it times out.
2019-12-13 17:30:37 +02:00
Marko Mäkelä
0a20e5ab77 Merge 10.2 into 10.3 2019-12-12 14:41:51 +02:00
Oleksandr Byelkin
a15234bf4b Merge branch '10.3' into 10.4 2019-12-09 15:09:41 +01:00
Jan Lindström
59e14b9684 MDEV-21189: Dropping partition with 'wsrep_OSU_method=RSU' and 'SESSION sql_log_bin = 0' cases the galera node to hang
Found two bugs

(1) have_committing_connections was missing mutex unlock on one
exit case. As this function is called on a loop it caused mutex
lock when we already owned the mutex. This could cause hang.

(2) wsrep_RSU_begin did set up error code when partition to
be dropped could not be MDL-locked because of concurrent
operations but wrong error code was propagated to upper layer
causing error to be ignored. This could have also caused
the hang.
2019-12-09 08:14:39 +02:00
Oleksandr Byelkin
008ee867a4 Merge branch '10.2' into 10.3 2019-12-04 17:46:28 +01:00
Jan Lindström
c9b9eb3315 MDEV-18497 : CTAS async replication from mariadb master crashes galera nodes (#1410)
In MariaDB 10.2 master could have been configured so that there
is extra annotate events. When we peak next event type for CTAS we
need to skip annotate events.
2019-12-04 11:46:37 +02:00
Oleksandr Byelkin
f7d35ffc76 Galera test fix after merge. 2019-12-03 20:39:16 +01:00
Oleksandr Byelkin
f8b5e147da Merge branch '10.1' into 10.2 2019-12-03 14:45:06 +01:00
Jan Lindström
88073dae79 MDEV-21198 : Galera test failure on galera_var_notify_cmd
Add proper wsrep sync wait.
2019-12-03 08:04:46 +02:00
Jan Lindström
9d9a2253c6 Merge remote-tracking branch 10.2 into 10.3
Conflicts:
	mysql-test/suite/galera/t/galera_binlog_event_max_size_max-master.opt
	mysql-test/suite/innodb/r/innodb-mdev-7513.result
	mysql-test/suite/innodb/t/innodb-mdev-7513.test
	mysql-test/suite/wsrep/disabled.def
	storage/innobase/ibuf/ibuf0ibuf.cc
2019-12-02 14:35:10 +02:00
Jan Lindström
c6ed37b88a MDEV-21182: Galera test failure on MW-284
galera_2nodes.cnf did not contain wsrep_on=1 on correct places. Fixed
restart options to use correct configuration.
2019-11-30 13:52:49 +02:00
seppo
38839854b7 MDEV-19572 async slave node fails to apply MyISAM only writes (#1418)
The problem happens when MariaDB master replicates writes for only non InnoDB
tables (e.g. writes to MyISAM table(s)). Async slave node, in Galera cluster,
can apply these writes successfully, but it will, in the end, write gtid position in
mysql.gtid_slave_pos table. mysql.gtid_slave_pos table is InnoDB engine, and
this write makes innodb handlerton part of the replicated "transaction".
Note that wsrep patch identifies that write to gtid_slave_pos should not be replicated
and skips appending wsrep keys for these writes. However, as InnoDB was present
in the transaction, and there are replication events (for MyISAM table) in transaction
cache, but there are no appended keys, wsrep raises an error, and this makes the söave
thread to stop.

The fix is simply to not treat it as an error if async slave tries to replicate a write
set with binlog events, but no keys. We just skip wsrep replication and return successfully.

This commit contains also a mtr test which forces mysql.gtid_slave_pos table isto be
of InnoDB engine, and executes MyISAM only write through asyn replication.

There is additional fix for declaring IO and background slave threads as non wsrep.
These threads should not write anything for wsrep replication, and this is just a safeguard
to make sure nothing leaks into cluster from these slave threads.
2019-11-26 08:49:50 +02:00
Aleksey Midenkov
0c05a2ed71 Merge 10.4 into 10.5 2019-11-25 17:24:09 +03:00
seppo
4111a53079 MDEV-21096 async slave crash with gtid_log_pos table access (#1413)
The original crash happened when async replication IO thread was updating mysql.gtid_slave_pos table. Operations on this table should remain node local, but it appears that protection (THD::wsrep_ignore_table flag) to prevent wsrep replication for this table mas missing for innodb write_row() and update_row().
It was somewhat difficult to reproduce the issue, because mtr seems to create the affected table mysql.gtid_log_pos as of Aria engine type, and Aria engine operations will not be replicated anyhow. It looks, though, that in release installation, mysql.gtid_slave_pos table is of InnoDB engine.
It was possible to trigger somewhat related problem by running test galera.galera_as_slave_gtid with configuration: gtid_pos_auto_engines=InnoDB. However, this test mode, causes earlier crash when replication background thread creates aditional table: mysql.gtid_slave_pos_InnoDB, and this table create triggered wsrep TOI replication, which also failed for assertion. Actually, async replication IO and background threads should not replicate anything to cluster.

This pull request contains new test galera.galera_as_slave_gtid_auto_engine, which basically just runs galera.galera_as_slave_gtid with configuration of gtid_pos_auto_engines=InnoDB.
Test galera.galera_as_slave_gtid is also modified for better code reuse.
Actual fix for MDEV-21096 is in storage/innobase/handler/ha_innodb.cc, where THD::wsrep_ignore_table flag is now honored before wsrep key population.
There is additional fix in sql/service_wsrep.cc where async replication IO and background threads are marked as non-local. This fences these threads out of wsrep replication altogether. Note that this change, actually makes the use of THD::wsrep_ignore-table redundant. We may want to refactor THD::wsrep_ignore_table out in the future, if there is no other use case for it in sight.
2019-11-25 11:19:33 +02:00
Jan Lindström
c6b097ab37 Remove excessive sleep from test. 2019-11-18 15:22:01 +02:00
seppo
5c68343db7 MDEV-18497 CTAS async replication from mariadb master crashes galera nodes (#1410)
This PR contains a mtr test for reproducing a failure with replicating create table as select statement (CTAS) through asynchronous mariadb replication to mariadb galera cluster.
The problem happens when CTAS replication contains both create table statement followed by row events for populating the table. In such situation, the galera node operating as mariadb replication slave, will first replicate only the create table part into the cluster, and then perform another replication containing both the create table and row events. This will lead all other nodes to fail for duplicate table create attempt, and crash due to this failure.

PR contains also a fix, which identifies the situation when CTAS has been replicated, and makes further scan in async replication stream to see if there are following row events. The slave node will replicate either single TOI in case the CTAS table is empty, or if CTAS table contains rows, then single bundled write set with create table and row events is replicated to galera cluster.

This fix should keep master server's GTID's for CTAS replication in sync with GTID's in galera cluster.
2019-11-18 15:18:00 +02:00
Oleksandr Byelkin
55b2281a5d Merge branch '10.2' into 10.3 2019-10-31 10:58:06 +01:00
Jan Lindström
cd1c10859d Fix test cases that use debug galera library.
Changes to be committed:
	modified:   mysql-test/suite/galera/r/MW-369.result
	modified:   mysql-test/suite/galera/r/MW-402.result
	modified:   mysql-test/suite/galera/r/galera#500.result
	modified:   mysql-test/suite/galera/r/galera_gcs_fragment.result
	modified:   mysql-test/suite/galera/r/mysql-wsrep#332.result
2019-10-30 10:14:56 +02:00
Marko Mäkelä
613e9e7d4d MDEV-20907 Set innodb_log_files_in_group=1 by default
Historically, InnoDB split the redo log into at least 2 files.
MDEV-12061 allowed the minimum to be innodb_log_files_in_group=1,
but it kept the default at innodb_log_files_in_group=2.

Because performance seems to be slightly better with only one log file,
and because implementing an append-only variant of the log would require
a single file, let us define the default to be 1, and have
innodb_log_file_size=96M, to retain the same default total size.
2019-10-28 17:11:10 +02:00
Marko Mäkelä
3043f38436 Merge 10.4 into 10.5 2019-10-28 17:10:34 +02:00
Jan Lindström
82f22d2f25 MDEV-18590 galera.versioning_trx_id: Test failure: mysqltest: Result content mismatch
Ignore warning.
2019-10-23 10:12:53 +03:00