1
0
mirror of https://github.com/MariaDB/server.git synced 2025-09-11 05:52:26 +03:00
Commit Graph

223 Commits

Author SHA1 Message Date
Marko Mäkelä
3da36fa130 Merge 10.6 into 10.11 2025-05-26 08:10:47 +03:00
Jan Lindström
7aed06887b MDEV-36512 : galera_3nodes.GCF-354: certification position less than last committed
Test changes only. Both warnings are expected and
should be suppressed because we intentionally inject
different inconsistencies on two nodes and then join
them back with membership change.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2025-05-20 20:59:10 +02:00
Jan Lindström
7fd5957d55 MDEV-36622 : Hang during galera_evs_suspect_timeout test
Test changes only. Add wait_condition so that all nodes
are in the expected state and add debug output if issue
does reproduce.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2025-05-20 20:59:10 +02:00
Julius Goryavsky
74f0b99edf Merge branch '10.6' into '10.11' 2025-04-02 06:33:39 +02:00
Denis Protivensky
c01bff4a10 MDEV-36360: Don't grab table-level X locks for applied inserts
It prevents a crash in wsrep_report_error() which happened when appliers would run
with FK and UK checks disabled and erroneously execute plain inserts as bulk inserts.

Moreover, in release builds such a behavior could lead to deadlocks between two applier
threads if a thread waiting for a table-level lock was ordered before the lock holder.
In that case the lock holder would proceed to commit order and wait forever for the
now-blocked other applier thread to commit before.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2025-04-02 04:50:30 +02:00
Julius Goryavsky
03c31ab099 Merge branch '10.5' into '10.6' 2025-04-02 04:43:24 +02:00
Julius Goryavsky
fa55b36c1e galera tests: corrections for garbd-related tests 2025-04-02 04:29:40 +02:00
Julius Goryavsky
41565615c5 galera: synchronization changes to stop random test failures 2025-04-02 04:29:34 +02:00
Julius Goryavsky
c61345169a galera tests: synchronization after merge 2025-03-28 02:53:59 +01:00
Marko Mäkelä
ab0f2a00b6 Merge 10.6 into 10.11 2025-03-27 08:01:47 +02:00
Julius Goryavsky
e3d7d5ca26 Merge branch '10.5' into '10.6' 2025-02-27 04:02:33 +01:00
Julius Goryavsky
04d731b6cc galera mtr tests: synchronization between versions
Added fixes to galera tests for issues found during
merging changes from 10.5 to 10.6.
2025-02-26 18:19:28 +01:00
Jan Lindström
94ef07d61e MDEV-32631 : galera_2_cluster: before_rollback(): Assertion `0' failed
Test case changes only. Add wait_conditions to make sure
nodes rejoin the cluster. Assertion itself should not be
possible anymore as we do not allow sequences on
Aria tables.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2025-02-18 04:51:16 +01:00
Julius Goryavsky
7b040e53cc galera mtr tests: fixes for test failures, 'cosmetic' changes and unification between versions 2025-02-12 12:25:09 +01:00
Julius Goryavsky
c35b6f133a galera mtr tests: synchronization between editions/branches (10.5) 2025-02-12 12:25:09 +01:00
Teemu Ollakka
1b146e8220 galera fix: Donor in non-Primary causes assertion in wsrep-lib
Constructed a test which makes donor to go into non-Primary configuration
before `sst_sent()` is called, causing an assertion in wsrep-lib
if the bug is present.

Updated wsrep-lib to version which contains the fix.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2025-02-12 12:25:09 +01:00
Julius Goryavsky
c9a6adba1e galera mtr tests: synchronization of tests between branches 2025-02-12 11:30:14 +01:00
Marko Mäkelä
3d23adb766 Merge 10.6 into 10.11 2024-11-29 13:43:17 +02:00
Marko Mäkelä
7d4077cc11 Merge 10.5 into 10.6 2024-11-29 12:37:46 +02:00
Jan Lindström
f39217da0c MDEV-35473 : Sporadic failures in the galera_3nodes.galera_evs_suspect_timeout mtr test
Remove unnecessary sleep and replace it with proper wait_conditions.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-11-28 01:02:35 +01:00
Thirunarayanan Balathandayuthapani
074831ec61 Merge branch 10.5 into 10.6 2024-11-08 18:17:15 +05:30
Julius Goryavsky
db68eb69f9 MDEV-35344: post-fix correction for other galera tests 2024-11-06 04:59:10 +01:00
Julius Goryavsky
f176248d4b Merge branch '10.6' into '10.11' 2024-09-17 06:23:10 +02:00
Julius Goryavsky
80fff4c6b1 Merge branch '10.5' into '10.6' 2024-09-16 16:39:59 +02:00
Julius Goryavsky
7ee0e60bbb galera mtr tests: minor fixes to make tests more reliable 2024-09-15 05:05:03 +02:00
Julius Goryavsky
b3cc952916 galera tests: updated .result for galera_gtid_2_cluster test 2024-09-03 07:21:43 +02:00
Julius Goryavsky
d058be62b8 Merge branch '10.6' into '10.11' 2024-09-02 03:49:03 +02:00
Julius Goryavsky
bac0804d81 Merge branch '10.5' into '10.6' 2024-09-01 06:51:25 +02:00
Jan Lindström
dd64f29d6b MDEV-33897 : Galera test failure on galera_3nodes.galera_gtid_consistency
Based on logs SST was started before donor reached
Primaty state. Add wait_conditions to make sure that
nodes reach Primary state before starting next node.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-01 03:01:37 +02:00
Alexey Yurchenko
731a5aba0b Use only MySQL code for TOI error vote
For TOI events specifically we have a situation where in case of the
same error different nodes may generate different messages. This may
be for two reasons:
 - different locale setting between the current client session and
   server default (we can reasonably require server locales to be
   identical on all nodes, but user can change message locale for the
   session)
 - non-deterministic course of STATEMENT execution e.g. for ALTER TABLE

On the other hand we may reasonably expect TOI event failures since
they are executed after replication, so we must ensure that voting is
consistent. For that purpose error codes should be sufficiently unique
and deterministic for TOI event failures as DDLs normally deal with
a single object, so we can merely use MySQL error codes to vote on.

Notice that this problem does not happen with regular transactional
writesets, since the originator node will always vote success and
replica nodes are assumed to have the same global locale setting.
As such different error messages indicate different errors even if
the error code is the same (e.g. ER_DUP_KEY can happen on different
rows tables).

Use only MySQL error code (without the error message) for error voting
in case of TOI event failure.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-01 02:58:27 +02:00
Marko Mäkelä
62bfcfd8b2 Merge 10.6 into 10.11 2024-08-14 11:36:52 +03:00
Marko Mäkelä
757c368139 Merge 10.5 into 10.6 2024-08-14 10:56:11 +03:00
Jan Lindström
71f289e5d1 MDEV-25614 : Galera test failure on GCF-354
Modified node config with longer timeouts for suspect,
inactive, install and wait_prim timeout. Increased
node_1 weight to keep it primary component when
other nodes are voted out.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-08-04 17:59:13 +02:00
Jan Lindström
cb80ef93a9 MDEV-32778 : galera_ssl_reload failed with warning message
Fixed used configuration and added suppression for warning
message. Test case changes only.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-08-04 17:54:05 +02:00
Oleksandr Byelkin
0fe39d368a Merge branch '10.6' into 10.11 2024-07-22 15:14:50 +02:00
Julius Goryavsky
4026f04425 Merge branch 10.5 into 10.6 2024-07-09 11:56:47 +02:00
Julius Goryavsky
d0a2d4e755 galera mtr tests: correction of inaccuracies in warnings suppressions 2024-07-08 23:36:21 +02:00
Marko Mäkelä
27a3366663 Merge 10.6 into 10.11 2024-06-27 10:26:09 +03:00
Jan Lindström
ee974ca5e0 MDEV-31658 : Deadlock found when trying to get lock during applying
Problem was that there was two non-conflicting local idle
transactions in node_1 that both inserted a key to primary key.
Then two transactions from other nodes inserted also
a key to primary key so that insert from node_2 conflicted
one of the local transactions in node_1 so that there would
be duplicate key if both are committed. For this insert
from other node tries to acquire S-lock for this record
and because this insert is high priority brute force (BF)
transaction it will kill idle local transaction.

Concurrently, second insert from node_3 conflicts the second
idle insert transaction in node_1. Again, it tries to acquire
S-lock for this record and kills idle local transaction.

At this point we have two non-conflicting high priority
transactions holding S-lock on different records in node_1.
For example like this: rec s-lock-node2-rec s-lock-node3-rec rec.

Because these high priority BF-transactions do not wait
each other insert from node3 that has later seqno compared
to insert from node2 can continue. It will try to acquire
insert intention for record it tries to insert (to avoid
duplicate key to be inserted by local transaction). Hower,
it will note that there is conflicting S-lock in same gap
between records. This will lead deadlock error as we have
defined that BF-transactions may not wait for record lock
but we can't kill conflicting BF-transaction because
it has lower seqno and it should commit first.

BF-transactions are executed concurrently because their
values to primary key are different i.e. they do not
conflict.

Galera certification will make sure that inserts from
other nodes i.e these high priority BF-transactions
can't insert duplicate keys. Local transactions naturally
can but they will be killed when BF-transaction
acquires required record locks.

Therefore, we can allow situation where there is conflicting
S-lock and insert intention lock regardless of their seqno
order and let both continue with no wait. This will lead
to situation where we need to allow BF-transaction
to wait when lock_rec_has_to_wait_in_queue is called
because this function is also called from
lock_rec_queue_validate and because lock is waiting
there would be assertion in ut_a(lock->is_gap()
|| lock_rec_has_to_wait_in_queue(cell, lock));

lock_wait_wsrep_kill
  Add debug sync points for BF-transactions killing
  local transaction.

wsrep_assert_no_bf_bf_wait
  Print also requested lock information

lock_rec_has_to_wait
  Add function to handle wsrep transaction lock wait
  cases.

lock_rec_has_to_wait_wsrep
  New function to handle wsrep transaction lock wait
  exceptions.

lock_rec_has_to_wait_in_queue
  Remove wsrep exception, in this function all
  conflicting locks need to wait in queue.
  Conflicts between BF and local transactions
  are handled in lock_wait.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-06-19 14:09:11 +02:00
Marko Mäkelä
b81d717387 Merge 10.6 into 10.11 2024-06-11 12:50:10 +03:00
Marko Mäkelä
a687cf8661 Merge 10.5 into 10.6 2024-06-07 10:03:51 +03:00
Denis Protivensky
a4838721a2 MDEV-32633: Fix Galera cluster <-> native replication interaction
GTID events are applied without a running server transaction,
we need to set next transaction ID for Wsrep transaction.

The whole Galera cluster now has a single GTID value (including
the server ID throughout the cluster), fix the config accordingly.

Add force restart so that repeated MTR test execution prints
consistent GTID values, otherwise they would have been recovered
from the previous run.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-06-03 09:48:13 +02:00
Denis Protivensky
0cc9b49751 MDEV-32633: Fix Galera cluster <-> native replication interaction
It's possible to establish Galera multi-cluster setups connected
through the native replication when every Galera cluster is configured
to have a separate domain ID.
For this setup to work, we need to replace domain ID values in generated
GTID events when they are written at transaction commit to the values
configured by Wsrep replication.

At the same time, it's possible that the GTID event already contains
a correct domain ID if it comes through the native replication from
another Galera cluster.
In this case, when such an event is applied either through a native
replication slave thread or through Wsrep applier, we write GTID event
on transaction start and avoid writing it during transaction commit.

The code contained multiple problems that were fixed:
- applying GTID events didn't work because it's applied without a
running server transaction and Wsrep transaction was not started
- GTID event generation on transaction start didn't contain proper
"standalone" and "is_transactional" flags that the original applied
GTID event contained
- condition determining that GTID event is written on transaction start
to avoid writing it on commit relied on the fact that the GTID event
is the first found in transaction/statement caches, which wasn't the
case and resulted in duplicate GTID events written
- instead of relying on the caches to find a GTID event, a simple check
is introduced that follows the exact rules for checking if event is
written at transaction start as described above
- the test case is improved to check that exact GTID events are
applied after two Galera clusters have synced.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-06-03 09:48:13 +02:00
Sergei Golubchik
0aae11ac28 Merge branch '10.6' into 10.11 2024-04-30 16:56:49 +02:00
Sergei Golubchik
c1f3eff53f Merge branch '10.5' into 10.6 2024-04-29 10:08:58 +02:00
Jan Lindström
b3e531a3cc MDEV-33896 : Galera test failure on galera_3nodes.MDEV-29171
Based on logs we might start SST before donor has reached
Primary state. Because this test shutdowns all nodes we
need to make sure when we start nodes that previous nodes
have reached Primary state and joined the cluster.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-04-25 16:32:06 +02:00
Jan Lindström
baec63e304 MDEV-33787 : Fix Galera test failures on 10.11 2024-04-03 10:04:40 +03:00
Marko Mäkelä
64cce8d5bf Merge 10.6 into 10.11 2024-02-14 16:12:53 +02:00
Marko Mäkelä
691f923906 Merge 10.5 into 10.6 2024-02-13 20:42:59 +02:00
Marko Mäkelä
8ec12e0d6d Merge 10.4 into 10.5 2024-02-12 11:38:13 +02:00