1
0
mirror of https://github.com/MariaDB/server.git synced 2025-11-09 11:41:36 +03:00
Commit Graph

654 Commits

Author SHA1 Message Date
Marko Mäkelä
426c2a6ca1 Merge 10.11 into 11.4 2025-10-07 13:01:57 +03:00
Hemant Dangi
1dd4889d95 MDEV-25039: MDL BF-BF conflict because of foreign key
Issue:
The Commit: 0584846 'Add TL_FIRST_WRITE in SQL layer for determining R/W'
limits INSERT statements on write nodes to acquire MDL locks on it's all child
tables and thereby wsrep certification keys added, but on applier nodes it does
acquire MDL locks for all child tables. This can result into MDL BF-BF conflict
on applier node when transactions referring to parent and child tables are
executed concurrently. For example:

Tables with foreign keys: t1<-t2<-t3<-t4
Conflicting transactions: INSERT t1 and DROP TABLE t4

Wsrep certification keys taken on write node:
- for INSERT t1: t1 and t2
- for DROP TABLE t4: t4

On applier node MDL BF-BF conflict happened between two transaction because
MDL locks on t1, t2, t3 and t4 were taken for INSERT t1, which conflicted
with MDL lock on t4 taken by DROP TABLE t4.
The Wsrep certification keys helps in resolving this MDL BF-BF conflict by
prioritizing and scheduling concurrent transactions. But to generate Wsrep
certification keys it needs to open and take MDL locks on all the child tables.

The Commit: 0584846 change limits MDL lock to be taken on all child nodes for
read-only FK checks (INSERT t1). But this doesn't works on applier nodes
because Write_rows_log_event event logged for INSERT is also used to record
update (check Write_rows_log_event::get_trg_event_map()), and
therefore MDL locks is taken for all the child tables on applier node
for update and insert event.

Solution:
Additional keys for the referenced/foreign table needs to be added
to avoid potential MDL conflicts with concurrent update and DDLs.
2025-10-07 14:08:35 +05:30
Marko Mäkelä
e8ef8c0055 Merge 10.11 into 11.4 2025-09-24 13:40:09 +03:00
Jan Lindström
f9bdff6162 MDEV-37373 : InnoDB partition table disallow local GTIDs in galera
Problem was that for partitioned tables base table storage engine
is DB_TYPE_PARTITION_DB and naturally different than DB_TYPE_INNODB
so operation was not allowed in Galera.

Fixed by requesting implementing storage engine for partitioned
tables i.e. table->file->partition_ht() or if that does not exist
we can use base table storage engine. Resulting storage engine
type is then used on condition is operation allowed when
wsrep_mode=DISALLOW_LOCAL_GTID or not. Operations to InnoDB
storage engine i.e DB_TYPE_INNODB should be allowed.
2025-09-23 13:17:00 +03:00
Marko Mäkelä
257f4b30ef Merge 10.11 into 11.4 2025-09-03 10:32:56 +03:00
Julius Goryavsky
186b0643f4 Merge branch '10.6' into '10.11' 2025-08-14 03:15:04 +02:00
Julius Goryavsky
858dc5bd5c galera: fixed duplicated debug checkpoint name 2025-08-13 20:43:27 +02:00
Sergei Golubchik
c4ed889b74 Merge branch '10.11' into 11.4 2025-07-28 19:40:10 +02:00
Sergei Golubchik
053f9bcb5b Merge branch '10.6' into 10.11 2025-07-28 18:06:31 +02:00
Jan Lindström
f41acb555d MDEV-35523 : Server crashes with "WSREP: Unknown writeset version: -1"
Cluster configuration was incorrect e.g. wsrep_node_address
was missing. Therefore, Galera replication was not properly
initialized and TOI is not supported.

Fix is to check when user tries to start Galera replication
with wsrep_on=ON that Galera replication is properly
initialized and node is ready to receive operations. If
Galera replication is not properly initialized return
a error.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2025-06-30 00:52:34 +02:00
Oleksandr Byelkin
89c7e2b9c7 Merge branch '10.11' into 11.4 2025-06-17 09:50:22 +02:00
Oleksandr Byelkin
28d6530571 Merge branch '10.6' into 10.11 2025-06-04 14:09:23 +02:00
Monty
22024da64e MDEV-36143 Row event replication with Aria does not honour BLOCK_COMMIT
This commit fixes a bug where Aria tables are used in
(master->slave1->slave2) and a backup is taken on slave2. In this case
it is possible that the replication position in the backup, stored in
mysql.gtid_slave_pos, will be wrong. This will lead to replication
errors if one is trying to use the backup as a new slave.

Analyze:
Replicated row events are committed with trans_commit_stmt() and
thd->transaction->all.ha_list != 0.
This means that backup_commit_lock is not taken for Aria tables,
which means the rows are committed and binary logged on the slave
under BLOCK_COMMIT which should not happen.

This issue does not occur on the master as thd->transaction->all.ha_list
is == 0 under AUTO_COMMIT, which sets 'is_real_trans' and 'rw_trans'
which in turn causes backup_commit_lock to be taken.

Fixed by checking in ha_check_and_coalesce_trx_read_only() if all handlers
supports rollback and if not, then wait for BLOCK_COMMIT also for
statement commit.
2025-06-02 14:02:53 +03:00
Marko Mäkelä
3da36fa130 Merge 10.6 into 10.11 2025-05-26 08:10:47 +03:00
Hemant Dangi
d38558f99b MDEV-36117: MDL BF-BF conflict on ALTER and UPDATE with multi-level foreign key parents
Issue:
Mariadb acquires additional MDL locks on UPDATE/INSERT/DELETE statements
on table with foreign keys. For example, table t1 references t2, an
UPDATE to t1 will MDL lock t2 in addition to t1.
A replica may deliver an ALTER t1 and UPDATE t2 concurrently for
applying. Then the UPDATE may acquire MDL lock for t1, followed by a
conflict when the ALTER attempts to MDL lock on t1. Causing a BF-BF
conflict.

Solution:
Additional keys for the referenced/foreign table needs to be added
to avoid potential MDL conflicts with concurrent update and DDLs.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2025-05-20 20:59:10 +02:00
Marko Mäkelä
3ae8f114e2 Merge 10.11 into 11.4 2025-04-02 10:15:08 +03:00
Julius Goryavsky
74f0b99edf Merge branch '10.6' into '10.11' 2025-04-02 06:33:39 +02:00
Julius Goryavsky
b983a911e9 galera mtr tests: synchronization between branches and editions 2025-04-02 04:50:11 +02:00
Julius Goryavsky
03c31ab099 Merge branch '10.5' into '10.6' 2025-04-02 04:43:24 +02:00
Denis Protivensky
dd54ce9e10 MDEV-36116: Remove debug assert in TOI when executing thread is killed
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2025-04-02 04:29:40 +02:00
Julius Goryavsky
41565615c5 galera: synchronization changes to stop random test failures 2025-04-02 04:29:34 +02:00
Marko Mäkelä
f5bd250f5b Merge 10.11 into 11.4 2025-03-28 13:55:21 +02:00
Julius Goryavsky
c61345169a galera tests: synchronization after merge 2025-03-28 02:53:59 +01:00
Marko Mäkelä
ab0f2a00b6 Merge 10.6 into 10.11 2025-03-27 08:01:47 +02:00
Marko Mäkelä
49a6baec56 Merge 10.11 into 11.4 2025-03-03 11:07:56 +02:00
Julius Goryavsky
e3d7d5ca26 Merge branch '10.5' into '10.6' 2025-02-27 04:02:33 +01:00
Marko Mäkelä
0c204bfb87 Merge 10.6 into 10.11 2025-02-25 10:23:24 +02:00
Jan Lindström
b167730499 MDEV-34891 : SST failure occurs when gtid_strict_mode is enabled
Problem was that initial GTID was set on wsrep_before_prepare
out-of-order. In practice GTID was set to same as previous
executed transaction GTID. In recovery valid GTID was found
from prepared transaction and this transaction is committed
leading to fact that same GTID was executed twice.

This is fixed by setting invalid GTID at wsrep_before_prepare
and later in wsrep_before_commit actual correct GTID is set
and this setting is done while we are in commit monitor i.e.
assigment is done in order of replication.

In recovery if prepared transaction is found we check its
GTID, if it is invalid transaction will be rolled back
and if it is valid it will be committed.

Initialize gtid seqno from recovered seqno when
bootstrapping a new cluster.

Added two test cases for both mariabackup and rsync SST methods
to show that GTIDs remain consistent on cluster and that
all expected rows are in the table.

Added tests for wsrep GTID recovery with binlog on and off.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2025-02-18 19:30:04 +01:00
Jan Lindström
bb64a51037 MDEV-35941 : galera_bf_abort_lock_table fails with wait for metadata lock
Problem was missing case from wsrep_handle_mdl_conflict. Test case
was trying to confirm that LOCK TABLE thread is not BF-aborted.
However as case was missing it was BF-aborted. Test case passed
because BF-aborting takes time and used wait condition might
see expected thread status before it was BF-aborted. Test naturally
failed if BF-aborting was done early enough.

Fix is to add missing case for SQLCOM_LOCK_TABLES to
wsrep_handle_mdl_conflict.

Note that using LOCK TABLE is still not recomended on cluster
because it could cause cluster hang.

This is a 10.5 specific commit that will then be overridden by
another one for 10.6+.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2025-02-12 13:35:47 +01:00
Julius Goryavsky
7b040e53cc galera mtr tests: fixes for test failures, 'cosmetic' changes and unification between versions 2025-02-12 12:25:09 +01:00
Jan Lindström
3009b5439d MDEV-35941 : galera_bf_abort_lock_table fails with wait for metadata lock
Problem was missing case from wsrep_handle_mdl_conflict. Test case
was trying to confirm that LOCK TABLE thread is not BF-aborted.
However as case was missing it was BF-aborted. Test case passed
because BF-aborting takes time and used wait condition might
see expected thread status before it was BF-aborted. Test naturally
failed if BF-aborting was done early enough.

Fix is to add missing case for SQLCOM_LOCK_TABLES to
wsrep_handle_mdl_conflict.

Note that using LOCK TABLE is still not recomended on cluster
because it could cause cluster hang.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2025-02-12 01:23:41 +01:00
Marko Mäkelä
565a0cebd8 Merge 10.6 into 10.11 2025-02-10 14:45:18 +02:00
Julius Goryavsky
d6f31ed263 Merge branch '10.5' into '10.6' 2025-02-03 10:44:13 +01:00
Daniele Sciascia
10fd2c207a MDEV-35946 Assertion `thd->is_error()' failed in Sql_cmd_dml::prepare
Fix a regression that caused assertion thd->is_error() after
sync wait failures. If wsrep_sync_wait() fails make sure a appropriate
error is set. Partially revert 75dd0246f8.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2025-02-03 10:03:50 +01:00
Julius Goryavsky
72f21560d5 Merge branch '10.6' into '10.11' 2025-02-02 23:17:20 +01:00
Julius Goryavsky
53c693ec2f Merge branch '10.5' into '10.6' 2025-02-02 12:55:16 +01:00
Jan Lindström
22414d2ed0 MDEV-27861: Creating partitioned tables should not be allowed with wsrep_osu_method=TOI and wsrep_strict_ddl=ON
Problem was incorrect handling of partitioned tables,
because db_type == DB_TYPE_PARTITION_DB
wsrep_should_replicate_ddl incorrectly marked
DDL as not replicatable. However, in partitioned
tables we should check implementing storage engine
from table->file->partition_ht() if available because
if partition handler is InnoDB all DDL should be allowed
even with wsrep_strict_ddl. For other storage engines
DDL should not be allowed and error should be issued.

This is 10.5 version of the fix.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2025-02-02 04:54:42 +01:00
Sergei Golubchik
7d657fda64 Merge branch '10.11 into 11.4 2025-01-30 12:01:11 +01:00
Sergei Golubchik
e69f8cae1a Merge branch '10.6' into 10.11 2025-01-30 11:55:13 +01:00
Sergei Golubchik
066e8d6aea Merge branch '10.5' into 10.6 2025-01-29 11:17:38 +01:00
Hemant Dangi
c0b11e75ff MDEV-34218: Mariadb Galera cluster fails when replicating from Mysql 5.7 on use of DDL
Issue:
Mariadb Galera cluster fails to replicate from
Mysql 5.7 when configured with MASTER_USE_GTID=no
option for CHANGE MASTER.

HOST: mysql, mysql 5.7.44 binlog_format=ROW
HOST: m1, mariadb 10.6 GALERA NODE replicating from HOST mysql, Using_Gtid: No (log file and position)
HOST: m2 mariadb 10.6 GALERA NODE
HOST: m3 mariadb 10.6 GALERA NODE

Error on m1:
2024-05-22 16:11:07 1 [ERROR] WSREP: Vote 0 (success) on 78cebda7-1876-11ef-896b-8a58fca50d36:2565 is inconsistent with group. Leaving cluster.

Error on m2 and m3:
2024-05-22 16:11:06 2 [ERROR] Error in Log_event::read_log_event(): 'Found invalid event in binary log', data_len: 42, event_type: -94
2024-05-22 16:11:06 2 [ERROR] WSREP: applier could not read binlog event, seqno: 2565, len: 482

It fails in Gtid_log_event::is_valid() check on
secondary node when sequence number sent from primary
is 0. On primary for applier or slave thread sequence
number is set to 0, when both thd->variables.gtid_seq_no
and thd->variables.wsrep_gtid_seq_no have value 0.

Solution:
Skip adding Gtid Event on primary for applier or
slave thread when both thd->variables.gtid_seq_no
and thd->variables.wsrep_gtid_seq_no have value 0.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2025-01-27 17:17:11 +01:00
Nikita Malyavin
765458c93c fix my_error usage 2025-01-26 16:15:46 +01:00
sjaakola
552cba92de MDEV-35710 support for threadpool
When client connections use threadpool, i.e. configuration has:
thread_handling = pool-of-threads it turned out that during wsrep
replication shutdown, not all client connections could be closed.
Reason was that some client threads has stmt_da in state DA_EOF,
and this state was earlier used to detect if client connection
was issuing SHUTDOWN command.

To fix this, the connection executing SHUTDOWN is now detected by
looking at the actual command being executed:
thd->get_command() == COM_SHUTDOWN
During replication shutdown, all other connections but the
SHUTDOWN executor, are terminated.

This commit has new mtr test galera.galera_threadpool, which
opens a number of threadpool client connections, and then
restarts the node to verify that connections in threadpool are
terminated during shutdown.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2025-01-24 17:09:34 +01:00
Julius Goryavsky
50cf189717 MDEV-35018 addendum: improved warnings handling
Fixed regression after original MDEV-35018 fix related
to warnings appearing when running MDEV-26266 test and
other possible problems with warnings.
2025-01-24 17:09:21 +01:00
Daniele Sciascia
841a7d391b MDEV-35018 MDL BF-BF conflict on DROP TABLE
DROP TABLE on child and UPDATE of parent table can cause an MDL BF-BF
conflict when applied concurrently.
DROP TABLE takes MDL locks on both child and its parent table, however
it only it did not add certification keys for the parent table.
This patch adds the following:
 * Append certification keys corresponding to all parent tables
   before DROP TABLE replication.
 * Fix wsrep_append_fk_parent_table() so that it works when it is
   given a table list containing temporary tables.
 * Make sure function wsrep_append_fk_parent_table() is only called
   for local transaction. That was not the case for ALTER TABLE.
 * Add a test case that verifies that UPDATE parent depends on
   preceeding DROP TABLE child.
 * Adapt galera_ddl_fk_conflict test to work with DROP TABLE as well.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2025-01-24 17:02:44 +01:00
Marko Mäkelä
17f01186f5 Merge 10.11 into 11.4 2025-01-09 07:58:08 +02:00
Marko Mäkelä
3f914afd3a Merge 10.6 into 10.11 2025-01-02 12:39:56 +02:00
Julius Goryavsky
3cd9f9d1b3 Merge branch '10.5' into '10.6' 2024-12-18 05:09:23 +01:00
Daniele Sciascia
75dd0246f8 Remove error handling from wsrep_sync_wait()
Let the wsrep-lib error be set/overriden at the end of
dispatch_command().

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-12-17 09:52:32 +01:00
Kristian Nielsen
0f47db8525 Merge 10.11 -> 11.4
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-12-05 11:01:42 +01:00