1
0
mirror of https://github.com/MariaDB/server.git synced 2025-11-28 17:36:30 +03:00
Commit Graph

3788 Commits

Author SHA1 Message Date
Marko Mäkelä
47ab793d71 Merge 10.3 into 10.4 2021-11-09 08:40:14 +02:00
Marko Mäkelä
524b4a89da Merge 10.2 into 10.3 2021-11-09 08:26:59 +02:00
Oleksandr Byelkin
a19ab67318 Merge branch '10.3' into 10.4 2021-11-05 19:59:58 +01:00
Oleksandr Byelkin
a2f147af35 Merge branch '10.2' into 10.3 2021-11-05 19:58:32 +01:00
Andrei Elkin
561b6c7e51 MDEV-26833 Missed statement rollback in case transaction drops or create temporary table
When transaction creates or drops temporary tables and afterward its statement
faces an error even the transactional table statement's cached ROW
format events get involved into binlog and are visible after the transaction's commit.

Fixed with proper analysis of whether the errored-out statement needs
to be rolled back in binlog.
For instance a fact of already cached CREATE or DROP for temporary
tables by previous statements alone
does not cause to retain the being errored-out statement events in the
cache.
Conversely, if the statement creates or drops a temporary table
itself it can't be rolled back - this rule remains.
2021-11-05 19:33:28 +02:00
Andrei Elkin
42ae765960 MDEV-26833 Missed statement rollback in case transaction drops or create temporary table
When transaction creates or drops temporary tables and afterward its statement
faces an error even the transactional table statement's cached ROW
format events get involved into binlog and are visible after the transaction's commit.

Fixed with proper analysis of whether the errored-out statement needs
to be rolled back in binlog.
For instance a fact of already cached CREATE or DROP for temporary
tables by previous statements alone
does not cause to retain the being errored-out statement events in the
cache.
Conversely, if the statement creates or drops a temporary table
itself it can't be rolled back - this rule remains.
2021-10-28 19:54:03 +03:00
Alice Sherepa
4e5cf34819 rpl_get_master_version_and_clock and rpl_row_big_table_id tests are slow, so let's not run them under valgrind 2021-10-28 11:15:54 +02:00
Marko Mäkelä
489ef007be Merge 10.3 into 10.4 2021-10-21 14:57:00 +03:00
Marko Mäkelä
e4a7c15dd6 Merge 10.2 into 10.3 2021-10-21 13:41:04 +03:00
Brandon Nesterenko
2291f8ef73 MDEV-25284: Assertion `info->type == READ_CACHE || info->type == WRITE_CACHE' failed
Problem:
========
This patch addresses two issues.

First, if a CHANGE MASTER command is issued and an error happens
while locating the replica’s relay logs, the logs can be put into an
invalid state where future updates fail and future CHANGE MASTER
calls crash the server. More specifically, right before a replica
purges the relay logs (part of the `CHANGE MASTER TO` logic), the
relay log is temporarily closed with state LOG_TO_BE_OPENED. If the
server errors in-between the temporary log closure and purge, i.e.
during the function find_log_pos, the log should be closed.
MDEV-25284 reveals the log is not properly closed.

Second, upon issuing a RESET SLAVE ALL command, a slave’s GTID
filters are not cleared (DO_DOMAIN_IDS, IGNORE_DOMIAN_IDS,
IGNORE_SERVER_IDS). MySQL had a similar bug report, Bug #18816897,
which fixed this issue to clear IGNORE_SERVER_IDS after issuing
RESET SLAVE ALL in version 5.7.

Solution:
=========

To fix the first problem, the CHANGE MASTER error handling logic was
extended to transition the relay log state to LOG_CLOSED from
LOG_TO_BE_OPENED.

To fix the second problem, the RESET SLAVE ALL logic is extended to
clear the domain_id filter and ignore_server_ids.

Reviewed By:
============
Andrei Elkin <andrei.elkin@mariadb.com>
2021-10-18 10:43:51 -06:00
Oleksandr Byelkin
7841a7eb09 Merge branch '10.3' into 10.4 2021-07-31 22:59:58 +02:00
Sergei Golubchik
2575eaa502 dissapear -> disappear 2021-07-26 12:40:01 +02:00
Sergei Golubchik
6190a02f35 Merge branch '10.2' into 10.3 2021-07-21 20:11:07 +02:00
Aleksey Midenkov
22e4baaa5d MDEV-25595 DROP part of failed CREATE OR REPLACE is not written into binary log
Do log_drop_table() in case of failed mysql_prepare_create_table().
2021-07-06 00:47:41 +03:00
Marko Mäkelä
4240704abc Merge 10.3 into 10.4 2021-05-18 08:59:12 +03:00
Marko Mäkelä
ca3f497564 Merge 10.2 into 10.3, except MDEV-25682 2021-05-18 08:40:19 +03:00
Sujatha
410e3c1a9a MDEV-17515: GTID Replication in optimistic mode deadlock
Problem:
=======
In slave_parallel_mode=optimistic configuration, when admin commands and
DML operation on the same table are scheduled simultaneously for execution,
it results in lock conflict and slave server either hangs due to
deadlock or goes down with an assert.

Analysis:
========
Admin commands OPTIMIZE, REPAIR and ANALYZE are written to binary log as
ordinary transactions. When 'slave_parallel_mode' is 'optimistic' DMLs are
allowed to run in parallel. But these locks are not detected by parallel
replication deadlock detection-and-handling mechanism. At times they result
in deadlock or assertion.

Fix:
===
Flag admin commands as DDL in Gtid_log_event at the time of writing to
binary log. Add a new bit EXECUTED_TABLE_ADMIN_CMD to
'm_unsafe_rollback_flags'. During 'mysql_admin_table' command execution it
accepts a list of tables to be processed and executes them in a loop. Upon
successful execution enable 'EXECUTED_TABLE_ADMIN_CMD' bit in
thd->transaction.stmt_unsafe_rollback_flags. Gtid_log_event constructor
will notice this flag and mark the current transaction with 'FL_DDL' flag.
Gtid_log_events marked as FL_DDL will not be scheduled parallel execution,
on the slave. They will execute in isolation to prevent deadlocks.

Note: Removed the call to 'trans_commit_implicit' from 'mysql_admin_table'
function as 'mysql_execute_command' will take care of invoking
'trans_commit_implicit'.
2021-05-17 16:38:58 +05:30
Sachin Kumar
e607f3398c MDEV-25336 Parallel replication causes failed assert while restarting
Problem:- When slave is shutdown, we will get this assertion failure
sql/sql_list.h:642: void ilink::assert_linked(): Assertion `prev != 0
&& next != 0' failed.

Solution:- In close_connections when we call threads.get() it resets to
prev and next to NULL. And in parallel worker thread(handle_rpl_parallel_thread)
calls unlink_not_visible_thd() which assert on prev and next being not NULL.
.unlink_not_visible_thd() should be always called first before threads.get()
is called. To make sure worker calls unlink_not_visible_thd() in
slave_prepare_for_shutdown() we are deactivating the  worker thread pool
which in turn will close all worker threads. Since this is already done in 10.4
and 10.5 I am backPorting MDEV-20821 and MDEV-22370 to 10.2. Mdev-22370
is improving the MDEV-20821 patch.
2021-05-14 11:50:12 +01:00
Andrei Elkin
3616640a31 MDEV-20821 parallel slave server shutdown hang
Parallel slave server shutdown found to be hanging in
close_connections() triggered by shutdown due to a slave worker thread
would not be notified to exit in case the worker was sitting idle.

Fixed with destroying the worker pool earlier that is in
slave_prepare_for_shutdown() when all their driver threads have already left.
A test file is added to simulate the bug condition as well as check
multi-sourced and not-idle worker cases.
2021-05-14 11:49:26 +01:00
Marko Mäkelä
8c73fab7f7 Merge 10.3 into 10.4 2021-05-10 09:52:01 +03:00
Oleksandr Byelkin
72753d2b65 Merge branch 'bb-10.3-release' into 10.3 2021-05-07 11:51:22 +02:00
Nikita Malyavin
509e4990af Merge branch bb-10.3-release into bb-10.4-release 2021-05-05 23:03:01 +03:00
Sujatha
625e44a80d MDEV-25597: Disable rpl_semi_sync_slave_compressed_protocol.test 2021-05-05 15:46:22 +05:30
Nikita Malyavin
a8a925dd22 Merge branch bb-10.2-release into bb-10.3-release 2021-05-04 14:49:31 +03:00
Sujatha
abe6eb10a6 MDEV-16146: MariaDB slave stops with following errors.
Problem:
========
180511 11:07:58 [ERROR] Slave I/O: Unexpected master's heartbeat data:
heartbeat is not compatible with local info;the event's data: log_file_name
mysql-bin.000009 log_pos 1054262041, Error_code: 1623

Analysis:
=========
In replication setup when master server doesn't have any events to send to
slave server it sends an 'Heartbeat_log_event'. This event carries the
current binary log filename and offset details. The offset values is stored
within 4 bytes of event header. When the size of binary log is higher than
UINT32_MAX the log_pos values will not fit in 4 bytes memory.  It overflows
and hence slave stops with an error.

Fix:
===
Since we cannot extend the common_header of Log_event class, a greater than
4GB value of Log_event::log_pos is made to be transported with a HeartBeat
event's sub-header.  Log_event::log_pos in such case is set to zero to
indicate that the 8 byte sub-header is allocated in the event.

In case of cross version replication following behaviour is expected

OLD - Server without fix
NEW - Server with fix

OLD<->NEW : works bidirectionally as long as the binlog offset is
            (normally) within 4GB.

When log_pos > UINT32_MAX
OLD->NEW  : The 'log_pos' is bound to overflow and NEW slave may report
            an invalid event/incompatible heart beat event error.
NEW->OLD  : Since patched server sets log_pos=0 on overflow, OLD slave will
            report invalid event error.
2021-04-30 20:34:31 +05:30
Marko Mäkelä
90a306a7ab Merge 10.3 into 10.4 2021-04-27 08:53:50 +03:00
Sujatha
391f1aa6ee MDEV-24773: slave_compressed_protocol doesn't work properly with semi-sync replication
Back port upstream fix

commit 1800b015a1d487330f7b15f2020b887be348a66b
Author: Venkatesh Duggirala <venkatesh.duggirala@oracle.com>
Date:   Fri Sep 8 20:29:22 2017 +0530

Bug#26027024    SLAVE_COMPRESSED_PROTOCOL DOESN'T WORK WITH
SEMI-SYNC REPLICATION IN MYSQL-5.7

Analysis: In mysql-5.6, dump thread (the thread that is created
on Master after Slave requested for a binlog dump) is also used
to receive acknowledgements from the Slave and act on them accordingly.
For performance reasons, a special thread called Ack Receiver thread
is added in mysql-5.7 Semi synchronous replication plugin.
This thread does not have special handling to receive acknowledgements
if Slave has enabled compression in the protocol. Hence Master is
unable to handle any slave if Slave_compressed_protocol is enabled
on it.

Fix: Enable compress flag on the communication channels if the Slave
has Slave_compressed_protocol ON.
2021-04-26 11:09:39 +05:30
Marko Mäkelä
5008171b05 Merge 10.3 into 10.4 2021-04-14 10:33:59 +03:00
Marko Mäkelä
6e6318b29b Merge 10.2 into 10.3 2021-04-13 10:26:01 +03:00
Julius Goryavsky
d5dacca4c5 Clarified abbreviated option names in some tests, to avoid
problems with ambiguous options in the future.
2021-04-11 02:26:59 +02:00
Marko Mäkelä
44d70c01f0 Merge 10.3 into 10.4 2021-03-19 11:42:44 +02:00
Marko Mäkelä
19052b6deb Merge 10.2 into 10.3 2021-03-18 12:34:48 +02:00
Alice Sherepa
ee12b055ff reenable tests from engines/funcs 2021-03-10 09:12:57 +01:00
Sergei Golubchik
e841957416 Merge branch '10.3' into 10.4 2021-02-23 09:25:57 +01:00
Sergei Golubchik
0ab1e3914c Merge branch '10.2' into 10.3 2021-02-22 22:42:27 +01:00
Alice Sherepa
6f14230643 MDEV-11862 rpl.rpl_semi_sync_event_after_sync 'fails if there is no semisync plugin 2021-02-15 13:41:44 +01:00
Sergei Golubchik
00a313ecf3 Merge branch 'bb-10.3-release' into bb-10.4-release
Note, the fix for "MDEV-23328 Server hang due to Galera lock conflict resolution"
was null-merged. 10.4 version of the fix is coming up separately
2021-02-12 17:44:22 +01:00
Sujatha
eef4c5d378 MDEV-22741: *SAN: ERROR: AddressSanitizer: use-after-poison on address in instrings/strmake.c:36 from change_master (on optimized builds)
Problem:
========
CHANGE MASTER TO MASTER_USER='root', MASTER_SSL=0, MASTER_SSL_CA='',
  MASTER_SSL_CERT='', MASTER_SSL_KEY='', MASTER_SSL_CRL='',
  MASTER_SSL_CRLPATH='';

CHANGE MASTER TO MASTER_USER='root', MASTER_PASSWORD='', MASTER_SSL=0;

use-after-poison is reported for lex_mi->ssl_crl

File: sql_repl.cc

if (lex_mi->ssl_crl)
  strmake_buf(mi->ssl_crl, lex_mi->ssl_crl);

Analysis:
========
At the end of CHANGE MASTER statement execution, the LEX_MASTER_INFO
parameters are reset so that the next query will have a clean state. But
'ssl_crl' and 'ssl_crl_path' members of LEX_MASTER_INFO object are not
cleared during 'LEX_MASTER_INFO::reset'. Hence when a new CHANGE MASTER
statement is executed, the stale value of lex_mi->ssl_crl is used, so ASAN
reports use-after-poison.

Fix:
===
Clear 'ssl_crl' and 'ssl_crl_path' as part of 'reset'.
2021-02-03 12:18:29 +05:30
Sergei Golubchik
60ea09eae6 Merge branch '10.2' into 10.3 2021-02-01 13:49:33 +01:00
Alice Sherepa
0d9c9f49bd reenable rpl_spec_variables.test 2021-01-22 09:17:00 +01:00
Sujatha
eb75e8705d MDEV-8134: The relay-log is not flushed after the slave-relay-log.999999 showed
Problem:
========
Auto purge of relaylogs stops when relay-log-file is
'slave-relay-log.999999' and slave_parallel_threads is enabled.

Analysis:
=========
The problem is that in Relay_log_info::inc_group_relay_log_pos() function,
when two log names are compared via strcmp() function, it gives correct
result, when log name sequence numbers are of same digits(6 digits), But
when the number goes to 7 digits, a 999999 compares greater than
1000000, which is wrong, hence the bug.

Fix:
====
Extract the numeric extension part of the file name, convert it into
unsigned long and compare.

Thanks to David Zhao for the contribution.
2021-01-21 13:00:02 +05:30
Dmitry Shulga
9e4a5a81fc MDEV-24208 SHOW RELAYLOG EVENTS command is not supported in the prepared statement protocol yet
Added sending of metadata in response to preparing request for
the commands SQLCOM_SHOW_BINLOG_EVENTS, SQLCOM_SHOW_RELAYLOG_EVENTS
2021-01-13 16:16:13 +07:00
Sergei Golubchik
0d8bd7cc3a MDEV-18428 Memory: If transactional=0 is specified in CREATE TABLE, it is not possible to ALTER TABLE
* be strict in CREATE TABLE, just like in ALTER TABLE, because
  CREATE TABLE, just like ALTER TABLE, can be rolled back for any engine
* but don't auto-convert warnings into errors for engine warnings
  (handler::create) - this matches ALTER TABLE behavior
* and not when creating a default record, these errors are handled
  specially (and replaced with ER_INVALID_DEFAULT)
* always issue a Note when a non-unique key is truncated, because it's
  not a Warning that can be converted to an Error. Before this commit
  it was a Note for blobs and a Warning for all other data types.
2021-01-11 21:54:47 +01:00
Sergei Golubchik
4568a72ce4 don't do a warning for bad table options in replication slave thread
otherwise ALTER TABLE can break replication
2021-01-11 21:54:47 +01:00
Marko Mäkelä
fd5e103aa4 Merge 10.3 into 10.4 2021-01-11 10:35:06 +02:00
Marko Mäkelä
5a1a714187 Merge 10.2 into 10.3 (except MDEV-17556)
The fix of MDEV-17556 (commit e25623e78a
and commit 61a362c949) has been
omitted due to conflicts and will have to be applied separately later.
2021-01-11 09:41:54 +02:00
Alice Sherepa
df1eefb2ad MDEV-16272 rpl.rpl_semisync_ali_issues failed in buildbot, SHOW variable was done instead of waiting for the value of that variable 2021-01-07 17:53:04 +01:00
Sujatha
608b0ee52e MDEV-23033: All slaves crash once in ~24 hours and loop restart with signal 11
Problem:
=======
Upon deleting or updating a row in a parent table (with primary key), if
the child table has virtual column and an associated key with ON UPDATE
CASCADE/ON DELETE CASCADE, it will result in slave crash.

Analysis:
========
Tables which are related through foreign key require prelocking similar to
triggers. i.e If a table has triggers/foreign keys we should add all tables
and routines used by them to the prelocking set.  This prelocking happens
during 'open_and_lock_tables' call.  Each table being opened is checked for
foreign key references. If foreign key reference exists then the child
table is opened and it is linked to the table_list. Upon any modification
to  parent table its corresponding child tables are retried from table_list
and they are updated accordingly. This prelocking work fine on master.

On slave  prelocking works for following cases.
 - Statement/mixed based replication
 - In row based replication when trigger execution is enabled through
   'slave_run_triggers_for_rbr=YES/LOGGING/ENFORCE'

Otherwise it results in an assert/crash, as the parent table will not find
the corresponding child table and it will be NULL. Dereferencing NULL
pointer leads to slave server exit.

Fix:
===
Introduce a new 'slave_fk_event_map' flag similar to 'trg_event_map'. This
flag will ensure that when foreign key is enabled in row based replication
all the parent and child tables are prelocked, so that parent is able to
locate the child table.

Note: This issue is specific to slave, hence only slave needs to be
      upgraded.
2021-01-04 15:06:12 +05:30
Oleksandr Byelkin
478b83032b Merge branch '10.3' into 10.4 2020-12-25 09:13:28 +01:00
Oleksandr Byelkin
25561435e0 Merge branch '10.2' into 10.3 2020-12-23 19:28:02 +01:00