1
0
mirror of https://github.com/MariaDB/server.git synced 2025-09-13 13:47:59 +03:00
Commit Graph

37 Commits

Author SHA1 Message Date
Marko Mäkelä
ab0f2a00b6 Merge 10.6 into 10.11 2025-03-27 08:01:47 +02:00
Marko Mäkelä
191209d8ab Merge 10.5 into 10.6 2025-03-26 17:09:57 +02:00
Kristian Nielsen
b6b6bb8d36 Fix sporadic failures of rpl.rpl_gtid_crash
- Suppress a couple errors the slave can get as the master crashes.

 - The mysql-test-run occasionally takes 120 seconds between crashing
   the master and starting it back up for some (unknown) reason. For
   now, work-around that by letting the slave try for 500 seconds to
   connect to master before giving up instead of only 100 seconds.

Reviewed-by: Brandon Nesterenko <brandon.nesterenko@mariadb.com>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2025-03-15 11:15:36 +01:00
Marko Mäkelä
788953463d Merge 10.6 into 10.11
Some fixes related to commit f838b2d799 and
Rows_log_event::do_apply_event() and Update_rows_log_event::do_exec_row()
for system-versioned tables were provided by Nikita Malyavin.
This was required by test versioning.rpl,trx_id,row.
2024-03-28 09:16:57 +02:00
Monty
567c097359 MDEV-33582 Add more warnings to be able to better diagnose network issues
Warnings are added to net_server.cc when
global_system_variables.log_warnings >= 4.

When the above condition holds then:
- All communication errors from net_serv.cc is also written to the
  error log.
- In case of a of not being able to read or write a packet, a more
  detailed error is given.

Other things:
- Added detection of slaves that has hangup to Ack_receiver::run()
- vio_close() is now first marking the socket closed before closing it.
  The reason for this is to ensure that the connection that gets a read
  error can check if the reason was that the socket was closed.
- Add a new state to vio to be able to detect if vio is acive, shutdown or
  closed. This is used to detect if socket is closed by another thread.
- Testing of the new warnings is done in rpl_get_lock.test
- Suppress some of the new warnings in mtr to allow one to run some of
  the tests with -mysqld=--log-warnings=4. All test in the 'rpl' suite
  can now be run with this option.
 - Ensure that global.log_warnings are restored at test end in a way
   that allows one to use mtr --mysqld=--log-warnings=4.

Reviewed-by: <serg@mariadb.org>,<brandon.nesterenko@mariadb.com>
2024-03-05 20:19:49 +02:00
Oleksandr Byelkin
ced243a099 Merge branch '10.9' into 10.10 2023-08-05 20:34:09 +02:00
Angelique
996b040f93 MDEV-30232: Increase timeouts to fix sporadic fails 2023-05-15 14:22:23 +00:00
asklavou
7aace5d5da MDEV-28839: remove current_pos where not intentionally being tested
Task:
=====
Update tests to reflect MDEV-20122, deprecation of master_use_gtid=current_pos.
Change Master (CM) statements were either removed or modified with
current_pos --> slave_pos based on original intention of the test.
Reviewed by:
============
Brandon Nesterenko <brandon.nesterenko@mariadb.com>
2023-02-13 21:04:52 +00:00
Brandon Nesterenko
90c3b2835d MDEV-20122: Deprecate MASTER_USE_GTID=Current_Pos to favor new MASTER_DEMOTE_TO_SLAVE option
New Feature:
========
This feature adds a safe replacement to the
MASTER_USE_GTID=Current_Pos option for CHANGE MASTER TO as
MASTER_DEMOTE_TO_SLAVE=<bool>. The use case of Current_Pos is to
transition a master to become a slave; however, can break
replication state if the slave executes local transactions due to
actively updating gtid_current_pos with gtid_binlog_pos and
gtid_slave_pos.

MASTER_DEMOTE_TO_SLAVE changes this use case by forcing users to set
Using_Gtid=Slave_Pos and merging gtid_binlog_pos into gtid_slave_pos
once at CHANGE MASTER TO time. Note that if gtid_slave_pos is more
recent than gtid_binlog_pos (as in the case of chain replication),
the replication state should be preserved.

Additionally, deprecate the `Current_Pos` option of MASTER_USE_GTID
to suggest the safe alternative option MASTER_DEMOTE_TO_SLAVE=TRUE.

Reviewed By:
============
Andrei Elkin <andrei.elkin@mariadb.com>
2022-07-26 16:35:24 -06:00
Marko Mäkelä
d9d9c30b70 Merge 10.2 into 10.3 2020-09-22 21:12:48 +03:00
Marko Mäkelä
9d0ee2dcb7 Merge 10.1 into 10.2 2020-09-22 15:21:43 +03:00
Sujatha
a8f6bbb7a8 MDEV-9501: rpl.rpl_binlog_index, rpl.rpl_gtid_crash, rpl.rpl_stm_multi_query fail sporadically in buildbot with Master command COM_REGISTER_SLAVE failed
Analysis:
========
Slave server will send COM_REGISTER_SLAVE command at the time of establishing
a connection to master. If master is down, then the command will fail and
COM_REGISTER_SLAVE failed warning is reported.

'rpl_binlog_index.test' shutsdown the master and it relocates binary logs to a
new location and attempts to start master by pointing 'log-bin' to new
location. During this process the slave threads are active. IO thread actively
checks for the presence of master when it finds that the connection is lost it
attempts a reconnect, as master is down COM_REGISTER_SLAVE command fails.

As part of fix, stop the slave threads and then shutdown the master and do the
binlog relocation. Once master is restarted start the slave threads and sync
them with the master. In test binary logs and index files on master are
relocated to /tmpdir but during master restart only --log-bin option is
provided, this is incorrect. Even --log-bin-index also should be pointed to
/tmpdir otherwise upon master server restart two index files will be created.
One master-bin.index in /tmpdir and a new master-bin.index as per log_basename
in datadir. Due to this slave will fail to connect to master.

'rpl_gtid_crash.test' tests following scenario "crashing master, causing slave
IO thread to reconnect while SQL thread is running". When IO thread tries to
connect to crashed master on slow platforms COM_REGISTER_SLAVE command fails.
This is expected hence the warning should be added to suppression list.
2020-09-07 17:23:45 +05:30
Kristian Nielsen
1b54cb3b77 MDEV-12179: Per-engine mysql.gtid_slave_pos table
Intermediate commit.

Update some existing test cases to work with the new handling of
mysql.gtid_slave_pos* tables:

 - The tables are now checked during START SLAVE, which causes some
   errors or error injections to trigger differently.

 - Some test cases that play games with renaming or altering the
   mysql.gtid_slave_pos table need adjustments.
2017-04-21 10:30:17 +02:00
Marko Mäkelä
b05bf8ff0f Merge 10.1 to 10.2.
Most notably, this includes MDEV-11623, which includes a fix and
an upgrade procedure for the InnoDB file format incompatibility
that is present in MariaDB Server 10.1.0 through 10.1.20.

In other words, this merge should address
MDEV-11202 InnoDB 10.1 -> 10.2 migration does not work
2017-01-19 12:06:13 +02:00
Marko Mäkelä
a9d00db155 MDEV-11799 InnoDB can abort if the doublewrite buffer
contains a bad and a good copy

Clean up the InnoDB doublewrite buffer code.

buf_dblwr_init_or_load_pages(): Do not add empty pages to the buffer.

buf_dblwr_process(): Do consider changes to pages that are all zero.
Do not abort when finding a corrupted copy of a page in the doublewrite
buffer, because there could be multiple copies in the doublewrite buffer,
and only one of them needs to be good.
2017-01-15 18:56:56 +02:00
Sergey Vojtovich
282497dd6d MDEV-6720 - enable connection log in mysqltest by default 2016-03-31 10:11:16 +04:00
Kristian Nielsen
aa845d123c MDEV-6391: GTID binlog state not recovered if mariadb-bin.state is removed
When the server starts up, check if the master-bin.state file was lost.
If it was, recover its contents by scanning the last binlog file, thus
avoiding running with a corrupt binlog state.
2015-02-27 14:34:52 +01:00
Kristian Nielsen
e79b7ca966 MDEV-7179: rpl.rpl_gtid_crash failed in buildbot with Warning: database page corruption or a failed
I saw two test failures in rpl.rpl_gtid_crash where we get this in the error
log:

141123 12:47:54 [Note] InnoDB: Restoring possible half-written data pages 
141123 12:47:54 [Note] InnoDB: from the doublewrite buffer...
InnoDB: Warning: database page corruption or a failed
InnoDB: file read of space 6 page 3.
InnoDB: Trying to recover it from the doublewrite buffer.
141123 12:47:54 [Note] InnoDB: Recovered the page from the doublewrite buffer.

This test case deliberately crashes the server, and if this crash happens
right in the middle of writing a buffer pool page to disk, it is not
unexpected that we can get a half-written page. The page is recovered
correctly from the doublewrite buffer.

So this patch adds a suppression for this warning in the error log for this
test case.
2014-11-25 14:19:11 +01:00
Kristian Nielsen
b79685902d MDEV-6903: gtid_slave_pos is incorrect after master crash
When a master slave restarts, it logs a special restart format description
event in its binlog. When the slave sees this event, it knows it needs to roll
back any active partial transaction, in case the master crashed previously in
the middle of writing such transaction to its binlog.

However, there was a bug where this rollback did not reset rgi->pending_gtid.
This caused the @@gtid_slave_pos to be updated incorrectly with the GTID of
the partial transaction that was rolled back.

Fix this by always clearing rgi->pending_gtid in cleanup_context(), hopefully
preventing similar bugs from turning up in other special cases where a
transaction is rolled back during replication.

Thanks to Pavel Ivanov for tracking down the issue and providing a test case.
2014-11-25 12:19:48 +01:00
Kristian Nielsen
7671fd70c0 MDEV-7080: rpl.rpl_gtid_crash fails sporadically in buildbot
The real problem here was inconsistent handling of entry->commit_errno in
MYSQL_BIN_LOG::write_transaction_or_stmt(). Some return paths were setting it
to the value of errno, some where not. And the setting was redundant anyway,
as it is set consistently by the caller.

Fix by consistently setting it in the caller, and not in each return path in
the function.

The test failure happened because a DBUG_EXECUTE_IF() used in the test case
set an entry->commit_errno that was immediately overwritten in the caller with
whatever happened to be the value of errno. This could lead to different error
message in the .result file.
2014-11-17 08:53:42 +01:00
Kristian Nielsen
36f50be970 MDEV-6462: Slave replicating using GTID doesn't recover correctly when master crashes in the middle of transaction
If the slave gets a reconnect in the middle of a GTID event group, normally
it will re-fetch that event group, skipping the first part that was already
queued for the SQL thread.

However, if the master crashed while writing the event group, the group is
incomplete. This patch detects this case and makes sure that the
transaction is rolled back and nothing is skipped from any following
event groups.

Similarly, a network proxy might cause the reconnect to end up on a
different master server. Detect this by noticing a different server_id,
and similarly in this case roll back the partially received group.
2014-09-02 14:07:01 +02:00
Sergey Vojtovich
2b61466733 MDEV-6469 - rpl.rpl_gtid_basic, rpl.rpl_gtid_stop_start,
rpl.rpl_gtid_crash fail on PPC64

GTID order in @@gtid_binlog_pos depends on internal hash order,
so requires to be hidden for stable test output.
2014-07-22 14:54:38 +04:00
unknown
dd93ec5633 Merge MariaDB 10.0-base to 10.0. 2014-02-10 15:12:17 +01:00
unknown
7bb022f3cf MDEV-4726: Race in mysql-test/suite/rpl/t/rpl_gtid_stop_start.test
Some GTID test cases were using include/wait_condition.inc with a
condition like SELECT COUNT(*)=4 FROM t1 to wait for the slave to
catch up with the master. This causes races and test failures, as the
changes to the tables become visible at the COMMIT of the SQL thread
(or even before in case of MyISAM), but the changes to
@@gtid_slave_pos only become visible a little bit after the COMMIT.

Now that we have MASTER_GTID_WAIT(), just use that to sync up in a
GTID-friendly way, wrapped in nice include/save_master_gtid.inc and
include/sync_with_master_gtid.inc scripts.
2014-02-07 20:24:39 +01:00
Sergei Golubchik
d28d3ba40d 10.0-base merge 2013-12-16 13:02:21 +01:00
unknown
55a7159f53 MDEV-4982: GTID looses all binlog state after crash if InnoDB is disabled
MDEV-4725: Incorrect binlog state recovery if crash while writing event group

The binlog state was not recovered correctly if XA is not used (eg. InnoDB
disabled), or if server crashed in the middle of writing an event group to the
binlog.

With this patch, we ensure that recovery of binlog state is done even if we do
not do the full XA binlog recovery, and we ensure that we only recover fully
written event groups into the binlog state.
2013-11-21 14:42:25 +01:00
unknown
a0fd7382bc Merge 10.0-base -> 10.0 2013-05-28 15:39:56 +02:00
unknown
ee2b7db3f8 MDEV-4478: Implement GTID "strict mode"
When @@GLOBAL.gtid_strict_mode=1, then certain operations result
in error that would otherwise result in out-of-order binlog files
between servers.

GTID sequence numbers are now allocated independently per domain;
this results in less/no holes in GTID sequences, increasing the
likelyhood that diverging binlogs will be caught by the slave when
GTID strict mode is enabled.
2013-05-28 13:28:31 +02:00
unknown
1cd6eb5f94 MDEV-26: Global transaction ID.
Change of user interface to be more logical and more in line with expectations
to work similar to old-style replication.

User can now explicitly choose in CHANGE MASTER whether binlog position is
taken into account (master_gtid_pos=current_pos) or not (master_gtid_pos=
slave_pos) when slave connects to master.

@@gtid_pos is replaced by three separate variables @@gtid_slave_pos (can
be set by user, replicated GTIDs only), @@gtid_binlog_pos (read only), and
@@gtid_current_pos (a combination of the two, most recent GTID within each
domain). mysql.rpl_slave_state is renamed to mysql.gtid_slave_pos to match.

This fixes MDEV-4474.
2013-05-22 17:36:48 +02:00
unknown
0e7410a154 Merge 10.0-base -> 10.0 (GTID). 2013-04-17 15:17:01 +02:00
unknown
b7363eb4ac MDEV-26: Global transaction ID.
Replace CHANGE MASTER TO ... master_gtid_pos='xxx' with a new system
variable @@global.gtid_pos.

This is more logical; @@gtid_pos is global, not per-master, and it is not
affected by RESET SLAVE.

Also rename master_gtid_pos=AUTO to master_use_gtid=1, which again is more
logical.
2013-04-05 16:20:58 +02:00
unknown
bdf6367d0e MDEV-26: Global transaction ID
More fixes for test failures in Buildbot:

 - Do not run crashing test in Valgrind.

 - FLUSH TABLES did not work to avoid errors about not closed tables when
   crashing server. Suppress the messages instead.

 - Rewrite multi-source test case to only start one pair of slave threads at a
   time, to work-around the bug MDEV-4352.
2013-04-02 14:44:24 +02:00
unknown
d639aece97 MDEV-26: Global transaction ID.
More fixes for race conditions in test cases.
2013-03-29 17:20:01 +01:00
unknown
5aaf73fcaa MDEV-26: Global transaction ID.
Add tests crashing the slave in the middle of replication and checking that
replication picks-up again on restart in a crash-safe way.

Fix silly code that causes crash by inserting uninitialised data into a hash.
2013-03-28 13:03:51 +01:00
unknown
b0389850a5 MDEV-26: Global transaction ID.
Test crashing the master, check that it recovers the binlog state.

Fix one bug introduced by previous commit (crash-recoved binlog state was
overwritten by loading stale binlog state file).

Fix Windows build error.
2013-03-27 19:29:59 +01:00
unknown
d9f975d08b MDEV-26: Global transaction ID
Adjust full test suite to work with GTID.

Huge patch, mainly due to having to update .result file for all SHOW BINLOG
EVENTS and mysqlbinlog outputs, where the new GTID events pop up.

Everything was painstakingly checked to be still correct and valid .result
file updates.
2013-03-26 10:35:34 +01:00
unknown
9bb989a9d1 MDEV-26: Global transaction ID.
Fix MDEV-4275 - I/O thread restart duplicates events in the relay log.
The first time we connect to master after CHANGE MASTER or restart, we connect
from the GTID position. But then subsequent reconnects or IO thread restarts
reconnect with the old-style file/offset binlog pos from where it left off at
last disconnect. This is necessary to avoid duplicate events in the relay
logs, as there is nothing that synchronises the SQL thread update of GTID
state (multiple threads in case of multi-source) with IO thread reconnects.

Test cases.

Some small cleanups and fixes.
2013-03-21 11:03:31 +01:00