1
0
mirror of https://github.com/MariaDB/server.git synced 2025-05-13 01:01:44 +03:00

3625 Commits

Author SHA1 Message Date
Oleksandr Byelkin
9e1fb104a3 MariaDB 11.4.4 release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEF39AEP5WyjM2MAMF8WVvJMdM0dgFAmck77AACgkQ8WVvJMdM
 0dgccQ/+Lls8fWt4D+gMPP7x+drJSO/IE/gZFt3ugbWF+/p3B2xXAs5AAE83wxEh
 QSbp4DCkb/9PnuakhLmzg0lFbxMUlh4rsJ1YyiuLB2J+YgKbAc36eQQf+rtYSipd
 DT5uRk36c9wOcOXo/mMv4APEvpPXBIBdIL4VvpKFbIOE7xT24Sp767zWXdXqrB1f
 JgOQdM2ct+bvSPC55oZ5p1kqyxwvd6K6+3RB3CIpwW9zrVSLg7enT3maLjj/761s
 jvlRae+Cv+r+Hit9XpmEH6n2FYVgIJ3o3WhdAHwN0kxKabXYTg7OCB7QxDZiUHI9
 C/5goKmKaPB1PCQyuTQyLSyyK9a8nPfgn6tqw/p/ZKDQhKT9sWJv/5bSWecrVndx
 LLYifSTrFC/eXLzgPvCnNv/U8SjsZaAdMIKS681+qDJ0P5abghUIlGnMYTjYXuX1
 1B6Vrr0bdrQ3V1CLB3tpkRjpUvicrsabtuAUAP65QnEG2G9UJXklOer+DE291Gsl
 f1I0o6C1zVGAOkUUD3QEYaHD8w7hlvyfKme5oXKUm3DOjaAar5UUKLdr6prxRZL4
 ebhmGEy42Mf8fBYoeohIxmxgvv6h2Xd9xCukgPp8hFpqJGw8abg7JNZTTKH4h2IY
 J51RpD10h4eoi6WRn3opEcjexTGvZ+xNR7yYO5WxWw6VIre9IUA=
 =s+WW
 -----END PGP SIGNATURE-----

Merge tag '11.4' into 11.6

MariaDB 11.4.4 release
2024-11-08 07:17:00 +01:00
Brandon Nesterenko
e9a502df08 Testing fix for rpl_semi_sync_cond_var_per_thd failure 2024-10-30 08:32:19 -06:00
Oleksandr Byelkin
c770bce898 Merge branch '11.2' into 11.4 2024-10-30 15:11:17 +01:00
Oleksandr Byelkin
69d033d165 Merge branch '10.11' into 11.2 2024-10-29 16:42:46 +01:00
Oleksandr Byelkin
3d0fb15028 Merge branch '10.6' into 10.11 2024-10-29 15:24:38 +01:00
Brandon Nesterenko
1ed30e08af MDEV-34122: Assertion `entry' failed in Active_tranx::assert_thd_is_waiter
If semi-sync is switched off then on while a transaction is
in-between binlogging and waiting for an ACK, the semi-sync state of
the transaction is removed, leading to a debug assertion that
indicates the transaction tried to wait, but cannot receive an ACK
signal. More specifically, when semi-sync is switched off, the
Active_tranx list is cleared (where a transaction adds an entry to
this list during binlogging), and each entry in this list saves the
thread which will wait for an ACK, and the thread has the COND
variable to signal to wake itself. So if the entry is lost, the
Ack_receiver thread won’t be able to find the thread to wake up when
an ACK comes in

The fix is to ensure that the entry exists before awaiting the ACK,
and if there is no entry, skip the wait. In debug builds, an
informative message is written explaining that the transaction is
skipping its wait. Additional debug-build only logic is added to
ensure that the cause of the missing entry is due to semi-sync being
turned off and on

Reviewed By:
============
Kristian Nielsen <knielsen@knielsen-hq.org>
2024-10-21 15:35:54 -06:00
Marko Mäkelä
43465352b9 Merge 11.4 into 11.6 2024-10-03 16:09:56 +03:00
Marko Mäkelä
b53b81e937 Merge 11.2 into 11.4 2024-10-03 14:32:14 +03:00
Marko Mäkelä
12a91b57e2 Merge 10.11 into 11.2 2024-10-03 13:24:43 +03:00
Marko Mäkelä
63913ce5af Merge 10.6 into 10.11 2024-10-03 10:55:08 +03:00
Marko Mäkelä
7e0afb1c73 Merge 10.5 into 10.6 2024-10-03 09:31:39 +03:00
Lena Startseva
0a5e4a0191 MDEV-31005: Make working cursor-protocol
Updated tests: cases with bugs or which cannot be run
with the cursor-protocol were excluded with
"--disable_cursor_protocol"/"--enable_cursor_protocol"

Fix for v.10.5
2024-09-18 18:39:26 +07:00
Brandon Nesterenko
68938d2b42 MDEV-33500 (part 2): rpl.rpl_parallel_sbm can still fail
The failing test case validates Seconds_Behind_Master for a delayed
slave, while STOP SLAVE is executed during a delay. The test fixes
initially added to the test (commit b04c8575967) added a table lock
to ensure a transaction could not finish before validating the
Seconds_Behind_Master field after SLAVE START, but did not address a
possibility that the transaction could finish before running the
STOP SLAVE command, which invalidates the validations for the rest
of the test case. Specifically, this would result in 1) a timeout in
“Waiting for table metadata lock” on the replica, which expects the
transaction to retry after slave restart and hit a lock conflict on
the locked tables (added in b04c8575967), and 2) that
Seconds_Behind_Master should have increased, but did not.

The failure can be reproduced by synchronizing the slave to the master
before the MDEV-32265 echo statement (i.e. before the SLAVE STOP).

This patch fixes the test by adding a mechanism to use DEBUG_SYNC to
synchronize a MASTER_DELAY, rather than continually increase the
duration of the delay each time the test fails on buildbot. This is
to ensure that on slow machines, a delay does not pass before the
test gets a chance to validate results. Additionally, it decreases
overall test time because the test can continue immediately after
validation, thereby bypassing the remainder of a full delay for each
transaction.
2024-09-17 06:29:20 -06:00
Marko Mäkelä
a5b80531fb Merge 11.4 into 11.6 2024-09-04 10:38:25 +03:00
Marko Mäkelä
44733aa8cf Merge 11.2 into 11.4 2024-08-29 19:10:38 +03:00
Marko Mäkelä
e91a799458 Merge 10.11 into 11.2 2024-08-29 16:02:57 +03:00
Marko Mäkelä
cfcf27c6fe Merge 10.6 into 10.11 2024-08-29 07:47:29 +03:00
Marko Mäkelä
48becffd07 Merge 10.5 into 10.6 2024-08-27 08:52:10 +03:00
Kristian Nielsen
8642453ce6 Fix sporadic failure of test case rpl.rpl_start_stop_slave
The test was expecting the I/O thread to be in a specific state, but thread
scheduling may cause it to not yet have reached that state. So just have a
loop that waits for the expected state to occur.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-08-26 14:39:24 +02:00
Kristian Nielsen
214e6c5b3d Fix sporadic failure of test case rpl.rpl_old_master
Remove the test for MDEV-14528. This is supposed to test that parallel
replication from pre-10.0 master will update Seconds_Behind_Master. But
after MDEV-12179 the SQL thread is blocked from even beginning to fetch
events from the relay log due to FLUSH TABLES WITH READ LOCK, so the test
case is no longer testing what is was intended to. And pre-10.0 versions are
long since out of support, so does not seem worthwhile to try to rewrite the
test to work another way.

The root cause of the test failure is MDEV-34778. Briefly, depending on
exact timing during slave stop, the rli->sql_thread_caught_up flag may end
up with different value. If it ends up as "true", this causes
Seconds_Behind_Master to be 0 during next slave start; and this caused test
case timeout as the test was waiting for Seconds_Behind_Master to become
non-zero.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-08-26 14:39:24 +02:00
Kristian Nielsen
7dc4ea5649 Fix sporadic test failure in rpl.rpl_create_drop_event
Depending on timing, an extra event run could start just when the event
scheduler is shut down and delay running until after the table has been
dropped; this would cause the test to fail with a "table does not exist"
error in the log.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-08-26 14:39:24 +02:00
Kristian Nielsen
33854d7324 Restore skiping rpl.rpl_mdev6020 under Valgrind
(Revert a change done by mistake when XtraDB was removed.)

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-08-26 14:39:24 +02:00
Brandon Nesterenko
9e845107f8 MDEV-34765: rpl.master_last_event_time_stmt fails with Result Length Mismatch
When executing a Query_log_event that is a COMMIT query,
gtid_slave_pos is updated before other replication status
variables, so when an MTR test syncs a replica with
primaries via GTID, there is a slight window where accessing
status variables, e.g. via SHOW ALL SLAVES STATUS, results
in "stale" values because gtid_slave_pos has been updated
before the *_last_event_time fields have been updated.

This patch only fixes the test by switching from using
GTIDs to using binlog file coordinates when synchronizing
replicas with their primaries.
2024-08-22 20:25:10 -06:00
Brandon Nesterenko
bd54475efa MDEV-34779: Sporadic test failure in rpl.rpl_semi_sync_cond_var_per_thd
In a merge, an mtr.call_suppression was erroneously removed
for "Got an error writing communication packets". So when
the error would expectedly occur, the test would fail.

This patch adds this suppression back.

Note that the test will still fail due MDEV-34799, which
is to be fixed in 10.6.
2024-08-22 13:44:15 -06:00
Oleksandr Byelkin
492a7c2430 Merge branch '11.5' into 11.6 2024-08-21 15:13:47 +02:00
Oleksandr Byelkin
342fa29615 Merge branch '11.4' into 11.5 2024-08-21 11:52:54 +02:00
Kristian Nielsen
78fcb9474c Fix sporadic failure in test rpl.rpl_rotate_logs
Clarify confusing comments in the previous commit, and note that the failure
started after push of MDEV-34504.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-08-19 21:18:56 +02:00
Kristian Nielsen
5dc2fe4815 Fix sporadic failure in test rpl.rpl_rotate_logs
The test started failing after push of MDEV-31404.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-08-16 22:27:01 +02:00
Oleksandr Byelkin
d6444022ca Merge branch 'bb-11.5-release' into bb-11.6-release 2024-08-06 17:28:38 +02:00
Oleksandr Byelkin
ea75a0b600 Merge branch '11.4' into 11.5 2024-08-05 17:50:18 +02:00
Oleksandr Byelkin
1640c9b06e Merge branch '11.2' into 11.4 2024-08-04 17:27:48 +02:00
Oleksandr Byelkin
dced6cbdb6 Merge branch '11.1' into 11.2 2024-08-03 09:50:16 +02:00
Oleksandr Byelkin
80abd847da Merge branch '10.11' into 11.1 2024-08-03 09:32:42 +02:00
Oleksandr Byelkin
8f020508c8 Merge branch '10.5' into 10.6 2024-08-03 09:04:24 +02:00
Brandon Nesterenko
001608de7e MDEV-15393: Fix rpl_mysqldump_gtid_slave_pos
The slave would try to sync_with_master_gtid.inc,
but the master never actually saved its gtid position
so the test would move on too quickly.
2024-07-31 14:17:46 -06:00
Monty
25b5c63905 MDEV-33856: Alternative Replication Lag Representation via Received/Executed Master Binlog Event Timestamps
This commit adds 3 new status variables to 'show all slaves status':

- Master_last_event_time ; timestamp of the last event read from the
  master by the IO thread.
- Slave_last_event_time ; Master timestamp of the last event committed
  on the slave.
- Master_Slave_time_diff: The difference of the above two timestamps.

All the above variables are NULL until the slave has started and the
slave has read one query event from the master that changes data.

- Added information_schema.slave_status, which allows us to remove:
   - show_master_info(), show_master_info_get_fields(),
     send_show_master_info_data(), show_all_master_info()
   - class Sql_cmd_show_slave_status.
   - Protocol::store(I_List<i_string_pair>* str_list) as it is not
     used anymore.
- Changed old SHOW SLAVE STATUS and SHOW ALL SLAVES STATUS to
  use the SELECT code path, as all other SHOW ... STATUS commands.

Other things:
- Xid_log_time is set to time of commit to allow slave that reads the
  binary log to calculate Master_last_event_time and
  Slave_last_event_time.
  This is needed as there is not 'exec_time' for row events.
- Fixed that Load_log_event calculates exec_time identically to
  Query_event.
- Updated RESET SLAVE to reset Master/Slave_last_event_time
- Updated SQL thread's update on first transaction read-in to
  only update Slave_last_event_time on group events.
- Fixed possible (unlikely) bugs in sql_show.cc ...old_format() functions
  if allocation of 'field' would fail.

Reviewed By:
Brandon Nesterenko <brandon.nesterenko@mariadb.com>
Kristian Nielsen <knielsen@knielsen-hq.org>
2024-07-25 08:57:27 -06:00
Andrei
c944cd6fec MDEV-15393 post-push: complete rpl_mysqldump_gtid_slave_pos fixes.
Added a missed
  --source include/save_master_gtid.inc
by the previous commit.
2024-07-22 20:52:26 +03:00
Oleksandr Byelkin
0fe39d368a Merge branch '10.6' into 10.11 2024-07-22 15:14:50 +02:00
Oleksandr Byelkin
a938503cfb Merge branch '10.5' into 10.6 2024-07-20 08:12:42 +02:00
Andrei
b8f92ade57 MDEV-15393 gtid_slave_pos duplicate key errors after mysqldump restore
When mysqldump is run to dump the `mysql` system database, it generates
INSERT statements into the table `mysql.gtid_slave_pos`.
After running the backup script
those inserts did not produce the expected gtid state on slave. In
particular the maximum of mysql.gtid_slave_pos.sub_id did not make
into
   rpl_global_gtid_slave_state.last_sub_id

an in-memory object that is supposed to match the current state of the
table. And that was regardless of whether --gtid option was specified
or not. Later when the backup recipient server starts as slave
in *non-gtid* mode this desychronization may lead to a duplicate key
error.

This effect is corrected for --gtid mode mysqldump/mariadb-dump only
as the following.  The fixes ensure the insert block of the dump
script is followed with a "summing-up" SET @global.gtid_slave_pos
assignment.

For the implemenation part, note a deferred print-out of
SET-gtid_slave_pos and associated comments is prefered over relocating
of the entire blocks if (opt_master,slave_data &&
do_show_master,slave_status) ...  because of compatiblity
concern. Namely an error inside do_show_*() is handled in the new code
the same way, as early as, as before.

A regression test can be run in how-to-reproduce mode as well.
One affected mtr test observed.
rpl_mysqldump_slave.result "mismatch" shows now the new deferring print
of SET-gtid_slave_pos policy in action.
2024-07-19 21:44:12 +03:00
Oleksandr Byelkin
9af2caca33 Merge branch '10.5' into 10.6 2024-07-18 16:25:33 +02:00
Brandon Nesterenko
a061ae1079 MDEV-33921: Fix rpl_xa_empty_transaction.test
The test was missing a save_master_gtid.inc on the master,
leading to the slave thinking it was in sync after executing
sync_with_master_gtid.inc, despite not having executed the
latest transaction. This skipped transaction, XA COMMIT,
was supposed to error-to-be-ignored because its XID could not
be found, but be thrown out because the replication filters
would filter out the target database. However, if the slave
was able to stop before executing the transaction, then
the replication filer is reset (to empty), and when the
slave is later restarted, that transactions error would
no longer be ignored.

Additionally, as the test cases added in MDEV-33921 rely
on GTID synchronization, the test cases now force
master_use_gtid=slave_pos for consistency
2024-07-17 16:38:26 -06:00
Yuchen Pei
f071b7620b
Merge branch '10.5' into 10.6 2024-07-16 15:54:22 +08:00
Daniel Black
e8bcc4e455 MDEV-34568 rpl.rpl_mdev12179 - correct for Windows
Simplify in an attempt to avoid:

mysqltest: At line 275: File already exist: on the write_file
lines.

Using write_line as that's what a lot of other tests
do for writing small bits to a expect file.

Review thanks Valdislav Vaintroub
2024-07-12 12:55:28 +02:00
Brandon Nesterenko
632dd304c7 MDEV-34554: rpl_change_master_demote sporadically fails on buildbot
MDEV-34274 did not fix the test failure. The test has a START SLAVE
UNTIL condition, where we can't use sync_with_master_gtid.inc,
wait_for_slave_to_start.inc, or wait_for_slave_to_stop.inc because
our MTR connection thread races with the start/stop of the SQL/IO
threads. So instead, for slave start, we prove the threads started
by waiting for the connection count to increase by 2; and for slave
stop, we wait for the processlist count to return to its pre start
slave number.
2024-07-11 14:45:12 -06:00
Brandon Nesterenko
fa80449725 MDEV-34274: Test rpl.rpl_change_master_demote frequently fails on buildbot with "IO thread should not be running..."
Note this is a backport of 8c8b3ab784b884754efae8182539e1c8752831d2
from 11.1.

The test rpl.rpl_change_master_demote used a `sleep 1` command
to give time for a START SLAVE UNTIL to start the slave threads
and wait for them to automatically die by UNTIL.  On machines
with heavy load (especially MSAN bb builders), one second was
not enough, and the test would fail due to the IO thread
still being up.

This patch fixes the test by replacing the sleep with specific
conditions to wait for. The test cannot wait for the IO or SQL
threads to start, as it would be possible that they would be
started and stopped by the time the MTR executor would check
the slave status. So instead, we test for proof that they
existed via the Connections status variable being incremented
by at least 2 (Connections just shows the global thread id).
At this point, we still can't use the wait_for_slave_to_stop
helper, as the SQL/IO_Running fields of SHOW SLAVE STATUS
may not be updated yet. So instead, we use
information_schema.processlist, which would show the presence
of the Slave_SQL/IO threads. So to "wait for the slave to stop",
we wait for the Slave_SQL/IO threads to be gone from the
processlist.
2024-07-11 09:06:23 -06:00
Monty
e0cff1e72b Fixed failure in rpl.rpl_change_master_demote : "IO thread should not be running..."
The issue was that the test did not take into account that the IO thread
could have been in COMMAND=Connecting state, which happens before the
COMMANMD=Slave_IO state.

The test is a bit fragile as it depends on the COMMAND state to be
syncronised with the Slave_IO_State, which is not the case.

I added a new proc state and some more information to the error
output to be able to diagnose future failures more easily.
2024-07-11 11:15:47 +03:00
Alexander Barkov
36eba98817 MDEV-19123 Change default charset from latin1 to utf8mb4
Changing the default server character set from latin1 to utf8mb4.
2024-07-11 10:21:07 +04:00
Brandon Nesterenko
ea9869504d MDEV-33921: Replication breaks when filtering two-phase XA transactions
There are two problems.

First, replication fails when XA transactions are used where the
slave has replicate_do_db set and the client has touched a different
database when running DML such as inserts. This is because XA
commands are not treated as keywords, and are thereby not exempt
from the replication filter. The effect of this is that during an XA
transaction, if its logged “use db” from the master is filtered out
by the replication filter, then XA END will be ignored, yet its
corresponding XA PREPARE will be executed in an invalid state,
thereby breaking replication.

Second, if the slave replicates an XA transaction which results in
an empty transaction, the XA START through XA PREPARE first phase of
the transaction won’t be binlogged, yet the XA COMMIT will be
binlogged. This will break replication in chain configurations.

The first problem is fixed by treating XA commands in
Query_log_event as keywords, thus allowing them to bypass the
replication filter. Note that Query_log_event::is_trans_keyword() is
changed to accept a new parameter to define its mode, to either
check for XA commands or regular transaction commands, but not both.
In addition, mysqlbinlog is adapted to use this mode so its
--database filter does not remove XA commands from its output.

The second problem fixed by overwriting the XA state in the XID
cache to be XA_ROLLBACK_ONLY, so at commit time, the server knows to
rollback the transaction and skip its binlogging. If the xid cache
is cleared before an XA transaction receives its completion command
(e.g. on server shutdown), then before reporting ER_XAER_NOTA when
the completion command is executed, the filter is first checked if
the database is ignored, and if so, the error is ignored.

Reviewed By:
============
Kristian Nielsen <knielsen@knielsen-hq.org>
Andrei Elkin <andrei.elkin@mariadb.com>
2024-07-10 14:37:39 -06:00
Monty
dd99780967 MDEV-34504 PURGE BINARY LOGS not working anymore
PURGE BINARY LOGS did not always purge binary logs. This commit fixes
some of the issues and adds notifications if a binary log cannot be
purged.

User visible changes:
- 'PURGE BINARY LOG TO log_name' and 'PURGE BINARY LOGS BEFORE date'
  worked differently. 'TO' ignored 'slave_connections_needed_for_purge'
  while 'BEFORE' did not. Now both versions ignores the
  'slave_connections_needed_for_purge variable'.
- 'PURGE BINARY LOG..' commands now returns 'note' if a binary log cannot
   be deleted like
   Note 1375 Binary log 'master-bin.000004' is not purged because it is
             the current active binlog
- Automatic binary log purges, based on date or size, will write a
  note to the error log if a binary log matching the size or date
  cannot yet be deleted.
- If 'slave_connections_needed_for_purge' is set from a config or
  command line, it is set to 0 if Galera is enabled and 1 otherwise
  (old default). This ensures that automatic binary log purge works
  with Galera as before the addition of
  'slave_connections_needed_for_purge'.
  If the variable is changed to 0, a warning will be printed to the error
  log.

Code changes:
- Added THD argument to several purge_logs related functions that needed
  THD.
- Added 'interactive' options to purge_logs functions. This allowed
  me to remove testing of sql_command == SQLCOM_PURGE.
- Changed purge_logs_before_date() to first check if log is applicable
  before calling can_purge_logs(). This ensures we do not get a
  notification for logs that does not match the remove criteria.
- MYSQL_BIN_LOG::can_purge_log() will write notifications to the user
  or error log if a log cannot yet be removed.
- log_in_use() will return reason why a binary log cannot be removed.

Changes to keep code consistent:
- Moved checking of binlog_format for Galera to be after Galera is
  initialized (The old check never worked). If Galera is enabled
  we now change the binlog_format to ROW, with a warning, instead of
  aborting the server. If this change happens a warning will be printed to
  the error log.
- Print a warning if Galera or FLASHBACK changes the binlog_format
  to ROW. Before it was done silently.

Reviewed by: Sergei Golubchik <serg@mariadb.com>,
             Kristian Nielsen <knielsen@knielsen-hq.org>
2024-07-10 18:50:08 +03:00