1
0
mirror of https://github.com/MariaDB/server.git synced 2025-09-11 05:52:26 +03:00
Commit Graph

3 Commits

Author SHA1 Message Date
Kristian Nielsen
b6b6bb8d36 Fix sporadic failures of rpl.rpl_gtid_crash
- Suppress a couple errors the slave can get as the master crashes.

 - The mysql-test-run occasionally takes 120 seconds between crashing
   the master and starting it back up for some (unknown) reason. For
   now, work-around that by letting the slave try for 500 seconds to
   connect to master before giving up instead of only 100 seconds.

Reviewed-by: Brandon Nesterenko <brandon.nesterenko@mariadb.com>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2025-03-15 11:15:36 +01:00
Kristian Nielsen
0c249ad718 MDEV-30232: rpl.rpl_gtid_crash fails sporadically in BB
The root cause of the failure is a bug in the Linux network stack:

  https://lore.kernel.org/netdev/87sf0ldk41.fsf@urd.knielsen-hq.org/T/#u

If the slave does a connect(2) at the exact same time that kill -9 of the
master process closes the listening socket, the FIN or RST packet is lost in
the kernel, and the slave ends up timing out waiting for the initial
communication from the server. This timeout defaults to
--slave-net-timeout=120, which causes include/master_gtid_wait.inc to time
out first and fail the test.

Work-around this problem by reducing the --slave-net-timeout for this test
case. If this problem turns up in other tests, we can consider reducing the
default value for all tests.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-04-20 13:41:08 +02:00
Kristian Nielsen
df2db86341 MDEV-7430: rpl.rpl_gtid_crash still fails in buildbot
The problem was a too low timeout for slave reconnect. It was set to 9 seconds
(10 retries with 1 second in-between). This is occasinally too short on some
Buildbot hosts, when the test crashes and restarts the master while the slave
IO thread is running.

Fix by increasing --master-retry-count for this test.
2015-01-15 15:55:09 +01:00