This will make mysql-test-run.pl try to schedule these long-running
(> 60 seconds) tests early in --parallel runs, which helps avoid that
the testsuite gets stuck with a few long-running tests at the end
while most other test workers are idle.
This speed up mtr --parallel=96 with 25 seconds for me.
Reviewed-by: Brandon Nesterenko <brandon.nesterenko@mariadb.com>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
Fix some random test failures following MDEV-32168 push.
Don't blindly set $rpl_only_running_threads in many places. Instead explicit
stop only the IO or SQL thread, as appropriate. Setting it interfered with
rpl_end.inc in some cases. Rather than clearing it afterwards, better to
not set it at all when it is not needed, removing ambiguity in the test
about the state of the replication threads.
Don't fail the test if include/stop_slave_io.inc finds an error in the IO
thread after stop. Such errors can be simply because slave stop happened in
the middle of the IO thread's initial communication with the master.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
This commit makes replicas crash-safe by default by changing the
Using_Gtid value to be Slave_Pos on a fresh slave start and after
RESET SLAVE is issued. If the primary server does not support GTIDs
(i.e., version < 10), the replica will fall back to Using_Gtid=No on
slave start and after RESET SLAVE.
The following additional informational messages/warnings are added:
1. When Using_Gtid is automatically changed. That is, if RESET
SLAVE reverts Using_Gtid back to Slave_Pos, or Using_Gtid is
inferred to No from a CHANGE MASTER TO given with log coordinates
without MASTER_USE_GTID.
2. If options are ignored in CHANGE MASTER TO. If CHANGE MASTER TO
is given with log coordinates, yet also specifies
MASTER_USE_GTID=Slave_Pos, a warning message is given that the log
coordinate options are ignored.
Additionally, an MTR macro has been added for RESET SLAVE,
reset_slave.inc, which provides modes/options for resetting a slave
in log coordinate or gtid modes. When in log coordinates mode, the
macro will execute CHANGE MASTER TO MASTER_USE_GTID=No after the
RESET SLAVE command. When in GTID mode, an extra parameter,
reset_slave_keep_gtid_state, can be set to reset or preserve the
value of gtid_slave_pos.
Reviewed By:
===========
Andrei Elkin <andrei.elkin@mariadb.com>
On a slow builder, a delay between binlog events on master could
occur, which would cause a heartbeat which is not expected by the
test. The solution is to monitor the timing of binlog events
on the master and only perform the heartbeat check if no critical
delays have happened.
Additionally, an unused variable was removed (this change is
unrelated to the bugfix).
This makes it clear that the error code has nothing to do with errno.
mysql-test/include/mtr_warnings.sql:
Fixed suppression for new slave error messages
mysql-test/lib/My/Test.pm:
Use 'send' instead of 'print' to avoid errors about "wrong class ... back attempt"
mysql-test/lib/v1/mtr_report.pl:
Fixed suppression for new slave error messages
mysql-test/mysql-test-run.pl:
Fixed suppression for new slave error messages
Removed warning from perl 5.16.2 about arrays
mysql-test/r/flush_read_lock.result:
Fixed suppression for new slave error messages
sql/rpl_reporting.cc:
Instead of writing "Errcode" to the log for Slave errors, use "Internal MariaDB error code"
Fixed some mtr test problems
dbug/tests.c:
Fixed compiler warnings
mysql-test/r/handlersocket.result:
Fixed that plugin_license is written
mysql-test/suite/innodb/t/innodb_bug60196.test:
Force sorted results as it was sometimes different on windows
mysql-test/suite/rpl/t/rpl_heartbeat_basic.test:
Prolong test as this failed on windows
mysql-test/t/handlersocket.test:
Fixed that plugin_license is written
plugin/handler_socket/handlersocket/handlersocket.cpp:
Use maria_declare_plugin
plugin/handler_socket/handlersocket/mysql_incl.hpp:
Fixed compiler warning
plugin/handler_socket/libhsclient/auto_addrinfo.hpp:
Fixed compiler warning
sql/handler.h:
Fixed typo
sql/sql_plugin.cc:
Fixed bug that caused plugin library name twice in error message
storage/maria/ma_checkpoint.c:
Fixed compiler warning
storage/maria/ma_loghandler.c:
Fixed compiler warning
unittest/mysys/base64-t.c:
Fixed compiler warning
unittest/mysys/bitmap-t.c:
Fixed compiler warning
unittest/mysys/my_malloc-t.c:
Fixed compiler warning
rpl_heartbeat_basic test fails sporadically on pushbuild because did
not received all heartbeats from slave in circular replication.
MASTER_HEARTBEAT_PERIOD had the default value (slave_net_timeout/2) so
wait on "Heartbeat event received on master", that only waits for 1
minute, sometimes timeout before heartbeat arrives. Fixed setting a
smaller period value.
Currently, rpl_semi_sync is failing in PB due to the warning message:
"Slave SQL: slave SQL thread is being stopped in the middle of "
"applying of a group having updated a non-transaction table; "
"waiting for the group completion ..."
The problem started happening after the fix for BUG#11762407 what was
automatically suppressing some warning messages.
To fix the current issue, we suppress the aforementioned warning message
and exploit the opportunity to make the sentence clearer.
Non-determinism of the test was caused by lack of setting a proper value to hb period,
actually fixed by BUG@50767.
These fixes aim at possible non-determinism in comparison of received
hb events by master and slave in the circular part of the test.
Even though the HB periods ratio was choosen to be as high as 10, it's still incorrect
to compare number of hb-events basing only a relation between their periods.
Yet another issue is relatively short 60 secs timeout of wait_for_status_var.inc
makes valgrind runs to fail.
Fixed with deploying wait_for_slave_io_to_start afront of calling wait_for_status_var.
The test is made runnable only with MIXED binlog-format as it has close to 1 min
total exec time and there is nothing format specific in it.
mysql-test/suite/rpl/r/rpl_heartbeat_basic.result:
results are changed.
mysql-test/suite/rpl/t/rpl_heartbeat_basic.test:
Reducing the test env to run in only with MIXED mode;
Simplifying logics of the circular setup to verify only
that HB flows both directions.
fails in PB sporadically)
The IO thread can concurrently access the relay log IO_CACHE
while another thread is performing an FLUSH LOGS procedure.
FLUSH LOGS closes and reopens the relay log and while doing so it
(re)initializes its IO_CACHE. During this procedure the IO_CACHE
mutex is also reinitialized, which can cause problems if some
other thread (namely the IO THREAD) is concurrently accessing it
at the time .
This patch fixes the problem by extending the interface of the
flush_master_info function to also include a second paramater,
"need_relay_log_lock", stating whether the thread should grab the
relay log lock or not before actually flushing the relay log.
Also, IO thread now calls flush_master_info with this flag set
when it flushes master info with in the event read_event loop.
Finally, we also increase loop time in rpl_heartbeat_basic test
case, so that the number of calls to flush logs doubles, stressing
this part of the code a little more.
mysql-test/suite/rpl/t/rpl_heartbeat_basic.test:
Doubled the number of iterations on the FLUSH LOGS loop by
doubling the time available to perform all iterations.
sql/repl_failsafe.cc:
Updating flush_master_info call so that it uses two parameters
instead of one.
sql/rpl_mi.cc:
Updating flush_master_info call so that it uses two parameters
instead of one.
sql/rpl_mi.h:
Changed flush_master_info interface. Now takes a second parameter
instead of just one. The second parameter is: need_lock_relay_log.
sql/rpl_rli.cc:
Small fix in comment.
sql/slave.cc:
Updating flush_master_info call so that it uses two parameters
instead of one.
sql/sql_repl.cc:
Updating flush_master_info call so that it uses two parameters
instead of one.
Linux x86_64 debug
Two test cases fail because the suppression for the unsafe
warning needs to be updated (BUG@39934 refactored this part and
these changes are only in mysql-next-mr - this is why we notice
them now when merging in next-mr). This is the case for
rpl_nondeterministic_functions and rpl_misc_functions test
cases. rpl_stm_binlog_direct test case is not needed in version >
5.1. The rpl_heartbeat_basic test case fails because patch for
BUG@50397 removed the CHANGE MASTER in the slave that would set
it's period to 1/10 of the master. This would cause the test
assertion to fail.
The fixes for the issues described above are:
- rpl_misc_functions - updated suppression message
- rpl_nondeterministic_functions - updated suppression message
- rpl_stm_binlog_direct - removed the test case (it is not
needed in versions > 5.1)
- rpl_heartbeat_basic - deployed instruction:
CHANGE MASTER TO MASTER_HEARTBEAT_PERIOD=0.1;
Resetting the master before stopping the slave was generating the message
"[ERROR] Slave I/O: Got fatal error 1236 from master when reading data from
binary log: 'could not find next log', Error_code: 1236". In consequence,
the test case was failing because the message had not been suppressed.
To circumvent the failure, we rewrote the test stopping the slave before
resetting the master. We prefer this alternative rather than suppressing
the message.
The issue appears when number of heartbeat events non-zero before start of test
block. But really we need to check that no new events has received during test block.
So I did following:
1. Replace absolute values by diff of values
2. Increase heartbeat period from 1.5 to 5 sec