Problem : The basic problem is the way the thread sleeps in mysql-5.5 and also in mysql-5.1
when we execute a stop slave on windows platform.
On windows platform if the stop slave is executed after the master dies, we have
this long wait before the stop slave return a value. This is because there is a
sleep of the thread. The sleep is uninterruptable in the two above version,
which was fixed by Davi patch for the BUG#11765860 for mysql-trunk. Backporting
his patch for mysql-5.5 fixes the problem.
Solution : A new pair of mutex and condition variable is introduced to synchronize thread
sleep and finalization. A new mutex is required because the slave threads are
terminated while holding the slave thread locks (run_lock), which can not be
relinquished during termination as this would affect the lock order.
mysql-test/suite/rpl/r/rpl_start_stop_slave.result:
The result file associated with the test added.
mysql-test/suite/rpl/t/rpl_start_stop_slave.test:
A test to check the new functionality.
sql/rpl_mi.cc:
The constructor using the new mutex and condition variables for the master_info.
sql/rpl_mi.h:
The condition variable and mutex have been added for the master_info.
sql/rpl_rli.cc:
The constructor using the new mutex and condition variables for the realy_log_info.
sql/rpl_rli.h:
The condition variable and mutex have been added for the relay_log_info.
sql/slave.cc:
Use a timed wait on a condition variable to implement a interruptible sleep.
The wait is registered with the THD object so that the thread will be woken
up if killed.
If an error message contains '\' backslash it is displayed correctly
through show-slave-status or
query_get_value(SHOW SLAVE STATUS, Last_IO_Error, 1);. But when
SELECT REPLACE(...) is applied backslash is escaped resulting in a
different test output.
Disabled backslash escape on show_slave_status.inc and replaced '\' for
'/' using replace_regex function in order to achieve the same test
output when different path separators are used.
The server crashes when receiving a COM_BINLOG_DUMP command with a position of 0 or
larger than the file size.
The execution proceeds to an error block having the last read replication coordinates
pointer be NULL and its dereferencing crashed the server.
Fixed with making "public" previously used only for heartbeat coordinates.
mysql-test/extra/rpl_tests/rpl_start_stop_slave.test:
regression test for bug#3593869-64035 is added.
mysql-test/suite/rpl/r/rpl_cant_read_event_incident.result:
results updated (error mess format is changed).
mysql-test/suite/rpl/r/rpl_log_pos.result:
results updated (error mess format is changed).
mysql-test/suite/rpl/r/rpl_manual_change_index_file.result:
results updated (error mess format is changed).
mysql-test/suite/rpl/r/rpl_packet.result:
results updated (error mess format is changed).
mysql-test/suite/rpl/r/rpl_stm_start_stop_slave.result:
results updated (error mess format is changed).
mysql-test/suite/rpl/t/rpl_stm_start_stop_slave.test:
Slave is stopped by bug#3593869-64035 tests so
-let $rpl_only_running_threads= 1 is set prior to rpl_end.
sql/share/errmsg-utf8.txt:
Increasing the max length of explanatory message to 512.
sql/sql_repl.cc:
Making `coord' to carry the last read from binlog event coordinates
regardless of heartbeat.
Renaming, small cleanup and simplifying the code after if (coord) becomes unnecessary.
Adding yet another 3rd pair of coordinates - the starting replication -
into error text.
Test extra/rpl_tests/rpl_extra_col_master.test (used by
rpl_extra_col_master_*) ends with the active connection pointing to the
slave. Thus, the two last tests never succeed in changing the binlog
format of the master away from 'row'. With correct active connection
(master) tests fail for binlog 'statement' and 'mixed' formats.
Tests rpl_extra_col_master_* only run when binary log format is
row. Statement and mix replication do not make sense in this
tests since it will try to execute statements on columns that do
not exist. This fix is basically a backport from mysql-5.5, see
changes done as part of BUG 39934.
There was memory leak when running some tests on PB2.
The reason of the failure is an early return from change_master()
that was supposed to deallocate a dyn-array.
Actually the same bug58915 was fixed in trunk with relocating the dyn-array
destruction into THD::cleanup_after_query() which can't be bypassed.
The current patch backports magne.mahre@oracle.com-20110203101306-q8auashb3d7icxho
and adds two optimizations: were done: the static buffer for the dyn-array to base on,
and the array initialization is called precisely when it's necessary rather than
per each CHANGE-MASTER as before.
mysql-test/suite/rpl/t/rpl_empty_master_host.test:
the test is binlog-format insensitive so it will be run with MIXED mode only.
mysql-test/suite/rpl/t/rpl_server_id_ignore.test:
the test is binlog-format insensitive so it will be run with MIXED mode only.
sql/sql_class.cc:
relocating the dyn-array
destruction into THD::cleanup_after_query().
sql/sql_lex.cc:
LEX.mi zero initialization is done in LEX().
sql/sql_lex.h:
Optimization for repl_ignore_server_ids to base on a static buffer
which size is chosen to fit to most common use cases.
sql/sql_repl.cc:
dyn-array destruction is relocated to THD::cleanup_after_query().
sql/sql_yacc.yy:
Refining logics of Lex->mi.repl_ignore_server_ids initialization.
The array is initialized once a corresponding option in CHANGE MASTER token sequence
is found.
Refactored the test case: hardened and extended it. Created test inc file
to abstract the task of relocating binlogs.
Also, disabled it on windows for not cluttering the test case any further,
as it depends heavily on doing filesystem operations and path handling.
mysql-test/include/relocate_binlogs.inc:
Auxiliar include file that performs the relocation of binary logs
listed in an index file.
There was memory leak when running some tests on PB2.
The reason of the failure is an early return from change_master()
that was supposed to deallocate a dyn-array.
Fixed with relocating the dyn-array's destructor at ~LEX() that is
the end of the session, per Gleb's patch idea.
Two optimizations were done: the static buffer for the dyn-array to base on,
and the array initialization is called precisely when it's necessary rather than
per each CHANGE-MASTER as before.
mysql-test/suite/rpl/t/rpl_empty_master_host.test:
the test is binlog-format insensitive so it will be run with MIXED mode only.
sql/sql_lex.cc:
the new flag is initialized.
sql/sql_lex.h:
A new bool flag new member to LEX.mi is added to stay UP since after
LEX.mi.repl_ignore_server_ids dynarray initialization was called
for the first time on the session. So it is set once and its life time
is session.
The array is destroyed at the end of the session.
sql/sql_repl.cc:
dyn-array destruction is relocated to ~LEX.
sql/sql_yacc.yy:
Refining logics of Lex->mi.repl_ignore_server_ids initialization.
The array is initialized once a corresponding option in CHANGE MASTER token sequence
is found.
The fact of initialization is memorized into the new flag.
BIN LOG HAS BEEN MOVED
When moving the binary/relay log files from one location to
another and restarting the server with a different log-bin or
relay-log paths, would cause the startup process to abort. The
root cause was that the server would not be able to find the log
files because it would consider old paths for entries in the
index file instead of the new location. What's even worse, the
relative paths would not be considered relative to the path
provided in log-bin and relay-log, but to mysql_data_dir.
We fix the cases where the server contains relative paths. When
the server is reading from the index file, it checks whether the
entry contains relative paths. If it does, we replace it with the
absolute path set in log-bin/relay-log option. Absolute paths
remain unchanged and the index must be manually edited to
consider the new log-bin and/or relay-log path (this should be
documented). This is a fix for a GA version, that does not break
behavior (that much).
For development versions, we should go with Zhenxing's approach
that removes paths altogether from index files.
mysql-test/include/begin_include_file.inc:
Added parameter to keep the begin_include_file.inc silent. Useful when
including scripts that contain platform dependent parameters, for example:
--let $rpl_server_parameters=--log-bin=$tmpdir/slave-bin --relay-log=$tmpdir/slave-relay-bin
--let $keep_include_silent=1
source include/rpl_start_server.inc;
--let $keep_include_silent=0
We want the paths ($tmpdir/slave-bin and $tmpdir/slave-relay-bin) not to be in the
result file.
mysql-test/suite/rpl/t/rpl_binlog_index.test:
Test case.
sql/log.cc:
When finding the corresponding log entry in the index file, we first
normalize the paths before doing the comparison. This will make relative
paths to be turned into absolute paths (based on the opt_bin_logname or
opt_relay_logname) and then compared against also, expanded paths entered,
through CHANGE MASTER for instance.
sql/log.h:
Added normalize_binlog_name, which turns relative paths, into absolute paths
given the parameter: is_relay_log ? opt_relay_logname : opt_bin_logname .
sql/mysqld.cc:
Exposing opt_bin_logname.
sql/mysqld.h:
Exposing opt_bin_logname.
When passing an empty user to the connect function will cause
valgrind warnings. Seems that the client code is not prepared
to handle empty users. On 5.6 this can even be triggered by
START SLAVE PASSWORD='...'; i.e., without setting USER='...' on
the START SLAVE command (see WL#4143 for details on the new
additional START SLAVE commands).
To fix this, we disallow empty users when configuring the slave
connection parameters (this decision might be revisited if the
client code accepts empty users in the future).
sql/slave.cc:
We throw an error if an empty user is supplied to the connection
function.
SCAN/CPU) => SLAVE FAILURE
When a statement containing a large amount of ROWs to be applied on
the slave, and the slave's table does not contain a PK, it can take a
considerable amount of time to find and change all the rows that are
to be changed.
The proper slave enhancement will be implemented in WL 5597. However,
in this bug we are making it clear to the user what the problem is, by
printing a message to the error log if the execution time, for a given
statement in RBR, takes more than LONG_FIND_ROW_THRESHOLD (set to 60
seconds). This shall help the DBA to diagnose what's happening when
facing a slave server that is quiet for no apparent reason...
The note is only printed to the error log if log_warnings is set to be
greater than 1.
sql/log_event.cc:
Core of the patch.
In Rows_log_event::do_apply_event, sets STMT start
timestamp.
In Rows_log_event::find_row, if a PK is not used, then the start
timestamp is checked to see if the time spent on this STMT is enough
to justify the printing of a note (only if it was not printed before).
sql/log_event.h:
Defining LONG_FIND_ROW_THRESHOLD.
sql/rpl_rli.cc:
Resets long_find_row state in rli->context_cleanup().
sql/rpl_rli.h:
Two new rli properties that are necessary to control when to
emit a note in the error log: 1) timestamp that states when the
ROW statement started; 2) flag indicating whether the note has
been emitted for the current statement or not. Also deployed
accessors.
post-push fixes for show_slave_io_error= 1 of wait_for_slave_io_error.inc;
Unix and win format path specifically so few tests have to change show_slave_io_error
to zero.
The bug case is similar to one fixed earlier bug_49536.
Deadlock involving LOCK_log appears to be possible because the purge running thread
is holding LOCK_log whereas there is no sense of doing that and which fact was
exploited by the earlier bug fixes.
Fixed with small reengineering of rotate_and_purge(), adding two new methods and
setting up a policy to execute those instead of the former
rotate_and_purge(RP_LOCK_LOG_IS_ALREADY_LOCKED).
The policy for using rotate(), purge() is that if the caller acquires LOCK_log itself,
it should call rotate(), release the mutex and run purge().
Side effect of this patch is refining error message of bug@11747416 to print
the whole path.
mysql-test/suite/rpl/r/rpl_cant_read_event_incident.result:
the file name printing is changed to a relative path instead of just the file name.
mysql-test/suite/rpl/r/rpl_log_pos.result:
the file name printing is changed to a relative path instead of just the file name.
mysql-test/suite/rpl/r/rpl_manual_change_index_file.result:
the file name printing is changed to a relative path instead of just the file name.
mysql-test/suite/rpl/r/rpl_packet.result:
the file name printing is changed to a relative path instead of just the file name.
mysql-test/suite/rpl/r/rpl_rotate_purge_deadlock.result:
new result file is added.
mysql-test/suite/rpl/t/rpl_cant_read_event_incident.test:
The test of that bug can't satisfy windows and unix backslash interpretation so windows
execution is chosen to bypass.
mysql-test/suite/rpl/t/rpl_rotate_purge_deadlock-master.opt:
new opt file is added.
mysql-test/suite/rpl/t/rpl_rotate_purge_deadlock.test:
regression test is added as well as verification of a
possible side effect of the fixes is tried.
sql/log.cc:
LOCK_log is never taken during execution of log purging routine.
The former MYSQL_BIN_LOG::rotate_and_purge is made to necessarily
acquiring and releasing LOCK_log.
If caller takes the mutex itself it has to use a new rotate(), purge()
methods combination and to never let purge() be run with LOCK_log grabbed.
split apart to allow
the caller to chose either it
Simulation of concurrently rotating/purging threads is added.
sql/log.h:
new rotate(), purge() methods are added to be used instead of
the former rotate_and_purge(RP_LOCK_LOG_IS_ALREADY_LOCKED).
rotate_and_purge() signature is changed. Caller should not call rotate_and_purge()
but rather {rotate(), purge()} if LOCK_log is acquired by it.
sql/rpl_injector.cc:
changes to reflect the new rotate_and_purge() signature.
sql/sql_class.h:
unnecessary constants are removed.
sql/sql_parse.cc:
changes to reflect the new rotate_and_purge() signature.
sql/sql_reload.cc:
changes to reflect the new rotate_and_purge() signature.
sql/sql_repl.cc:
followup for bug@11747416: the file name printing is changed to a relative
path instead of just the file name.
Binary log of master can get a partially logged event if the server
runs out of disk space and, while waiting for some space to be freed,
is shut down (or crashes). If the server is not stopped, it will just
wait endlessly for space to be freed, thus no partial event anomaly
occurs. The restarted master server has had a dubious policy to send
the incomplete event to slave which it apparently can't handle.
Although an error was printed out the fact of sending with unclear
error message is a source of confusion.
Actually the problem of presence an incomplete event in the binary log
was already fixed by WL 5493 (which was merged to our current trunk
branch, major version 5.6). The fix makes the server truncate the
binary log on server restart and recovery.
However 5.5 master can't do that. So the current issue is a problem of
sending incomplete events to the slave by 5.5 master.
It is fixed in this patch by changing the policy so that only complete
events are pushed by the dump thread to the IO thread. In addition,
the error text that master sends to the slave when an incomplete event
is found, now states that incomplete event may have been caused by an
out-of-disk space situation and provides coordinates of
the first and the last event bytes read.
mysql-test/std_data/bug11747416_32228_binlog.000001:
a binlog is added with the last event written partly.
mysql-test/suite/rpl/r/rpl_cant_read_event_incident.result:
new result file is added.
mysql-test/suite/rpl/r/rpl_log_pos.result:
results updated.
mysql-test/suite/rpl/r/rpl_manual_change_index_file.result:
results updated.
mysql-test/suite/rpl/r/rpl_packet.result:
results updated.
mysql-test/suite/rpl/t/rpl_cant_read_event_incident.test:
regression test for bug#11747416 : 32228 A disk full makes binary log corrupt
is added.
sql/share/errmsg-utf8.txt:
Increasing the explanatory part of ER_MASTER_FATAL_ERROR_READING_BINLOG error message twice
in order to fit to the updated version which carries some more info.
sql/sql_repl.cc:
Error text indicating a failure of reading from binlog that master delivers to the slave
is made more clear;
A policy to regard a partial event to send it out to the slave anyway is removed.
Problem: The following statements can cause the slave to go out of sync
if logged in statement format:
INSERT IGNORE...SELECT
INSERT ... SELECT ... ON DUPLICATE KEY UPDATE
REPLACE ... SELECT
UPDATE IGNORE :
CREATE ... IGNORE SELECT
CREATE ... REPLACE SELECT
Background: Since the order of the rows returned by the SELECT
statement or otherwise may differ on master and slave, therefore
the above statements may cuase the salve to go out of sync with
the master.
Fix:
Issue a warning when statements like the above are exectued and
the bin-logging format is statement. If the logging format is mixed,
use row based logging. Marking a statement as unsafe has been
done in the sql/sql_parse.cc instead of sql/sql_yacc.cc, because while
parsing for a token has been done we cannot be sure if the parsing
of the other tokens has been done as well.
Six new warning messages has been added for each unsafe statement.
binlog.binlog_unsafe.test has been updated to incoporate these additional unsafe statments.
******
BUG#11758262 - 50439: MARK INSERT...SEL...ON DUP KEY UPD,REPLACE...SEL,CREATE...[IGN|REPL] SEL
Problem: The following statements can cause the slave to go out of sync
if logged in statement format:
INSERT IGNORE...SELECT
INSERT ... SELECT ... ON DUPLICATE KEY UPDATE
REPLACE ... SELECT
UPDATE IGNORE :
CREATE ... IGNORE SELECT
CREATE ... REPLACE SELECT
Background: Since the order of the rows returned by the SELECT
statement or otherwise may differ on master and slave, therefore
the above statements may cuase the salve to go out of sync with
the master.
Fix:
Issue a warning when statements like the above are exectued and
the bin-logging format is statement. If the logging format is mixed,
use row based logging. Marking a statement as unsafe has been
done in the sql/sql_parse.cc instead of sql/sql_yacc.cc, because while
parsing for a token has been done we cannot be sure if the parsing
of the other tokens has been done as well.
Six new warning messages has been added for each unsafe statement.
binlog.binlog_unsafe.test has been updated to incoporate these additional unsafe statments.
mysql-test/extra/rpl_tests/rpl_insert_duplicate.test:
Test removed: Added the test to rpl.rpl_insert_ignore.test
******
Test removed: the test is redundant as the same is being tested in rpl.rpl_insert_ignore.
mysql-test/extra/rpl_tests/rpl_insert_id.test:
Warnings disabled for the unsafe statements.
mysql-test/extra/rpl_tests/rpl_insert_ignore.test:
1. Disabled warnings while for unsafe statements
2. As INSERT...IGNORE is an unsafe statement, an insert ignore not changing any rows,
will not be logged in the binary log, in the ROW and MIXED modes. It will however be logged
in STATEMENT mode.
mysql-test/r/commit_1innodb.result:
updated result file
******
updated result file
mysql-test/suite/binlog/r/binlog_stm_blackhole.result:
Updated result file.
mysql-test/suite/binlog/r/binlog_unsafe.result:
updated result file
mysql-test/suite/binlog/t/binlog_unsafe.test:
added tests for the statements marked as unsafe.
mysql-test/suite/rpl/r/rpl_insert_duplicate.result:
File Removed :Result file of rpl_insert_duplicate, which has been removed.
mysql-test/suite/rpl/r/rpl_insert_ignore.result:
Added the content of rpl.rpl_insert_duplicate here.
mysql-test/suite/rpl/r/rpl_insert_select.result:
Result file removed as the corresponding test has beenn removed.
mysql-test/suite/rpl/r/rpl_known_bugs_detection.result:
Updated result file.
mysql-test/suite/rpl/t/rpl_insert_duplicate.test:
File Removed: this was a wrapper for rpl.rpl_insert_duplicate.test, which has been removed.
mysql-test/suite/rpl/t/rpl_insert_select.test:
File Removed: This test became redundant after this fix, This test showed how INSERT IGNORE...SELECT break replication, which has been handled in this fix.
mysql-test/suite/rpl/t/rpl_known_bugs_detection.test:
Since all the tests are statement based bugs are being tested, having mixed format
forces the event to be written in row format. When the statement and causes the
test to fail as certain known bugs do not occur when the even is logged in row format.
sql/share/errmsg-utf8.txt:
added 6 new Warning messages.
******
added 6 new Warning messages.
sql/sql_lex.cc:
Added 6 new error Identifier [ER_BINLOG_STMT_UNSAFEE_*]
sql/sql_lex.h:
Added 6 new BINLOG_STMT_UNSAFE_* enums to identify the type of unsafe statement dealt with in this bug.
******
Added 6 new BINLOG_STMT_UNSAFE_* enums to identify the type of unsafe statement dealt with in this bug.
sql/sql_parse.cc:
added check for specific queries and marked them as unsafe.
******
added check for specific queries and marked them as unsafe.
The server crashes if it processes table map events that are
corrupted, especially if they map different tables to the same
identifier. This could happen, for instance, due to BUG 56226.
We fix this by checking whether the table map has already been
mapped before actually applying the event. If it has been mapped
with different settings an error is raised and the slave SQL
thread stops. If it has been mapped with same settings the event
is skipped. If the table is set to be ignored by the filtering
rules, there is no change in behavior: the event is skipped and
ids are not checked.
mysql-test/suite/rpl/t/rpl_row_corruption.test:
Added a simple test case that checks both cases:
- multiple table maps with the same identifier
- multiple table maps with the same identifier, but only one
is processed (the others are filtered out)
VM-WIN2003-32-A, SLES10-IA64-A
The test case waits for master_pos_wait not to timeout, which
means that the deadlock between SQL and IO threads was
succesfully and automatically dealt with.
However, very rarely, master_pos_wait reports a timeout. This
happens because the time set for master_pos_wait to wait was
too small (6 seconds). On slow test env this could be a
problem.
We fix this by setting the timeout inline with the one used
in sync_slave_with_master (300 seconds). In addition we
refactored the test case and refined some comments.
Manual merged mysql-5.1-gca into latest mysql-5.5.
Conflicts
=========
Text conflict in mysql-test/suite/rpl/r/rpl_relayspace.result
Text conflict in mysql-test/suite/rpl/t/rpl_relayspace.test
If LOAD DATA INFILE featured a SET clause, the name=value pairs
would be regenerated using item::print. Unfortunately, that code
is mostly optimized for EXPLAIN EXTENDED output and such, and can
not be relied on to return valid SQL.
We now name each value its original, user-supplied form and use
that to create LOAD DATA INFILE statements for statement-based
replication.
mysql-test/suite/binlog/r/binlog_stm_mix_innodb_myisam.result:
minor change in syntactic sugar
mysql-test/suite/rpl/r/rpl_loaddatalocal.result:
add test case
mysql-test/suite/rpl/t/rpl_loaddatalocal.test:
add test case
sql/sql_load.cc:
Do not try to item::print values in LOAD DATA INFILE's
SET clause; they might not even be valid SQL at this
point. Use our saved version instead.
sql/sql_yacc.yy:
If LOAD DATA INFILE has SET name=val clauses, tag the
individual val-parts with the user's version so we can
later replicate that, rather than the smashed pieces
we'd get from item::print once the optimizer's through
with our poor values.
- add support for choosing the engine of test
table(t1) with $engine_type
- add primary key to the test table(t1) to support
replication of BLOB/TEXT (also with ENGINE=ndb)
- change the suppression since the warning printed to error log
now says "Column 1"
IN WAIT_SHOW_CONDITION)
There was a typo in the name of one of the parameters to the
include file wait_show_condition. The parameter name was being
set to "connection" instead of "condition".
We fix this typo, improve one instruction in the test case and
deploy parameter checks inside wait_show_condition.inc.
Manual merge from mysql-5.1 into mysql-5.5.
Conflicts
=========
Text conflict in mysql-test/suite/rpl/t/rpl_row_until.test
Text conflict in sql/handler.h
Text conflict in storage/archive/ha_archive.cc
ARE NOT BEING HONORED
max_allowed_packet works in conjunction with net_buffer_length.
max_allowed_packet is an upper bound of net_buffer_length.
So it doesn't make sense to set the upper limit lower than the value.
Added a warning (using ER_UNKNOWN_ERRROR and a specific message)
when this is done (in the log at startup and when setting either
max_allowed_packet or the net_buffer_length variables)
Added a test case.
Fixed several tests that broke the above rule.
The slave was not able to find the correct row in the innodb
table, because the row fetched from the innodb table would not
match the before image. This happened because the (don't care)
bytes in the NULLed fields would change once the row was stored
in the storage engine (from zero to the default value). This
would make bulk memory comparison (using memcmp) to fail.
We fix this by taking a preventing measure and avoiding memcmp
for tables that contain nullable fields. Therefore, we protect
the slave search routine from engines that return arbitrary
values for don't care bytes (in the nulled fields). Instead, the
slave thread will only check null_bits and those fields that are
not set to NULL when comparing the before image against the
storage engine row.
mysql-test/extra/rpl_tests/rpl_record_compare.test:
Added test case to the include file so that this is tested
with more than one engine.
mysql-test/suite/rpl/r/rpl_row_rec_comp_innodb.result:
Result update.
mysql-test/suite/rpl/r/rpl_row_rec_comp_myisam.result:
Result update.
mysql-test/suite/rpl/t/rpl_row_rec_comp_myisam.test:
Moved the include file last, so that the result from
BUG#11766865 is not intermixed with the result for
BUG#11760454.
sql/log_event.cc:
Skips memory comparison if the table has nullable
columns and compares only non-nulled fields in the
field comparison loop.
Currently, rpl_semi_sync is failing in PB due to the warning message:
"Slave SQL: slave SQL thread is being stopped in the middle of "
"applying of a group having updated a non-transaction table; "
"waiting for the group completion ..."
The problem started happening after the fix for BUG#11762407 what was
automatically suppressing some warning messages.
To fix the current issue, we suppress the aforementioned warning message
and exploit the opportunity to make the sentence clearer.
There is a race between two threads: user thread and the dump
thread. The former sets a debug instruction that makes the latter wait
before processing an Xid event. There can be cases that the dump
thread has not yet processed the previous Xid event, causing it to
wait one Xid event too soon, thus causing sync_slave_with_master never
to resume.
We fix this by moving the instructions that set the debug variable
after calling sync_slave_with_master.
Problem: the test failed because errors were found in the error log.
The test case contains suppressions for an old version of the error message,
but the format of the error message has changed without updating the suppression.
Fix: Update the suppression. Also small fixes to improve the test.
mysql-test/suite/rpl/r/rpl_slave_load_remove_tmpfile.result:
update result file
mysql-test/suite/rpl/t/rpl_slave_load_remove_tmpfile-slave.opt:
Use variables instead of .opt files to avoid server restarts.
mysql-test/suite/rpl/t/rpl_slave_load_remove_tmpfile.test:
1. To fix the bug, we update the regular expression in mtr.add_suppression
so that it matches the real error text.
2. Use wait_for_slave_sql_error.inc when we wait for an error.
This makes the test easier to understand and will produce better
debug info if the test fails.
3. Use server variables instead of command line options to set
the @@GLOBAL.DEBUG variable. This avoids server restarts when
running the test suite.
4. Clarify the comment at the top of the file and add bug reference.