Major replication test framework cleanup. This does the following:
- Ensure that all tests clean up the replication state when they
finish, by making check-testcase check the output of SHOW SLAVE STATUS.
This implies:
- Slave must not be running after test finished. This is good
because it removes the risk for sporadic errors in subsequent
tests when a test forgets to sync correctly.
- Slave SQL and IO errors must be cleared when test ends. This is
good because we will notice if a test gets an unexpected error in
the slave threads near the end.
- We no longer have to clean up before a test starts.
- Ensure that all tests that wait for an error in one of the slave
threads waits for a specific error. It is no longer possible to
source wait_for_slave_[sql|io]_to_stop.inc when there is an error
in one of the slave threads. This is good because:
- If a test expects an error but there is a bug that causes
another error to happen, or if it stops the slave thread without
an error, then we will notice.
- When developing tests, wait_for_*_to_[start|stop].inc will fail
immediately if there is an error in the relevant slave thread.
Before this patch, we had to wait for the timeout.
- Remove duplicated and repeated code for setting up unusual replication
topologies. Now, there is a single file that is capable of setting
up arbitrary topologies (include/rpl_init.inc, but
include/master-slave.inc is still available for the most common
topology). Tests can now end with include/rpl_end.inc, which will clean
up correctly no matter what topology is used. The topology can be
changed with include/rpl_change_topology.inc.
- Improved debug information when tests fail. This includes:
- debug info is printed on all servers configured by include/rpl_init.inc
- User can set $rpl_debug=1, which makes auxiliary replication files
print relevant debug info.
- Improved documentation for all auxiliary replication files. Now they
describe purpose, usage, parameters, and side effects.
- Many small code cleanups:
- Made have_innodb.inc output a sensible error message.
- Moved contents of rpl000017-slave.sh into rpl000017.test
- Added mysqltest variables that expose the current state of
disable_warnings/enable_warnings and friends.
- Too many to list here: see per-file comments for details.
MTR sporadically reported that rpl_do_grant does not
clean up after itself.
We fix this by backporting BUG 50984 fix. This deploys
missing synchronization between master and slave.
Additionally, it also fixes the check_testcase for
rpl_tmp_table_and_DDL.
Stored routine DDL statements use statement-based replication
regardless of the current binlog format. The problem here was
that if a DDL statement failed during metadata lock acquisition
or opening of mysql.proc, the binlog format would not be reset
before returning. So the following DDL or DML statements are
binlogged with a wrong binlog format, which causes the slave
to stop.
The problem can be resolved by grabbing an exclusive MDL lock firstly
instead of clearing the current binlog format. So that the binlog
format will not be affected when the lock grab returns directly with
an error. The same way is taken to open a proc table for update.
We found that there are some tests that are not cleaning
up properly:
1. rpl_tmp_table_and_DDL
2. rpl_do_grant
3. rpl_sync
For #1 and #2 we found that the slave would not, for some
cases, replicate all the instructions the master processed
in the cleanup section. We fix these by deploying some
synchronization commands in the test cases so that slave
processes all clean up instructions.
As for #3, this is tracked as part of another bug
(BUG@50442).
In RBR, DDL statement will change binlog format to non row-based
format before it is binlogged, but the binlog format was not be
restored, and then manipulating a temporary table can not reset binlog
format to row-based format rightly. So that the manipulated statement
is binlogged with statement-based format.
To fix the problem, restore the state of binlog format after the DDL
statement is binlogged.