Problem: master executed a statement that would fail on slave
(namely, DROP USER 'create_rout_db'@'localhost').
Then the test did:
--let $rpl_only_running_threads= 1
--source include/rpl_reset.inc
rpl_reset.inc calls rpl_sync.inc, which first checks which of
the threads are running and then syncs those threads that are
running. If the SQL thread fails after the check, the sync will
fail. So there was a race in the test and it failed on some
slow hosts.
Fix: Don't replicate the failing statement.
Major replication test framework cleanup. This does the following:
- Ensure that all tests clean up the replication state when they
finish, by making check-testcase check the output of SHOW SLAVE STATUS.
This implies:
- Slave must not be running after test finished. This is good
because it removes the risk for sporadic errors in subsequent
tests when a test forgets to sync correctly.
- Slave SQL and IO errors must be cleared when test ends. This is
good because we will notice if a test gets an unexpected error in
the slave threads near the end.
- We no longer have to clean up before a test starts.
- Ensure that all tests that wait for an error in one of the slave
threads waits for a specific error. It is no longer possible to
source wait_for_slave_[sql|io]_to_stop.inc when there is an error
in one of the slave threads. This is good because:
- If a test expects an error but there is a bug that causes
another error to happen, or if it stops the slave thread without
an error, then we will notice.
- When developing tests, wait_for_*_to_[start|stop].inc will fail
immediately if there is an error in the relevant slave thread.
Before this patch, we had to wait for the timeout.
- Remove duplicated and repeated code for setting up unusual replication
topologies. Now, there is a single file that is capable of setting
up arbitrary topologies (include/rpl_init.inc, but
include/master-slave.inc is still available for the most common
topology). Tests can now end with include/rpl_end.inc, which will clean
up correctly no matter what topology is used. The topology can be
changed with include/rpl_change_topology.inc.
- Improved debug information when tests fail. This includes:
- debug info is printed on all servers configured by include/rpl_init.inc
- User can set $rpl_debug=1, which makes auxiliary replication files
print relevant debug info.
- Improved documentation for all auxiliary replication files. Now they
describe purpose, usage, parameters, and side effects.
- Many small code cleanups:
- Made have_innodb.inc output a sensible error message.
- Moved contents of rpl000017-slave.sh into rpl000017.test
- Added mysqltest variables that expose the current state of
disable_warnings/enable_warnings and friends.
- Too many to list here: see per-file comments for details.
"Grantor" columns' data is lost when replicating mysql.tables_priv.
Slave SQL thread used its default user ''@'' as the grantor of GRANT|REVOKE
statements executing on it.
In this patch, current user is put in query log event for all GRANT and REVOKE
statement, SQL thread uses the user in query log event as grantor.
MTR sporadically reported that rpl_do_grant does not
clean up after itself.
We fix this by backporting BUG 50984 fix. This deploys
missing synchronization between master and slave.
Additionally, it also fixes the check_testcase for
rpl_tmp_table_and_DDL.
Conflicts:
Text conflict in configure.in
Text conflict in dbug/dbug.c
Text conflict in mysql-test/r/ps.result
Text conflict in mysql-test/t/ps.test
Text conflict in sql/CMakeLists.txt
Text conflict in sql/ha_ndbcluster.cc
Text conflict in sql/mysqld.cc
Text conflict in sql/sql_plugin.cc
Text conflict in sql/sql_table.cc
A failed REVOKE statement is logged with error=0, thus causing
the slave to stop. The slave should not stop as this was an
expected error. Given that the execution failed on the master as
well the error code should be logged so that the slave can replay
the statement, get an error and compare with the master's
execution outcome. If errors match, then slave can proceed with
replication, as the error it got, when replaying the statement,
was expected.
In this particular case, the bug surfaces because the error code
is pushed to the THD diagnostics area after writing the event to
the binary log. Therefore, it would be logged with the THD
diagnostics area clean, hence its error code would not contain
the correct code.
We fix this by moving the error reporting ahead of the call to
the routine that writes the event to the binary log.
We found that there are some tests that are not cleaning
up properly:
1. rpl_tmp_table_and_DDL
2. rpl_do_grant
3. rpl_sync
For #1 and #2 we found that the slave would not, for some
cases, replicate all the instructions the master processed
in the cleanup section. We fix these by deploying some
synchronization commands in the test cases so that slave
processes all clean up instructions.
As for #3, this is tracked as part of another bug
(BUG@50442).
{PROCEDURE|FUNCTION} FROM ...'
The master would hit an assertion when binary log was
active. This was due to the fact that the thread's diagnostics
area was being cleared before writing to the binlog,
independently of mysql_routine_grant returning an error or
not. When mysql_routine_grant was to return an error, the return
value and the diagnostics area contents would
mismatch. Consequently, neither my_ok would be called nor an
error would be signaled in the diagnostics area, eventually
triggering the assertion in net_end_statement.
We fix this by not clearing the diagnostics area at binlogging
time.
mysql.procs_priv table itself does not get replicated.
Inserting routine privilege record into mysql.procs_priv table
is triggered by creating function/procedure statements
according to current user's privileges.
Because the current user of SQL thread has GLOBAL_ACL,
which doesn't need any check mysql.procs_priv privilege
when create/alter/execute routines.
Corresponding GLOBAL_ACL privilege user
doesn't insert routine privilege record into
mysql.procs_priv when creating a routine.
Fixed by switching the current user of SQL thread to definer user if
the definer user exists on slave.
That populates procs_priv, otherwise to keep the SQL thread
user and procs_priv remains unchanged.