~40% bugfixed(*) applied
~40$ bugfixed reverted (incorrect or we're not buggy)
~20% bugfixed applied, despite us being not buggy
(*) only changes in the server code, e.g. not cmakefiles
Problem: Uninstallation of semi sync plugin causes replication to
break.
Analysis: A semisync enabled replication is mutual agreement between
Master and Slave when the connection (I/O thread) is established.
Once I/O thread is started and if semisync is enabled on both
master and slave, master appends special magic header to events
using semisync plugin functions and sends it to slave. And slave
expects that each event will have that special magic header format
and reads those bytes using semisync plugin functions.
When semi sync replication is in use if users execute
uninstallation of the plugin on master, slave gets confused while
interpreting that event's content because it expects special
magic header at the beginning of the event. Slave SQL thread will
be stopped with "Missing magic number in the header" error.
Similar problem will happen if uninstallation of the plugin happens
on slave when semi sync replication is in in use. Master sends
the events with magic header and slave does not know about the
added magic header and thinks that it received a corrupted event.
Hence slave SQL thread stops with "Found corrupted event" error.
Fix: Uninstallation of semisync plugin will be blocked when semisync
replication is in use and will throw 'ER_UNKNOWN_ERROR' error.
To detect that semisync replication is in use, this patch uses
semisync status variable values.
> On Master, it checks for 'Rpl_semi_sync_master_status' to be OFF
before allowing the uninstallation of rpl_semi_sync_master plugin.
>> Rpl_semi_sync_master_status is OFF when
>>> there is no dump thread running
>>> there are no semisync slaves
> On Slave, it checks for 'Rpl_semi_sync_slave_status' to be OFF
before allowing the uninstallation of rpl_semi_sync_slave plugin.
>> Rpl_semi_sync_slave_status is OFF when
>>> there is no I/O thread running
>>> replication is asynchronous replication.
Give error for wrong parameters to CHANGE MASTER
Extend MASTER_PASSWORD and MASTER_HOST lengths
mysql-test/suite/rpl/r/rpl_password_boundaries.result:
Test length of MASTER_PASSWORD, MASTER_HOST and MASTER_USER
mysql-test/suite/rpl/r/rpl_semi_sync.result:
Use different password than user name for better test coverage
mysql-test/suite/rpl/t/rpl_password_boundaries.test:
Test length of MASTER_PASSWORD, MASTER_HOST and MASTER_USER
mysql-test/suite/rpl/t/rpl_semi_sync.test:
Use different password than user name for better test coverage
sql/rpl_mi.h:
Extend MASTER_PASSWORD and MASTER_HOST lengths
sql/sql_repl.cc:
Give error for wrong parameters to CHANGE MASTER
sql/sql_repl.h:
Extend MASTER_PASSWORD and MASTER_HOST lengths
mysql-test-run auto-disables all optional plugins.
mysql-test/include/default_client.cnf:
no @OPT.plugindir anymore
mysql-test/include/default_mysqld.cnf:
don't disable plugins manually - mtr can do it better
mysql-test/suite/innodb/t/innodb_bug47167.test:
mtr now uses suite-dir as an include path
mysql-test/suite/innodb/t/innodb_file_format.test:
mtr now uses suite-dir as an include path
mysql-test/t/partition_binlog.test:
this test uses partitions
storage/example/mysql-test/mtr/t/source.result:
update results. as mysqltest includes the correct overlayed include
storage/innobase/handler/ha_innodb.cc:
the assert is wrong
Currently, rpl_semi_sync is failing in PB due to the warning message:
"Slave SQL: slave SQL thread is being stopped in the middle of "
"applying of a group having updated a non-transaction table; "
"waiting for the group completion ..."
The problem started happening after the fix for BUG#11762407 what was
automatically suppressing some warning messages.
To fix the current issue, we suppress the aforementioned warning message
and exploit the opportunity to make the sentence clearer.
After stopped slave, it is possible that the Dump thread on master
is still running and has locked the semi-sync master plugin, and when
uninstalling the semi-sync master plugin, a plugin busy warning could
be generated.
Fixed by disabling the warnings when uninstalling semi-sync plugin
on master.
The root cause of the crash is that a TranxNode is freed before it is used.
A TranxNode is allocated and inserted into the active list each time
a log event is written and flushed into the binlog file.
The memory for TranxNode is allocated with thd_alloc and will be freed
at the end of the statement. The after_commit/after_rollback callback
was supposed to be called before the end of each statement and remove the node from
the active list. However this assumption is not correct in all cases(e.g. call
'CREATE TEMPORARY TABLE myisam_t SELECT * FROM innodb_t' in a transaction
and delete all temporary tables automatically when a session closed),
and can cause the memory allocated for TranxNode be freed
before it was removed from the active list. So The TranxNode pointer in the active
list would become a wild pointer and cause the crash.
After this patch, We have a class called a TranxNodeAllocate which manages the memory
for allocating and freeing TranxNode. It uses my_malloc to allocate memory.
sql/rpl_handler.cc:
params are not initialized.
CMakeLists.txt:
Add plugin/semisync subdirectory
mysql-test/mysql-test-run.pl:
Check for semisync dll for Windows
mysql-test/suite/rpl/r/rpl_semi_sync.result:
Update result file
mysql-test/suite/rpl/t/rpl_semi_sync.test:
Test semi-sync on Windows
plugin/semisync/semisync_master.cc:
Define gettimeofday for Windows
Remove functions that no longer needed
Fix warning suppressions
mysql-test/suite/rpl/t/rpl_semi_sync.test:
Fix warning suppressions
plugin/semisync/semisync_slave.cc:
Remove functions that no longer needed
plugin/semisync/semisync_slave.h:
Remove functions that no longer needed
Add an option to control whether the master should keep waiting
until timeout when it detected that there is no semi-sync slave
available.
The bool option 'rpl_semi_sync_master_wait_no_slave' is 1 by
defalt, and will keep waiting until timeout. When set to 0, the
master will switch to asynchronous replication immediately when
no semi-sync slave is available.
Semi-sync status were not reset by FLUSH STATUS, this was because
all semi-sync status variables are defined as SHOW_FUNC and FLUSH
STATUS could only reset SHOW_LONG type variables.
This problem is fixed by change all status variables that should
be reset by FLUSH STATUS from SHOW_FUNC to SHOW_LONG.
After the fix, the following status variables will be reset by
FLUSH STATUS:
Rpl_semi_sync_master_yes_tx
Rpl_semi_sync_master_no_tx
Note: normally, FLUSH STATUS itself will be written into binlog
and be replicated, so after FLUSH STATS, one of
Rpl_semi_sync_master_yes_tx
Rpl_semi_sync_master_no_tx
can be 1 dependent on the semi-sync status. So it's recommended
to use FLUSH NO_WRITE_TO_BINLOG STATUS to avoid this.
Semi-sync uses an extra connection from slave to master to send
replies, this is a normal client connection, and used a normal
SET query to set the reply information on master, which is visible
to user and may cause some confusion and complaining.
This problem is fixed by using the method of sending reply by
using the same connection that is used by master dump thread to
send binlog to slave. Since now the semi-sync plugins are integrated
with the server code, it is not a problem to use the internal net
interfaces to do this.
The master dump thread will mark the event requires a reply and
wait for the reply when the event just sent is the last event
of a transaction and semi-sync status is ON; And the slave will
send a reply to master when it received such an event that requires
a reply.