RPL_ROTATE_LOGS has been failing sporadically in what seems a
problem related to routines that update the coordinates. However,
the test lacks proper assert statments and because of this the
debug information upon failure simply points to the content
mismatch between the test and the result file.
Not as a solution, but as a improvement to the test to better
debug this failure, new assert statments were added to the test.
@rpl_rotate_logs.test
Added new assert statments reducing the
dependency on the result file.
@rpl_rotate_logs.result
Added new content to the result file to
match the test changes
Problem: The problem with the test is that the slave returns
from start_slave.inc call too early before the list
is actually actualised. This caused the slave stale
data to be reported.
Fix: Added a wait in the test till the slave's IO status is
changed to "Waiting for master to send event" which
which ensures the list is correctly updated.
When a binlog is replayed into a server, e.g.:
$ mysqlbinlog binlog.000001 | mysql
it sets a pseudo slave mode on the client connection in order to server
be able to read binlog events, there is, a format description event is
needed to correctly read following events.
Also this pseudo slave mode applies to the current connection
replication rules that are needed to correctly apply binlog events.
If a binlog dump is sourced on a connection, this pseudo slave mode will
remains after it, what will apply unexpected rules from customer
perspective to following commands.
Added a new SET statement to binlog dump that will unset pseudo slave
mode at the end of dump file.
QUOTING IN REPLICATION
Problem: Misquoting or unquoted identifiers may lead to
incorrect statements to be logged to the binary log.
Fix: we use specialized functions to append quoted identifiers in
the statements generated by the server.
rpl_cant_read_event_incident:
Slave applies updates from bug11747416_32228_binlog.000001 file which
contains a CREATE TABLE t statement and an incident, when SQL thread is
running slowly IO thread may reach the incident before SQL thread
executes the create table statement.
Execute "drop table if exists t" and also perform a RESET MASTER to
clean slave binary logs.
rpl_bug41902:
Error "MYSQL_BIN_LOG::purge_logs was called with file
./master-bin.000001 not listed in the index." suppression is not
considering windows path, there is ".\master-bin.000001".
Changed suppression to: "MYSQL_BIN_LOG::purge_logs was called with file
..master-bin.000001 not listed in the index", to match ".\" and "./".
updating the result file. Because a multi-row insert now reserves the
auto increment values before hand, if any explicitly specified auto
increment values are there, then some of the reserved values are lost.
Problem
========
SQL statements close to the size of max_allowed_packet produce binary
log events larger than max_allowed_packet.
The reason why this failure is occuring is because the event length is
more than the total size of the max_allowed_packet + max_event_header
length. Now since the event length exceeds this size master Dump
thread is unable to send the packet on to the slave.
That can happen e.g with row-based replication in Update_rows event.
Fix
====
The problem was fixed by increasing the max_allowed_packet for the
slave's threads (IO/SQL) by increasing it to 1GB.
This is done using the new server option included which is used to
regulate the max_allowed_packet of the slave thread (IO/SQL).
This causes the large packets to be received by the slave and apply
it successfully.
sql/log_event.h:
Added the new option in the log_event.h file.
sql/mysqld.cc:
Added a new option to the server.
sql/slave.cc:
Increasing the session max_allowed_packet to a large value ,
i.e. not taking global(max_allowed) into consideration, for the slave's threads.
Problem - The failure on PB2 is possbily due to the port number being still in
use even after the server restarts which is not reflected in the
server restart.
Fix - The problem is fixed by starting the servers forcefully using the option
file and also the parameters for the server restart is passed correctly.
mysql-test/suite/rpl/t/rpl_report_port-master.opt:
Option file for the master.
The function mysql_show_binlog_events has a local stack variable
'LOG_INFO linfo;', which is assigned to thd->current_linfo, however
this variable goes out of scope and is destroyed before clean
thd->current_linfo.
The problem is solved by moving 'LOG_INFO linfo;' to function scope.
BUG#11761686 insert_id event is not filtered.
Two issues are covered.
INSERT into autoincrement field which is not the first part in the composed primary key
is unsafe by autoincrement logging design. The case is specific to MyISAM engine
because Innodb does not allow such table definition.
However no warnings and row-format logging in the MIXED mode was done, and
that is fixed.
Int-, Rand-, User-var log-events were not filtered along with their parent
query that made possible them to screw up execution context of the following
query.
Fixed with deferring their execution until the parent query.
******
Bug#11754117
Post review fixes.
mysql-test/suite/rpl/r/rpl_auto_increment_bug45679.result:
a new result file is added.
mysql-test/suite/rpl/r/rpl_filter_tables_not_exist.result:
results updated.
mysql-test/suite/rpl/t/rpl_auto_increment_bug45679.test:
regression test for BUG#11754117-45670 is added.
mysql-test/suite/rpl/t/rpl_filter_tables_not_exist.test:
regression test for filtering issue of BUG#11754117 - 45670 is added.
sql/log_event.cc:
Logics are added for deferring and executing events associated
with the Query event.
sql/log_event.h:
Interface to deferred events batch execution is added.
sql/rpl_rli.cc:
initialization for new RLI members is added.
sql/rpl_rli.h:
New members to RLI are added to facilitate deferred events gathering
and execution control;
two general character RLI cleanup methods are constructed.
sql/rpl_utility.cc:
Deferred_log_events methods are difined.
sql/rpl_utility.h:
A new class Deferred_log_events is defined to implement
IRU events gathering, execution and cleanup.
sql/slave.cc:
Necessary changes to initialize `rli->deferred_events' and prevent
deferred event deletion in the main read-exec branch.
sql/sql_base.cc:
A new safe-check function for multi-part pk with auto-increment is defined
and deployed in lock_tables().
sql/sql_class.cc:
Initialization for a new member and replication cleanups are added
to THD class.
sql/sql_class.h:
THD class receives a new member to hold a specific execution
context for slave applier.
sql/sql_parse.cc:
Execution of the deferred event in started prior to its parent query.
PROBLEM:
--------
When binary log statements are replayed on the slave, BEGIN is represented
in com_counters but COMMIT is not. Similarly in 'ROW' based replication
'INSERT','UPDATE',and 'DELETE' com_counters are not getting incremented
when the binary log statements are replayed at slave.
ANALYSIS:
---------
In 'ROW' based replication for COMMIT,INSERT,UPDATE and DELETE operations
following special events are invoked.
Xid_log_event,Write_rows_log_event,Update_rows_log_event,Update_rows_log_event.
The above mentioned events doesn't go through the parser where the
'COM_COUNTERS' are incremented.
FIX:
-----
Increment statements are added at appropriate events.
Respective functions are listed below.
'Xid_log_event::do_apply_event'
'Write_rows_log_event::do_before_row_operations'
'Update_rows_log_event::do_before_row_operations'
'Delete_rows_log_event::do_before_row_operations'
sql/log_event.cc:
Added code to increment counts for 'COM_INSERT','COM_UPDATE',
'COM_DELETE' and 'COM_COMMIT'during ROW based replicaiton
Problem - this failure occured in the test added for the fix of the
bug-13333431. The basic problem of the failure was the
value of the report_port which persisted even after the end
of the test (ie. rpl_end.inc). So this causes the assertion
in the test to fail if it is executed again.
Fix - restarted the server with the default value being passed to the
report_port after testing the two expected case so that in the
next run of the test we will not encounter the previous value of
report_port.
mysql-test/suite/rpl/r/rpl_report_port.result:
Updated the corresponding result file.
mysql-test/suite/rpl/t/rpl_report_port-slave.opt:
Removed the slave option file.
mysql-test/suite/rpl/t/rpl_report_port.test:
Added the restart server option before ending the test.
Description: When the table has more than one unique or primary key,
INSERT... ON DUP KEY UPDATE statement is sensitive to the order in which
the storage engines checks the keys. Depending on this order, the storage
engine may determine different rows to mysql, and hence mysql can update
different rows on master and slave.
Solution: We mark INSERT...ON DUP KEY UPDATE on a table with more than on unique
key as unsafe therefore the event will be logged in row format if it is available
(ROW/MIXED). If only STATEMENT format is available, a warning will be thrown.
mysql-test/suite/binlog/r/binlog_unsafe.result:
Updated result file
mysql-test/suite/binlog/t/binlog_unsafe.test:
Added test to check for warning being thrown when the unsafe statement is executed
mysql-test/suite/rpl/r/rpl_known_bugs_detection.result:
Updated result file
sql/share/errmsg-utf8.txt:
Added new warning message
sql/sql_base.cc:
check for tables in the query with more than one UNIQUE KEY and INSERT ON DUPLICATE KEY UPDATE, and mark such statements unsafe.
BUG#64503: mysql frequently ignores --relay-log-space-limit
When the SQL thread goes to sleep, waiting for more events, it sets
the flag ignore_log_space_limit to true. This gives the IO thread a
chance to queue some more events and ultimately the SQL thread will be
able to purge the log once it is rotated. By then the SQL thread
resets the ignore_log_space_limit to false. However, between the time
the SQL thread has set the ignore flag and the time it resets it, the
IO thread will be queuing events in the relay log, possibly going way
over the limit.
This patch makes the IO and SQL thread to synchronize when they reach
the space limit and only ask for one event at a time. Thus the SQL
thread sets ignore_log_space_limit flag and the IO thread resets it to
false everytime it processes one more event. In addition, everytime
the SQL thread processes the next event, and the limit has been
reached, it checks if the IO thread should rotate. If it should, it
instructs the IO thread to rotate, giving the SQL thread a chance to
purge the logs (freeing space). Finally, this patch removes the
resetting of the ignore_log_space_limit flag from purge_first_log,
because this is now reset by the IO thread every time it processes the
next event when the limit has been reached.
If the SQL thread is in a transaction, it cannot purge so, there is no
point in asking the IO thread to rotate. The only thing it can do is
to ask for more events until the transaction is over (then it can ask
the IO to rotate and purge the log right away). Otherwise, there would
be a deadlock (SQL would not be able to purge and IO thread would not
be able to queue events so that the SQL would finish the transaction).
Fix - Changed the implementation of the condition check from the result file
to using an assert.
mysql-test/suite/rpl/r/rpl_report_port.result:
Updated the result file.
mysql-test/suite/rpl/t/rpl_report_port.test:
Changed from the a condtional check to an assert.
Problem - The default port number shown in SHOW SLAVE HOSTS is always 3306
though the slave is actually listening on a different port number.
This is a problem as the user can not be sure whether this port
value can be trusted and so client trying to read replication
topology can get confused.
Fix - 3306 ceases to be the default value of report-port. Moreover report-port
does not have a static default any longer.
Instead we initialize report-port to 0 as the new default value and change
it based on two checks :
1) If report_port is not set, the slave reports the port number its listening
on. (i.e. if report-port is not set we get the actual value of the slave's
port number).
2) If report-port is set, we show the value report-port is set to, as the slave's
port number.
mysql-test/include/show_slave_hosts.inc:
A .inc file is added to use show slave hosts in the new test added.
mysql-test/r/mysqld--help-notwin.result:
Updated the result file to show the default value passed for report-port.
mysql-test/suite/rpl/r/rpl_report_port.result:
The result file for the new test that is added.
mysql-test/suite/rpl/r/rpl_show_slave_hosts.result:
Updated the result file to show the default value passed for report-port.
mysql-test/suite/rpl/t/rpl_report_port-slave.opt:
Option file for the new test added.
mysql-test/suite/rpl/t/rpl_report_port.test:
Added a test to check the correct functionality of report-port.
We check this by running the replication twice.
In the first run we do not set the value of report-port through the opt file
and get the actual port number of the slave's port.
We then restart the server with report-port set to some value (in this case 9000)
and check the value reported for the slave's port number.
mysql-test/suite/sys_vars/t/report_port_basic.test:
Update the test file to show the value for report-port. It is replaced with
SLAVE_PORT as the actual value of the report-port will change with each run.
sql/mysqld.cc:
Changed the value reported by report port :
1. If the value for report-port is not set we assign report-port to be the
actual port number of the slave (mysqld_port).
2. If report-port is set we get the value set for the report-port.
sql/sys_vars.cc:
Passed 0 as the default value of the report-port.
PROBLEM: After WL 4144, when using MyISAM Merge tables, the routine
open_and_lock_tables will append to the list of tables to lock, the
base tables that make up the MERGE table. This has two side-effects in
replication:
1. On the master side, we log additional table maps for the base
tables, since they appear in the list of locked tables, even
though we don't really use them at the slave.
2. On the slave side, when opening a MERGE table while applying a
ROW event, additional tables are appended to the list of tables
to lock.
Side-effect #1 is not harmful. It's just that when using MyISAM Merge
tables a few table maps more may be logged.
Side-effect #2, is harmful, because the list rli->tables_to_lock is an
extended structure from TABLE_LIST in which the extra fields are
filled from the table maps that are processed. Since
open_and_lock_tables appends tables to the list after all table map
events have been processed we end up with entries without
replication/table map data on them. Thus when trying to access that
info for these extra tables, the server will crash.
SOLUTION: We fix side-effect #2 by making sure that we access the
replication part of the structure for those in the list that were
accounted for when processing the correspondent table map events. All
in all, we never go beyond rli->tables_to_lock_count.
We also deploy an assertion when clearing rli->tables_to_lock, making
sure that the base tables are not in the list anymore (were closed in
close_thread_tables).
Fixed a typo in the comment.
Fixing test cases which were previouslyno throwing due
disable warnings macro.
sql/sql_base.cc:
Change in indentation and fixing a typo in the comment.
Problem: Statements that write to tables with auto_increment columns
based on the selection from another table, may lead to master
and slave going out of sync, as the order in which the rows
are retrieved from the table may differ on master and slave.
Solution: We mark writing to a table with auto_increment table
based on the rows selected from another table as unsafe. This
will cause the execution of such statements to throw a warning
and forces the statement to be logged in ROW if the logging
format is mixed.
Changes:
1. All the statements that writes to a table with auto_increment
column(s) based on the rows fetched from another table, will now
be unsafe.
2. CREATE TABLE with SELECT will now be unsafe.
sql/share/errmsg-utf8.txt:
Added new warning messages.
sql/sql_base.cc:
-Created function to check statements that write to
tables with auto_increment column and has select.
-Marked all the statements that write to a table
with auto_increment column based on rows fetched
from other table(s) as unsafe.
sql/sql_table.cc:
mark CREATE TABLE[with auto_increment column] as unsafe.
Problem: Statements that write to tables with auto_increment columns
based on the selection from another table, may lead to master
and slave going out of sync, as the order in which the rows
are retrived from the table may differ on master and slave.
Solution: We mark writing to a table with auto_increment table
as unsafe. This will cause the execution of such statements to
throw a warning and forces the statement to be logged in ROW if
the logging format is mixed.
Changes:
1. All the statements that writes to a table with auto_increment
column(s) based on the rows fetched from another table, will now
be unsafe.
2. CREATE TABLE with SELECT will now be unsafe.
sql/share/errmsg-utf8.txt:
Added new Warning messages
sql/sql_base.cc:
created a new function that checks for select + write on a autoinc table
made all such statements to be unsafe.
sql/sql_parse.cc:
made create autoincremnet tabble + select unsafe
rpl_heartbeat_basic test fails sporadically on pushbuild because did
not received all heartbeats from slave in circular replication.
MASTER_HEARTBEAT_PERIOD had the default value (slave_net_timeout/2) so
wait on "Heartbeat event received on master", that only waits for 1
minute, sometimes timeout before heartbeat arrives. Fixed setting a
smaller period value.
Problem : The basic problem is the way the thread sleeps in mysql-5.5 and also in mysql-5.1
when we execute a stop slave on windows platform.
On windows platform if the stop slave is executed after the master dies, we have
this long wait before the stop slave return a value. This is because there is a
sleep of the thread. The sleep is uninterruptable in the two above version,
which was fixed by Davi patch for the BUG#11765860 for mysql-trunk. Backporting
his patch for mysql-5.5 fixes the problem.
Solution : A new pair of mutex and condition variable is introduced to synchronize thread
sleep and finalization. A new mutex is required because the slave threads are
terminated while holding the slave thread locks (run_lock), which can not be
relinquished during termination as this would affect the lock order.
mysql-test/suite/rpl/r/rpl_start_stop_slave.result:
The result file associated with the test added.
mysql-test/suite/rpl/t/rpl_start_stop_slave.test:
A test to check the new functionality.
sql/rpl_mi.cc:
The constructor using the new mutex and condition variables for the master_info.
sql/rpl_mi.h:
The condition variable and mutex have been added for the master_info.
sql/rpl_rli.cc:
The constructor using the new mutex and condition variables for the realy_log_info.
sql/rpl_rli.h:
The condition variable and mutex have been added for the relay_log_info.
sql/slave.cc:
Use a timed wait on a condition variable to implement a interruptible sleep.
The wait is registered with the THD object so that the thread will be woken
up if killed.
If an error message contains '\' backslash it is displayed correctly
through show-slave-status or
query_get_value(SHOW SLAVE STATUS, Last_IO_Error, 1);. But when
SELECT REPLACE(...) is applied backslash is escaped resulting in a
different test output.
Disabled backslash escape on show_slave_status.inc and replaced '\' for
'/' using replace_regex function in order to achieve the same test
output when different path separators are used.
A follow-up patch corrects max sizes of printed strings and changes llstr() to %lld.
Credits go to Davi who provided a great feedback.
sql/share/errmsg-utf8.txt:
Max size for the whole message is 512 so a part of - like '%-.512s' should be less,
reduction to 320 is safe and with good chances won't cut off a part of a rather log
message in Last_IO_Error = 'Got fatal error 1236 ...'
sql/sql_repl.cc:
llstr() is replaced by %lld.
The server crashes when receiving a COM_BINLOG_DUMP command with a position of 0 or
larger than the file size.
The execution proceeds to an error block having the last read replication coordinates
pointer be NULL and its dereferencing crashed the server.
Fixed with making "public" previously used only for heartbeat coordinates.
mysql-test/extra/rpl_tests/rpl_start_stop_slave.test:
regression test for bug#3593869-64035 is added.
mysql-test/suite/rpl/r/rpl_cant_read_event_incident.result:
results updated (error mess format is changed).
mysql-test/suite/rpl/r/rpl_log_pos.result:
results updated (error mess format is changed).
mysql-test/suite/rpl/r/rpl_manual_change_index_file.result:
results updated (error mess format is changed).
mysql-test/suite/rpl/r/rpl_packet.result:
results updated (error mess format is changed).
mysql-test/suite/rpl/r/rpl_stm_start_stop_slave.result:
results updated (error mess format is changed).
mysql-test/suite/rpl/t/rpl_stm_start_stop_slave.test:
Slave is stopped by bug#3593869-64035 tests so
-let $rpl_only_running_threads= 1 is set prior to rpl_end.
sql/share/errmsg-utf8.txt:
Increasing the max length of explanatory message to 512.
sql/sql_repl.cc:
Making `coord' to carry the last read from binlog event coordinates
regardless of heartbeat.
Renaming, small cleanup and simplifying the code after if (coord) becomes unnecessary.
Adding yet another 3rd pair of coordinates - the starting replication -
into error text.
Test extra/rpl_tests/rpl_extra_col_master.test (used by
rpl_extra_col_master_*) ends with the active connection pointing to the
slave. Thus, the two last tests never succeed in changing the binlog
format of the master away from 'row'. With correct active connection
(master) tests fail for binlog 'statement' and 'mixed' formats.
Tests rpl_extra_col_master_* only run when binary log format is
row. Statement and mix replication do not make sense in this
tests since it will try to execute statements on columns that do
not exist. This fix is basically a backport from mysql-5.5, see
changes done as part of BUG 39934.
Refactored the test case: hardened and extended it. Created test inc file
to abstract the task of relocating binlogs.
Also, disabled it on windows for not cluttering the test case any further,
as it depends heavily on doing filesystem operations and path handling.
mysql-test/include/relocate_binlogs.inc:
Auxiliar include file that performs the relocation of binary logs
listed in an index file.