post-push fixes for show_slave_io_error= 1 of wait_for_slave_io_error.inc;
Unix and win format path specifically so few tests have to change show_slave_io_error
to zero.
The bug case is similar to one fixed earlier bug_49536.
Deadlock involving LOCK_log appears to be possible because the purge running thread
is holding LOCK_log whereas there is no sense of doing that and which fact was
exploited by the earlier bug fixes.
Fixed with small reengineering of rotate_and_purge(), adding two new methods and
setting up a policy to execute those instead of the former
rotate_and_purge(RP_LOCK_LOG_IS_ALREADY_LOCKED).
The policy for using rotate(), purge() is that if the caller acquires LOCK_log itself,
it should call rotate(), release the mutex and run purge().
Side effect of this patch is refining error message of bug@11747416 to print
the whole path.
Binary log of master can get a partially logged event if the server
runs out of disk space and, while waiting for some space to be freed,
is shut down (or crashes). If the server is not stopped, it will just
wait endlessly for space to be freed, thus no partial event anomaly
occurs. The restarted master server has had a dubious policy to send
the incomplete event to slave which it apparently can't handle.
Although an error was printed out the fact of sending with unclear
error message is a source of confusion.
Actually the problem of presence an incomplete event in the binary log
was already fixed by WL 5493 (which was merged to our current trunk
branch, major version 5.6). The fix makes the server truncate the
binary log on server restart and recovery.
However 5.5 master can't do that. So the current issue is a problem of
sending incomplete events to the slave by 5.5 master.
It is fixed in this patch by changing the policy so that only complete
events are pushed by the dump thread to the IO thread. In addition,
the error text that master sends to the slave when an incomplete event
is found, now states that incomplete event may have been caused by an
out-of-disk space situation and provides coordinates of
the first and the last event bytes read.
Problem: The following statements can cause the slave to go out of sync
if logged in statement format:
INSERT IGNORE...SELECT
INSERT ... SELECT ... ON DUPLICATE KEY UPDATE
REPLACE ... SELECT
UPDATE IGNORE :
CREATE ... IGNORE SELECT
CREATE ... REPLACE SELECT
Background: Since the order of the rows returned by the SELECT
statement or otherwise may differ on master and slave, therefore
the above statements may cuase the salve to go out of sync with
the master.
Fix:
Issue a warning when statements like the above are exectued and
the bin-logging format is statement. If the logging format is mixed,
use row based logging. Marking a statement as unsafe has been
done in the sql/sql_parse.cc instead of sql/sql_yacc.cc, because while
parsing for a token has been done we cannot be sure if the parsing
of the other tokens has been done as well.
Six new warning messages has been added for each unsafe statement.
binlog.binlog_unsafe.test has been updated to incoporate these additional unsafe statments.
******
BUG#11758262 - 50439: MARK INSERT...SEL...ON DUP KEY UPD,REPLACE...SEL,CREATE...[IGN|REPL] SEL
Problem: The following statements can cause the slave to go out of sync
if logged in statement format:
INSERT IGNORE...SELECT
INSERT ... SELECT ... ON DUPLICATE KEY UPDATE
REPLACE ... SELECT
UPDATE IGNORE :
CREATE ... IGNORE SELECT
CREATE ... REPLACE SELECT
Background: Since the order of the rows returned by the SELECT
statement or otherwise may differ on master and slave, therefore
the above statements may cuase the salve to go out of sync with
the master.
Fix:
Issue a warning when statements like the above are exectued and
the bin-logging format is statement. If the logging format is mixed,
use row based logging. Marking a statement as unsafe has been
done in the sql/sql_parse.cc instead of sql/sql_yacc.cc, because while
parsing for a token has been done we cannot be sure if the parsing
of the other tokens has been done as well.
Six new warning messages has been added for each unsafe statement.
binlog.binlog_unsafe.test has been updated to incoporate these additional unsafe statments.
Before BUG#28796, an empty host was used to identify that an instance was no
longer a slave. However, BUG#28796 changed this behavior and one cannot set
an empty host. Besides, a RESET SLAVE only cleans up information on the next
event to retrieve from the master, disables ssl and resets heartbeat period.
So a call to SHOW SLAVE STATUS after issuing a RESET SLAVE still returns some
valid information, such as host, port, user and password.
To fix this problem, we have introduced the command RESET SLAVE ALL that does
what a regular RESET SLAVE does and also clears host, port, user and password
information thus allowing users to identify when an instance is no longer a
slave.
The server crashes if it processes table map events that are
corrupted, especially if they map different tables to the same
identifier. This could happen, for instance, due to BUG 56226.
We fix this by checking whether the table map has already been
mapped before actually applying the event. If it has been mapped
with different settings an error is raised and the slave SQL
thread stops. If it has been mapped with same settings the event
is skipped. If the table is set to be ignored by the filtering
rules, there is no change in behavior: the event is skipped and
ids are not checked.
VM-WIN2003-32-A, SLES10-IA64-A
The test case waits for master_pos_wait not to timeout, which
means that the deadlock between SQL and IO threads was
succesfully and automatically dealt with.
However, very rarely, master_pos_wait reports a timeout. This
happens because the time set for master_pos_wait to wait was
too small (6 seconds). On slow test env this could be a
problem.
We fix this by setting the timeout inline with the one used
in sync_slave_with_master (300 seconds). In addition we
refactored the test case and refined some comments.
Manual merged mysql-5.1-gca into latest mysql-5.5.
Conflicts
=========
Text conflict in mysql-test/suite/rpl/r/rpl_relayspace.result
Text conflict in mysql-test/suite/rpl/t/rpl_relayspace.test
If LOAD DATA INFILE featured a SET clause, the name=value pairs
would be regenerated using item::print. Unfortunately, that code
is mostly optimized for EXPLAIN EXTENDED output and such, and can
not be relied on to return valid SQL.
We now name each value its original, user-supplied form and use
that to create LOAD DATA INFILE statements for statement-based
replication.
- add support for choosing the engine of test
table(t1) with $engine_type
- add primary key to the test table(t1) to support
replication of BLOB/TEXT (also with ENGINE=ndb)
- change the suppression since the warning printed to error log
now says "Column 1"
IN WAIT_SHOW_CONDITION)
There was a typo in the name of one of the parameters to the
include file wait_show_condition. The parameter name was being
set to "connection" instead of "condition".
We fix this typo, improve one instruction in the test case and
deploy parameter checks inside wait_show_condition.inc.
Manual merge from mysql-5.1 into mysql-5.5.
Conflicts
=========
Text conflict in mysql-test/suite/rpl/t/rpl_row_until.test
Text conflict in sql/handler.h
Text conflict in storage/archive/ha_archive.cc
ARE NOT BEING HONORED
max_allowed_packet works in conjunction with net_buffer_length.
max_allowed_packet is an upper bound of net_buffer_length.
So it doesn't make sense to set the upper limit lower than the value.
Added a warning (using ER_UNKNOWN_ERRROR and a specific message)
when this is done (in the log at startup and when setting either
max_allowed_packet or the net_buffer_length variables)
Added a test case.
Fixed several tests that broke the above rule.
The slave was not able to find the correct row in the innodb
table, because the row fetched from the innodb table would not
match the before image. This happened because the (don't care)
bytes in the NULLed fields would change once the row was stored
in the storage engine (from zero to the default value). This
would make bulk memory comparison (using memcmp) to fail.
We fix this by taking a preventing measure and avoiding memcmp
for tables that contain nullable fields. Therefore, we protect
the slave search routine from engines that return arbitrary
values for don't care bytes (in the nulled fields). Instead, the
slave thread will only check null_bits and those fields that are
not set to NULL when comparing the before image against the
storage engine row.
configuration wizard to fail
Made the fields mysql.user.plugin and mysql.user.authentication_string
nullable to conform with some older clients doing inserts instead of
using the commands.
Currently, rpl_semi_sync is failing in PB due to the warning message:
"Slave SQL: slave SQL thread is being stopped in the middle of "
"applying of a group having updated a non-transaction table; "
"waiting for the group completion ..."
The problem started happening after the fix for BUG#11762407 what was
automatically suppressing some warning messages.
To fix the current issue, we suppress the aforementioned warning message
and exploit the opportunity to make the sentence clearer.
There is a race between two threads: user thread and the dump
thread. The former sets a debug instruction that makes the latter wait
before processing an Xid event. There can be cases that the dump
thread has not yet processed the previous Xid event, causing it to
wait one Xid event too soon, thus causing sync_slave_with_master never
to resume.
We fix this by moving the instructions that set the debug variable
after calling sync_slave_with_master.
Problem: the test failed because errors were found in the error log.
The test case contains suppressions for an old version of the error message,
but the format of the error message has changed without updating the suppression.
Fix: Update the suppression. Also small fixes to improve the test.