ARE NOT BEING HONORED
max_allowed_packet works in conjunction with net_buffer_length.
max_allowed_packet is an upper bound of net_buffer_length.
So it doesn't make sense to set the upper limit lower than the value.
Added a warning (using ER_UNKNOWN_ERRROR and a specific message)
when this is done (in the log at startup and when setting either
max_allowed_packet or the net_buffer_length variables)
Added a test case.
Fixed several tests that broke the above rule.
The slave was not able to find the correct row in the innodb
table, because the row fetched from the innodb table would not
match the before image. This happened because the (don't care)
bytes in the NULLed fields would change once the row was stored
in the storage engine (from zero to the default value). This
would make bulk memory comparison (using memcmp) to fail.
We fix this by taking a preventing measure and avoiding memcmp
for tables that contain nullable fields. Therefore, we protect
the slave search routine from engines that return arbitrary
values for don't care bytes (in the nulled fields). Instead, the
slave thread will only check null_bits and those fields that are
not set to NULL when comparing the before image against the
storage engine row.
configuration wizard to fail
Made the fields mysql.user.plugin and mysql.user.authentication_string
nullable to conform with some older clients doing inserts instead of
using the commands.
Currently, rpl_semi_sync is failing in PB due to the warning message:
"Slave SQL: slave SQL thread is being stopped in the middle of "
"applying of a group having updated a non-transaction table; "
"waiting for the group completion ..."
The problem started happening after the fix for BUG#11762407 what was
automatically suppressing some warning messages.
To fix the current issue, we suppress the aforementioned warning message
and exploit the opportunity to make the sentence clearer.
There is a race between two threads: user thread and the dump
thread. The former sets a debug instruction that makes the latter wait
before processing an Xid event. There can be cases that the dump
thread has not yet processed the previous Xid event, causing it to
wait one Xid event too soon, thus causing sync_slave_with_master never
to resume.
We fix this by moving the instructions that set the debug variable
after calling sync_slave_with_master.
Problem: the test failed because errors were found in the error log.
The test case contains suppressions for an old version of the error message,
but the format of the error message has changed without updating the suppression.
Fix: Update the suppression. Also small fixes to improve the test.
With --mem if fails with
+UNEXPECTED ERROR NUMBER: 1290
In var/log/mysqld.2.err we have:
[ERROR] LOAD DATA INFILE in the slave SQL Thread can only read from --slave-load-tmpdir. Please, report a bug.
[ERROR] Slave SQL: Error 'The MySQL server is running with the --slave-load-tmpdir option so it cannot execute this statement' on query. Default database: 'test'. Query: 'LOAD DATA INFILE '../../tmp/SQL_LOAD-2-1-1.data' INTO TABLE `t1` FIELDS TERMINATED BY '\t' ENCLOSED BY '' ESCAPED BY '\\' LINES TERMINATED BY '\n' (`a`, `b`)', Error_code: 1290
getcwd() in the server yields something like: /dev/shm/var_auto_iv5Q/mysqld.2/data
There is one part of the test case that needs to break
and re-establish the circular topology. For this the test
stops the slave threads on a couple of servers and restarts
them with START SLAVE. However, no check is done on the
status of the IO or SQL threads before proceeding with
the subsequent commands.
Because rpl_only_running_threads is set to 1 this can lead
to silently not syncing all slave threads as expected,
ultimately resulting in unexpected results (and consequently
on a failing test run).
We fix this by replacing the START SLAVE instructions with
calls to --source include/start_slave.inc, which will wait
for the slave threads to be running (show 'Yes' in
Slave_IO|SQL_Running fields of SHOW SLAVE STATUS) before
proceeding. Additionally, we change rpl_sync.inc to make the
IO thread report that it is running when its running status
is any other than 'No'.
In SBR, if a statement does not fail, it is always written to the binary
log, regardless if rows are changed or not. If there is a failure, a
statement is only written to the binary log if a non-transactional (.e.g.
MyIsam) engine is updated.
INSERT ON DUPLICATE KEY UPDATE and INSERT IGNORE were not following the
rule above and were not written to the binary log, if then engine was
Innodb.
There are two calls to read_log_event() on master in mysql_binlog_send().
Each call reads 19 bytes in this test case and the error of the second
read_log_event() is reported to the slave.
The second read_log_event() starts from position 94 (75 + 19) to 113
(75 + 19 + 19). Usually, there are two events in the binary log:
. 0 - 3 - Header
. 4 - 105 - Format Descriptor Event
. 106 - 304 - Query Event
and both reads fail because operations are reading from invalid positions
as expected.
However, mysql_binlog_send() does not use the same IO_CACHE that is used to
write into binary log (i.e. mysql_bin_log.log_file) for the hot binary log.
It opens the binary log file directly by calling open_binlog() and creates a
separated IO_CACHE. So there is a possibly that after a master has flushed
the binary log file, the content has been cached by the filesystem, and has
not updated the disk file. If this happens, then a slave will only see part
of the file, and thus the second read_log_event() will report event truncated
error.
To fix the problem, if the first read_log_event() has failed, we ensure that
the second one will try to read from the same position.
rpl_packet got a timeout failure sporadically on PB when stopping
slave. The real reason of this bug is that STOP SLAVE stopped
IO thread first and then stopped SQL thread. It was
possible that IO thread stopped after replicating part of a
transaction which SQL thread was executing. SQL thread would
be hung if the transaction could not be rolled back safely.
After this patch, STOP SLAVE will stop SQL thread first and then stop IO
thread, which guarantees that IO thread will fetch the reset of the
events of the transaction that SQL thread is executing, so that SQL
thread can finish the transaction if it cannot be rolled back safely.
Added below auxiliary files to make the test code neater.
restart_slave_sql.inc
rpl_connection_master.inc
rpl_connection_slave.inc
rpl_connection_slave1.inc
On windows, an #endif in a wrong place was causing an early
return from mysql_load and thus the LOAD DATA LOCAL was not
executed. This problem was fixed by moving the #endif to the
right place.
The following code was missing
if ((stat_info.st_mode & S_IFIFO) == S_IFIFO)
is_fifo = 1;
which is required to properly configure and read from the
IO_CACHE when a named pipe is used. So it was re-introduced
before the #endif.
The test started failing on the same day patch for BUG 49978 was
pushed. BUG 49978 changed part of the replication testing
infrastructure in mysql-test-run. This caused the test to fail
sporadically with result differences on relay log file
names. When the test fails the relay-log filenames are shifted by
one, eg:
-show relaylog events in 'slave-relay-bin.000002' from <binlog_start>;
+show relaylog events in 'slave-relay-bin.000003' from <binlog_start>;
The problem was caused by a bad cleanup when using the include
files:
- include/setup_fake_relay_log.inc
- include/cleanup_fake_relay_log.inc
Which would leave a spurious relay-log file around (not listed in
slave-relay-bin.index), causing the server to shift the name of
the relay logs by one, even if cleaning up with RESET SLAVE.
We fix this by removing the relay-log file when it is not needed
anymore, ie at setup time and after recreating the fake relay-log
index.
Additionally, to make the affected test more resilient, we
deployed a call to rpl_reset.inc (which resets both master and
slave, including log files) before actually running the test
case.
Finally, appart from the reported bug, we also fix: (a) an
unrelated issue with the failing test itself - in some cases, the
test was not setting the log file name to use when it should;
(b) one typo.
Problem: master executed a statement that would fail on slave
(namely, DROP USER 'create_rout_db'@'localhost').
Then the test did:
--let $rpl_only_running_threads= 1
--source include/rpl_reset.inc
rpl_reset.inc calls rpl_sync.inc, which first checks which of
the threads are running and then syncs those threads that are
running. If the SQL thread fails after the check, the sync will
fail. So there was a race in the test and it failed on some
slow hosts.
Fix: Don't replicate the failing statement.
Normally, auto_increment value is generated for the column by
inserting either NULL or 0 into it. NO_AUTO_VALUE_ON_ZERO
suppresses this behavior for 0 so that only NULL generates
the auto_increment value. This behavior is also followed by
a slave, specifically by the SQL Thread, when applying events
in the statement format from a master. However, when applying
events in the row format, the flag was ignored thus causing
an assertion failure:
"Assertion failed: next_insert_id == 0, file .\handler.cc"
In fact, we never need to generate a auto_increment value for
the column when applying events in row format on slave. So we
don't allow it to happen by using 'MODE_NO_AUTO_VALUE_ON_ZERO'.
Refactoring: Get rid of all the sql_mode checks to rows_log_event
when applying it for avoiding problems caused by the inconsistency
of the sql_mode on slave and master as the sql_mode is not set for
Rows_log_event.
Normally, auto_increment value is generated for the column by
inserting either NULL or 0 into it. NO_AUTO_VALUE_ON_ZERO
suppresses this behavior for 0 so that only NULL generates
the auto_increment value. This behavior is also followed by
a slave, specifically by the SQL Thread, when applying events
in the statement format from a master. However, when applying
events in the row format, the flag was ignored thus causing
an assertion failure:
"Assertion failed: next_insert_id == 0, file .\handler.cc"
In fact, we never need to generate a auto_increment value for
the column when applying events in row format on slave. So we
don't allow it to happen by using 'MODE_NO_AUTO_VALUE_ON_ZERO'.
Refactoring: Get rid of all the sql_mode checks to rows_log_event
when applying it for avoiding problems caused by the inconsistency
of the sql_mode on slave and master as the sql_mode is not set for
Rows_log_event.