The replication slave sets first error 1913 and immediately after error
1595. Thus it is possible, but unlikely, to get 1913. The original test
seems to realise this, but uses an invalid error code - my guess is
that this was a temporary code used in a feature tree, which was then
forgotten to be fixed when merged to main. The removed "1923" is
something committed by mistake during tests.
Fixed a typo in the comment.
Fixing test cases which were previouslyno throwing due
disable warnings macro.
sql/sql_base.cc:
Change in indentation and fixing a typo in the comment.
Problem: Statements that write to tables with auto_increment columns
based on the selection from another table, may lead to master
and slave going out of sync, as the order in which the rows
are retrieved from the table may differ on master and slave.
Solution: We mark writing to a table with auto_increment table
based on the rows selected from another table as unsafe. This
will cause the execution of such statements to throw a warning
and forces the statement to be logged in ROW if the logging
format is mixed.
Changes:
1. All the statements that writes to a table with auto_increment
column(s) based on the rows fetched from another table, will now
be unsafe.
2. CREATE TABLE with SELECT will now be unsafe.
sql/share/errmsg-utf8.txt:
Added new warning messages.
sql/sql_base.cc:
-Created function to check statements that write to
tables with auto_increment column and has select.
-Marked all the statements that write to a table
with auto_increment column based on rows fetched
from other table(s) as unsafe.
sql/sql_table.cc:
mark CREATE TABLE[with auto_increment column] as unsafe.
- mysql-test-run.pl --valgrind complains when all tests succeed.
- perfschema.all_instances fail on non-linux, where ENABLE_TEMP_POOL
is not set and therefore BITMAP mutex is not used.
- MDEV-132: main.mysqldump fails because it depends on exact size of stdio
buffers.
- MDEV-99: rpl.rpl_cant_read_event_incident fails due to a race where the
slave manages to connect while the test case is in the middle of setting up
the master, causing the slave to replicate extra/wrong events.
- MDEV-133: rpl.rpl_rotate_purge_deadlock fails because it issues a
DEBUG_SYNC SIGNAL immediately followed by RESET; this means that sometimes
the intended receipient has no time to see the signal before it is cleared
by the RESET, causing wait to timeout.
Problem: Statements that write to tables with auto_increment columns
based on the selection from another table, may lead to master
and slave going out of sync, as the order in which the rows
are retrived from the table may differ on master and slave.
Solution: We mark writing to a table with auto_increment table
as unsafe. This will cause the execution of such statements to
throw a warning and forces the statement to be logged in ROW if
the logging format is mixed.
Changes:
1. All the statements that writes to a table with auto_increment
column(s) based on the rows fetched from another table, will now
be unsafe.
2. CREATE TABLE with SELECT will now be unsafe.
sql/share/errmsg-utf8.txt:
Added new Warning messages
sql/sql_base.cc:
created a new function that checks for select + write on a autoinc table
made all such statements to be unsafe.
sql/sql_parse.cc:
made create autoincremnet tabble + select unsafe
storage/innobase/handler/handler0alter.cc:
for NEWDATE key_type says unsigned, thus col->prtype says unsigned,
but field->flags says signed. Use the same flag for value retrieval
that was used for value storage.
rpl_heartbeat_basic test fails sporadically on pushbuild because did
not received all heartbeats from slave in circular replication.
MASTER_HEARTBEAT_PERIOD had the default value (slave_net_timeout/2) so
wait on "Heartbeat event received on master", that only waits for 1
minute, sometimes timeout before heartbeat arrives. Fixed setting a
smaller period value.
Don't log updates to performance schema in replication log.
Ensure that we don't call ha_update after ha_index_or_rnd_end() is called on slave.
.bzrignore:
Ignore some generated files
mysql-test/include/show_slave_status.inc:
Ensure that ./ is removed from file names
mysql-test/suite/perfschema/r/binlog_mix.result:
Updated results
mysql-test/suite/perfschema/r/binlog_row.result:
Updated results
mysql-test/suite/perfschema/r/binlog_stmt.result:
Updated results
mysql-test/suite/rpl/r/rpl_cant_read_event_incident.result:
Updated results
mysql-test/suite/rpl/r/rpl_performance_schema.result:
Ensure that we don't crash slave when we update performance schema
mysql-test/suite/rpl/t/rpl_performance_schema.test:
Ensure that we don't crash slave when we update performance schema
sql/log_event.cc:
Ensure that we don't call ha_update after ha_index_or_rnd_end() is called.
Remove old code that is not needed anymore (like restarting read loop over all rows if no matcing row is found)
Simplify code
sql/log_event_old.cc:
Ensure that we don't call ha_update after ha_index_or_rnd_end() is called.
storage/myisam/ha_myisam.cc:
More DBUG_PRINT
storage/perfschema/ha_perfschema.h:
Don't log updates to performance schema in replication log.
Problem : The basic problem is the way the thread sleeps in mysql-5.5 and also in mysql-5.1
when we execute a stop slave on windows platform.
On windows platform if the stop slave is executed after the master dies, we have
this long wait before the stop slave return a value. This is because there is a
sleep of the thread. The sleep is uninterruptable in the two above version,
which was fixed by Davi patch for the BUG#11765860 for mysql-trunk. Backporting
his patch for mysql-5.5 fixes the problem.
Solution : A new pair of mutex and condition variable is introduced to synchronize thread
sleep and finalization. A new mutex is required because the slave threads are
terminated while holding the slave thread locks (run_lock), which can not be
relinquished during termination as this would affect the lock order.
mysql-test/suite/rpl/r/rpl_start_stop_slave.result:
The result file associated with the test added.
mysql-test/suite/rpl/t/rpl_start_stop_slave.test:
A test to check the new functionality.
sql/rpl_mi.cc:
The constructor using the new mutex and condition variables for the master_info.
sql/rpl_mi.h:
The condition variable and mutex have been added for the master_info.
sql/rpl_rli.cc:
The constructor using the new mutex and condition variables for the realy_log_info.
sql/rpl_rli.h:
The condition variable and mutex have been added for the relay_log_info.
sql/slave.cc:
Use a timed wait on a condition variable to implement a interruptible sleep.
The wait is registered with the THD object so that the thread will be woken
up if killed.
nes prefixed with .\ or ./
- Add my_basename() to mysys.
- Do not compile files that are not needed on Windows (my_addr_resolve, an
d safemalloc related stuff it it is not used)
Avoids linker warnings about compilation of essentially empty files.
If an error message contains '\' backslash it is displayed correctly
through show-slave-status or
query_get_value(SHOW SLAVE STATUS, Last_IO_Error, 1);. But when
SELECT REPLACE(...) is applied backslash is escaped resulting in a
different test output.
Disabled backslash escape on show_slave_status.inc and replaced '\' for
'/' using replace_regex function in order to achieve the same test
output when different path separators are used.
A follow-up patch corrects max sizes of printed strings and changes llstr() to %lld.
Credits go to Davi who provided a great feedback.
sql/share/errmsg-utf8.txt:
Max size for the whole message is 512 so a part of - like '%-.512s' should be less,
reduction to 320 is safe and with good chances won't cut off a part of a rather log
message in Last_IO_Error = 'Got fatal error 1236 ...'
sql/sql_repl.cc:
llstr() is replaced by %lld.
The server crashes when receiving a COM_BINLOG_DUMP command with a position of 0 or
larger than the file size.
The execution proceeds to an error block having the last read replication coordinates
pointer be NULL and its dereferencing crashed the server.
Fixed with making "public" previously used only for heartbeat coordinates.
mysql-test/extra/rpl_tests/rpl_start_stop_slave.test:
regression test for bug#3593869-64035 is added.
mysql-test/suite/rpl/r/rpl_cant_read_event_incident.result:
results updated (error mess format is changed).
mysql-test/suite/rpl/r/rpl_log_pos.result:
results updated (error mess format is changed).
mysql-test/suite/rpl/r/rpl_manual_change_index_file.result:
results updated (error mess format is changed).
mysql-test/suite/rpl/r/rpl_packet.result:
results updated (error mess format is changed).
mysql-test/suite/rpl/r/rpl_stm_start_stop_slave.result:
results updated (error mess format is changed).
mysql-test/suite/rpl/t/rpl_stm_start_stop_slave.test:
Slave is stopped by bug#3593869-64035 tests so
-let $rpl_only_running_threads= 1 is set prior to rpl_end.
sql/share/errmsg-utf8.txt:
Increasing the max length of explanatory message to 512.
sql/sql_repl.cc:
Making `coord' to carry the last read from binlog event coordinates
regardless of heartbeat.
Renaming, small cleanup and simplifying the code after if (coord) becomes unnecessary.
Adding yet another 3rd pair of coordinates - the starting replication -
into error text.
Test extra/rpl_tests/rpl_extra_col_master.test (used by
rpl_extra_col_master_*) ends with the active connection pointing to the
slave. Thus, the two last tests never succeed in changing the binlog
format of the master away from 'row'. With correct active connection
(master) tests fail for binlog 'statement' and 'mixed' formats.
Tests rpl_extra_col_master_* only run when binary log format is
row. Statement and mix replication do not make sense in this
tests since it will try to execute statements on columns that do
not exist. This fix is basically a backport from mysql-5.5, see
changes done as part of BUG 39934.
- ensure that mtr supressions table is flushed before doing controlled crash and restart
- use DBUG_SUICIDE() rather than abort() in partition tests - avoids a crash message/warning
- disable perfschema all_instances test on Windows- there are legitimate reasons for output to be different on Unix (some different threads, some different locks), the differences are expected to grow in the future, e.g with threadpool.
There was memory leak when running some tests on PB2.
The reason of the failure is an early return from change_master()
that was supposed to deallocate a dyn-array.
Actually the same bug58915 was fixed in trunk with relocating the dyn-array
destruction into THD::cleanup_after_query() which can't be bypassed.
The current patch backports magne.mahre@oracle.com-20110203101306-q8auashb3d7icxho
and adds two optimizations: were done: the static buffer for the dyn-array to base on,
and the array initialization is called precisely when it's necessary rather than
per each CHANGE-MASTER as before.
mysql-test/suite/rpl/t/rpl_empty_master_host.test:
the test is binlog-format insensitive so it will be run with MIXED mode only.
mysql-test/suite/rpl/t/rpl_server_id_ignore.test:
the test is binlog-format insensitive so it will be run with MIXED mode only.
sql/sql_class.cc:
relocating the dyn-array
destruction into THD::cleanup_after_query().
sql/sql_lex.cc:
LEX.mi zero initialization is done in LEX().
sql/sql_lex.h:
Optimization for repl_ignore_server_ids to base on a static buffer
which size is chosen to fit to most common use cases.
sql/sql_repl.cc:
dyn-array destruction is relocated to THD::cleanup_after_query().
sql/sql_yacc.yy:
Refining logics of Lex->mi.repl_ignore_server_ids initialization.
The array is initialized once a corresponding option in CHANGE MASTER token sequence
is found.
Fix typo causing too low timeout value for wait_for_slave_param.inc.
Fix binlog checksums following 5.5 merge.
Make sure the rpl suite can run with --mysqld=--binlog-checksum=CRC32
Fix a number of problems in the code when checksums are enabled.
* rename all debugging related command-line options
and variables to start from "debug-", and made them all
OFF by default.
* replace "MySQL" with "MariaDB" in error messages
* "Cast ... converted ... integer to it's ... complement"
is now a note, not a warning
* @@query_cache_strip_comments now has a session scope,
not global.
Refactored the test case: hardened and extended it. Created test inc file
to abstract the task of relocating binlogs.
Also, disabled it on windows for not cluttering the test case any further,
as it depends heavily on doing filesystem operations and path handling.
mysql-test/include/relocate_binlogs.inc:
Auxiliar include file that performs the relocation of binary logs
listed in an index file.
There was memory leak when running some tests on PB2.
The reason of the failure is an early return from change_master()
that was supposed to deallocate a dyn-array.
Fixed with relocating the dyn-array's destructor at ~LEX() that is
the end of the session, per Gleb's patch idea.
Two optimizations were done: the static buffer for the dyn-array to base on,
and the array initialization is called precisely when it's necessary rather than
per each CHANGE-MASTER as before.
mysql-test/suite/rpl/t/rpl_empty_master_host.test:
the test is binlog-format insensitive so it will be run with MIXED mode only.
sql/sql_lex.cc:
the new flag is initialized.
sql/sql_lex.h:
A new bool flag new member to LEX.mi is added to stay UP since after
LEX.mi.repl_ignore_server_ids dynarray initialization was called
for the first time on the session. So it is set once and its life time
is session.
The array is destroyed at the end of the session.
sql/sql_repl.cc:
dyn-array destruction is relocated to ~LEX.
sql/sql_yacc.yy:
Refining logics of Lex->mi.repl_ignore_server_ids initialization.
The array is initialized once a corresponding option in CHANGE MASTER token sequence
is found.
The fact of initialization is memorized into the new flag.
BIN LOG HAS BEEN MOVED
When moving the binary/relay log files from one location to
another and restarting the server with a different log-bin or
relay-log paths, would cause the startup process to abort. The
root cause was that the server would not be able to find the log
files because it would consider old paths for entries in the
index file instead of the new location. What's even worse, the
relative paths would not be considered relative to the path
provided in log-bin and relay-log, but to mysql_data_dir.
We fix the cases where the server contains relative paths. When
the server is reading from the index file, it checks whether the
entry contains relative paths. If it does, we replace it with the
absolute path set in log-bin/relay-log option. Absolute paths
remain unchanged and the index must be manually edited to
consider the new log-bin and/or relay-log path (this should be
documented). This is a fix for a GA version, that does not break
behavior (that much).
For development versions, we should go with Zhenxing's approach
that removes paths altogether from index files.
mysql-test/include/begin_include_file.inc:
Added parameter to keep the begin_include_file.inc silent. Useful when
including scripts that contain platform dependent parameters, for example:
--let $rpl_server_parameters=--log-bin=$tmpdir/slave-bin --relay-log=$tmpdir/slave-relay-bin
--let $keep_include_silent=1
source include/rpl_start_server.inc;
--let $keep_include_silent=0
We want the paths ($tmpdir/slave-bin and $tmpdir/slave-relay-bin) not to be in the
result file.
mysql-test/suite/rpl/t/rpl_binlog_index.test:
Test case.
sql/log.cc:
When finding the corresponding log entry in the index file, we first
normalize the paths before doing the comparison. This will make relative
paths to be turned into absolute paths (based on the opt_bin_logname or
opt_relay_logname) and then compared against also, expanded paths entered,
through CHANGE MASTER for instance.
sql/log.h:
Added normalize_binlog_name, which turns relative paths, into absolute paths
given the parameter: is_relay_log ? opt_relay_logname : opt_bin_logname .
sql/mysqld.cc:
Exposing opt_bin_logname.
sql/mysqld.h:
Exposing opt_bin_logname.
When passing an empty user to the connect function will cause
valgrind warnings. Seems that the client code is not prepared
to handle empty users. On 5.6 this can even be triggered by
START SLAVE PASSWORD='...'; i.e., without setting USER='...' on
the START SLAVE command (see WL#4143 for details on the new
additional START SLAVE commands).
To fix this, we disallow empty users when configuring the slave
connection parameters (this decision might be revisited if the
client code accepts empty users in the future).
sql/slave.cc:
We throw an error if an empty user is supplied to the connection
function.
SCAN/CPU) => SLAVE FAILURE
When a statement containing a large amount of ROWs to be applied on
the slave, and the slave's table does not contain a PK, it can take a
considerable amount of time to find and change all the rows that are
to be changed.
The proper slave enhancement will be implemented in WL 5597. However,
in this bug we are making it clear to the user what the problem is, by
printing a message to the error log if the execution time, for a given
statement in RBR, takes more than LONG_FIND_ROW_THRESHOLD (set to 60
seconds). This shall help the DBA to diagnose what's happening when
facing a slave server that is quiet for no apparent reason...
The note is only printed to the error log if log_warnings is set to be
greater than 1.
sql/log_event.cc:
Core of the patch.
In Rows_log_event::do_apply_event, sets STMT start
timestamp.
In Rows_log_event::find_row, if a PK is not used, then the start
timestamp is checked to see if the time spent on this STMT is enough
to justify the printing of a note (only if it was not printed before).
sql/log_event.h:
Defining LONG_FIND_ROW_THRESHOLD.
sql/rpl_rli.cc:
Resets long_find_row state in rli->context_cleanup().
sql/rpl_rli.h:
Two new rli properties that are necessary to control when to
emit a note in the error log: 1) timestamp that states when the
ROW statement started; 2) flag indicating whether the note has
been emitted for the current statement or not. Also deployed
accessors.
post-push fixes for show_slave_io_error= 1 of wait_for_slave_io_error.inc;
Unix and win format path specifically so few tests have to change show_slave_io_error
to zero.
The bug case is similar to one fixed earlier bug_49536.
Deadlock involving LOCK_log appears to be possible because the purge running thread
is holding LOCK_log whereas there is no sense of doing that and which fact was
exploited by the earlier bug fixes.
Fixed with small reengineering of rotate_and_purge(), adding two new methods and
setting up a policy to execute those instead of the former
rotate_and_purge(RP_LOCK_LOG_IS_ALREADY_LOCKED).
The policy for using rotate(), purge() is that if the caller acquires LOCK_log itself,
it should call rotate(), release the mutex and run purge().
Side effect of this patch is refining error message of bug@11747416 to print
the whole path.
mysql-test/suite/rpl/r/rpl_cant_read_event_incident.result:
the file name printing is changed to a relative path instead of just the file name.
mysql-test/suite/rpl/r/rpl_log_pos.result:
the file name printing is changed to a relative path instead of just the file name.
mysql-test/suite/rpl/r/rpl_manual_change_index_file.result:
the file name printing is changed to a relative path instead of just the file name.
mysql-test/suite/rpl/r/rpl_packet.result:
the file name printing is changed to a relative path instead of just the file name.
mysql-test/suite/rpl/r/rpl_rotate_purge_deadlock.result:
new result file is added.
mysql-test/suite/rpl/t/rpl_cant_read_event_incident.test:
The test of that bug can't satisfy windows and unix backslash interpretation so windows
execution is chosen to bypass.
mysql-test/suite/rpl/t/rpl_rotate_purge_deadlock-master.opt:
new opt file is added.
mysql-test/suite/rpl/t/rpl_rotate_purge_deadlock.test:
regression test is added as well as verification of a
possible side effect of the fixes is tried.
sql/log.cc:
LOCK_log is never taken during execution of log purging routine.
The former MYSQL_BIN_LOG::rotate_and_purge is made to necessarily
acquiring and releasing LOCK_log.
If caller takes the mutex itself it has to use a new rotate(), purge()
methods combination and to never let purge() be run with LOCK_log grabbed.
split apart to allow
the caller to chose either it
Simulation of concurrently rotating/purging threads is added.
sql/log.h:
new rotate(), purge() methods are added to be used instead of
the former rotate_and_purge(RP_LOCK_LOG_IS_ALREADY_LOCKED).
rotate_and_purge() signature is changed. Caller should not call rotate_and_purge()
but rather {rotate(), purge()} if LOCK_log is acquired by it.
sql/rpl_injector.cc:
changes to reflect the new rotate_and_purge() signature.
sql/sql_class.h:
unnecessary constants are removed.
sql/sql_parse.cc:
changes to reflect the new rotate_and_purge() signature.
sql/sql_reload.cc:
changes to reflect the new rotate_and_purge() signature.
sql/sql_repl.cc:
followup for bug@11747416: the file name printing is changed to a relative
path instead of just the file name.
sql/sql_insert.cc:
CREATE ... IF NOT EXISTS may do nothing, but
it is still not a failure. don't forget to my_ok it.
******
CREATE ... IF NOT EXISTS may do nothing, but
it is still not a failure. don't forget to my_ok it.
sql/sql_table.cc:
small cleanup
******
small cleanup
Binary log of master can get a partially logged event if the server
runs out of disk space and, while waiting for some space to be freed,
is shut down (or crashes). If the server is not stopped, it will just
wait endlessly for space to be freed, thus no partial event anomaly
occurs. The restarted master server has had a dubious policy to send
the incomplete event to slave which it apparently can't handle.
Although an error was printed out the fact of sending with unclear
error message is a source of confusion.
Actually the problem of presence an incomplete event in the binary log
was already fixed by WL 5493 (which was merged to our current trunk
branch, major version 5.6). The fix makes the server truncate the
binary log on server restart and recovery.
However 5.5 master can't do that. So the current issue is a problem of
sending incomplete events to the slave by 5.5 master.
It is fixed in this patch by changing the policy so that only complete
events are pushed by the dump thread to the IO thread. In addition,
the error text that master sends to the slave when an incomplete event
is found, now states that incomplete event may have been caused by an
out-of-disk space situation and provides coordinates of
the first and the last event bytes read.
mysql-test/std_data/bug11747416_32228_binlog.000001:
a binlog is added with the last event written partly.
mysql-test/suite/rpl/r/rpl_cant_read_event_incident.result:
new result file is added.
mysql-test/suite/rpl/r/rpl_log_pos.result:
results updated.
mysql-test/suite/rpl/r/rpl_manual_change_index_file.result:
results updated.
mysql-test/suite/rpl/r/rpl_packet.result:
results updated.
mysql-test/suite/rpl/t/rpl_cant_read_event_incident.test:
regression test for bug#11747416 : 32228 A disk full makes binary log corrupt
is added.
sql/share/errmsg-utf8.txt:
Increasing the explanatory part of ER_MASTER_FATAL_ERROR_READING_BINLOG error message twice
in order to fit to the updated version which carries some more info.
sql/sql_repl.cc:
Error text indicating a failure of reading from binlog that master delivers to the slave
is made more clear;
A policy to regard a partial event to send it out to the slave anyway is removed.
on Windows, kill <connection> can and almost always will return client error 2013 ("Lost connection...") on the killed connection,
On the server side, if connection is "sleeping" KILL will close the socket, thus socket error on client is expected.