Implement binlog_gtid_pos() function. This will be used so that
the slave can obtain the gtid position automatically from first
connect with old-style position - then MASTER_GTID_POS=AUTO will
work the next time. Can also be used by mysqldump --master-data
to give the current gtid position directly.
When CHANGE MASTER fails, it may or may not have already added
the Master_info * to the index. Implement logic that properly
handles removal and freeing in both cases.
FROM MYSQL_BINLOG_SEND
As part Bug #11747416 A DISK FULL MAKES BINARY LOG CORRUPT,
reading the variable "binlog_can_be_corrupted" was removed
In the existing code the value of this variable is only set,
never read. And also this issue causing compiler warnings.
So the variable is completely redundant and should be removed.
sql/sql_repl.cc:
Removing dead code
FROM MYSQL_BINLOG_SEND
As part Bug #11747416 A DISK FULL MAKES BINARY LOG CORRUPT,
reading the variable "binlog_can_be_corrupted" was removed
In the existing code the value of this variable is only set,
never read. And also this issue causing compiler warnings.
So the variable is completely redundant and should be removed.
Give error for wrong parameters to CHANGE MASTER
Extend MASTER_PASSWORD and MASTER_HOST lengths
mysql-test/suite/rpl/r/rpl_password_boundaries.result:
Test length of MASTER_PASSWORD, MASTER_HOST and MASTER_USER
mysql-test/suite/rpl/r/rpl_semi_sync.result:
Use different password than user name for better test coverage
mysql-test/suite/rpl/t/rpl_password_boundaries.test:
Test length of MASTER_PASSWORD, MASTER_HOST and MASTER_USER
mysql-test/suite/rpl/t/rpl_semi_sync.test:
Use different password than user name for better test coverage
sql/rpl_mi.h:
Extend MASTER_PASSWORD and MASTER_HOST lengths
sql/sql_repl.cc:
Give error for wrong parameters to CHANGE MASTER
sql/sql_repl.h:
Extend MASTER_PASSWORD and MASTER_HOST lengths
=== Problem ===
The test is dependent on binlog positions and checks
to see if the command 'START SLAVE' functions correctly
with the 'UNTIL' clause added to it. The 'UNTIL' clause
is added to specify that the slave should start and run
until the SQL thread reaches a given point in the master
binary log or in the slave relay log.
The test uses hard coded values for MASTER_LOG_POS and
RELAY_LOG_POS, instead of extracting it using
query_get_value() function. There is a test
'rpl.rpl_row_until' which does the similar thing but uses
query_get_value() function to set the values of
MASTER_LOG_POS/ RELAY_LOG_POS. To be precise,
rpl.rpl_row_until is a modified version of
engines/func.rpl_row_until.test.
The use of hard coded values may lead the slave to stop at a position
which may differ from the expected position in the binlog file,
an example being the failure of engines/funcs.rpl_row_until in
mysql-5.1 given as:
"query 'select * from t2' failed. Table 'test.t2' doesn't exist".
In this case, the slave actually ran a couple of extra commands
as a result of which the slave first deleted the table and then
ran a select query on table, leading to the above mentioned failure.
=== Fix ===
1) Fixed the code for failure seen in rpl.rpl_row_until.
This test was also failing although the symptoms of
failure were different.
2) Copied the contents from rpl.rpl_row_until into
into engines/funcs.rpl.rpl_row_until.
3) Updated engines/funcs.rpl_row_until.result accordingly.
mysql-test/suite/engines/funcs/r/rpl_row_until.result:
modified to accomodate the changes in corresponding
test file.
mysql-test/suite/engines/funcs/t/disabled.def:
removed from the list of disabled tests.
mysql-test/suite/engines/funcs/t/rpl_row_until.test:
fixed rpl.rpl_row_until and copied its content to
engines/funcs.rpl_row_until. The reason being both
are same tests but rpl.rpl_row_until is an
updated version.
mysql-test/suite/rpl/t/disabled.def:
removed from the list of disabled tests.
sql/sql_repl.cc:
Added a check to catch an improper combination
of arguements passed to 'START SLAVE UNTIL'. Earlier,
START SLAVE UNTIL MASTER_LOG_FILE='master-bin.000001',
MASTER_LOG_POS=561, RELAY_LOG_POS=12;
passed. It is now detected and an error is reported.
=== Problem ===
The test is dependent on binlog positions and checks
to see if the command 'START SLAVE' functions correctly
with the 'UNTIL' clause added to it. The 'UNTIL' clause
is added to specify that the slave should start and run
until the SQL thread reaches a given point in the master
binary log or in the slave relay log.
The test uses hard coded values for MASTER_LOG_POS and
RELAY_LOG_POS, instead of extracting it using
query_get_value() function. There is a test
'rpl.rpl_row_until' which does the similar thing but uses
query_get_value() function to set the values of
MASTER_LOG_POS/ RELAY_LOG_POS. To be precise,
rpl.rpl_row_until is a modified version of
engines/func.rpl_row_until.test.
The use of hard coded values may lead the slave to stop at a position
which may differ from the expected position in the binlog file,
an example being the failure of engines/funcs.rpl_row_until in
mysql-5.1 given as:
"query 'select * from t2' failed. Table 'test.t2' doesn't exist".
In this case, the slave actually ran a couple of extra commands
as a result of which the slave first deleted the table and then
ran a select query on table, leading to the above mentioned failure.
=== Fix ===
1) Fixed the code for failure seen in rpl.rpl_row_until.
This test was also failing although the symptoms of
failure were different.
2) Copied the contents from rpl.rpl_row_until into
into engines/funcs.rpl.rpl_row_until.
3) Updated engines/funcs.rpl_row_until.result accordingly.
Now slave can connect to master, sending start position as slave state
rather than old-style binlog name/position.
This enables to switch to a new master by changing just connection
information, replication slave GTID state ensures that slave starts
at the correct point in the new master.
MDEV-26: Global transaction id, partial commit
Change server_id to be a session variable.
User with SUPER can set it to binlog with different server_id.
Implement backward-compatible ::server_id mirror for plugins.
This allows us to avoid calculating variables (including those involving mutex) that doesn't match the given
wildcard in SHOW STATUS LIKE '...'
Removed all references to active_mi that could cause problems for multi-source replication.
Added START|STOP ALL SLAVES
Added SHOW ALL SLAVES STATUS
include/mysql/plugin.h:
Added SHOW_SIMPLE_FUNC
include/mysql/plugin_audit.h.pp:
Updated .pp file
include/mysql/plugin_auth.h.pp:
Updated .pp file
include/mysql/plugin_ftparser.h.pp:
Updated .pp file
mysql-test/suite/multi_source/info_logs.result:
New columns in SHOW ALL SLAVES STATUS
mysql-test/suite/multi_source/info_logs.test:
Test new syntax
mysql-test/suite/multi_source/simple.result:
New columns in SHOW ALL SLAVES STATUS
mysql-test/suite/multi_source/simple.test:
test new syntax
mysql-test/suite/multi_source/syntax.result:
Updated result
mysql-test/suite/multi_source/syntax.test:
Test new syntax
mysql-test/suite/rpl/r/rpl_skip_replication.result:
Updated result
plugin/semisync/semisync_master_plugin.cc:
SHOW_FUNC -> SHOW_SIMPLE_FUNC
sql/item_create.cc:
Simplify code
sql/lex.h:
Added SLAVES keyword
sql/log.cc:
Constant -> define
sql/log_event.cc:
Added comment
sql/mysqld.cc:
SHOW_FUNC -> SHOW_SIMPLE_FUNC
Made slave_retried_trans, slave_received_heartbeats and heartbeat_period multi-source safe
Clear variable denied_connections and slave_retried_transactions on startup
sql/mysqld.h:
Added slave_retried_transactions
sql/rpl_mi.cc:
create_signed_file_name -> create_logfile_name_with_suffix
Added start_all_slaves() and stop_all_slaves()
sql/rpl_mi.h:
Added prototypes
sql/rpl_rli.cc:
create_signed_file_name -> create_logfile_name_with_suffix
added executed_entries
sql/rpl_rli.h:
Added executed_entries
sql/share/errmsg-utf8.txt:
More and better error messages
sql/slave.cc:
Added more fields to SHOW ALL SLAVES STATUS
Added slave_retried_transactions
Changed constants -> defines
sql/sql_class.h:
Added comment
sql/sql_insert.cc:
active_mi.rli -> thd->rli_slave
sql/sql_lex.h:
Added SQLCOM_SLAVE_ALL_START & SQLCOM_SLAVE_ALL_STOP
sql/sql_load.cc:
active_mi.rli -> thd->rli_slave
sql/sql_parse.cc:
Added START|STOP ALL SLAVES
sql/sql_prepare.cc:
Added SQLCOM_SLAVE_ALL_START & SQLCOM_SLAVE_ALL_STOP
sql/sql_reload.cc:
Made REFRESH RELAY LOG multi-source safe
sql/sql_repl.cc:
create_signed_file_name -> create_logfile_name_with_suffix
Don't send my_ok() from start_slave() or stop_slave() so that we can call it for all master connections
sql/sql_show.cc:
Compare wild cards early for all variables
sql/sql_yacc.yy:
Added START|STOP ALL SLAVES
Added SHOW ALL SLAVES STATUS
sql/sys_vars.cc:
Made replicate_events_marked_for_skip,slave_net_timeout and rpl_filter multi-source safe.
sql/sys_vars.h:
Simplify Sys_var_rpl_filter
Changed names of multi-source log files so that original suffixes are kept.
include/my_sys.h:
Added fn_ext2(), which returns pointer to last '.' in file name
mysql-test/extra/rpl_tests/rpl_max_relay_size.test:
Updated test
mysql-test/suite/multi_source/info_logs-master.opt:
Test with strange file names
mysql-test/suite/multi_source/info_logs.result:
Updated results
mysql-test/suite/multi_source/info_logs.test:
Changed to test with complex names to be able to verify the filename generator code
mysql-test/suite/multi_source/relaylog_events.result:
Updated results
mysql-test/suite/multi_source/reset_slave.result:
Updated results
mysql-test/suite/multi_source/skip_counter.result:
Updated results
mysql-test/suite/multi_source/skip_counter.test:
Added testing of max_relay_log_size
mysql-test/suite/rpl/r/rpl_row_max_relay_size.result:
Updated results
mysql-test/suite/rpl/r/rpl_stm_max_relay_size.result:
Updated results
mysql-test/suite/sys_vars/r/max_relay_log_size_basic.result:
Updated results
mysql-test/suite/sys_vars/t/max_relay_log_size_basic.test:
Updated results
mysys/mf_fn_ext.c:
Added fn_ext2(), which returns pointer to last '.' in file name
sql/log.cc:
Removed some wrong casts
sql/log.h:
Updated comment to reflect new code
sql/log_event.cc:
Updated DBUG_PRINT
sql/mysqld.cc:
Added that max_relay_log_size copies it's values from max_binlog_size
sql/mysqld.h:
Removed max_relay_log_size
sql/rpl_mi.cc:
Changed names of multi-source log files so that original suffixes are kept.
sql/rpl_mi.h:
Updated prototype
sql/rpl_rli.cc:
Updated comment to reflect new code
Made max_relay_log_size depending on master connection.
sql/rpl_rli.h:
Made max_relay_log_size depending on master connection.
sql/set_var.h:
Made option global so that one can check and change min & max values (sorry Sergei)
sql/sql_class.h:
Made max_relay_log_size depending on master connection.
sql/sql_repl.cc:
Updated calls to create_signed_file_name()
sql/sys_vars.cc:
Made max_relay_log_size depending on master connection.
Made old code more reusable
sql/sys_vars.h:
Changed Sys_var_multi_source_uint to ulong to be able to handle max_relay_log_size
Made old code more reusable
- Added parameter to reset_logs() so that one can specify if new logs should be created.
mysql-test/include/setup_fake_relay_log.inc:
There is no orphan relay log files anymore
mysql-test/mysql-test-run.pl:
Added multi_source to test suite
mysql-test/suite/multi_source/info_logs.result:
New test
mysql-test/suite/multi_source/info_logs.test:
New test
mysql-test/suite/multi_source/my.cnf:
Added log-warnings to get more information to the log files
mysql-test/suite/multi_source/relaylog_events.result:
Added cleanup
mysql-test/suite/multi_source/relaylog_events.test:
Added cleanup
mysql-test/suite/multi_source/reset_slave.result:
Updated results after improved RESET SLAVE
mysql-test/suite/multi_source/simple.result:
Updated results after improved RESET SLAVE
mysql-test/suite/multi_source/simple.test:
Syncronize positions before show full slave status
mysql-test/suite/rpl/r/rpl_row_show_relaylog_events.result:
Updated results after improved RESET SLAVE (we now use less relay log files)
mysql-test/suite/rpl/r/rpl_stm_mix_show_relaylog_events.result:
Updated results after improved RESET SLAVE (we now use less relay log files)
sql/log.cc:
Added parameter to reset_logs() so that one can specify if new logs should be created.
sql/log.h:
Added parameter to reset_logs()
sql/rpl_mi.cc:
Create Master_info_index::index_file_names once at init
More DBUG_PRINT
Give error if Master_info_index::check_duplicate_master_info fails
sql/rpl_rli.cc:
If we do a full reset, don't create any new relay log files.
sql/share/errmsg-utf8.txt:
Improved error message if connection exists
sql/sql_parse.cc:
Fixed memory leak
sql/sql_repl.cc:
check_duplicate_master_info() now generates an error
Added parameter to reset_logs()
Documentation of the feature can be found at: http://kb.askmonty.org/en/multi-source-replication/
This code is based on code from Taobao, developed by Plinux
BUILD/SETUP.sh:
Added -Wno-invalid-offsetof to get rid of warning of offsetof() on C++ class (safe in the contex we use it)
client/mysqltest.cc:
Added support for error names starting with 'W'
Added connection_name support to --sync_with_master
cmake/maintainer.cmake:
Added -Wno-invalid-offsetof to get rid of warning of offsetof() on C++ class (safe in the contex we use it)
mysql-test/r/mysqltest.result:
Updated results
mysql-test/r/parser.result:
Updated results
mysql-test/suite/multi_source/my.cnf:
Setup of multi-master tests
mysql-test/suite/multi_source/simple.result:
Simple basic test of multi-source functionality
mysql-test/suite/multi_source/simple.test:
Simple basic test of multi-source functionality
mysql-test/suite/multi_source/syntax.result:
Test of multi-source syntax
mysql-test/suite/multi_source/syntax.test:
Test of multi-source syntax
mysql-test/suite/rpl/r/rpl_rotate_logs.result:
Updated results because of new error messages
mysql-test/t/parser.test:
Updated test as master_pos_wait() now takes more arguments than before
sql/event_scheduler.cc:
No reason to initialize slave_thread (it's guaranteed to be zero here)
sql/item_create.cc:
Added connection_name argument to master_pos_wait()
Simplified code
sql/item_func.cc:
Added connection_name argument to master_pos_wait()
sql/item_func.h:
Added connection_name argument to master_pos_wait()
sql/log.cc:
Added tag "Master 'connection_name'" to slave errors that has a connection name.
sql/mysqld.cc:
Added variable mysqld_server_initialized so that other functions can test if server is fully initialized.
Free all slave data in one place (fewer ifdef's)
Removed not needed call to close_active_mi()
Initialize slaves() later in startup to ensure that everthing is really initialized when slaves start.
Made status variable slave_running multi-source safe
sql/mysqld.h:
Added mysqld_server_initialized
sql/rpl_mi.cc:
Store connection name and cmp_connection_name (only used for show full slave status) in Master_info
Added code for Master_info_index, which handles storage of multi-master information
Don't write the empty "" connection_name to multi-master.info file. This is handled by the original code.
sql/rpl_mi.h:
Added connection_name and Master_info_index
sql/rpl_rli.cc:
Added connection_name to relay log files.
sql/rpl_rli.h:
Fixed type of slave_skip_counter as we now access it directly in sys_vars.cc, so it must be uint
sql/share/errmsg-utf8.txt:
Added new error messages needed for multi-source
Added multi-source name to error ER_MASTER_INFO and WARN_NO_MASTER_INFO
sql/slave.cc:
Moved things a bit around to make it easier to handle error conditions.
Create a global master_info_index and add the "" connection to it
Ensure that new Master_info doesn't fail.
Don't call terminate_slave_threads(active_mi..) on end_slave() as this is now done automaticly when deleting master_info_index.
Delete not needed function close_active_mi(). One can achive same thing by calling end_slave().
Added support for SHOW FULL SLAVE STATUS (show status for all master connections with connection_name as first column)
sql/slave.h:
Added new prototypes
sql/sql_base.cc:
More DBUG_PRINT
sql/sql_class.cc:
Reset thd->connection_name and thd-->default_master_connection
sql/sql_class.h:
Added thd->connection_name and thd-->default_master_connection
Added slave_skip_count to variables to make changing the @@sql_slave_skip_count variable thread safe
sql/sql_const.h:
Added MAX_CONNECTION_NAME
sql/sql_lex.cc:
Reset 'lex->verbose' (to simplify some sql_yacc.yy code)
sql/sql_lex.h:
Added connection_name
sql/sql_parse.cc:
Added support for connection_name to all SLAVE commands.
- Instead of using active_mi, we now get the current Master_info from master_info_index.
- Create new replication threads with CHANGE MASTER
- Added support for show_all_master_info()
sql/sql_reload.cc:
Made reset/full slave use master_info_index->get_master_info() instead of active_mi.
If one uses 'RESET SLAVE "connection_name" all' the connection is removed from master_info_index.
sql/sql_repl.cc:
sql_slave_skip_counter is moved to thd->variables to make it thread safe and fix some bugs with it
Add connection name to relay log files.
Added connection name to errors.
Added some logging for multi-master if log_warnings > 1
stop_slave():
- Don't check if thd is set. It's guaranteed to always be set.
change_master():
- Check for duplicate connection names in change_master()
- Check for wrong arguments first in file (to simplify error handling)
- Register new connections in master_info_index
sql/sql_yacc.yy:
Added optional connection_name to a all relevant master/slave commands
sql/strfunc.cc:
my_global.h shoud always be included first.
sql/sys_vars.cc:
Added variable default_master_connection
Made variable sql_slave_skip_counter multi-source safe
sql/sys_vars.h:
Added Sys_var_session_lexstring (needed for default_master_connection)
Added Sys_var_multi_source_uint (needed for sql_slave_skip_counter).
and small collateral changes
mysql-test/lib/My/Test.pm:
somehow with "print" we get truncated writes sometimes
mysql-test/suite/perfschema/r/digest_table_full.result:
md5 hashes of statement digests differ, because yacc token codes are different in mariadb
mysql-test/suite/perfschema/r/dml_handler.result:
host table is not ported over yet
mysql-test/suite/perfschema/r/information_schema.result:
host table is not ported over yet
mysql-test/suite/perfschema/r/nesting.result:
this differs, because we don't rewrite general log queries, and multi-statement
packets are logged as a one entry. this result file is identical to what mysql-5.6.5
produces with the --log-raw option.
mysql-test/suite/perfschema/r/relaylog.result:
MariaDB modifies the binlog index file directly, while MySQL 5.6 has a feature "crash-safe binlog index" and modifies a special "crash-safe" shadow copy of the index file and then moves it over. That's why this test shows "NONE" index file writes in MySQL and "MANY" in MariaDB.
mysql-test/suite/perfschema/r/server_init.result:
MariaDB initializes the "manager" resources from the "manager" thread, and starts this thread only when --flush-time is not 0. MySQL 5.6 initializes "manager" resources unconditionally on server startup.
mysql-test/suite/perfschema/r/stage_mdl_global.result:
this differs, because MariaDB disables query cache when query_cache_size=0. MySQL does not
do that, and this causes useless mutex locks and waits.
mysql-test/suite/perfschema/r/statement_digest.result:
md5 hashes of statement digests differ, because yacc token codes are different in mariadb
mysql-test/suite/perfschema/r/statement_digest_consumers.result:
md5 hashes of statement digests differ, because yacc token codes are different in mariadb
mysql-test/suite/perfschema/r/statement_digest_long_query.result:
md5 hashes of statement digests differ, because yacc token codes are different in mariadb
mysql-test/suite/rpl/r/rpl_mixed_drop_create_temp_table.result:
will be updated to match 5.6 when alfranio.correia@oracle.com-20110512172919-c1b5kmum4h52g0ni and anders.song@greatopensource.com-20110105052107-zoab0bsf5a6xxk2y are merged
mysql-test/suite/rpl/r/rpl_non_direct_mixed_mixing_engines.result:
will be updated to match 5.6 when anders.song@greatopensource.com-20110105052107-zoab0bsf5a6xxk2y is merged
two tests still fail:
main.innodb_icp and main.range_vs_index_merge_innodb
call records_in_range() with both range ends being open
(which triggers an assert)
Keep track of how many pending XIDs (transactions that are prepared in
storage engine and written into binlog, but not yet durably committed
on disk in the engine) there are in each binlog.
When the count of one binlog drops to zero, write a new binlog checkpoint
event, telling which is the oldest binlog with pending XIDs.
When doing XA recovery after a crash, check the last binlog checkpoint
event, and scan all binlog files from that point onwards for XIDs that
must be committed if found in prepared state inside engine.
Remove the code in binlog rotation that waits for all prepared XIDs to
be committed before writing a new binlog file (this is no longer necessary
when recovery can scan multiple binlog files).
Add function to replace arbitrary event with dummy event.
Add code which uses this to fix the bug that enabling row_annotate events
on the master breaks slaves which do not request such events.
Add that slaves set a variable @mariadb_slave_capability to inform the
master in a robust way about which events it can, and cannot, handle.
Add tests.
Problem
========
Replication breaks in the cases if the event length exceeds
the size of master Dump thread's max_allowed_packet.
The reason why this failure is occuring is because the event length is
more than the total size of the max_allowed_packet, on addition of the
max_event_header length exceeds the max_allowed_packet of the DUMP thread.
This causes the Dump thread to break replication and throw an error.
That can happen e.g with row-based replication in Update_rows event.
Fix
====
The problem is fixed in 2 steps:
1.) The Dump thread limit to read event is increased to the upper limit
i.e. Dump thread reads whatever gets logged in the binary log.
2.) On the slave side we increase the the max_allowed_packet for the
slave's threads (IO/SQL) by increasing it to 1GB.
This is done using the new server option (slave_max_allowed_packet)
included, is used to regulate the max_allowed_packet of the
slave thread (IO/SQL) by the DBA, and facilitates the sending of
large packets from the master to the slave.
This causes the large packets to be received by the slave and apply
it successfully.
sql/log_event.cc:
The max_allowed_packet is not evaluated to the new option
slave_max_allowed_packet after the fix.
sql/log_event.h:
Added the new option in the log_event.h file.
sql/mysqld.cc:
Added a new option to the server.
sql/slave.cc:
Increasing the session max_allowed_packet to a large value,
i.e. not taking global(max_allowed) into consideration, for the slave's threads.
sql/sql_repl.cc:
The dump thread's max_allowed_packet is set to the upper limit
which makes it independent and it now reads whatever gets
logged in the binary log.
Problem
========
Replication breaks in the cases if the event length exceeds
the size of master Dump thread's max_allowed_packet.
The reason why this failure is occuring is because the event length is
more than the total size of the max_allowed_packet, on addition of the
max_event_header length exceeds the max_allowed_packet of the DUMP thread.
This causes the Dump thread to break replication and throw an error.
That can happen e.g with row-based replication in Update_rows event.
Fix
====
The problem is fixed in 2 steps:
1.) The Dump thread limit to read event is increased to the upper limit
i.e. Dump thread reads whatever gets logged in the binary log.
2.) On the slave side we increase the the max_allowed_packet for the
slave's threads (IO/SQL) by increasing it to 1GB.
This is done using the new server option (slave_max_allowed_packet)
included, is used to regulate the max_allowed_packet of the
slave thread (IO/SQL) by the DBA, and facilitates the sending of
large packets from the master to the slave.
This causes the large packets to be received by the slave and apply
it successfully.
Problem
========
Replication breaks in the cases if the event length exceeds
the size of master Dump thread's max_allowed_packet.
The reason why this failure is occuring is because the event length is
more than the total size of the max_allowed_packet, on addition of the
max_event_header length exceeds the max_allowed_packet of the DUMP thread.
This causes the Dump thread to break replication and throw an error.
That can happen e.g with row-based replication in Update_rows event.
Fix
====
The problem is fixed in 2 steps:
1.) The Dump thread limit to read event is increased to the upper limit
i.e. Dump thread reads whatever gets logged in the binary log.
2.) On the slave side we increase the the max_allowed_packet for the
slave's threads (IO/SQL) by increasing it to 1GB.
This is done using the new server option (slave_max_allowed_packet)
included, is used to regulate the max_allowed_packet of the
slave thread (IO/SQL) by the DBA, and facilitates the sending of
large packets from the master to the slave.
This causes the large packets to be received by the slave and apply
it successfully.
sql/log_event.cc:
The max_allowed_packet is not evaluated to the new option
slave_max_allowed_packet after the fix.
sql/log_event.h:
Added the new option in the log_event.h file.
sql/mysqld.cc:
Added a new option to the server.
sql/slave.cc:
Increasing the session max_allowed_packet to a large value,
i.e. not taking global(max_allowed) into consideration, for the slave's threads.
sql/sql_repl.cc:
The dump thread's max_allowed_packet is set to the upper limit
which makes it independent and it now reads whatever gets
logged in the binary log.
Problem
========
Replication breaks in the cases if the event length exceeds
the size of master Dump thread's max_allowed_packet.
The reason why this failure is occuring is because the event length is
more than the total size of the max_allowed_packet, on addition of the
max_event_header length exceeds the max_allowed_packet of the DUMP thread.
This causes the Dump thread to break replication and throw an error.
That can happen e.g with row-based replication in Update_rows event.
Fix
====
The problem is fixed in 2 steps:
1.) The Dump thread limit to read event is increased to the upper limit
i.e. Dump thread reads whatever gets logged in the binary log.
2.) On the slave side we increase the the max_allowed_packet for the
slave's threads (IO/SQL) by increasing it to 1GB.
This is done using the new server option (slave_max_allowed_packet)
included, is used to regulate the max_allowed_packet of the
slave thread (IO/SQL) by the DBA, and facilitates the sending of
large packets from the master to the slave.
This causes the large packets to be received by the slave and apply
it successfully.
IN THE ERROR LOG
Problem:
Using mysqlbinlog with the --read-from-remote-server option as shown below
prints a message in error log for each call. This happens for 5.5 and above
versions
mysqlbinlog -uroot -p --read-from-remote-server --host=localhost test
Message in error log file is given below:
120312 10:27:57 [Note] Start binlog_dump to slave_server(0), pos(test, 4)
The problem is that it can fill up the error log if the command is called
very often.
Analysis:
The below mentioned print function is called from "mysql_binlog_send" function
which causes the "Start binlog_dump..." string to be printed in error log file.
sql_print_information("Start binlog_dump to master_thread_id(%lu)
slave_server(%d)..."
Fix:
A condition has been added in such a way that the 'sql_print_information'
will be invoked only when the "log_warnings" variable is set to >1
otherwise don't call the 'sql_print_information' function.
IN THE ERROR LOG
Problem:
Using mysqlbinlog with the --read-from-remote-server option as shown below
prints a message in error log for each call. This happens for 5.5 and above
versions
mysqlbinlog -uroot -p --read-from-remote-server --host=localhost test
Message in error log file is given below:
120312 10:27:57 [Note] Start binlog_dump to slave_server(0), pos(test, 4)
The problem is that it can fill up the error log if the command is called
very often.
Analysis:
The below mentioned print function is called from "mysql_binlog_send" function
which causes the "Start binlog_dump..." string to be printed in error log file.
sql_print_information("Start binlog_dump to master_thread_id(%lu)
slave_server(%d)..."
Fix:
A condition has been added in such a way that the 'sql_print_information'
will be invoked only when the "log_warnings" variable is set to >1
otherwise don't call the 'sql_print_information' function.
Problem
========
SQL statements close to the size of max_allowed_packet produce binary
log events larger than max_allowed_packet.
The reason why this failure is occuring is because the event length is
more than the total size of the max_allowed_packet + max_event_header
length. Now since the event length exceeds this size master Dump
thread is unable to send the packet on to the slave.
That can happen e.g with row-based replication in Update_rows event.
Fix
====
The problem was fixed by increasing the max_allowed_packet for the
slave's threads (IO/SQL) by increasing it to 1GB.
This is done using the new server option included which is used to
regulate the max_allowed_packet of the slave thread (IO/SQL).
This causes the large packets to be received by the slave and apply
it successfully.
sql/log_event.h:
Added the new option in the log_event.h file.
sql/mysqld.cc:
Added a new option to the server.
sql/slave.cc:
Increasing the session max_allowed_packet to a large value ,
i.e. not taking global(max_allowed) into consideration, for the slave's threads.
Problem
========
SQL statements close to the size of max_allowed_packet produce binary
log events larger than max_allowed_packet.
The reason why this failure is occuring is because the event length is
more than the total size of the max_allowed_packet + max_event_header
length. Now since the event length exceeds this size master Dump
thread is unable to send the packet on to the slave.
That can happen e.g with row-based replication in Update_rows event.
Fix
====
The problem was fixed by increasing the max_allowed_packet for the
slave's threads (IO/SQL) by increasing it to 1GB.
This is done using the new server option included which is used to
regulate the max_allowed_packet of the slave thread (IO/SQL).
This causes the large packets to be received by the slave and apply
it successfully.
PROBLEM:
Threads end-up in deadlock due to locks acquired as described
below,
con1: Run Query on a table.
It is important that this SELECT must back-off while
trying to open the t1 and enter into wait_for_condition().
The SELECT then is blocked trying to lock mysys_var->mutex
which is held by con3. The very significant fact here is
that mysys_var->current_mutex will still point to LOCK_open,
even if LOCK_open is no longer held by con1 at this point.
con2: Try dropping table used in con1 or query some table.
It will hold LOCK_open and be blocked trying to lock
kernel_mutex held by con4.
con3: Try killing the query run by con1.
It will hold THD::LOCK_thd_data belonging to con1 while
trying to lock mysys_var->current_mutex belonging to con1.
But current_mutex will point to LOCK_open which is held
by con2.
con4: Get innodb engine status
It will hold kernel_mutex, trying to lock THD::LOCK_thd_data
belonging to con1 which is held by con3.
So while technically only con2, con3 and con4 participate in the
deadlock, con1's mysys_var->current_mutex pointing to LOCK_open
is a vital component of the deadlock.
CYCLE = (THD::LOCK_thd_data -> LOCK_open ->
kernel_mutex -> THD::LOCK_thd_data)
FIX:
LOCK_thd_data has responsibility of protecting,
1) thd->query, thd->query_length
2) VIO
3) thd->mysys_var (used by KILL statement and shutdown)
4) THD during thread delete.
Among above responsibilities, 1), 2)and (3,4) seems to be three
independent group of responsibility. If there is different LOCK
owning responsibility of (3,4), the above mentioned deadlock cycle
can be avoid. This fix introduces LOCK_thd_kill to handle
responsibility (3,4), which eliminates the deadlock issue.
Note: The problem is not found in 5.5. Introduction MDL subsystem
caused metadata locking responsibility to be moved from TDC/TC to
MDL subsystem. Due to this, responsibility of LOCK_open is reduced.
As the use of LOCK_open is removed in open_table() and
mysql_rm_table() the above mentioned CYCLE does not form.
Revision ID for changes,
open_table() = dlenev@mysql.com-20100727133458-m3ua9oslnx8fbbvz
mysql_rm_table() = jon.hauglid@oracle.com-20101116100012-kxep9txz2fxy3nmw
PROBLEM:
Threads end-up in deadlock due to locks acquired as described
below,
con1: Run Query on a table.
It is important that this SELECT must back-off while
trying to open the t1 and enter into wait_for_condition().
The SELECT then is blocked trying to lock mysys_var->mutex
which is held by con3. The very significant fact here is
that mysys_var->current_mutex will still point to LOCK_open,
even if LOCK_open is no longer held by con1 at this point.
con2: Try dropping table used in con1 or query some table.
It will hold LOCK_open and be blocked trying to lock
kernel_mutex held by con4.
con3: Try killing the query run by con1.
It will hold THD::LOCK_thd_data belonging to con1 while
trying to lock mysys_var->current_mutex belonging to con1.
But current_mutex will point to LOCK_open which is held
by con2.
con4: Get innodb engine status
It will hold kernel_mutex, trying to lock THD::LOCK_thd_data
belonging to con1 which is held by con3.
So while technically only con2, con3 and con4 participate in the
deadlock, con1's mysys_var->current_mutex pointing to LOCK_open
is a vital component of the deadlock.
CYCLE = (THD::LOCK_thd_data -> LOCK_open ->
kernel_mutex -> THD::LOCK_thd_data)
FIX:
LOCK_thd_data has responsibility of protecting,
1) thd->query, thd->query_length
2) VIO
3) thd->mysys_var (used by KILL statement and shutdown)
4) THD during thread delete.
Among above responsibilities, 1), 2)and (3,4) seems to be three
independent group of responsibility. If there is different LOCK
owning responsibility of (3,4), the above mentioned deadlock cycle
can be avoid. This fix introduces LOCK_thd_kill to handle
responsibility (3,4), which eliminates the deadlock issue.
Note: The problem is not found in 5.5. Introduction MDL subsystem
caused metadata locking responsibility to be moved from TDC/TC to
MDL subsystem. Due to this, responsibility of LOCK_open is reduced.
As the use of LOCK_open is removed in open_table() and
mysql_rm_table() the above mentioned CYCLE does not form.
Revision ID for changes,
open_table() = dlenev@mysql.com-20100727133458-m3ua9oslnx8fbbvz
mysql_rm_table() = jon.hauglid@oracle.com-20101116100012-kxep9txz2fxy3nmw
The function mysql_show_binlog_events has a local stack variable
'LOG_INFO linfo;', which is assigned to thd->current_linfo, however
this variable goes out of scope and is destroyed before clean
thd->current_linfo.
The problem is solved by moving 'LOG_INFO linfo;' to function scope.