XID_STATE->XID.KEY(),
XID_STATE->XID.KEY_LENGTH())==0
This bug is a regression of bug#11759534 - 51855: RACE CONDITION
IN XA START.
The reason for regression is that the changes that fixes the original
bug wasn't merged from mysql-5.1 into mysql-5.5 and mysql-trunk.
Only null-merge was done for the patch changeset.
To incorporate lost changes the manual merge have been done.
Additionally the call of trans_rolback() was added into trans_xa_start()
in case if xid_cache_insert is failed() after transaction has been started.
If we don't call trans_rollback() we would never reset the flag
SERVER_STATUS_IN_TRANS in THD::server_status and therefore all subsequent
attempts to execute XA START in the connection where the error was occurred
will be failed since thd->in_active_multi_stmt_transaction() will return
the true every time when trans_xa_start is called.
The latest changes were absent in patch for mysql-5.1
When master and slave have different schemas, in particular different
AUTO_INCREMENT columns, INSERT_ID events logged for a given table on
master may be applied to a different table on slave on SBR, e.g.:
master has one table (t1) with one auto-inc column and another table
(t2) without auto-inc column, on slave t1 does not have auto-inc
column (despite having the same columns) and t2 has a auto-inc
column. The INSERT_ID that is intended for t1, since t1 on slave
doesn't have auto-inc column is used on t2, causing consistency
problems.
To fix this incorrect behaviour, auto-inc interval allocation via
INSERT_ID is made effectively terminated at the end of top-level
statements on slave and binlog replay.
THREAD POOLING STRESS TEST
PROBLEM:
Connection stress tests which consists of concurrent
kill connections interleaved with mysql ping queries
cause the mysqld server which uses thread pool scheduler
to crash.
FIX:
Killing a connection involves shutdown and close of client
socket and this can cause EPOLLHUP(or EPOLLERR) events to be
to be queued and handled after disarming and cleanup of
of the connection object (THD) is being done.We disarm the
the connection by modifying the epoll mask to zero which
ensure no events come and release the ownership of waiting
thread that collect events and then do the cleanup of THD.
object.As per the linux kernel epoll source code (
http://lxr.linux.no/linux+*/fs/eventpoll.c#L1771), EPOLLHUP
(or EPOLLERR) can't be masked even if we set EPOLL mask
to zero. So we disarm the connection and thus prevent
execution of any query processing handler/queueing to
client ctx. queue by removing the client fd from the epoll
set via EPOLL_CTL_DEL. Also there is a race condition which
involve the following threads:
1) Thread X executing KILL CONNECTION Y and is in THD::awake
and using mysys_var (holding LOCK_thd_data).
2) Thread Y in tp_process_event executing and is being killed.
3) Thread Z receives KILL flag internally and possible call
the tp_thd_cleanup function which set thread session variable
and changing mysys_var.
The fix for the above race is to set thread session variable
under LOCK_thd_data.
We also do not call THD::awake if we found the thread in the
thread list that is to be killed but it's KILL_CONNECTION flag
set thus avoiding any possible concurrent cleanup. This patch
is approved by Mikael Ronstrom via email review.
VARIABLES
Analysis:
-------------
After executing the query, new value of the user defined
variables are set in the function "select_dumpvar::send_data".
"select_dumpvar::send_data" first calls function
"Item_func_set_user_var::save_item_result()". This function
checks the nullness of the Item_field passed as parameter
to it and saves it. The nullness of item is stored with
arg[0]'s null_value flag. Then "select_dumpvar::send_data" calls
"Item_func_set_user_var::update()" which notices null
result that was saved and calls "Item_func_set_user_var::
update_hash". But here null_value is not set and args[0]
is different from that given to function "Item_func_set_user_var::
set_item_result()". This causes "Item_func_set_user_var::
update_hash" function to believe that its getting non-null value.
"user_var_entry::length" set to 0 and hence "user_var_entry::value"
is made to point to extra_area allocated in "user_var_entry".
And "Item_func_set_user_var::update_hash" tries to write
at memory beyond extra_area for result type DECIMAL. Because of
this invalid write issue is reported by Valgrind.
Before this bug was introduced, we avoided this problem by
creating "Item_func_set_user_var" object with the same
Item_field as arg[0] and as parameter to
Item_func_set_user_var::save_item_result(). But now
they are refering to different args[0]. Because of this
null_value flag set in parameter Item_field in function
"Item_func_set_user_var::save_item_result()" is not
reflected in "Item_func_set_user_var" object.
Fix:
------------
This issue is reported on versions 5.5.24. Issue does not exists
in 5.5.23, 5.1, 5.6 and trunk.
This issue was introduced by
revid:georgi.kodinov@oracle.com-20120309130449-82e3bs5v3et1x0ef (fix for
bug #12408412), which was pushed into 5.5 and later releases. This patch
has later been reversed in 5.6 and trunk by
revid:norvald.ryeng@oracle.com-20121010135242-xj34gg73h04hrmyh (fix for
bug #14664077). Backported this patch in 5.5 also to fix this issue.
sql/item_func.cc:
here unsigned value is converted to signed value.
sql/item_func.h:
last_insert_id() gives an auto_incremented value which can be
positive only,so defined it as a unsigned longlong sets the
unsigned_flag to 1.
MDEV-26: Global transaction id, partial commit
Change server_id to be a session variable.
User with SUPER can set it to binlog with different server_id.
Implement backward-compatible ::server_id mirror for plugins.
mysql-test/suite/innodb/include/restart_and_reinit.inc:
drop and recreate mysql.innodb* tables when deleting innodb table spaces
mysql-test/t/ssl_8k_key-master.opt:
with loose- prefix ssl errors are ignored
sql-common/client.c:
compiler warnings
sql/field.cc:
use the new function
sql/item.cc:
don't convert time to double or decimal via longlong,
this loses sub-second part.
Use dedicated functions.
sql/item.h:
incorrect cast_to_int type for params
sql/item_strfunc.cc:
use the new function
sql/lex.h:
unused
sql/my_decimal.h:
helper macro
sql/sql_plugin.cc:
workaround for a compiler warning
sql/sql_yacc.yy:
unused
sql/transaction.cc:
fix the merge for SERVER_STATUS_IN_TRANS_READONLY protocol flag
storage/sphinx/CMakeLists.txt:
compiler warnings
Documentation of the feature can be found at: http://kb.askmonty.org/en/multi-source-replication/
This code is based on code from Taobao, developed by Plinux
BUILD/SETUP.sh:
Added -Wno-invalid-offsetof to get rid of warning of offsetof() on C++ class (safe in the contex we use it)
client/mysqltest.cc:
Added support for error names starting with 'W'
Added connection_name support to --sync_with_master
cmake/maintainer.cmake:
Added -Wno-invalid-offsetof to get rid of warning of offsetof() on C++ class (safe in the contex we use it)
mysql-test/r/mysqltest.result:
Updated results
mysql-test/r/parser.result:
Updated results
mysql-test/suite/multi_source/my.cnf:
Setup of multi-master tests
mysql-test/suite/multi_source/simple.result:
Simple basic test of multi-source functionality
mysql-test/suite/multi_source/simple.test:
Simple basic test of multi-source functionality
mysql-test/suite/multi_source/syntax.result:
Test of multi-source syntax
mysql-test/suite/multi_source/syntax.test:
Test of multi-source syntax
mysql-test/suite/rpl/r/rpl_rotate_logs.result:
Updated results because of new error messages
mysql-test/t/parser.test:
Updated test as master_pos_wait() now takes more arguments than before
sql/event_scheduler.cc:
No reason to initialize slave_thread (it's guaranteed to be zero here)
sql/item_create.cc:
Added connection_name argument to master_pos_wait()
Simplified code
sql/item_func.cc:
Added connection_name argument to master_pos_wait()
sql/item_func.h:
Added connection_name argument to master_pos_wait()
sql/log.cc:
Added tag "Master 'connection_name'" to slave errors that has a connection name.
sql/mysqld.cc:
Added variable mysqld_server_initialized so that other functions can test if server is fully initialized.
Free all slave data in one place (fewer ifdef's)
Removed not needed call to close_active_mi()
Initialize slaves() later in startup to ensure that everthing is really initialized when slaves start.
Made status variable slave_running multi-source safe
sql/mysqld.h:
Added mysqld_server_initialized
sql/rpl_mi.cc:
Store connection name and cmp_connection_name (only used for show full slave status) in Master_info
Added code for Master_info_index, which handles storage of multi-master information
Don't write the empty "" connection_name to multi-master.info file. This is handled by the original code.
sql/rpl_mi.h:
Added connection_name and Master_info_index
sql/rpl_rli.cc:
Added connection_name to relay log files.
sql/rpl_rli.h:
Fixed type of slave_skip_counter as we now access it directly in sys_vars.cc, so it must be uint
sql/share/errmsg-utf8.txt:
Added new error messages needed for multi-source
Added multi-source name to error ER_MASTER_INFO and WARN_NO_MASTER_INFO
sql/slave.cc:
Moved things a bit around to make it easier to handle error conditions.
Create a global master_info_index and add the "" connection to it
Ensure that new Master_info doesn't fail.
Don't call terminate_slave_threads(active_mi..) on end_slave() as this is now done automaticly when deleting master_info_index.
Delete not needed function close_active_mi(). One can achive same thing by calling end_slave().
Added support for SHOW FULL SLAVE STATUS (show status for all master connections with connection_name as first column)
sql/slave.h:
Added new prototypes
sql/sql_base.cc:
More DBUG_PRINT
sql/sql_class.cc:
Reset thd->connection_name and thd-->default_master_connection
sql/sql_class.h:
Added thd->connection_name and thd-->default_master_connection
Added slave_skip_count to variables to make changing the @@sql_slave_skip_count variable thread safe
sql/sql_const.h:
Added MAX_CONNECTION_NAME
sql/sql_lex.cc:
Reset 'lex->verbose' (to simplify some sql_yacc.yy code)
sql/sql_lex.h:
Added connection_name
sql/sql_parse.cc:
Added support for connection_name to all SLAVE commands.
- Instead of using active_mi, we now get the current Master_info from master_info_index.
- Create new replication threads with CHANGE MASTER
- Added support for show_all_master_info()
sql/sql_reload.cc:
Made reset/full slave use master_info_index->get_master_info() instead of active_mi.
If one uses 'RESET SLAVE "connection_name" all' the connection is removed from master_info_index.
sql/sql_repl.cc:
sql_slave_skip_counter is moved to thd->variables to make it thread safe and fix some bugs with it
Add connection name to relay log files.
Added connection name to errors.
Added some logging for multi-master if log_warnings > 1
stop_slave():
- Don't check if thd is set. It's guaranteed to always be set.
change_master():
- Check for duplicate connection names in change_master()
- Check for wrong arguments first in file (to simplify error handling)
- Register new connections in master_info_index
sql/sql_yacc.yy:
Added optional connection_name to a all relevant master/slave commands
sql/strfunc.cc:
my_global.h shoud always be included first.
sql/sys_vars.cc:
Added variable default_master_connection
Made variable sql_slave_skip_counter multi-source safe
sql/sys_vars.h:
Added Sys_var_session_lexstring (needed for default_master_connection)
Added Sys_var_multi_source_uint (needed for sql_slave_skip_counter).
and small collateral changes
mysql-test/lib/My/Test.pm:
somehow with "print" we get truncated writes sometimes
mysql-test/suite/perfschema/r/digest_table_full.result:
md5 hashes of statement digests differ, because yacc token codes are different in mariadb
mysql-test/suite/perfschema/r/dml_handler.result:
host table is not ported over yet
mysql-test/suite/perfschema/r/information_schema.result:
host table is not ported over yet
mysql-test/suite/perfschema/r/nesting.result:
this differs, because we don't rewrite general log queries, and multi-statement
packets are logged as a one entry. this result file is identical to what mysql-5.6.5
produces with the --log-raw option.
mysql-test/suite/perfschema/r/relaylog.result:
MariaDB modifies the binlog index file directly, while MySQL 5.6 has a feature "crash-safe binlog index" and modifies a special "crash-safe" shadow copy of the index file and then moves it over. That's why this test shows "NONE" index file writes in MySQL and "MANY" in MariaDB.
mysql-test/suite/perfschema/r/server_init.result:
MariaDB initializes the "manager" resources from the "manager" thread, and starts this thread only when --flush-time is not 0. MySQL 5.6 initializes "manager" resources unconditionally on server startup.
mysql-test/suite/perfschema/r/stage_mdl_global.result:
this differs, because MariaDB disables query cache when query_cache_size=0. MySQL does not
do that, and this causes useless mutex locks and waits.
mysql-test/suite/perfschema/r/statement_digest.result:
md5 hashes of statement digests differ, because yacc token codes are different in mariadb
mysql-test/suite/perfschema/r/statement_digest_consumers.result:
md5 hashes of statement digests differ, because yacc token codes are different in mariadb
mysql-test/suite/perfschema/r/statement_digest_long_query.result:
md5 hashes of statement digests differ, because yacc token codes are different in mariadb
mysql-test/suite/rpl/r/rpl_mixed_drop_create_temp_table.result:
will be updated to match 5.6 when alfranio.correia@oracle.com-20110512172919-c1b5kmum4h52g0ni and anders.song@greatopensource.com-20110105052107-zoab0bsf5a6xxk2y are merged
mysql-test/suite/rpl/r/rpl_non_direct_mixed_mixing_engines.result:
will be updated to match 5.6 when anders.song@greatopensource.com-20110105052107-zoab0bsf5a6xxk2y is merged
two tests still fail:
main.innodb_icp and main.range_vs_index_merge_innodb
call records_in_range() with both range ends being open
(which triggers an assert)
PROBLEM:
mysql provides a feature where in a session which is
idle for a period specified by the wait_timeout variable
(whose value is in seconds), the session is closed
This feature is not present when we use thread pool.
FIX:
This patch implements the interface functions which is
required to implement the wait_timeout functionality
in the thread pool plugin.
KEY UPDATES WITH A LIMIT OF 1
Problem: The unsafety warning for statements such as
update...limit1 where pk=1 are thrown when binlog-format
= STATEMENT,despite of the fact that such statements are
actually safe. this leads to filling up of the disk space
with false warnings.
Solution: This is not a complete fix for the problem, but
prevents the disks from getting filled up. This should
therefore be regarded as a workaround. In the future this
should be overriden by server general suppress/filtering
framework. It should also be noted that another worklog is
supposed to defeat this case's artificial unsafety.
We use a warning suppression mechanism to detect warning flood,
enable the suppression, and disable this when the average
warnings/second has reduced to acceptable limits.
Activation: The supression for LIMIT unsafe statements are
activated when the last 50 warnings were logged in less
than 50 seconds.
Supression: Once activated this supression will prevent the
individual warnings to be logged in the error log, but print
the warning for every 50 warnings with the note:
"The last warning was repeated N times in last S seconds"
Noteworthy is the fact that this supression works only on the
error logs and the warnings seen by the clients will remain as
it is (i.e. one warning/ unsafe statement)
Deactivation: The supression will be deactivated once the
average # of warnings/sec have gone down to the acceptable limits.
sql/sql_class.cc:
Added code to supress warning while logging them to error-log.
- Add Monty Program Ab copyright in new files
- Change Apc_target::make_apc_call() to accept a C++-style
functor (instead of C-style function + parameter)
- Make SHOW EXPLAIN code take into account that st_select_lex object without joins can be
a full-featured SELECTs which were already executed and cleaned up.
Analysis:
-------------
If server is started with limit of MAX_CONNECTIONS and
MAX_USER_CONNECTIONS then only MAX_USER_CONNECTIONS of any particular
users can be connected to server and total MAX_CONNECTIONS of client can
be connected to server.
Server maintains a counter for total CONNECTIONS and total CONNECTIONS
from particular user.
Here, MAX_CONNECTIONS of connections are created to server. Out of this
MAX_CONNECTIONS, connections from particular user (say USER1) are
also created. The connections from USER1 is lesser than
MAX_USER_CONNECTIONS. After that there was one more connection request from
USER1. Since USER1 can still create connections as he havent reached
MAX_USER_CONNECTIONS, server increments counter of CONNECTIONS per user.
As server already has MAX_CONNECTIONS of connections, next check to total
CONNECTION count fails. In this case control is returned WITHOUT
decrementing the CONNECTIONS per user. So the counter per user CONNECTIONS goes
on incrementing for each attempt until current connections are closed.
And because of this counter per CONNECTIONS reached MAX_USER_CONNECTIONS.
So, next connections form USER1 user always returns with MAX_USER_CONNECTION
limit error, even when total connection to sever are less than MAX_CONNECTIONS.
Fix:
-------------
This issue is occurred because of not handling counters properly in the
server. Changed the code to handle per user connection counters properly.
PROBLEM:
Threads end-up in deadlock due to locks acquired as described
below,
con1: Run Query on a table.
It is important that this SELECT must back-off while
trying to open the t1 and enter into wait_for_condition().
The SELECT then is blocked trying to lock mysys_var->mutex
which is held by con3. The very significant fact here is
that mysys_var->current_mutex will still point to LOCK_open,
even if LOCK_open is no longer held by con1 at this point.
con2: Try dropping table used in con1 or query some table.
It will hold LOCK_open and be blocked trying to lock
kernel_mutex held by con4.
con3: Try killing the query run by con1.
It will hold THD::LOCK_thd_data belonging to con1 while
trying to lock mysys_var->current_mutex belonging to con1.
But current_mutex will point to LOCK_open which is held
by con2.
con4: Get innodb engine status
It will hold kernel_mutex, trying to lock THD::LOCK_thd_data
belonging to con1 which is held by con3.
So while technically only con2, con3 and con4 participate in the
deadlock, con1's mysys_var->current_mutex pointing to LOCK_open
is a vital component of the deadlock.
CYCLE = (THD::LOCK_thd_data -> LOCK_open ->
kernel_mutex -> THD::LOCK_thd_data)
FIX:
LOCK_thd_data has responsibility of protecting,
1) thd->query, thd->query_length
2) VIO
3) thd->mysys_var (used by KILL statement and shutdown)
4) THD during thread delete.
Among above responsibilities, 1), 2)and (3,4) seems to be three
independent group of responsibility. If there is different LOCK
owning responsibility of (3,4), the above mentioned deadlock cycle
can be avoid. This fix introduces LOCK_thd_kill to handle
responsibility (3,4), which eliminates the deadlock issue.
Note: The problem is not found in 5.5. Introduction MDL subsystem
caused metadata locking responsibility to be moved from TDC/TC to
MDL subsystem. Due to this, responsibility of LOCK_open is reduced.
As the use of LOCK_open is removed in open_table() and
mysql_rm_table() the above mentioned CYCLE does not form.
Revision ID for changes,
open_table() = dlenev@mysql.com-20100727133458-m3ua9oslnx8fbbvz
mysql_rm_table() = jon.hauglid@oracle.com-20101116100012-kxep9txz2fxy3nmw
BUG#11761686 insert_id event is not filtered.
Two issues are covered.
INSERT into autoincrement field which is not the first part in the composed primary key
is unsafe by autoincrement logging design. The case is specific to MyISAM engine
because Innodb does not allow such table definition.
However no warnings and row-format logging in the MIXED mode was done, and
that is fixed.
Int-, Rand-, User-var log-events were not filtered along with their parent
query that made possible them to screw up execution context of the following
query.
Fixed with deferring their execution until the parent query.
******
Bug#11754117
Post review fixes.
mysql-test/suite/rpl/r/rpl_auto_increment_bug45679.result:
a new result file is added.
mysql-test/suite/rpl/r/rpl_filter_tables_not_exist.result:
results updated.
mysql-test/suite/rpl/t/rpl_auto_increment_bug45679.test:
regression test for BUG#11754117-45670 is added.
mysql-test/suite/rpl/t/rpl_filter_tables_not_exist.test:
regression test for filtering issue of BUG#11754117 - 45670 is added.
sql/log_event.cc:
Logics are added for deferring and executing events associated
with the Query event.
sql/log_event.h:
Interface to deferred events batch execution is added.
sql/rpl_rli.cc:
initialization for new RLI members is added.
sql/rpl_rli.h:
New members to RLI are added to facilitate deferred events gathering
and execution control;
two general character RLI cleanup methods are constructed.
sql/rpl_utility.cc:
Deferred_log_events methods are difined.
sql/rpl_utility.h:
A new class Deferred_log_events is defined to implement
IRU events gathering, execution and cleanup.
sql/slave.cc:
Necessary changes to initialize `rli->deferred_events' and prevent
deferred event deletion in the main read-exec branch.
sql/sql_base.cc:
A new safe-check function for multi-part pk with auto-increment is defined
and deployed in lock_tables().
sql/sql_class.cc:
Initialization for a new member and replication cleanups are added
to THD class.
sql/sql_class.h:
THD class receives a new member to hold a specific execution
context for slave applier.
sql/sql_parse.cc:
Execution of the deferred event in started prior to its parent query.
mysql-test/suite/innodb/t/group_commit_crash.test:
remove autoincrement to avoid rbr being used for insert ... select
mysql-test/suite/innodb/t/group_commit_crash_no_optimize_thread.test:
remove autoincrement to avoid rbr being used for insert ... select
mysys/my_addr_resolve.c:
a pointer to a buffer is returned to the caller -> the buffer cannot be on the stack
mysys/stacktrace.c:
my_vsnprintf() is ok here, in 5.5