PROBLEM:
Threads end-up in deadlock due to locks acquired as described
below,
con1: Run Query on a table.
It is important that this SELECT must back-off while
trying to open the t1 and enter into wait_for_condition().
The SELECT then is blocked trying to lock mysys_var->mutex
which is held by con3. The very significant fact here is
that mysys_var->current_mutex will still point to LOCK_open,
even if LOCK_open is no longer held by con1 at this point.
con2: Try dropping table used in con1 or query some table.
It will hold LOCK_open and be blocked trying to lock
kernel_mutex held by con4.
con3: Try killing the query run by con1.
It will hold THD::LOCK_thd_data belonging to con1 while
trying to lock mysys_var->current_mutex belonging to con1.
But current_mutex will point to LOCK_open which is held
by con2.
con4: Get innodb engine status
It will hold kernel_mutex, trying to lock THD::LOCK_thd_data
belonging to con1 which is held by con3.
So while technically only con2, con3 and con4 participate in the
deadlock, con1's mysys_var->current_mutex pointing to LOCK_open
is a vital component of the deadlock.
CYCLE = (THD::LOCK_thd_data -> LOCK_open ->
kernel_mutex -> THD::LOCK_thd_data)
FIX:
LOCK_thd_data has responsibility of protecting,
1) thd->query, thd->query_length
2) VIO
3) thd->mysys_var (used by KILL statement and shutdown)
4) THD during thread delete.
Among above responsibilities, 1), 2)and (3,4) seems to be three
independent group of responsibility. If there is different LOCK
owning responsibility of (3,4), the above mentioned deadlock cycle
can be avoid. This fix introduces LOCK_thd_kill to handle
responsibility (3,4), which eliminates the deadlock issue.
Note: The problem is not found in 5.5. Introduction MDL subsystem
caused metadata locking responsibility to be moved from TDC/TC to
MDL subsystem. Due to this, responsibility of LOCK_open is reduced.
As the use of LOCK_open is removed in open_table() and
mysql_rm_table() the above mentioned CYCLE does not form.
Revision ID for changes,
open_table() = dlenev@mysql.com-20100727133458-m3ua9oslnx8fbbvz
mysql_rm_table() = jon.hauglid@oracle.com-20101116100012-kxep9txz2fxy3nmw
BUG#11761686 insert_id event is not filtered.
Two issues are covered.
INSERT into autoincrement field which is not the first part in the composed primary key
is unsafe by autoincrement logging design. The case is specific to MyISAM engine
because Innodb does not allow such table definition.
However no warnings and row-format logging in the MIXED mode was done, and
that is fixed.
Int-, Rand-, User-var log-events were not filtered along with their parent
query that made possible them to screw up execution context of the following
query.
Fixed with deferring their execution until the parent query.
******
Bug#11754117
Post review fixes.
mysql-test/suite/rpl/r/rpl_auto_increment_bug45679.result:
a new result file is added.
mysql-test/suite/rpl/r/rpl_filter_tables_not_exist.result:
results updated.
mysql-test/suite/rpl/t/rpl_auto_increment_bug45679.test:
regression test for BUG#11754117-45670 is added.
mysql-test/suite/rpl/t/rpl_filter_tables_not_exist.test:
regression test for filtering issue of BUG#11754117 - 45670 is added.
sql/log_event.cc:
Logics are added for deferring and executing events associated
with the Query event.
sql/log_event.h:
Interface to deferred events batch execution is added.
sql/rpl_rli.cc:
initialization for new RLI members is added.
sql/rpl_rli.h:
New members to RLI are added to facilitate deferred events gathering
and execution control;
two general character RLI cleanup methods are constructed.
sql/rpl_utility.cc:
Deferred_log_events methods are difined.
sql/rpl_utility.h:
A new class Deferred_log_events is defined to implement
IRU events gathering, execution and cleanup.
sql/slave.cc:
Necessary changes to initialize `rli->deferred_events' and prevent
deferred event deletion in the main read-exec branch.
sql/sql_base.cc:
A new safe-check function for multi-part pk with auto-increment is defined
and deployed in lock_tables().
sql/sql_class.cc:
Initialization for a new member and replication cleanups are added
to THD class.
sql/sql_class.h:
THD class receives a new member to hold a specific execution
context for slave applier.
sql/sql_parse.cc:
Execution of the deferred event in started prior to its parent query.
mysql-test/suite/innodb/t/group_commit_crash.test:
remove autoincrement to avoid rbr being used for insert ... select
mysql-test/suite/innodb/t/group_commit_crash_no_optimize_thread.test:
remove autoincrement to avoid rbr being used for insert ... select
mysys/my_addr_resolve.c:
a pointer to a buffer is returned to the caller -> the buffer cannot be on the stack
mysys/stacktrace.c:
my_vsnprintf() is ok here, in 5.5
BUG#64503: mysql frequently ignores --relay-log-space-limit
When the SQL thread goes to sleep, waiting for more events, it sets
the flag ignore_log_space_limit to true. This gives the IO thread a
chance to queue some more events and ultimately the SQL thread will be
able to purge the log once it is rotated. By then the SQL thread
resets the ignore_log_space_limit to false. However, between the time
the SQL thread has set the ignore flag and the time it resets it, the
IO thread will be queuing events in the relay log, possibly going way
over the limit.
This patch makes the IO and SQL thread to synchronize when they reach
the space limit and only ask for one event at a time. Thus the SQL
thread sets ignore_log_space_limit flag and the IO thread resets it to
false everytime it processes one more event. In addition, everytime
the SQL thread processes the next event, and the limit has been
reached, it checks if the IO thread should rotate. If it should, it
instructs the IO thread to rotate, giving the SQL thread a chance to
purge the logs (freeing space). Finally, this patch removes the
resetting of the ignore_log_space_limit flag from purge_first_log,
because this is now reset by the IO thread every time it processes the
next event when the limit has been reached.
If the SQL thread is in a transaction, it cannot purge so, there is no
point in asking the IO thread to rotate. The only thing it can do is
to ask for more events until the transaction is over (then it can ask
the IO to rotate and purge the log right away). Otherwise, there would
be a deadlock (SQL would not be able to purge and IO thread would not
be able to queue events so that the SQL would finish the transaction).
locked until we have finished clean up.
Previously, the code released the lock without marking that the thread
was running. This allowed a new slave thread to start while the old one
was still in the middle of cleaning up, causing assertions and probably
general mayhem.
Problem : The basic problem is the way the thread sleeps in mysql-5.5 and also in mysql-5.1
when we execute a stop slave on windows platform.
On windows platform if the stop slave is executed after the master dies, we have
this long wait before the stop slave return a value. This is because there is a
sleep of the thread. The sleep is uninterruptable in the two above version,
which was fixed by Davi patch for the BUG#11765860 for mysql-trunk. Backporting
his patch for mysql-5.5 fixes the problem.
Solution : A new pair of mutex and condition variable is introduced to synchronize thread
sleep and finalization. A new mutex is required because the slave threads are
terminated while holding the slave thread locks (run_lock), which can not be
relinquished during termination as this would affect the lock order.
mysql-test/suite/rpl/r/rpl_start_stop_slave.result:
The result file associated with the test added.
mysql-test/suite/rpl/t/rpl_start_stop_slave.test:
A test to check the new functionality.
sql/rpl_mi.cc:
The constructor using the new mutex and condition variables for the master_info.
sql/rpl_mi.h:
The condition variable and mutex have been added for the master_info.
sql/rpl_rli.cc:
The constructor using the new mutex and condition variables for the realy_log_info.
sql/rpl_rli.h:
The condition variable and mutex have been added for the relay_log_info.
sql/slave.cc:
Use a timed wait on a condition variable to implement a interruptible sleep.
The wait is registered with the THD object so that the thread will be woken
up if killed.
Fix typo causing too low timeout value for wait_for_slave_param.inc.
Fix binlog checksums following 5.5 merge.
Make sure the rpl suite can run with --mysqld=--binlog-checksum=CRC32
Fix a number of problems in the code when checksums are enabled.
dbug/tests.c:
Added __attribute__((unused)) to get rid of compiler warning
server-tools/instance-manager/guardian.cc:
Added __attribute__((unused)) to get rid of compiler warning
sql/filesort.cc:
Added __attribute__((unused)) to get rid of compiler warning
sql/slave.cc:
Added __attribute__((unused)) to get rid of compiler warning
sql/sql_load.cc:
Added __attribute__((unused)) to get rid of compiler warning
sql/sql_table.cc:
Added __attribute__((unused)) to get rid of compiler warning
storage/maria/ma_blockrec.c:
Added __attribute__((unused)) to get rid of compiler warning
storage/maria/ma_check.c:
Added missing cast
storage/maria/ma_loghandler.c:
Added __attribute__((unused)) to get rid of compiler warning
storage/maria/ma_recovery.c:
Added __attribute__((unused)) to get rid of compiler warning
storage/pbxt/src/cache_xt.cc:
Added __attribute__((unused)) to get rid of compiler warning
storage/xtradb/fil/fil0fil.c:
Removed not used variable
storage/xtradb/handler/ha_innodb.cc:
Use unused variable
vio/viosocket.c:
Remove usage of not used variable
vio/viosslfactories.c:
Added cast
When passing an empty user to the connect function will cause
valgrind warnings. Seems that the client code is not prepared
to handle empty users. On 5.6 this can even be triggered by
START SLAVE PASSWORD='...'; i.e., without setting USER='...' on
the START SLAVE command (see WL#4143 for details on the new
additional START SLAVE commands).
To fix this, we disallow empty users when configuring the slave
connection parameters (this decision might be revisited if the
client code accepts empty users in the future).
sql/slave.cc:
We throw an error if an empty user is supplied to the connection
function.
sql/sql_insert.cc:
CREATE ... IF NOT EXISTS may do nothing, but
it is still not a failure. don't forget to my_ok it.
******
CREATE ... IF NOT EXISTS may do nothing, but
it is still not a failure. don't forget to my_ok it.
sql/sql_table.cc:
small cleanup
******
small cleanup
Threat ER_CONNECTION_KILLED same as ER_SERVER_SHUTDOWN in replication (to get rid of a possible warning in error log)
sql/slave.cc:
Threat ER_CONNECTION_KILLED same as ER_SERVER_SHUTDOWN
- If USER is given, all threads for that user is signaled
- If SOFT is used then the KILL will not be sent to the handler. This can be used to not interrupt critical things in the handler like 'REPAIR'.
Internally added more kill signals. This gives us more information of why a query/connection was killed.
- KILL_SERVER is used when server is going down. In this case the users gets ER_SHUTDOWN as the reason connection was killed.
- Changed signals to number in correct order, which makes it easier to test how the signal should affect the code.
- New error message ER_CONNECTION_KILLED if connection was killed by 'KILL CONNECTION'. Before we got error ER_SHUTDOWN.
Changed names of not used parameters KILL_QUERY & KILL_CONNCTION to mysql_kill() to not conflict with defines in the server
include/mysql.h.pp:
Updated file
include/mysql_com.h:
Changed names of not used parameters KILL_QUERY & KILL_CONNCTION to mysql_kill() to not conflict with defines in the server
mysql-test/r/kill.result:
Added test of KILL USER
mysql-test/suite/rpl/r/rpl_stm_000001.result:
Updated error code
mysql-test/suite/rpl/t/rpl_stm_000001.test:
Updated error codes
mysql-test/t/flush_read_lock_kill.test:
Updated error codes
mysql-test/t/kill.test:
Added test of KILL USER
plugin/handler_socket/handlersocket/database.cpp:
Removed THD:: from KILL
sql/debug_sync.cc:
Removed THD:: from KILL
sql/event_scheduler.cc:
Removed THD:: from KILL
sql/filesort.cc:
Removed THD:: from KILL
sql/ha_ndbcluster_binlog.cc:
Removed THD:: from KILL
sql/handler.cc:
Removed THD:: from KILL
Simplify code.
sql/lex.h:
Added new keywords HARD | SOFT
sql/log.cc:
Removed THD:: from KILL
Added testing of new error ER_CONNECTION_KILLED
sql/log_event.cc:
Removed THD:: from KILL
Added testing of new error ER_CONNECTION_KILLED
sql/mysql_priv.h:
Added new prototypes
sql/mysqld.cc:
Removed THD:: from KILL
Use KILL_SERVER_HARD signal on shutdown.
sql/scheduler.cc:
Removed THD:: from KILL
Simplify test if connection should be killed
sql/share/errmsg.txt:
New error message ER_CONNECTION_KILLED
sql/slave.cc:
Removed THD:: from KILL
sql/sp_head.cc:
Removed THD:: from KILL
sql/sql_base.cc:
Removed THD:: from KILL
sql/sql_cache.cc:
Removed THD:: from KILL
sql/sql_class.cc:
Removed THD:: from KILL
Added killed_errno()
Only signal kill to storage engine if HARD bit is set.
sql/sql_class.h:
Move KILL options out from THD to make them easier to use in sql_yacc.yy
sql/sql_connect.cc:
Removed THD:: from KILL
sql/sql_delete.cc:
Removed THD:: from KILL
sql/sql_error.cc:
Removed THD:: from KILL
sql/sql_insert.cc:
Removed THD:: from KILL
Simplifed testing if thread is killed.
sql/sql_lex.h:
Added kill options to st_lex
sql/sql_load.cc:
Removed THD:: from KILL
sql/sql_parse.cc:
Added kill options to st_lex
Simplifed and optimzed testing of thd->killed at end of query
Added support for KILL USER
Extended sql_kill() to allow use of more kill signals.
sql/sql_repl.cc:
Removed THD:: from KILL
sql/sql_show.cc:
Removed THD:: from KILL
Simplied testing if query/connection was killed
sql/sql_table.cc:
Removed THD:: from KILL
sql/sql_update.cc:
Removed THD:: from KILL
sql/sql_yacc.yy:
Added support for new KILL syntax: KILL [HARD|SOFT] [CONNECTION|QUERY] [ID | USER user_name]
storage/archive/ha_archive.cc:
Simplify compilation
storage/maria/ha_maria.cc:
Removed THD:: from KILL
Connection of slave to master using a replication account which authenticates
with an external plugin was not possible.
Fixed by making sure that the CLIENT_PLUGIN_AUTH capability is set when client connects using mysql_real_connect(). Also, a plugin-dir path used by client library to locate authentication plugins is set based on the analogous server setting. This is done in connect_to_master() function before a call to mysql_real_connect().
Give more information when finding an error in a MyISAM table.
When killing system thread, use KILL_SYSTEM_THREAD instead of KILL_CONNECTION to make it easier to ignore the signal in sensitive context (like auto-repair)
Added new kill level: KILL_SERVER that will in the future to be used to signal killed by shutdown.
Add more warnings about killed connections when warning level > 3
include/myisamchk.h:
Added counting of printed info/notes
mysys/mf_iocache.c:
Remove duplicate assignment
sql/handler.cc:
Added test of KILL_SERVER
sql/log.cc:
Ignore new 'kill' error ER_NEW_ABORTING_CONNECTION when requesting query error code.
sql/mysqld.cc:
Add more warnings for killed connections when warning level > 3
sql/scheduler.cc:
Added checks for new kill signals
sql/slave.cc:
Ignore new kill signal ER_NEW_ABORTING_CONNECTION
sql/sp_head.cc:
Fixed assignment to bool
Added testing of new kill signals
sql/sql_base.cc:
Use KILL_SYSTEM_THREAD to auto-kill system threads
sql/sql_class.cc:
Add more warnings for killed connections when warning level > 3
thd_killed() now ignores KILL_BAD_DATA and THD::KILL_SYSTEM_THREAD as these should not abort sensitive operations.
sql/sql_class.h:
Added KILL_SYSTEM_THREAD and KILL_SERVER
sql/sql_connect.cc:
Added handling of KILL_SERVER
sql/sql_insert.cc:
Use KILL_SYSTEM_THREAD to auto-kill system threads
Added handling of KILL_SERVER
sql/sql_parse.cc:
Add more warnings for killed connections when warning level > 3
Added checking that thd->abort_on_warning is reset at end of query.
sql/sql_show.cc:
Update condition for when a query is 'killed'
storage/myisam/ha_myisam.cc:
Added counting of info/notes printed
storage/myisam/mi_check.c:
Always print an an error if we find data errors when checking/repairing a MyISAM table.
When a repair was killed, don't retry repair.
Added assert if sort_get_next_record() returned an error without an error message.
Removed nonsence check "if (sort_param->read_cache.error < 0)" in repair.
storage/myisam/myisamchk.c:
Added counting of notes printed
storage/pbxt/src/thread_xt.cc:
Better error message.
Background: Backporting fix for BUG 11752963 to Mysql5.1 branch.
Problem: Fix of bug 11752963 was only available for trunk and 5.5 branch.
Partial fix has been pushed to 5.1 branch as well.
Fix: backporting the fixes of bug 11752963 to 5.1 branch.
1. Made all major changes to make 5.1 branch in line with 5.5 and the trunk.
2. skipped the partial patch that was already applied to the 5.1 branch.
sql/rpl_rli.h:
Made inited Volatile (find inline comments)
sql/slave.cc:
backported all changes from the fix of BUG#11752963.