The reason for the deadlock was an improper exit from
MDL_context::wait_for_locks() which caused mysys_var->current_mutex to remain
LOCK_mdl even though LOCK_mdl was no longer held by that connection.
This could for example lead to a deadlock in the following way:
1) INSERT DELAYED tries to open a table but fails, and trying to recover it
calls wait_for_locks().
2) Due to a pending exclusive request, wait_for_locks() fails and exits without
resetting mysys_var->current_mutex for the delayed insert handler thread. So it
continues to point to LOCK_mdl.
3) The handler thread manages to open a table.
4) A different connection takes LOCK_open and tries to take LOCK_mdl.
5) FLUSH TABLES from a third connection notices that the handler thread has a
table open, and tries to kill it. This involves locking mysys_var->current_mutex
while having LOCK_open locked. Since current_mutex mistakenly points to LOCK_mdl,
we have a deadlock.
This patch makes sure MDL_EXIT_COND() is called before exiting wait_for_locks().
This clears mysys->current_mutex which resolves the issue.
An assert is added to recover_from_failed_open_table_attempt() after
wait_for_locks() is called, to check that current_mutex is indeed reset.
With this assert in place, existing tests in (e.g.) mdl_sync.test will fail
without this patch.
The reason for the deadlock was an improper exit from
MDL_context::wait_for_locks() which caused mysys_var->current_mutex to remain
LOCK_mdl even though LOCK_mdl was no longer held by that connection.
This could for example lead to a deadlock in the following way:
1) INSERT DELAYED tries to open a table but fails, and trying to recover it
calls wait_for_locks().
2) Due to a pending exclusive request, wait_for_locks() fails and exits without
resetting mysys_var->current_mutex for the delayed insert handler thread. So it
continues to point to LOCK_mdl.
3) The handler thread manages to open a table.
4) A different connection takes LOCK_open and tries to take LOCK_mdl.
5) FLUSH TABLES from a third connection notices that the handler thread has a
table open, and tries to kill it. This involves locking mysys_var->current_mutex
while having LOCK_open locked. Since current_mutex mistakenly points to LOCK_mdl,
we have a deadlock.
This patch makes sure MDL_EXIT_COND() is called before exiting wait_for_locks().
This clears mysys->current_mutex which resolves the issue.
An assert is added to recover_from_failed_open_table_attempt() after
wait_for_locks() is called, to check that current_mutex is indeed reset.
With this assert in place, existing tests in (e.g.) mdl_sync.test will fail
without this patch.
-----------------------------------------------------------
2630.28.28 Magne Mahre 2008-12-05
Bug #38661 'all threads hang in "opening tables" or "waiting for table"
and cpu is at 100%'
Concurrent execution of FLUSH TABLES statement and at least two statements
using the same table might have led to live-lock which caused all three
connections to stall and hog 100% of CPU.
tdc_wait_for_old_versions() wrongly assumed that there cannot be a share
with an old version and no used TABLE instances and thus was failing to
perform wait in situation when such old share was cached in MDL subsystem
thanks to a still active metadata lock on the table. So it might have
happened that two or more connections simultaneously executing statements
which involve table being flushed managed to prevent each other from
waiting in this function by keeping shared metadata lock on the table
constantly active (i.e. one of the statements managed to take/hold this
lock while other statements were calling tdc_wait_for_old_versions()).
Thus they were forcing each other to loop infinitely in open_tables() -
close_thread_tables_for_reopen() - tdc_wait_for_old_versions() cycle
causing CPU hogging.
This patch fixes this problem by removing this false assumption from
tdc_wait_for_old_versions().
Note that the problem is specific only for server versions >= 6.0.
No test case is submitted for this test, as the test infrastructure
hasn't got the necessary primitives to test the behaviour. The
manifestation is that throughput will decrease to a low level
(possibly 0) after some time, and stay at that level. Several
transactions will not complete.
Manual testing can be done by running the code submitted by Shane
Bester attached to the bug report. If the bug persists, the
transaction thruput will almost immediately drop to near zero
(shown as the transaction count output from the test program staying
on a close to constant value, instead of increasing rapidly).
-----------------------------------------------------------
2630.28.28 Magne Mahre 2008-12-05
Bug #38661 'all threads hang in "opening tables" or "waiting for table"
and cpu is at 100%'
Concurrent execution of FLUSH TABLES statement and at least two statements
using the same table might have led to live-lock which caused all three
connections to stall and hog 100% of CPU.
tdc_wait_for_old_versions() wrongly assumed that there cannot be a share
with an old version and no used TABLE instances and thus was failing to
perform wait in situation when such old share was cached in MDL subsystem
thanks to a still active metadata lock on the table. So it might have
happened that two or more connections simultaneously executing statements
which involve table being flushed managed to prevent each other from
waiting in this function by keeping shared metadata lock on the table
constantly active (i.e. one of the statements managed to take/hold this
lock while other statements were calling tdc_wait_for_old_versions()).
Thus they were forcing each other to loop infinitely in open_tables() -
close_thread_tables_for_reopen() - tdc_wait_for_old_versions() cycle
causing CPU hogging.
This patch fixes this problem by removing this false assumption from
tdc_wait_for_old_versions().
Note that the problem is specific only for server versions >= 6.0.
No test case is submitted for this test, as the test infrastructure
hasn't got the necessary primitives to test the behaviour. The
manifestation is that throughput will decrease to a low level
(possibly 0) after some time, and stay at that level. Several
transactions will not complete.
Manual testing can be done by running the code submitted by Shane
Bester attached to the bug report. If the bug persists, the
transaction thruput will almost immediately drop to near zero
(shown as the transaction count output from the test program staying
on a close to constant value, instead of increasing rapidly).
----------------------------------------------------
2736.2.10 Michael Widenius 2008-10-22
Fix for bug#39395 Maria: ma_extra.c:286: maria_extra:
Assertion `share->reopen == 1' failed
sql/sql_base.cc:
Race condition in wait_while_table_is_used() where a table used
by another connection could be forced closed, but there was no protection against the other thread re-opening the table and trying to lock it
again before the table was name locked by original thread.
(diagnostics_area)
Execution of CREATE TABLE ... SELECT statement was not atomic in
the sense that concurrent statements trying to affect its target
table might have sneaked in between the moment when the table was
created and moment when it was filled according to SELECT clause.
This resulted in inconsistent binary log, unexpected target table
contents. In cases when concurrent statement was a DDL statement
CREATE TABLE ... SELECT might have failed with ER_CANT_LOCK error.
In more detail:
Due to premature metadata lock downgrade which occured after CREATE
TABLE SELECT statement created table but before it managed to obtain
table-level lock on it other statements were allowed to open, lock
and change target table in the middle of CREATE TABLE SELECT
execution. This also meant that it was possible that CREATE TABLE
SELECT would wait in mysql_lock_tables() when it was called for newly
created table and that this wait could have been aborted by concurrent
DDL. The latter led to execution of unexpected branch of code and
CREATE TABLE SELECT ending with ER_CANT_LOCK error.
The premature downgrade occured because open_table(), which was called
for newly created table, decided that it is OK to downgrade metadata
lock from exclusive to shared since table exists, even although it
was not acquired within this call.
This fix ensures that open_table() does not downgrade metadata lock
if it is not acquired during its current invocation.
Testing:
The bug is exposed in a race condition, and is thus difficult to
expose in a standard mysql-test-run test case. Instead, a stress
test using the Random Query Generator (https://launchpad.net/randgen)
will trip the problem occasionally.
% perl runall.pl \
--basedir=<build dir> \
--mysqld=--table-lock-wait-timeout=5 \
--mysqld=--skip-safemalloc \
--grammar=conf/maria_bulk_insert.yy \
--reporters=ErrorLog,Backtrace,WinPackage \
--mysqld=--log-output=file \
--queries=100000 \
--threads=10 \
--engine=myisam
Note: You will need a debug build to expose the bug
When the bug is tripped, the server will abort and dump core.
Backport from 6.0-codebase (revid: 2617.53.4)
(diagnostics_area)
Execution of CREATE TABLE ... SELECT statement was not atomic in
the sense that concurrent statements trying to affect its target
table might have sneaked in between the moment when the table was
created and moment when it was filled according to SELECT clause.
This resulted in inconsistent binary log, unexpected target table
contents. In cases when concurrent statement was a DDL statement
CREATE TABLE ... SELECT might have failed with ER_CANT_LOCK error.
In more detail:
Due to premature metadata lock downgrade which occured after CREATE
TABLE SELECT statement created table but before it managed to obtain
table-level lock on it other statements were allowed to open, lock
and change target table in the middle of CREATE TABLE SELECT
execution. This also meant that it was possible that CREATE TABLE
SELECT would wait in mysql_lock_tables() when it was called for newly
created table and that this wait could have been aborted by concurrent
DDL. The latter led to execution of unexpected branch of code and
CREATE TABLE SELECT ending with ER_CANT_LOCK error.
The premature downgrade occured because open_table(), which was called
for newly created table, decided that it is OK to downgrade metadata
lock from exclusive to shared since table exists, even although it
was not acquired within this call.
This fix ensures that open_table() does not downgrade metadata lock
if it is not acquired during its current invocation.
Testing:
The bug is exposed in a race condition, and is thus difficult to
expose in a standard mysql-test-run test case. Instead, a stress
test using the Random Query Generator (https://launchpad.net/randgen)
will trip the problem occasionally.
% perl runall.pl \
--basedir=<build dir> \
--mysqld=--table-lock-wait-timeout=5 \
--mysqld=--skip-safemalloc \
--grammar=conf/maria_bulk_insert.yy \
--reporters=ErrorLog,Backtrace,WinPackage \
--mysqld=--log-output=file \
--queries=100000 \
--threads=10 \
--engine=myisam
Note: You will need a debug build to expose the bug
When the bug is tripped, the server will abort and dump core.
Backport from 6.0-codebase (revid: 2617.53.4)
2630.16.14 Sergei Golubchik 2008-08-25
fixed a crash in partition tests
introduced by HA_EXTRA_PREPARE_FOR_DROP patch
sql/sql_base.cc:
Don't call ::extra() for closed tables.
Bug #46654 False deadlock on concurrent DML/DDL with partitions,
inconsistent behavior
The problem was that if one connection is running a multi-statement
transaction which involves a single partitioned table, and another
connection attempts to alter the table, the first connection gets
ER_LOCK_DEADLOCK and cannot proceed anymore, even when the ALTER TABLE
statement in another connection has timed out or failed.
The reason for this was that the prepare phase for ALTER TABLE for
partitioned tables removed all instances of the table from the table
definition cache before it started waiting on the lock. The transaction
running in the first connection would notice this and report ER_LOCK_DEADLOCK.
This patch changes the prep_alter_part_table() ALTER TABLE code so that
tdc_remove_table() is no longer called. Instead, only the TABLE instance
changed by prep_alter_part_table() is marked as needing reopen.
The patch also removes an unnecessary call to tdc_remove_table() from
mysql_unpack_partition() as the changed TABLE object is destroyed by the
caller at a later point.
Test case added in partition_sync.test.
Bug #46654 False deadlock on concurrent DML/DDL with partitions,
inconsistent behavior
The problem was that if one connection is running a multi-statement
transaction which involves a single partitioned table, and another
connection attempts to alter the table, the first connection gets
ER_LOCK_DEADLOCK and cannot proceed anymore, even when the ALTER TABLE
statement in another connection has timed out or failed.
The reason for this was that the prepare phase for ALTER TABLE for
partitioned tables removed all instances of the table from the table
definition cache before it started waiting on the lock. The transaction
running in the first connection would notice this and report ER_LOCK_DEADLOCK.
This patch changes the prep_alter_part_table() ALTER TABLE code so that
tdc_remove_table() is no longer called. Instead, only the TABLE instance
changed by prep_alter_part_table() is marked as needing reopen.
The patch also removes an unnecessary call to tdc_remove_table() from
mysql_unpack_partition() as the changed TABLE object is destroyed by the
caller at a later point.
Test case added in partition_sync.test.
Bug#42546 Backup: RESTORE fails, thinking it finds an existing table
The problem occured when a MDL locking conflict happened for a non-existent
table between a CREATE and a INSERT statement. The code for CREATE
interpreted this lock conflict to mean that the table existed,
which meant that the statement failed when it should not have.
The problem could occur for CREATE TABLE, CREATE TABLE LIKE and
ALTER TABLE RENAME.
This patch fixes the problem for CREATE TABLE and CREATE TABLE LIKE.
It is based on code backported from the mysql-6.1-fk tree written
by Dmitry Lenev. CREATE now uses normal open_and_lock_tables() code
to acquire exclusive locks. This means that for the test case in the bug
description, CREATE will wait until INSERT completes so that it can
get the exclusive lock. This resolves the reported bug.
The patch also prohibits CREATE TABLE and CREATE TABLE LIKE under
LOCK TABLES. Note that this is an incompatible change and must
be reflected in the documentation. Affected test cases have been
updated.
mdl_sync.test contains tests for CREATE TABLE and CREATE TABLE LIKE.
Fixing the issue for ALTER TABLE RENAME is beyond the scope of this
patch. ALTER TABLE cannot be prohibited from working under LOCK TABLES
as this could seriously impact customers and a proper fix would require
a significant rewrite.
Bug#42546 Backup: RESTORE fails, thinking it finds an existing table
The problem occured when a MDL locking conflict happened for a non-existent
table between a CREATE and a INSERT statement. The code for CREATE
interpreted this lock conflict to mean that the table existed,
which meant that the statement failed when it should not have.
The problem could occur for CREATE TABLE, CREATE TABLE LIKE and
ALTER TABLE RENAME.
This patch fixes the problem for CREATE TABLE and CREATE TABLE LIKE.
It is based on code backported from the mysql-6.1-fk tree written
by Dmitry Lenev. CREATE now uses normal open_and_lock_tables() code
to acquire exclusive locks. This means that for the test case in the bug
description, CREATE will wait until INSERT completes so that it can
get the exclusive lock. This resolves the reported bug.
The patch also prohibits CREATE TABLE and CREATE TABLE LIKE under
LOCK TABLES. Note that this is an incompatible change and must
be reflected in the documentation. Affected test cases have been
updated.
mdl_sync.test contains tests for CREATE TABLE and CREATE TABLE LIKE.
Fixing the issue for ALTER TABLE RENAME is beyond the scope of this
patch. ALTER TABLE cannot be prohibited from working under LOCK TABLES
as this could seriously impact customers and a proper fix would require
a significant rewrite.
------------------------------------------------------------
revno: 2617.68.25
committer: Dmitry Lenev <dlenev@mysql.com>
branch nick: mysql-next-bg-pre2-2
timestamp: Wed 2009-09-16 18:26:50 +0400
message:
Follow-up for one of pre-requisite patches for fixing bug #30977
"Concurrent statement using stored function and DROP FUNCTION
breaks SBR".
Made enum_mdl_namespace enum part of MDL_key class and removed MDL_
prefix from the names of enum members. In order to do the latter
changed name of PROCEDURE symbol to PROCEDURE_SYM (otherwise macro
which was automatically generated for this symbol conflicted with
MDL_key::PROCEDURE enum member).
------------------------------------------------------------
revno: 2617.68.25
committer: Dmitry Lenev <dlenev@mysql.com>
branch nick: mysql-next-bg-pre2-2
timestamp: Wed 2009-09-16 18:26:50 +0400
message:
Follow-up for one of pre-requisite patches for fixing bug #30977
"Concurrent statement using stored function and DROP FUNCTION
breaks SBR".
Made enum_mdl_namespace enum part of MDL_key class and removed MDL_
prefix from the names of enum members. In order to do the latter
changed name of PROCEDURE symbol to PROCEDURE_SYM (otherwise macro
which was automatically generated for this symbol conflicted with
MDL_key::PROCEDURE enum member).
------------------------------------------------------------
revno: 2617.68.24
committer: Dmitry Lenev <dlenev@mysql.com>
branch nick: mysql-next-bg-pre2-2
timestamp: Wed 2009-09-16 17:25:29 +0400
message:
Pre-requisite patch for fixing bug #30977 "Concurrent statement
using stored function and DROP FUNCTION breaks SBR".
Added MDL_request for stored routine as member to Sroutine_hash_entry
in order to be able perform metadata locking for stored routines in
future (Sroutine_hash_entry is an equivalent of TABLE_LIST class for
stored routines).
(WL#4284, follow up fixes).
sql/mdl.cc:
Introduced version of MDL_request::init() method which initializes
lock request using pre-built MDL key.
MDL_key::table_name/table_name_length() getters were
renamed to reflect the fact that MDL_key objects are
now created not only for tables.
sql/mdl.h:
Extended enum_mdl_namespace enum with values which correspond
to namespaces for stored functions and triggers.
Renamed MDL_key::table_name/table_name_length() getters
to MDL_key::name() and name_length() correspondingly to
reflect the fact that MDL_key objects are now created
not only for tables.
Added MDL_key::mdl_namespace() getter.
Also added version of MDL_request::init() method which
initializes lock request using pre-built MDL key.
sql/sp.cc:
Added MDL_request for stored routine as member to Sroutine_hash_entry.
Changed code to use MDL_key from this request as a key for LEX::sroutines
set. Removed separate "key" member from Sroutine_hash_entry as it became
unnecessary.
sql/sp.h:
Added MDL_request for stored routine as member to Sroutine_hash_entry
in order to be able perform metadata locking for stored routines in
future (Sroutine_hash_entry is an equivalent of TABLE_LIST class for
stored routines).
Removed Sroutine_hash_entry::key member as now we can use MDL_key from
this request as a key for LEX::sroutines set.
sql/sp_head.cc:
Removed sp_name::m_sroutines_key member and set_routine_type() method.
Since key for routine in LEX::sroutines set has no longer sp_name::m_qname
as suffix we won't save anything by creating it at sp_name construction
time.
Adjusted sp_name constructor used for creating temporary objects for
lookups in SP-cache to accept MDL_key as parameter and to avoid any
memory allocation.
Finally, removed sp_head::m_soutines_key member for reasons similar
to why sp_name::m_sroutines_key was removed
sql/sp_head.h:
Removed sp_name::m_sroutines_key member and set_routine_type() method.
Since key for routine in LEX::sroutines set has no longer sp_name::m_qname
as suffix we won't save anything by creating it at sp_name construction
time.
Adjusted sp_name constructor used for creating temporary objects for
lookups in SP-cache to accept MDL_key as parameter and to avoid any
memory allocation.
Finally, removed sp_head::m_soutines_key member for reasons similar
to why sp_name::m_sroutines_key was removed.
sql/sql_base.cc:
Adjusted code to the fact that we now use MDL_key from
Sroutine_hash_entry::mdl_request as a key for LEX::sroutines set.
MDL_key::table_name/table_name_length() getters were
renamed to reflect the fact that MDL_key objects are
now created not only for tables.
sql/sql_trigger.cc:
sp_add_used_routine() now takes MDL_key as parameter as now we use
instance of this class as a key for LEX::sroutines set.
------------------------------------------------------------
revno: 2617.68.24
committer: Dmitry Lenev <dlenev@mysql.com>
branch nick: mysql-next-bg-pre2-2
timestamp: Wed 2009-09-16 17:25:29 +0400
message:
Pre-requisite patch for fixing bug #30977 "Concurrent statement
using stored function and DROP FUNCTION breaks SBR".
Added MDL_request for stored routine as member to Sroutine_hash_entry
in order to be able perform metadata locking for stored routines in
future (Sroutine_hash_entry is an equivalent of TABLE_LIST class for
stored routines).
(WL#4284, follow up fixes).
revno: 2617.68.23
committer: Dmitry Lenev <dlenev@mysql.com>
branch nick: mysql-next-bg-pre1
timestamp: Wed 2009-09-16 09:34:42 +0400
message:
Pre-requisite patch for fixing bug #30977 "Concurrent statement
using stored function and DROP FUNCTION breaks SBR".
CREATE TABLE SELECT statements take exclusive metadata lock on table
being created. Invariant of metadata locking subsystem states that
such lock should be taken before taking any kind of shared locks.
Once metadata locks on stored routines are introduced statements like
"CREATE TABLE ... SELECT f1()" will break this invariant by taking
shared locks on routines before exclusive lock on target table.
To avoid this, open_tables() is reworked to process tables which are
directly used by the statement before stored routines are processed.
sql/sql_base.cc:
Refactored open_tables() implementation to process stored routines
only after tables which are directly used by statement were processed.
To achieve this moved handling of routines in open_tables() out of
loop which iterates over tables to a new separate loop. And in its
turn this allowed to split handling of particular table or view to
an auxiliary function, which made code in open_tables() simpler and
more easy to understand.