1
0
mirror of https://github.com/MariaDB/server.git synced 2025-07-30 16:24:05 +03:00

Patch that refactors global read lock implementation and fixes

bug #57006 "Deadlock between HANDLER and FLUSH TABLES WITH READ
LOCK" and bug #54673 "It takes too long to get readlock for
'FLUSH TABLES WITH READ LOCK'".

The first bug manifested itself as a deadlock which occurred
when a connection, which had some table open through HANDLER
statement, tried to update some data through DML statement
while another connection tried to execute FLUSH TABLES WITH
READ LOCK concurrently.

What happened was that FTWRL in the second connection managed
to perform first step of GRL acquisition and thus blocked all
upcoming DML. After that it started to wait for table open
through HANDLER statement to be flushed. When the first connection
tried to execute DML it has started to wait for GRL/the second
connection creating deadlock.

The second bug manifested itself as starvation of FLUSH TABLES
WITH READ LOCK statements in cases when there was a constant
stream of concurrent DML statements (in two or more
connections).

This has happened because requests for protection against GRL
which were acquired by DML statements were ignoring presence of
pending GRL and thus the latter was starved.

This patch solves both these problems by re-implementing GRL
using metadata locks.

Similar to the old implementation acquisition of GRL in new
implementation is two-step. During the first step we block
all concurrent DML and DDL statements by acquiring global S
metadata lock (each DML and DDL statement acquires global IX
lock for its duration). During the second step we block commits
by acquiring global S lock in COMMIT namespace (commit code
acquires global IX lock in this namespace).

Note that unlike in old implementation acquisition of
protection against GRL in DML and DDL is semi-automatic.
We assume that any statement which should be blocked by GRL
will either open and acquires write-lock on tables or acquires
metadata locks on objects it is going to modify. For any such
statement global IX metadata lock is automatically acquired
for its duration.

The first problem is solved because waits for GRL become
visible to deadlock detector in metadata locking subsystem
and thus deadlocks like one in the first bug become impossible.

The second problem is solved because global S locks which
are used for GRL implementation are given preference over
IX locks which are acquired by concurrent DML (and we can
switch to fair scheduling in future if needed).

Important change:
FTWRL/GRL no longer blocks DML and DDL on temporary tables.
Before this patch behavior was not consistent in this respect:
in some cases DML/DDL statements on temporary tables were
blocked while in others they were not. Since the main use cases
for FTWRL are various forms of backups and temporary tables are
not preserved during backups we have opted for consistently
allowing DML/DDL on temporary tables during FTWRL/GRL.

Important change:
This patch changes thread state names which are used when
DML/DDL of FTWRL is waiting for global read lock. It is now
either "Waiting for global read lock" or "Waiting for commit
lock" depending on the stage on which FTWRL is.

Incompatible change:
To solve deadlock in events code which was exposed by this
patch we have to replace LOCK_event_metadata mutex with
metadata locks on events. As result we have to prohibit
DDL on events under LOCK TABLES.

This patch also adds extensive test coverage for interaction
of DML/DDL and FTWRL.

Performance of new and old global read lock implementations
in sysbench tests were compared. There were no significant
difference between new and old implementations.
This commit is contained in:
Dmitry Lenev
2010-11-11 20:11:05 +03:00
parent aee8fce233
commit 378cdc58c1
75 changed files with 5492 additions and 1243 deletions

View File

@ -0,0 +1,158 @@
#
# SUMMARY
# Check that a statement is compatible with FLUSH TABLES WITH READ LOCK.
#
# PARAMETERS
# $con_aux1 Name of the 1st aux connection to be used by this script.
# $con_aux2 Name of the 2nd aux connection to be used by this script.
# $statement The statement to be checked.
# $cleanup_stmt The statement to be run in order to revert effects of
# the statement to be checked.
# $skip_3rd_chk Skip the 3rd stage of checking. The purpose of the third
# stage is to check that metadata locks taken by this
# statement are compatible with metadata locks taken
# by FTWRL.
#
# EXAMPLE
# flush_read_lock.test
#
--disable_result_log
--disable_query_log
# Reset DEBUG_SYNC facility for safety.
set debug_sync= "RESET";
#
# First, check that the statement can be run under FTWRL.
#
flush tables with read lock;
--disable_abort_on_error
--eval $statement
--enable_abort_on_error
let $err= $mysql_errno;
if (!$err)
{
--echo Success: Was able to run '$statement' under FTWRL.
unlock tables;
if (`SELECT "$cleanup_stmt" <> ""`)
{
--eval $cleanup_stmt;
}
}
if ($err)
{
--echo Error: Wasn't able to run '$statement' under FTWRL!
unlock tables;
}
#
# Then check that this statement won't be blocked by FTWRL
# that is active in another connection.
#
connection $con_aux1;
flush tables with read lock;
connection default;
--send_eval $statement;
connection $con_aux1;
--enable_result_log
--enable_query_log
let $wait_condition=
select count(*) = 0 from information_schema.processlist
where info = "$statement";
--source include/wait_condition.inc
--disable_result_log
--disable_query_log
if ($success)
{
--echo Success: Was able to run '$statement' with FTWRL active in another connection.
connection default;
# Apparently statement was successfully executed and so
# was not blocked by FTWRL.
# To be safe against wait_condition.inc succeeding due to
# races let us first reap the statement being checked to
# ensure that it has been successfully executed.
--reap
connection $con_aux1;
unlock tables;
connection default;
}
if (!$success)
{
--echo Error: Wasn't able to run '$statement' with FTWRL active in another connection!
unlock tables;
connection default;
--reap
}
if (`SELECT "$cleanup_stmt" <> ""`)
{
--eval $cleanup_stmt;
}
if (`SELECT "$skip_3rd_check" = ""`)
{
#
# Finally, let us check that FTWRL will succeed if this statement
# is active but has already closed its tables.
#
connection default;
set debug_sync='execute_command_after_close_tables SIGNAL parked WAIT_FOR go';
--send_eval $statement;
connection $con_aux1;
set debug_sync="now WAIT_FOR parked";
--send flush tables with read lock
connection $con_aux2;
--enable_result_log
--enable_query_log
let $wait_condition=
select count(*) = 0 from information_schema.processlist
where info = "flush tables with read lock";
--source include/wait_condition.inc
--disable_result_log
--disable_query_log
if ($success)
{
--echo Success: Was able to run FTWRL while '$statement' was active in another connection.
connection $con_aux1;
# Apparently FTWRL was successfully executed and so was not blocked by
# the statement being checked. To be safe against wait_condition.inc
# succeeding due to races let us first reap the FTWRL to ensure that it
# has been successfully executed.
--reap
unlock tables;
set debug_sync="now SIGNAL go";
connection default;
--reap
}
if (!$success)
{
--echo Error: Wasn't able to run FTWRL while '$statement' was active in another connection!
set debug_sync="now SIGNAL go";
connection default;
--reap
connection $con_aux1;
--reap
unlock tables;
connection default;
}
set debug_sync= "RESET";
if (`SELECT "$cleanup_stmt" <> ""`)
{
--eval $cleanup_stmt;
}
}
--enable_result_log
--enable_query_log

View File

@ -0,0 +1,155 @@
#
# SUMMARY
# Check that a statement is incompatible with FLUSH TABLES WITH READ LOCK.
#
# PARAMETERS
# $con_aux1 Name of the 1st aux connection to be used by this script.
# $con_aux2 Name of the 2nd aux connection to be used by this script.
# $statement The statement to be checked.
# $cleanup_stmt1 The 1st statement to be run in order to revert effects
# of statement to be checked.
# $cleanup_stmt2 The 2nd statement to be run in order to revert effects
# of statement to be checked.
# $skip_3rd_chk Skip the 3rd stage of checking. The purpose of the third
# stage is to check that metadata locks taken by this
# statement are incompatible with metadata locks taken
# by FTWRL.
#
# EXAMPLE
# flush_read_lock.test
#
--disable_result_log
--disable_query_log
# Reset DEBUG_SYNC facility for safety.
set debug_sync= "RESET";
#
# First, check that the statement cannot be run under FTWRL.
#
flush tables with read lock;
--disable_abort_on_error
--eval $statement
--enable_abort_on_error
let $err= $mysql_errno;
if ($err)
{
--echo Success: Was not able to run '$statement' under FTWRL.
unlock tables;
}
if (!$err)
{
--echo Error: Was able to run '$statement' under FTWRL!
unlock tables;
if (`SELECT "$cleanup_stmt1" <> ""`)
{
--eval $cleanup_stmt1;
}
if (`SELECT "$cleanup_stmt2" <> ""`)
{
--eval $cleanup_stmt2;
}
}
#
# Then check that this statement is blocked by FTWRL
# that is active in another connection.
#
connection $con_aux1;
flush tables with read lock;
connection default;
--send_eval $statement;
connection $con_aux1;
--enable_result_log
--enable_query_log
let $wait_condition=
select count(*) = 1 from information_schema.processlist
where (state = "Waiting for global read lock" or
state = "Waiting for commit lock") and
info = "$statement";
--source include/wait_condition.inc
--disable_result_log
--disable_query_log
if ($success)
{
--echo Success: '$statement' is blocked by FTWRL active in another connection.
}
if (!$success)
{
--echo Error: '$statement' wasn't blocked by FTWRL active in another connection!
}
unlock tables;
connection default;
--reap
if (`SELECT "$cleanup_stmt1" <> ""`)
{
--eval $cleanup_stmt1;
}
if (`SELECT "$cleanup_stmt2" <> ""`)
{
--eval $cleanup_stmt2;
}
if (`SELECT "$skip_3rd_check" = ""`)
{
#
# Finally, let us check that FTWRL will not succeed if this
# statement is active but has already closed its tables.
#
connection default;
--eval set debug_sync='execute_command_after_close_tables SIGNAL parked WAIT_FOR go';
--send_eval $statement;
connection $con_aux1;
set debug_sync="now WAIT_FOR parked";
--send flush tables with read lock
connection $con_aux2;
--enable_result_log
--enable_query_log
let $wait_condition=
select count(*) = 1 from information_schema.processlist
where (state = "Waiting for global read lock" or
state = "Waiting for commit lock") and
info = "flush tables with read lock";
--source include/wait_condition.inc
--disable_result_log
--disable_query_log
if ($success)
{
--echo Success: FTWRL is blocked when '$statement' is active in another connection.
}
if (!$success)
{
--echo Error: FTWRL isn't blocked when '$statement' is active in another connection!
}
set debug_sync="now SIGNAL go";
connection default;
--reap
connection $con_aux1;
--reap
unlock tables;
connection default;
set debug_sync= "RESET";
if (`SELECT "$cleanup_stmt1" <> ""`)
{
--eval $cleanup_stmt1;
}
if (`SELECT "$cleanup_stmt2" <> ""`)
{
--eval $cleanup_stmt2;
}
}
--enable_result_log
--enable_query_log

View File

@ -1545,8 +1545,6 @@ lock table not_exists_write read;
--echo # We still have the read lock.
--error ER_CANT_UPDATE_WITH_READLOCK
drop table t1;
handler t1 read next;
handler t1 close;
handler t1 open;
select a from t2;
handler t1 read next;

View File

@ -101,7 +101,7 @@ if (`SELECT '$wait_for_all' = '1'`)
if (!$found)
{
echo # Timeout in include/wait_show_condition.inc for $wait_condition;
echo # Timeout in include/wait_show_condition.inc for $condition;
echo # show_statement : $show_statement;
echo # field : $field;
echo # condition : $condition;