subsystem. Fix a number of caveates that the previous
implementation suffered from, including unprotected
access to shared data and lax resource accounting
(share->ref_count) that could lead to deadlocks.
The new implementation still suffers from a number
of potential deadlocks in some edge cases, and this is
still not enabled by default. Especially since performance
testing has shown that it gives only marginable (not even
exceeding measuring accuracy) improvements.
@todo:
- Remove calls to close_cached_tables() with REFRESH_FAST,
and have_lock, because they break the MDL cache.
- rework FLUSH TABLES <list> to not use close_cached_tables()
- make sure that whenever we set TABLE_SHARE::version to
0 we free MDL cache references to it.
an atomic counter"
Split the large LOCK_open section in open_table().
Do not call open_table_from_share() under LOCK_open.
Remove thd->version.
This fixes
Bug#50589 "Server hang on a query evaluated using a temporary
table"
Bug#51557 "LOCK_open and kernel_mutex are not happy together"
Bug#49463 "LOCK_table and innodb are not nice when handler
instances are created".
This patch has effect on storage engines that rely on
ha_open() PSEA method being called under LOCK_open.
In particular:
1) NDB is broken and left unfixed. NDB relies on LOCK_open
being kept as part of ha_open(), since it uses auto-discovery.
While previously the NDB open code was race-prone, now
it simply fails on asserts.
2) HEAP engine had a race in ha_heap::open() when
a share for the same table could be added twice
to the list of shares, or a dangling reference to a share
stored in HEAP handler. This patch aims to address this
problem by 'pinning' the newly created share in the
internal HEAP engine share list until at least one
handler instance is created using that share.
make tdc_refresh_version an atomic counter".
To avoid orphaned TABLE_SHARE objects left in the
cache, make sure that wherever we set table->s->version
we take care of removing all unused table share objects
from the table cache.
Always set table->s->version under LOCK_open, to make sure
that no other connection sees an old value of the
version and adds the table to unused_tables list.
Add an assert to table_def_unuse_table() that we never
'unuse' a talbe of a share that has an old version.
With this patch, only three places are left in the code
that manipulate with table->s->version:
- tdc_remove_table(). In most cases we have an X mdl lock
in tdc_remove_table(), the two remaining cases when we
don't are 'FLUSH TABLE' and mysql_admin_table().
- sql_view.cc - a crude hack that needs a separate fix
- initial assignment from refresh_version in table.cc.
mutex protecting thd->open_tables".
We should not manipulate with table->s->version outside the
table definition cache code, but use the TDC API
to achieve the desired result.
Fix one violation: close_all_tables_for_name().
thd->open_tables"
thd->open_tables list is not normally accessed concurrently
except for one case: when the connection has open SQL
HANDLER tables, and we want to perform a DDL on the table,
we want to abort waits on MyISAM thr_lock of those connections
that prevent the DDL from proceeding, and iterate
over thd->open_tables list to find out the tables on which
the thread is waiting.
In 5.5 we mostly use deadlock detection and soft deadlock
prevention, as opposed to "hard" deadlock prevention
of 5.1, which would abort any transaction that
may cause a deadlock. The only remaining case when
neither deadlock detection nor deadlock prevention
is implemented in 5.5 is HANDLER SQL, where we use
old good thr_lock_abort() technique form 5.1.
Thus, replace use of LOCK_open to protect thd->open_tables
with thd->LOCK_ha_data (a lock protecting various session
private data).
This is a port of the work done for 5.5.4 for review
and inclusion into 5.5.5.
1) No mutex and no function call if we're not using
plugins.
2) If we're above the table definition cache limit,
delete the oldest unused share, not the share on our hands.
locks for DML statements and changes the way MDL locks
are acquired/granted in contended case.
Instead of backing-off when a lock conflict is encountered
and waiting for it to go away before restarting open_tables()
process we now wait for lock to be released without releasing
any previously acquired locks. If conflicting lock goes away
we resume opening tables. If waiting leads to a deadlock we
try to resolve it by backing-off and restarting open_tables()
immediately.
As result both waiting for possibility to acquire and
acquiring of a metadata lock now always happen within the
same MDL API call. This has allowed to make release of a lock
and granting it to the most appropriate pending request an
atomic operation.
Thanks to this it became possible to wake up during release
of lock only those waiters which requests can be satisfied
at the moment as well as wake up only one waiter in case
when granting its request would prevent all other requests
from being satisfied. This solves thundering herd problem
which occured in cases when we were releasing some lock and
woke up many waiters for SNRW or X locks (this was the issue
in bug#52289 "performance regression for MyISAM in sysbench
OLTP_RW test".
This also allowed to implement more fair (FIFO) scheduling
among waiters with the same priority.
It also opens the door for introducing new types of requests
for metadata locks such as low-prio SNRW lock which is
necessary in order to support LOCK TABLES LOW_PRIORITY WRITE.
Notice that after this sometimes can report ER_LOCK_DEADLOCK
error in cases in which it has not happened before.
Particularly we will always report this error if waiting for
conflicting lock has happened in the middle of transaction
and resulted in a deadlock. Before this patch the error was
not reported if deadlock could have been resolved by backing
off all metadata locks acquired by the current statement.
Conflicts:
Text conflict in mysql-test/r/archive.result
Contents conflict in mysql-test/r/innodb_bug38231.result
Text conflict in mysql-test/r/mdl_sync.result
Text conflict in mysql-test/suite/binlog/t/disabled.def
Text conflict in mysql-test/suite/rpl_ndb/r/rpl_ndb_binlog_format_errors.result
Text conflict in mysql-test/t/archive.test
Contents conflict in mysql-test/t/innodb_bug38231.test
Text conflict in mysql-test/t/mdl_sync.test
Text conflict in sql/sp_head.cc
Text conflict in sql/sql_show.cc
Text conflict in sql/table.cc
Text conflict in sql/table.h
transactional SELECT and ALTER TABLE ... REBUILD PARTITION".
Make open flags part of Open_table_context.
This allows to simplify some code and (in future)
enforce the invariant that we don't, say, request a back
off on the table when there is MYSQL_OPEN_IGNORE_FLUSH
flag.
transactional SELECT and ALTER TABLE ... REBUILD PARTITION".
Move declarations of sql_base.cc classes to sql_base.h
(previously declared in sql_class.h).
Became possible after a header file split.
without FOR UPDATE is causing a lock".
SELECT statements with subqueries referencing InnoDB tables
were acquiring shared locks on rows in these tables when they
were executed in REPEATABLE-READ mode and with statement or
mixed mode binary logging turned on.
This was a regression which were introduced when fixing
bug 39843.
The problem was that for tables belonging to subqueries
parser set TL_READ_DEFAULT as a lock type. In cases when
statement/mixed binary logging at open_tables() time this
type of lock was converted to TL_READ_NO_INSERT lock at
open_tables() time and caused InnoDB engine to acquire
shared locks on reads from these tables. Although in some
cases such behavior was correct (e.g. for subqueries in
DELETE) in case of SELECT it has caused unnecessary locking.
This patch implements minimal version of the fix for the
specific problem described in the bug-report which supposed
to be not too risky for pushing into 5.1 tree.
The 5.5 tree already contains a more appropriate solution
which also addresses other related issues like bug 53921
"Wrong locks for SELECTs used stored functions may lead
to broken SBR".
This patch tries to solve the problem by ensuring that
TL_READ_DEFAULT lock which is set in the parser for
tables participating in subqueries at open_tables()
time is interpreted as TL_READ_NO_INSERT or TL_READ.
TL_READ is used only if we know that this is a SELECT
and that this particular table is not used by a stored
function.
Test coverage is added for both InnoDB and MyISAM.
This patch introduces an "incompatible" change in locking
scheme for subqueries used in SELECT ... FOR UPDATE and
SELECT .. IN SHARE MODE.
In 4.1 (as well as in 5.0 and 5.1 before fix for bug 39843)
the server would use a snapshot InnoDB read for subqueries
in SELECT FOR UPDATE and SELECT .. IN SHARE MODE statements,
regardless of whether the binary log is on or off.
If the user required a different type of read (i.e. locking
read), he/she could request so explicitly by providing FOR
UPDATE/IN SHARE MODE clause for each individual subquery.
The patch for bug 39843 broke this behaviour (which was not
documented or tested), and started to use locking reads for
all subqueries in SELECT ... FOR UPDATE/IN SHARE MODE.
This patch restores 4.1 behaviour.
This patch should be mostly null-merged into 5.5 tree.
The thd->variables.option_bits & OPTION_BIN_LOG is currently abused:
it's both a system variable and an implementation switch. The current
approach to this option bit breaks the session variable encapsulation.
Besides it is allowed to change @@session.sql_bin_log within a
transaction what may lead to not correctly logging a transaction.
To fix the problems, we created a thd->variables variable to represent
the "sql_log_bin" and prohibited its update inside a transaction or
sub-statement.
SELECT and ALTER TABLE ... REBUILD PARTITION".
ALTER TABLE on InnoDB table (including partitioned tables)
acquired exclusive locks on rows of table being altered.
In cases when there was concurrent transaction which did
locking reads from this table this sometimes led to a
deadlock which was not detected by MDL subsystem nor by
InnoDB engine (and was reported only after exceeding
innodb_lock_wait_timeout).
This problem stemmed from the fact that ALTER TABLE acquired
TL_WRITE_ALLOW_READ lock on table being altered. This lock
was interpreted as a write lock and thus for table being
altered handler::external_lock() method was called with
F_WRLCK as an argument. As result InnoDB engine treated
ALTER TABLE as an operation which is going to change data
and acquired LOCK_X locks on rows being read from old
version of table.
In case when there was a transaction which already acquired
SR metadata lock on table and some LOCK_S locks on its rows
(e.g. by using it in subquery of DML statement) concurrent
ALTER TABLE was blocked at the moment when it tried to
acquire LOCK_X lock before reading one of these rows.
The transaction's attempt to acquire SW metadata lock on
table being altered led to deadlock, since it had to wait
for ALTER TABLE to release SNW lock. This deadlock was not
detected and got resolved only after timeout expiring
because waiting were happening in two different subsystems.
Similar deadlocks could have occured in other situations.
This patch tries to solve the problem by changing ALTER TABLE
implementation to use TL_READ_NO_INSERT lock instead of
TL_WRITE_ALLOW_READ. After this step handler::external_lock()
is called with F_RDLCK as an argument and InnoDB engine
correctly interprets ALTER TABLE as operation which only
reads data from original version of table. Thanks to this
ALTER TABLE acquires only LOCK_S locks on rows it reads.
This, in its turn, causes inter-subsystem deadlocks to go
away, as all potential lock conflicts and thus deadlocks will
be limited to metadata locking subsystem:
- When ALTER TABLE reads rows from table being altered it
can't encounter any locks which conflict with LOCK_S row
locks. There should be no concurrent transactions holding
LOCK_X row locks. Such a transaction should have been
acquired SW metadata lock on table first which would have
conflicted with ALTER's SNW lock.
- Vice versa, when DML which runs concurrently with ALTER
TABLE tries to lock row it should be requesting only LOCK_S
lock which is compatible with locks acquired by ALTER,
as otherwise such DML must own an SW metadata lock on table
which would be incompatible with ALTER's SNW lock.
The problem was that TRUNCATE TABLE didn't take a exclusive
lock on a table if it resorted to truncating via delete of
all rows in the table. Specifically for InnoDB tables, this
could break proper isolation as InnoDB ends up aborting some
granted locks when truncating a table.
The solution is to take a exclusive metadata lock before
TRUNCATE TABLE can proceed. This guarantees that no other
transaction is using the table.
Incompatible change: Truncate via delete no longer fails
if sql_safe_updates is activated (this was a undocumented
side effect).
transactional SELECT and ALTER TABLE ... REBUILD PARTITION".
The goal of this patch is to decouple type of metadata
lock acquired for table by open_tables() from type of
table-level lock to be acquired on it.
To achieve this we change approach to how we determine what
type of metadata lock should be acquired on table to be open.
Now instead of inferring it at open_tables() time from flags
and type of table-level lock we rely on that type of metadata
lock is properly set at parsing time and is not changed
further.
FOR UPDATE is causing a lock".
This patch tries to address problems which were exposed
during backporting of original patch to 5.1 tree.
- It ensures that we don't change locking behavior of simple
SELECT statements on InnoDB tables when they are executed
under LOCK TABLES ... READ and with @@innodb_table_locks=0.
Also we no longer pass TL_READ_DEFAULT/TL_WRITE_DEFAULT
lock types, which are supposed to be parser-only, to
handler::start_stmt() method.
- It makes check_/no_concurrent_insert.inc auxiliary scripts
more robust against changes in test cases that use them
and also ensures that they don't unnecessarily change
environment of caller.
mode
Post-push fix after backporting the patch to 5.1-bugteam:
1 - changed the name of some variables to be equivalent to pe.
2 - fixed that patch to mark a statement as unsafe when both a
self-logging eng. and regular eng. are accessed and one of them
is updated.
Fix for bug #46947 "Embedded SELECT without FOR UPDATE is
causing a lock", with after-review fixes.
SELECT statements with subqueries referencing InnoDB tables
were acquiring shared locks on rows in these tables when they
were executed in REPEATABLE-READ mode and with statement or
mixed mode binary logging turned on.
This was a regression which were introduced when fixing
bug 39843.
The problem was that for tables belonging to subqueries
parser set TL_READ_DEFAULT as a lock type. In cases when
statement/mixed binary logging at open_tables() time this
type of lock was converted to TL_READ_NO_INSERT lock at
open_tables() time and caused InnoDB engine to acquire
shared locks on reads from these tables. Although in some
cases such behavior was correct (e.g. for subqueries in
DELETE) in case of SELECT it has caused unnecessary locking.
This patch tries to solve this problem by rethinking our
approach to how we handle locking for SELECT and subqueries.
Now we always set TL_READ_DEFAULT lock type for all cases
when we read data. When at open_tables() time this lock
is interpreted as TL_READ_NO_INSERT or TL_READ depending
on whether this statement as a whole or call to function
which uses particular table should be written to the
binary log or not (if yes then statement should be properly
serialized with concurrent statements and stronger lock
should be acquired).
Test coverage is added for both InnoDB and MyISAM.
This patch introduces an "incompatible" change in locking
scheme for subqueries used in SELECT ... FOR UPDATE and
SELECT .. IN SHARE MODE.
In 4.1 the server would use a snapshot InnoDB read for
subqueries in SELECT FOR UPDATE and SELECT .. IN SHARE MODE
statements, regardless of whether the binary log is on or off.
If the user required a different type of read (i.e. locking read),
he/she could request so explicitly by providing FOR UPDATE/IN SHARE MODE
clause for each individual subquery.
On of the patches for 5.0 broke this behaviour (which was not documented
or tested), and started to use locking reads fora all subqueries in SELECT ...
FOR UPDATE/IN SHARE MODE. This patch restored 4.1 behaviour.
Adding my_global.h first in all files using
NO_EMBEDDED_ACCESS_CHECKS.
Correcting a merge problem resulting from a changed definition
of check_some_access compared to the original patches.
This patch:
- Moves all definitions from the mysql_priv.h file into
header files for the component where the variable is
defined
- Creates header files if the component lacks one
- Eliminates all include directives from mysql_priv.h
- Eliminates all circular include cycles
- Rename time.cc to sql_time.cc
- Rename mysql_priv.h to sql_priv.h
Conflicts:
Text conflict in client/mysqlbinlog.cc
Text conflict in mysql-test/Makefile.am
Text conflict in mysql-test/collections/default.daily
Text conflict in mysql-test/r/mysqlbinlog_row_innodb.result
Text conflict in mysql-test/suite/rpl/r/rpl_typeconv_innodb.result
Text conflict in mysql-test/suite/rpl/t/rpl_get_master_version_and_clock.test
Text conflict in mysql-test/suite/rpl/t/rpl_row_create_table.test
Text conflict in mysql-test/suite/rpl/t/rpl_slave_skip.test
Text conflict in mysql-test/suite/rpl/t/rpl_typeconv_innodb.test
Text conflict in mysys/charset.c
Text conflict in sql/field.cc
Text conflict in sql/field.h
Text conflict in sql/item.h
Text conflict in sql/item_func.cc
Text conflict in sql/log.cc
Text conflict in sql/log_event.cc
Text conflict in sql/log_event_old.cc
Text conflict in sql/mysqld.cc
Text conflict in sql/rpl_utility.cc
Text conflict in sql/rpl_utility.h
Text conflict in sql/set_var.cc
Text conflict in sql/share/Makefile.am
Text conflict in sql/sql_delete.cc
Text conflict in sql/sql_plugin.cc
Text conflict in sql/sql_select.cc
Text conflict in sql/sql_table.cc
Text conflict in storage/example/ha_example.h
Text conflict in storage/federated/ha_federated.cc
Text conflict in storage/myisammrg/ha_myisammrg.cc
Text conflict in storage/myisammrg/myrg_open.c
concurrent I_S query
There were two problem:
1) MYSQL_LOCK_IGNORE_FLUSH also ignored name locks
2) there was a race between abort_and_upgrade_locks and
alter_close_tables
(i.e. remove_table_from_cache and
close_data_files_and_morph_locks)
Which allowed the table to be opened with MYSQL_LOCK_IGNORE_FLUSH flag
resulting in renaming a partition that was already in use,
which could cause the table to be unusable.
Solution was to not allow IGNORE_FLUSH to skip waiting for
a named locked table.
And to not release the LOCK_open mutex between the
calls to remove_table_from_cache and
close_data_files_and_morph_locks by merging the functions
abort_and_upgrade_locks and alter_close_tables.
DDL no longer aborts mysql_lock_tables(), and hence
we no longer need to support need_reopen flag of this
call.
Remove the flag, and all the code in the server
that was responsible for handling the case when
it was set. This allowed to simplify:
open_and_lock_tables_derived(), the delayed thread,
multi-update.
Rename MYSQL_LOCK_IGNORE_FLUSH to MYSQL_OPEN_IGNORE_FLUSH,
since we now only support this flag in open_table().
Rename MYSQL_LOCK_PERF_SCHEMA to MYSQL_LOCK_LOG_TABLE,
to avoid confusion.
Move the wait for the global read lock for cases
when we do updates in SELECT f1() or DO (UPDATE) to
open_table() from mysql_lock_tables(). When waiting
for the read lock, we could raise need_reopen flag,
which is no longer present in mysql_lock_tables().
Since the block responsible for waiting for GRL
was moved, MYSQL_LOCK_IGNORE_GLOBAL_READ_LOCK
was renamed to MYSQL_OPEN_IGNORE_GLOBAL_READ_LOCK.
Conflicts:
Text conflict in client/mysqlbinlog.cc
Text conflict in mysql-test/r/explain.result
Text conflict in mysql-test/r/subselect.result
Text conflict in mysql-test/r/subselect3.result
Text conflict in mysql-test/r/type_datetime.result
Text conflict in sql/share/Makefile.am
+ failing statements
Implicit DROP event for temporary table is not getting
LOG_EVENT_THREAD_SPECIFIC_F flag, because, in the previous
executed statement in the same thread, which might even be a
failed statement, the thread_specific_used flag is set to
FALSE (in mysql_reset_thd_for_next_command) and not set to TRUE
before connection is shutdown. This means that implicit DROP
event will take the FALSE value from thread_specific_used and
will not set LOG_EVENT_THREAD_SPECIFIC_F in the event header. As
a consequence, mysqlbinlog will not print the pseudo_thread_id
from the DROP event, because one of the requirements for the
printout is that this flag is set to TRUE.
We fix this by setting thread_specific_used whenever we are
binlogging a DROP in close_temporary_tables, and resetting it to
its previous value afterward.
The problem was that ALTER TABLE on a merge table which was locked
using LOCK TABLE ... WRITE, by mistake gave
ER_TABLE_NOT_LOCKED_FOR_WRITE.
During opening of the table to be ALTERed, open_table() tried to
get an upgradable metadata lock. In LOCK TABLEs mode, this lock
must already exist (i.e. taken by LOCK TABLE) as new locks of this
type cannot be acquired for fear of deadlock. So in LOCK TABLEs
mode, open_table() tried to find an existing upgradable lock for
the table to be altered.
The problem was that open_table() also tried to find upgradable
metadata locks for children of merge tables even if no such
locks are needed to execute ALTER TABLE on merge tables.
This patch fixes the problem by making sure that open tables code
only searches for upgradable metadata locks for the merge table
and not for the merge children tables.
The patch also fixes a related bug where an upgradable metadata
lock was aquired outside of LOCK TABLEs mode even if the table in
question was temporary. This bug meant that LOCK TABLES or DDL on
temporary tables by mistake could be blocked/aborted by locks held
on base tables with the same table name by other connections.
Test cases added to merge.test and lock_multi.test.
This patch prevents system threads and system table accesses from
using user-specified values for "lock_wait_timeout". Instead all
such accesses are done using the default value (1 year).
This prevents background tasks (such as replication, events,
accessing stored function definitions, logging, reading time-zone
information, etc.) from failing in cases where the global value
of "lock_wait_timeout" is set very low.
The patch also simplifies the open tables API. Rather than adding
another convenience function for opening and locking system tables,
this patch removes most of the existing convenience functions for
open_and_lock_tables_derived(). Before, open_and_lock_tables() was
a convenience function that enforced derived tables handling, while
open_and_lock_tables_derived() was the main function where derived
tables handling was optional. Now, this convencience function is
gone and the main function is renamed to open_and_lock_tables().
No test case added as it would have required the use of --sleep to
check that system threads and system tables have a different timeout
value from the user-specified "lock_wait_timeout" system variable.
an INFORMATION_SCHEMA table
When a prepared statement using a merged view containing an information
schema table was executed, a metadata lock of the view was not taken.
This meant that it was possible for concurrent view DDL to execute,
thereby breaking the binary log. For example, it was possible
for DROP VIEW to appear in the binary log before a query using the view.
This also happened when a statement in a stored routine was executed a
second time.
For such views, the information schema table is merged into the view
during the prepare phase (or first execution of a statement in a routine).
The problem was that we took a short cut and were not executing full-blown
view opening during subsequent executions of the statement. As a result,
a metadata lock on the view was not taken to protect the view definition.
This patch resolves the problem by making sure a metadata lock is taken
for views even after information schema tables are merged into them.
Test cased added to view.test.
This patch introduces timeouts for metadata locks.
The timeout is specified in seconds using the new dynamic system
variable "lock_wait_timeout" which has both GLOBAL and SESSION
scopes. Allowed values range from 1 to 31536000 seconds (= 1 year).
The default value is 1 year.
The new server parameter "lock-wait-timeout" can be used to set
the default value parameter upon server startup.
"lock_wait_timeout" applies to all statements that use metadata locks.
These include DML and DDL operations on tables, views, stored procedures
and stored functions. They also include LOCK TABLES, FLUSH TABLES WITH
READ LOCK and HANDLER statements.
The patch also changes thr_lock.c code (table data locks used by MyISAM
and other simplistic engines) to use the same system variable.
InnoDB row locks are unaffected.
One exception to the handling of the "lock_wait_timeout" variable
is delayed inserts. All delayed inserts are executed with a timeout
of 1 year regardless of the setting for the global variable. As the
connection issuing the delayed insert gets no notification of
delayed insert timeouts, we want to avoid unnecessary timeouts.
It's important to note that the timeout value is used for each lock
acquired and that one statement can take more than one lock.
A statement can therefore block for longer than the lock_wait_timeout
value before reporting a timeout error. When lock timeout occurs,
ER_LOCK_WAIT_TIMEOUT is reported.
Test case added to lock_multi.test.
A closely related problem, hardly worth a new bug report:
Removed a spurious call to:
thd->set_current_stmt_binlog_format_row_if_mixed()
in sql_base.cc:lock_tables().