--Bug#52157 various crashes and assertions with multi-table update, stored function
--Bug#54475 improper error handling causes cascading crashing failures in innodb/ndb
--Bug#57703 create view cause Assertion failed: 0, file .\item_subselect.cc, line 846
--Bug#57352 valgrind warnings when creating view
--Recently discovered problem when a nested materialized derived table is used
before being populated and it leads to incorrect result
We have several modes when we should disable subquery evaluation.
The reasons for disabling are different. It could be
uselessness of the evaluation as in case of 'CREATE VIEW'
or 'PREPARE stmt', or we should disable subquery evaluation
if tables are not locked yet as it happens in bug#54475, or
too early evaluation of subqueries can lead to wrong result
as it happened in Bug#19077.
Main problem is that if subquery items are treated as const
they are evaluated in ::fix_fields(), ::fix_length_and_dec()
of the parental items as a lot of these methods have
Item::val_...() calls inside.
We have to make subqueries non-const to prevent unnecessary
subquery evaluation. At the moment we have different methods
for this. Here is a list of these modes:
1. PREPARE stmt;
We use UNCACHEABLE_PREPARE flag.
It is set during parsing in sql_parse.cc, mysql_new_select() for
each SELECT_LEX object and cleared at the end of PREPARE in
sql_prepare.cc, init_stmt_after_parse(). If this flag is set
subquery becomes non-const and evaluation does not happen.
2. CREATE|ALTER VIEW, SHOW CREATE VIEW, I_S tables which
process FRM files
We use LEX::view_prepare_mode field. We set it before
view preparation and check this flag in
::fix_fields(), ::fix_length_and_dec().
Some bugs are fixed using this approach,
some are not(Bug#57352, Bug#57703). The problem here is
that we have a lot of ::fix_fields(), ::fix_length_and_dec()
where we use Item::val_...() calls for const items.
3. Derived tables with subquery = wrong result(Bug19077)
The reason of this bug is too early subquery evaluation.
It was fixed by adding Item::with_subselect field
The check of this field in appropriate places prevents
const item evaluation if the item have subquery.
The fix for Bug19077 fixes only the problem with
convert_constant_item() function and does not cover
other places(::fix_fields(), ::fix_length_and_dec() again)
where subqueries could be evaluated.
Example:
CREATE TABLE t1 (i INT, j BIGINT);
INSERT INTO t1 VALUES (1, 2), (2, 2), (3, 2);
SELECT * FROM (SELECT MIN(i) FROM t1
WHERE j = SUBSTRING('12', (SELECT * FROM (SELECT MIN(j) FROM t1) t2))) t3;
DROP TABLE t1;
4. Derived tables with subquery where subquery
is evaluated before table locking(Bug#54475, Bug#52157)
Suggested solution is following:
-Introduce new field LEX::context_analysis_only with the following
possible flags:
#define CONTEXT_ANALYSIS_ONLY_PREPARE 1
#define CONTEXT_ANALYSIS_ONLY_VIEW 2
#define CONTEXT_ANALYSIS_ONLY_DERIVED 4
-Set/clean these flags when we perform
context analysis operation
-Item_subselect::const_item() returns
result depending on LEX::context_analysis_only.
If context_analysis_only is set then we return
FALSE that means that subquery is non-const.
As all subquery types are wrapped by Item_subselect
it allow as to make subquery non-const when
it's necessary.
bug #57006 "Deadlock between HANDLER and FLUSH TABLES WITH READ
LOCK" and bug #54673 "It takes too long to get readlock for
'FLUSH TABLES WITH READ LOCK'".
The first bug manifested itself as a deadlock which occurred
when a connection, which had some table open through HANDLER
statement, tried to update some data through DML statement
while another connection tried to execute FLUSH TABLES WITH
READ LOCK concurrently.
What happened was that FTWRL in the second connection managed
to perform first step of GRL acquisition and thus blocked all
upcoming DML. After that it started to wait for table open
through HANDLER statement to be flushed. When the first connection
tried to execute DML it has started to wait for GRL/the second
connection creating deadlock.
The second bug manifested itself as starvation of FLUSH TABLES
WITH READ LOCK statements in cases when there was a constant
stream of concurrent DML statements (in two or more
connections).
This has happened because requests for protection against GRL
which were acquired by DML statements were ignoring presence of
pending GRL and thus the latter was starved.
This patch solves both these problems by re-implementing GRL
using metadata locks.
Similar to the old implementation acquisition of GRL in new
implementation is two-step. During the first step we block
all concurrent DML and DDL statements by acquiring global S
metadata lock (each DML and DDL statement acquires global IX
lock for its duration). During the second step we block commits
by acquiring global S lock in COMMIT namespace (commit code
acquires global IX lock in this namespace).
Note that unlike in old implementation acquisition of
protection against GRL in DML and DDL is semi-automatic.
We assume that any statement which should be blocked by GRL
will either open and acquires write-lock on tables or acquires
metadata locks on objects it is going to modify. For any such
statement global IX metadata lock is automatically acquired
for its duration.
The first problem is solved because waits for GRL become
visible to deadlock detector in metadata locking subsystem
and thus deadlocks like one in the first bug become impossible.
The second problem is solved because global S locks which
are used for GRL implementation are given preference over
IX locks which are acquired by concurrent DML (and we can
switch to fair scheduling in future if needed).
Important change:
FTWRL/GRL no longer blocks DML and DDL on temporary tables.
Before this patch behavior was not consistent in this respect:
in some cases DML/DDL statements on temporary tables were
blocked while in others they were not. Since the main use cases
for FTWRL are various forms of backups and temporary tables are
not preserved during backups we have opted for consistently
allowing DML/DDL on temporary tables during FTWRL/GRL.
Important change:
This patch changes thread state names which are used when
DML/DDL of FTWRL is waiting for global read lock. It is now
either "Waiting for global read lock" or "Waiting for commit
lock" depending on the stage on which FTWRL is.
Incompatible change:
To solve deadlock in events code which was exposed by this
patch we have to replace LOCK_event_metadata mutex with
metadata locks on events. As result we have to prohibit
DDL on events under LOCK TABLES.
This patch also adds extensive test coverage for interaction
of DML/DDL and FTWRL.
Performance of new and old global read lock implementations
in sysbench tests were compared. There were no significant
difference between new and old implementations.
temp table
This patch introduces two key changes in the replication's behavior.
Firstly, it reverts part of BUG#51894 which puts any update to temporary tables
into the trx-cache. Now, updates to temporary tables are handled according to
the type of their engines as a regular table.
Secondly, an unsafe mixed statement, (i.e. a statement that access transactional
table as well non-transactional or temporary table, and writes to any of them),
are written into the trx-cache in order to minimize errors in the execution when
the statement logging format is in use.
Such changes has a direct impact on which statements are classified as unsafe
statements and thus part of BUG#53259 is reverted.
locks on the table
Fixing the partitioning specifics after TRUNCATE TABLE in
bug-42643 was fixed.
Reorganize of code to decrease the size of the giant switch
in mysql_execute_command, and to prepare for future parser
reengineering. Moved code into Sql_statement objects.
Updated patch according to davi's review comments.
After BUG#36649, warnings for sub-statements are cleared when a
new sub-statement is started. This is problematic since it suppresses
warnings for unsafe statements in some cases. It is important that we
always give a warning to the client, because the user needs to know
when there is a risk that the slave goes out of sync.
We fixed the problem by generating warning messages for unsafe statements
while returning from a stored procedure, function, trigger or while
executing a top level statement.
We also started checking unsafeness when both performance and log tables are
used. This is necessary after the performance schema which does a distinction
between performance and log tables.
With statement- or mixed-mode logging, "LOAD DATA INFILE" queries
are written to the binlog using special types of log events.
When mysqlbinlog reads such events, it re-creates the file in a
temporary directory with a generated filename and outputs a
"LOAD DATA INFILE" query where the filename is replaced by the
generated file. The temporary file is not deleted by mysqlbinlog
after termination.
To fix the problem, in mixed mode we go to row-based. In SBR, we
document it to remind user the tmpfile is left in a temporary
directory.
/*![:version:] Query Code */, where [:version:] is a sequence of 5
digits representing the mysql server version(e.g /*!50200 ... */),
is a special comment that the query in it can be executed on those
servers whose versions are larger than the version appearing in the
comment. It leads to a security issue when slave's version is larger
than master's. A malicious user can improve his privileges on slaves.
Because slave SQL thread is running with SUPER privileges, so it can
execute queries that he/she does not have privileges on master.
This bug is fixed with the logic below:
- To replace '!' with ' ' in the magic comments which are not applied on
master. So they become common comments and will not be applied on slave.
- Example:
'INSERT INTO t1 VALUES (1) /*!10000, (2)*/ /*!99999 ,(3)*/
will be binlogged as
'INSERT INTO t1 VALUES (1) /*!10000, (2)*/ /* 99999 ,(3)*/
This patch also fixes Bug#55452 "SET PASSWORD is
replicated twice in RBR mode".
The goal of this patch is to remove the release of
metadata locks from close_thread_tables().
This is necessary to not mistakenly release
the locks in the course of a multi-step
operation that involves multiple close_thread_tables()
or close_tables_for_reopen().
On the same token, move statement commit outside
close_thread_tables().
Other cleanups:
Cleanup COM_FIELD_LIST.
Don't call close_thread_tables() in COM_SHUTDOWN -- there
are no open tables there that can be closed (we leave
the locked tables mode in THD destructor, and this
close_thread_tables() won't leave it anyway).
Make open_and_lock_tables() and open_and_lock_tables_derived()
call close_thread_tables() upon failure.
Remove the calls to close_thread_tables() that are now
unnecessary.
Simplify the back off condition in Open_table_context.
Streamline metadata lock handling in LOCK TABLES
implementation.
Add asserts to ensure correct life cycle of
statement transaction in a session.
Remove a piece of dead code that has also become redundant
after the fix for Bug 37521.
Essentially, the problem is that safemalloc is excruciatingly
slow as it checks all allocated blocks for overrun at each
memory management primitive, yielding a almost exponential
slowdown for the memory management functions (malloc, realloc,
free). The overrun check basically consists of verifying some
bytes of a block for certain magic keys, which catches some
simple forms of overrun. Another minor problem is violation
of aliasing rules and that its own internal list of blocks
is prone to corruption.
Another issue with safemalloc is rather the maintenance cost
as the tool has a significant impact on the server code.
Given the magnitude of memory debuggers available nowadays,
especially those that are provided with the platform malloc
implementation, maintenance of a in-house and largely obsolete
memory debugger becomes a burden that is not worth the effort
due to its slowness and lack of support for detecting more
common forms of heap corruption.
Since there are third-party tools that can provide the same
functionality at a lower or comparable performance cost, the
solution is to simply remove safemalloc. Third-party tools
can provide the same functionality at a lower or comparable
performance cost.
The removal of safemalloc also allows a simplification of the
malloc wrappers, removing quite a bit of kludge: redefinition
of my_malloc, my_free and the removal of the unused second
argument of my_free. Since free() always check whether the
supplied pointer is null, redudant checks are also removed.
Also, this patch adds unit testing for my_malloc and moves
my_realloc implementation into the same file as the other
memory allocation primitives.
strict aliasing violations.
One somewhat major source of strict-aliasing violations and
related warnings is the SQL_LIST structure. For example,
consider its member function `link_in_list` which takes
a pointer to pointer of type T (any type) as a pointer to
pointer to unsigned char. Dereferencing this pointer, which
is done to reset the next field, violates strict-aliasing
rules and might cause problems for surrounding code that
uses the next field of the object being added to the list.
The solution is to use templates to parametrize the SQL_LIST
structure in order to deference the pointers with compatible
types. As a side bonus, it becomes possible to remove quite
a few casts related to acessing data members of SQL_LIST.
1) No mutex and no function call if we're not using
plugins.
2) If we're above the table definition cache limit,
delete the oldest unused share, not the share on our hands.
Conflicts:
Text conflict in mysql-test/r/archive.result
Contents conflict in mysql-test/r/innodb_bug38231.result
Text conflict in mysql-test/r/mdl_sync.result
Text conflict in mysql-test/suite/binlog/t/disabled.def
Text conflict in mysql-test/suite/rpl_ndb/r/rpl_ndb_binlog_format_errors.result
Text conflict in mysql-test/t/archive.test
Contents conflict in mysql-test/t/innodb_bug38231.test
Text conflict in mysql-test/t/mdl_sync.test
Text conflict in sql/sp_head.cc
Text conflict in sql/sql_show.cc
Text conflict in sql/table.cc
Text conflict in sql/table.h
transactional SELECT and ALTER TABLE ... REBUILD PARTITION".
The goal of this patch is to decouple type of metadata
lock acquired for table by open_tables() from type of
table-level lock to be acquired on it.
To achieve this we change approach to how we determine what
type of metadata lock should be acquired on table to be open.
Now instead of inferring it at open_tables() time from flags
and type of table-level lock we rely on that type of metadata
lock is properly set at parsing time and is not changed
further.
Item_hex_string::Item_hex_string
The status of memory allocation in the Lex_input_stream (called
from the Parser_state constructor) was not checked which led to
a parser crash in case of the out-of-memory error.
The solution is to introduce new init() member function in
Parser_state and Lex_input_stream so that status of memory
allocation can be returned to the caller.
multiquery packet).
Background:
- a query can contain multiple SQL statements;
- the server frees resources allocated to process a query when the
whole query is handled. In other words, resources allocated to process
one SQL statement from a multi-statement query are freed when all SQL
statements are handled.
The problem was that the parser allocated a buffer of size of the whole
query for each SQL statement in a multi-statement query. Thus, if a query
had many SQL-statements (so, the query was long), but each SQL statement
was short, ther parser tried to allocate huge amount of memory (number of
small SQL statements * length of the whole query).
The memory was allocated for a so-called "cpp buffer", which is intended to
store pre-processed SQL statement -- SQL text without version specific
comments.
The fix is to allocate memory for the "cpp buffer" once for all SQL
statements (once for a query).
Conflicts:
Text conflict in mysql-test/r/explain.result
Text conflict in mysql-test/t/explain.test
Text conflict in sql/net_serv.cc
Text conflict in sql/sp_head.cc
Text conflict in sql/sql_priv.h
Fix for bug #46947 "Embedded SELECT without FOR UPDATE is
causing a lock", with after-review fixes.
SELECT statements with subqueries referencing InnoDB tables
were acquiring shared locks on rows in these tables when they
were executed in REPEATABLE-READ mode and with statement or
mixed mode binary logging turned on.
This was a regression which were introduced when fixing
bug 39843.
The problem was that for tables belonging to subqueries
parser set TL_READ_DEFAULT as a lock type. In cases when
statement/mixed binary logging at open_tables() time this
type of lock was converted to TL_READ_NO_INSERT lock at
open_tables() time and caused InnoDB engine to acquire
shared locks on reads from these tables. Although in some
cases such behavior was correct (e.g. for subqueries in
DELETE) in case of SELECT it has caused unnecessary locking.
This patch tries to solve this problem by rethinking our
approach to how we handle locking for SELECT and subqueries.
Now we always set TL_READ_DEFAULT lock type for all cases
when we read data. When at open_tables() time this lock
is interpreted as TL_READ_NO_INSERT or TL_READ depending
on whether this statement as a whole or call to function
which uses particular table should be written to the
binary log or not (if yes then statement should be properly
serialized with concurrent statements and stronger lock
should be acquired).
Test coverage is added for both InnoDB and MyISAM.
This patch introduces an "incompatible" change in locking
scheme for subqueries used in SELECT ... FOR UPDATE and
SELECT .. IN SHARE MODE.
In 4.1 the server would use a snapshot InnoDB read for
subqueries in SELECT FOR UPDATE and SELECT .. IN SHARE MODE
statements, regardless of whether the binary log is on or off.
If the user required a different type of read (i.e. locking read),
he/she could request so explicitly by providing FOR UPDATE/IN SHARE MODE
clause for each individual subquery.
On of the patches for 5.0 broke this behaviour (which was not documented
or tested), and started to use locking reads fora all subqueries in SELECT ...
FOR UPDATE/IN SHARE MODE. This patch restored 4.1 behaviour.
The problem was that a syntactically invalid trigger could cause
the server to crash when trying to list triggers. The crash would
happen due to a mishap in the backup/restore procedure that should
protect parser items which are not associated with the trigger. The
backup/restore is used to isolate the parse tree (and context) of
a statement from the load (and parsing) of a trigger. In this case,
a error during the parsing of a trigger could cause the improper
backup/restore sequence.
The solution is to properly restore the original statement context
before the parser is exited due to syntax errors in the trigger body.
This patch:
- Moves all definitions from the mysql_priv.h file into
header files for the component where the variable is
defined
- Creates header files if the component lacks one
- Eliminates all include directives from mysql_priv.h
- Eliminates all circular include cycles
- Rename time.cc to sql_time.cc
- Rename mysql_priv.h to sql_priv.h
BUG#46364 introduced the flag binlog_direct_non_transactional_updates which
would make N-changes to be written to the binary log upon committing the
statement when "ON". On the other hand, when "OFF" the option was supposed
to mimic the behavior in 5.1. However, the implementation was not mimicking
the behavior correctly and the following bugs popped up:
Case #1: N-changes executed within a transaction would go into
the S-cache. When later in the same transaction a
T-change occurs, N-changes following it were written
to the T-cache instead of the S-cache. In some cases,
this raises problems. For example, a
Table_map_log_event being written initially into the
S-cache, together with the initial N-changes, would be
absent from the T-cache. This would log N-changes
orphaned from a Table_map_log_event (thence discarded
at the slave). (MIXED and ROW)
Case #2: When rolling back a transaction, the N-changes that
might be in the T-cache were disregarded and
truncated along with the T-changes. (MIXED and ROW)
Case #3: When a MIXED statement (TN) is ahead of any other
T-changes in the transaction and it fails, it is kept
in the T-cache until the transaction ends. This is
not the case in 5.1 or Betony (5.5.2). In these, the
failed TN statement would be written to the binlog at
the same instant it had failed and not deferred until
transaction end. (SBR)
To fix these problems, we have decided to do what follows:
For Case #1 and #2, we circumvent them:
1. by not letting binlog_direct_non_transactional_updates
affect MIXED and RBR. These modes will keep the behavior
provided by WL#2687. Although this will make Celosia to
behave differently from 5.1, an execution will be always
safe under such modes in the sense that slaves will never
go out sync. In 5.1, using either MIXED or ROW while
mixing N-statements and T-statements was not safe.
For Case #3, we don't actually fix it. We:
1. keep it and make all MIXED statements whether they end
up failing or not or whether they are up front in the
transaction or after some transactional change to always
be stored in the T-cache. This means that it is written
to the binary log on transaction commit/rollback only.
2. We make the warning message even more specific about the
MIXED statement and SBR.
Reading from a self-logging engine and updating a transactional engine such as Innodb
generates changes that are written to the binary log in the statement format and may
make slaves diverge. In the mixed mode, such changes should be written to the binary
log in the row format.
Note that the issue does not happen if we mix a self-logging engine and MyIsam
as this case is caught by checking the mixture of non-transactional and transactional
engines.
So, we classify a mixed statement where one reads from NDB and writes into another
engine as unsafe:
if (multi_engine && flags_some_set & HA_HAS_OWN_BINLOGGING)
lex->set_stmt_unsafe(LEX::BINLOG_STMT_UNSAFE_MULTIPLE_ENGINES_AND_SELF_LOGGING_ENGINE);
Conflicts:
Text conflict in .bzr-mysql/default.conf
Text conflict in mysql-test/suite/rpl/r/rpl_slow_query_log.result
Text conflict in mysql-test/suite/rpl/t/rpl_slow_query_log.test
Conflict adding files to server-tools. Created directory.
Conflict because server-tools is not versioned, but has versioned children. Versioned directory.
Conflict adding files to server-tools/instance-manager. Created directory.
Conflict because server-tools/instance-manager is not versioned, but has versioned children. Versioned directory.
Contents conflict in server-tools/instance-manager/options.cc
Text conflict in sql/mysqld.cc
Grouping by a subquery in a query with a distinct aggregate
function lead to a wrong result (wrong and unordered
grouping values).
There are two related problems:
1) The query like this:
SELECT (SELECT t1.a) aa, COUNT(DISTINCT b) c
FROM t1 GROUP BY aa
returned wrong result, because the outer reference "t1.a"
in the subquery was substituted with the Item_ref item.
The Item_ref item obtains data from the result_field object
that refreshes once after the end of each group. This data
is not applicable to filesort since filesort() doesn't care
about groups (and doesn't update result_field objects with
copy_fields() and so on). Also that data is not applicable
to group separation algorithm: end_send_group() checks every
record with test_if_group_changed() that evaluates Item_ref
items, but it refreshes those Item_ref-s only after the end
of group, that is a vicious circle and the grouped column
values in the output are shifted.
Fix: if
a) we grouping by a subquery and
b) that subquery has outer references to FROM list
of the grouping query,
then we substitute these outer references with
Item_direct_ref like references under aggregate
functions: Item_direct_ref obtains data directly
from the current record.
2) The query with a non-trivial grouping expression like:
SELECT (SELECT t1.a) aa, COUNT(DISTINCT b) c
FROM t1 GROUP BY aa+0
also returned wrong result, since JOIN::exec() substitutes
references to top-level aliases in SELECT list with Item_copy
caching items. Item_copy items have same refreshing policy
as Item_ref items, so the whole groping expression with
Item_copy inside returns wrong result in filesort() and
end_send_group().
Fix: include aliased items into GROUP BY item tree instead
of Item_ref references to them.
The auto-inc unsafe warning makes sense even though it's just
one auto-inc table could be involved via a trigger or a stored
function.
However its content was not updated by bug@45677 fixes continuing to mention
two tables whereas the fixes refined semantics of replication of auto_increment
in stored routine.
Fixed with updating the error message, renaming the error and an internal unsafe-condition
constants.
A documentation notice
======================
Inserting into an autoincrement column in a stored function or a trigger
is unsafe for replication.
Even with just one autoincrement column, if the routine is invoked more than
once slave is not guaranteed to execute the statement graph same way as
the master.
And since it's impossible to estimate how many times a routine can be invoked at
the query pre-execution phase (see lock_tables), the statement is marked
pessimistically unsafe.
Conflicts:
Text conflict in .bzr-mysql/default.conf
Text conflict in mysql-test/extra/rpl_tests/rpl_loaddata.test
Text conflict in mysql-test/r/mysqlbinlog2.result
Text conflict in mysql-test/suite/binlog/r/binlog_stm_mix_innodb_myisam.result
Text conflict in mysql-test/suite/binlog/r/binlog_unsafe.result
Text conflict in mysql-test/suite/rpl/r/rpl_insert_id.result
Text conflict in mysql-test/suite/rpl/r/rpl_loaddata.result
Text conflict in mysql-test/suite/rpl/r/rpl_stm_auto_increment_bug33029.result
Text conflict in mysql-test/suite/rpl/r/rpl_udf.result
Text conflict in mysql-test/suite/rpl/t/rpl_slow_query_log.test
Text conflict in sql/field.h
Text conflict in sql/log.cc
Text conflict in sql/log_event.cc
Text conflict in sql/log_event_old.cc
Text conflict in sql/mysql_priv.h
Text conflict in sql/share/errmsg.txt
Text conflict in sql/sp.cc
Text conflict in sql/sql_acl.cc
Text conflict in sql/sql_base.cc
Text conflict in sql/sql_class.h
Text conflict in sql/sql_db.cc
Text conflict in sql/sql_delete.cc
Text conflict in sql/sql_insert.cc
Text conflict in sql/sql_lex.cc
Text conflict in sql/sql_lex.h
Text conflict in sql/sql_load.cc
Text conflict in sql/sql_table.cc
Text conflict in sql/sql_update.cc
Text conflict in sql/sql_view.cc
Conflict adding files to storage/innobase. Created directory.
Conflict because storage/innobase is not versioned, but has versioned children. Versioned directory.
Conflict adding file storage/innobase. Moved existing file to storage/innobase.moved.
Conflict adding files to storage/innobase/handler. Created directory.
Conflict because storage/innobase/handler is not versioned, but has versioned children. Versioned directory.
Contents conflict in storage/innobase/handler/ha_innodb.cc