CHEAP SQ: Valgrind warnings "Memory lost" with IN and EXISTS nested subquery, materialization+semijoin
Analysis:
The memory leak was a result of the interaction of semi-join optimization
with early optimization of constant subqueries. The function:
setup_jtbm_semi_joins() created a dummy temporary table "dummy_table"
in order to make some JOIN_TAB objects complete. Normally, such temporary
tables are freed inside JOIN_TAB::cleanup.
However, the inner-most subquery is pre-optimized, which allows the
optimization fo the MAX subquery to determine that its WHERE is TRUE,
and thus to compute the result of the MAX during optimization. This
ultimately allows the optimize phase of the outer query to find that
it WHERE clause is FALSE. Once JOIN::optimize finds that the result
set is empty, it sets zero_result_cause, and returns *before* it ever
reached make_join_statistics(). As a result the query plan has no
JOIN_TABs at all. Since the temporary table is supposed to be cleanup
via JOIN_TAB::cleanup, this never happens because there is no JOIN_TAB
for this table. Hence we get a memory leak.
Solution:
Whenever there are no JOIN_TABs, iterate over all table reference in
JOIN::join_list, and free the ones that contain semi-join temporary
tables.
Fixed by backport of:
------------------------------------------------------------
revno: 3402.50.156
committer: Jon Olav Hauglid <jon.hauglid@oracle.com>
branch nick: mysql-trunk-test
timestamp: Wed 2012-02-08 14:10:23 +0100
message:
Bug#13417754 ASSERT IN ROW_DROP_DATABASE_FOR_MYSQL DURING DROP SCHEMA
This assert could be triggered if an InnoDB table was being moved
to a different database using ALTER TABLE ... RENAME, while this
database concurrently was being dropped by DROP DATABASE.
The reason for the problem was that no metadata lock was taken
on the target database by ALTER TABLE ... RENAME.
DROP DATABASE was therefore not blocked and could remove
the database while ALTER TABLE ... RENAME was executing. This
could cause the assert in InnoDB to be triggered.
This patch fixes the problem by taking a IX metadata lock on
the target database before ALTER TABLE ... RENAME starts
moving a table to a different database.
Note that this problem did not occur with RENAME TABLE which
already takes the correct metadata locks.
Also note that this patch slightly changes the behavior of
ALTER TABLE ... RENAME. Before, the statement would abort and
return an error if a lock on the target table name could not
be taken immediately. With this patch, ALTER TABLE ... RENAME
will instead block and wait until the lock can be taken
(or until we get a lock timeout). This also means that it is
possible to get ER_LOCK_DEADLOCK errors in this situation
since we allow ALTER TABLE ... RENAME to wait and not just
abort immediately.
Analysis:
When a subquery that needs a temp table is executed during
the prepare or optimize phase of the outer query, at the end
of the subquery execution all the JOIN_TABs of the subquery
are replaced by a new JOIN_TAB that selects from the temp table.
However that temp table has no corresponding TABLE_LIST.
Once EXPLAIN execution reaches its last phase, it tries to print
the names of the subquery tables through its TABLE_LISTs, but in
the case of this bug there is no such TABLE_LIST (it is NULL),
hence a crash.
Solution:
The fix is to block subquery evaluation inside
Item_func_like::fix_fields and Item_func_like::select_optimize()
using the Item::is_expensive() test.
Problem
========
Replication breaks in the cases if the event length exceeds
the size of master Dump thread's max_allowed_packet.
The reason why this failure is occuring is because the event length is
more than the total size of the max_allowed_packet, on addition of the
max_event_header length exceeds the max_allowed_packet of the DUMP thread.
This causes the Dump thread to break replication and throw an error.
That can happen e.g with row-based replication in Update_rows event.
Fix
====
The problem is fixed in 2 steps:
1.) The Dump thread limit to read event is increased to the upper limit
i.e. Dump thread reads whatever gets logged in the binary log.
2.) On the slave side we increase the the max_allowed_packet for the
slave's threads (IO/SQL) by increasing it to 1GB.
This is done using the new server option (slave_max_allowed_packet)
included, is used to regulate the max_allowed_packet of the
slave thread (IO/SQL) by the DBA, and facilitates the sending of
large packets from the master to the slave.
This causes the large packets to be received by the slave and apply
it successfully.
sql/log_event.cc:
The max_allowed_packet is not evaluated to the new option
slave_max_allowed_packet after the fix.
sql/log_event.h:
Added the new option in the log_event.h file.
sql/mysqld.cc:
Added a new option to the server.
sql/slave.cc:
Increasing the session max_allowed_packet to a large value,
i.e. not taking global(max_allowed) into consideration, for the slave's threads.
sql/sql_repl.cc:
The dump thread's max_allowed_packet is set to the upper limit
which makes it independent and it now reads whatever gets
logged in the binary log.
Analysis:
The fix for lp:944706 introduces early subquery optimization.
While a subquery is being optimized some of its predicates may be
removed. In the test case, the EXISTS subquery is constant, and is
evaluated to TRUE. As a result the whole OR is TRUE, and thus the
correlated condition "b = alias1.b" is optimized away. The subquery
becomes non-correlated.
The subquery cache is designed to work only for correlated subqueries.
If constant subquery optimization is disallowed, then the constant
subquery is not evaluated, the subquery remains correlated, and its
execution is cached. As a result execution is fast.
However, when the constant subquery was optimized away, it was neither
cached by the subquery cache, nor it was cached by the internal subquery
caching. The latter was due to the fact that the subquery still appeared
as correlated to the subselect_XYZ_engine::exec methods, and they
re-executed the subquery on each call to Item_subselect::exec.
Solution:
The solution is to update the correlated status of the subquery after it has
been optimized. This status consists of:
- st_select_lex::is_correlated
- Item_subselect::is_correlated
- SELECT_LEX::uncacheable
- SELECT_LEX_UNIT::uncacheable
The status is updated by st_select_lex::update_correlated_cache(), and its
caller st_select_lex::optimize_unflattened_subqueries. The solution relies
on the fact that the optimizer already called
st_select_lex::update_used_tables() for each subquery. This allows to
efficiently update the correlated status of each subquery without walking
the whole subquery tree.
Notice that his patch is an improvement over MySQL 5.6 and older, where
subqueries are not pre-optimized, and the above analysis is not possible.
Optimizator fails using index with ST_Within(g, constant_poly).
per-file comments:
mysql-test/r/gis-rt-precise.result
test result fixed.
mysql-test/r/gis-rtree.result
test result fixed.
mysql-test/suite/maria/r/maria-gis-rtree-dynamic.result
test result fixed.
mysql-test/suite/maria/r/maria-gis-rtree-trans.result
test result fixed.
mysql-test/suite/maria/r/maria-gis-rtree.result
test result fixed.
storage/maria/ma_rt_index.c
Use MBR_INTERSECT mode when optimizing the select WITH ST_Within.
storage/myisam/rt_index.c
Use MBR_INTERSECT mode when optimizing the select WITH ST_Within.
- In JOIN::exec(), make the having->update_used_tables() call before we've
made the JOIN::cleanup(full=true) call. The latter frees SJ-Materialization
structures, which correlated subquery predicate items attempt to walk afterwards.
Details:
- Archive storage engine file access were not instrumented and thus
were not shown in PS tables.
Fix:
- Added instrumentation code by using PS Apis for I/O.
Analysis:
The problem in the original MySQL bug is that the range optimizer
performs its analysis in a separate MEM_ROOT object that is freed
after the range optimzier is done. During range analysis get_mm_tree
calls Item_func_like::select_optimize, which in turn evaluates its
right argument. In the test case the right argument is a subquery.
In MySQL, subqueries are optimized lazyly, thus the call to val_str
triggers optimization for the subquery. All objects needed by the
subquery plan end up in the temporary MEM_ROOT used by the range
optimizer. When execution ends, the JOIN::cleanup process tries to
cleanup objects of the subquery plan, but all these objects are gone
with the temporary MEM_ROOT. The solution for MySQL is to switch the
mem_root.
In MariaDB with the patch for bug lp:944706, all constant subqueries
that may be used by the optimization process are preoptimized. Therefore
Item_func_like::select_optimize only triggers subquery execution, and
the above problem is not present.
The patch however adds a test whether the evaluated right argument of
the LIKE predicate is expensive. This is consistent with our approach
not to evaluate expensive expressions during optimization.
This is a backport of the (unchaged) fix for MySQL bug #11764372, 57197.
Analysis:
When the outer query finishes its main execution and computes GROUP BY,
it needs to construct a new temporary table (and a corresponding JOIN) to
execute the last DISTINCT operation. At this point JOIN::exec calls
JOIN::join_free, which calls JOIN::cleanup -> TMP_TABLE_PARAM::cleanup
for both the outer and the inner JOINs. The call to the inner
TMP_TABLE_PARAM::cleanup sets copy_field = NULL, but not copy_field_end.
The final execution phase that computes the DISTINCT invokes:
evaluate_join_record -> end_write -> copy_funcs
The last function copies the results of all functions into the temp table.
copy_funcs walks over all functions in join->tmp_table_param.items_to_copy.
In this case items_to_copy contains both assignments to user variables.
The process of copying user variables invokes Item_func_set_user_var::check
which in turn re-evaluates the arguments of the user variable assignment.
This in turn triggers re-evaluation of the subquery, and ultimately
copy_field.
However, the previous call to TMP_TABLE_PARAM::cleanup for the subquery
already set copy_field to NULL but not its copy_field_end. This results
in a null pointer access, and a crash.
Fix:
Set copy_field_end and save_copy_field_end to null when deleting
copy fields in TMP_TABLE_PARAM::cleanup().
updating the result file. Because a multi-row insert now reserves the
auto increment values before hand, if any explicitly specified auto
increment values are there, then some of the reserved values are lost.
Analysis:
The optimizer detects an empty result through constant table optimization.
Then it calls return_zero_rows(), which in turns calls inderctly
Item_maxmin_subselect::no_rows_in_result(). The latter method set "value=0",
however "value" is pointer to Item_cache, and not just an integer value.
All of the Item_[maxmin | singlerow]_subselect::val_XXX methods does:
if (forced_const)
return value->val_real();
which of course crashes when value is a NULL pointer.
Solution:
When the optimizer discovers an empty result set, set
Item_singlerow_subselect::value to a FALSE constant Item instead of NULL.
Handle the 'set read_only=1' in lighter way, than the FLUSH TABLES READ LOCK;
For the transactional engines we don't wait for operations on that tables to finish.
per-file comments:
mysql-test/r/read_only_innodb.result
MDEV-136 Non-blocking "set read_only".
test result updated.
mysql-test/t/read_only_innodb.test
MDEV-136 Non-blocking "set read_only".
test case added.
sql/mysql_priv.h
MDEV-136 Non-blocking "set read_only".
The close_cached_tables_set_readonly() declared.
sql/set_var.cc
MDEV-136 Non-blocking "set read_only".
Call close_cached_tables_set_readonly() for the read_only::set_var.
sql/sql_base.cc
MDEV-136 Non-blocking "set read_only".
Parameters added to the close_cached_tables implementation,
close_cached_tables_set_readonly declared.
Prevent blocking on the transactional tables if the
set_readonly_mode is on.
Problem
========
SQL statements close to the size of max_allowed_packet produce binary
log events larger than max_allowed_packet.
The reason why this failure is occuring is because the event length is
more than the total size of the max_allowed_packet + max_event_header
length. Now since the event length exceeds this size master Dump
thread is unable to send the packet on to the slave.
That can happen e.g with row-based replication in Update_rows event.
Fix
====
The problem was fixed by increasing the max_allowed_packet for the
slave's threads (IO/SQL) by increasing it to 1GB.
This is done using the new server option included which is used to
regulate the max_allowed_packet of the slave thread (IO/SQL).
This causes the large packets to be received by the slave and apply
it successfully.
sql/log_event.h:
Added the new option in the log_event.h file.
sql/mysqld.cc:
Added a new option to the server.
sql/slave.cc:
Increasing the session max_allowed_packet to a large value ,
i.e. not taking global(max_allowed) into consideration, for the slave's threads.
- make make_cond_after_sjm() correctly handle OR clauses where one branch refers to the semi-join table
while the other branch refers to the non-semijoin table.
The cause for this bug is that the method JOIN::get_examined_rows iterates over all
JOIN_TABs of the join assuming they are just a sequence. In the query above, the
innermost subquery is merged into its parent query. When we call
JOIN::get_examined_rows for the second-level subquery, the iteration that
assumes sequential order of join tabs goes outside the join_tab array and calls
the method JOIN_TAB::get_examined_rows on uninitialized memory.
The fix is to iterate over JOIN_TABs in a way that takes into account the nested
semi-join structure of JOIN_TABs. In particular iterate as select_describe.
The patch enables back constant subquery execution during
query optimization after it was disabled during the development
of MWL#89 (cost-based choice of IN-TO-EXISTS vs MATERIALIZATION).
The main idea is that constant subqueries are allowed to be executed
during optimization if their execution is not expensive.
The approach is as follows:
- Constant subqueries are recursively optimized in the beginning of
JOIN::optimize of the outer query. This is done by the new method
JOIN::optimize_constant_subqueries(). This is done so that the cost
of executing these queries can be estimated.
- Optimization of the outer query proceeds normally. During this phase
the optimizer may request execution of non-expensive constant subqueries.
Each place where the optimizer may potentially execute an expensive
expression is guarded with the predicate Item::is_expensive().
- The implementation of Item_subselect::is_expensive has been extended
to use the number of examined rows (estimated by the optimizer) as a
way to determine whether the subquery is expensive or not.
- The new system variable "expensive_subquery_limit" controls how many
examined rows are considered to be not expensive. The default is 100.
In addition, multiple changes were needed to make this solution work
in the light of the changes made by MWL#89. These changes were needed
to fix various crashes and wrong results, and legacy bugs discovered
during development.
If we did nothing in resolving unique table conflict we should not retry (it leed to infinite loop).
Now we retry (recheck) unique table check only in case if we materialized a table.
- Let fix_semijoin_strategies_for_picked_join_order() set
POSITION::prefix_record_count for POSITION records that it copies from
SJ_MATERIALIZATION_INFO::tables.
(These records do not have prefix_record_count set, because they are optimized
as joins-inside-semijoin-nests, without full advance_sj_state() processing).
Fixed some mtr test problems
dbug/tests.c:
Fixed compiler warnings
mysql-test/r/handlersocket.result:
Fixed that plugin_license is written
mysql-test/suite/innodb/t/innodb_bug60196.test:
Force sorted results as it was sometimes different on windows
mysql-test/suite/rpl/t/rpl_heartbeat_basic.test:
Prolong test as this failed on windows
mysql-test/t/handlersocket.test:
Fixed that plugin_license is written
plugin/handler_socket/handlersocket/handlersocket.cpp:
Use maria_declare_plugin
plugin/handler_socket/handlersocket/mysql_incl.hpp:
Fixed compiler warning
plugin/handler_socket/libhsclient/auto_addrinfo.hpp:
Fixed compiler warning
sql/handler.h:
Fixed typo
sql/sql_plugin.cc:
Fixed bug that caused plugin library name twice in error message
storage/maria/ma_checkpoint.c:
Fixed compiler warning
storage/maria/ma_loghandler.c:
Fixed compiler warning
unittest/mysys/base64-t.c:
Fixed compiler warning
unittest/mysys/bitmap-t.c:
Fixed compiler warning
unittest/mysys/my_malloc-t.c:
Fixed compiler warning
Fix is done by doing an autocommit in truncate table inside Aria
storage/maria/ha_maria.cc:
Force a commit for TRUNCATE TABLE inside lock tables
Check that we don't call TRUNCATE with concurrent inserts going on.
Make ha_maria::implict_commit faster when we don't have Aria tables in the transaction.
(Most of the patch is just re-indentation because I removed an if level)