This bug happened because the executor tried to use a wrong
TABLE REF object when building access keys. It constructed
keys from fields of a materialized table from a ref object
created to construct keys from the fields of the underlying
base table. This could happen only when materialized table
was created for a non-correlated IN subquery and only
when the materialized table used for lookups.
In this case we are guaranteed to be able to construct the
keys from the fields of tables that would be outer tables
for the tables of the IN subquery.
The patch makes sure that no ref objects constructed from
fields of materialized lookup tables are to be used.
WITH A VARIABLE AND ORDER BY
Bug#16035412 MYSQL SERVER 5.5.29 WRONG SORTING USING COMPLEX INDEX
This is a fix for a regression introduced by Bug#12667154:
Bug#12667154 attempted to fix a performance problem with subqueries
that did filesort. For doing filesort, the optimizer creates a quick
select object to use when building the sort index. This quick select
object was deleted after the first call to create_sort_index(). Thus,
for queries where the subquery was executed multiple times, the quick
object was only used for the first execution. For all later executions
of the subquery, filesort used a complete table scan for building the
sort index. The fix for Bug#12667154 tried to fix this by not deleting
the quick object after the first execution of create_sort_index() so
that it would be re-used for building the sort index by the following
executions of the subquery.
This regression introduced in Bug#12667154 is that due to not deleting
the quick select object after building the sort index, the quick
object could in some cases be used also during the second phase of the
execution of the subquery instead of using the created sort
index. This caused wrong results to be returned.
The fix for this issue is to delete the reference to the select object
after it has been used in create_sort_index(). In this way the select
and quick objects will not be available when doing the second phase
of the execution of the select operation. To ensure that the select
object can be re-used for the following executions of the subquery
we make a copy of the select pointer. This is used for restoring the
select object after the select operation is completed.
mysql-test/suite/innodb/r/innodb_mysql.result:
Changed explain output: The explain now contains "Using where" since we
have restored the select pointer after doing the filesort operation.
sql/sql_select.cc:
Change create_sort_index() so that it always sets the pointer to
the select object to NULL. This is done in order to avoid that the
select->quick object can be used when execution the main part of
the select operation.
sql/sql_select.h:
New member in JOIN_TAB: saved_select. Used by create_sort_index to
make a backup copy of the select pointer.
The problem was that in debugging binaries it try to print item to assign human readable name to the item.
But subquery item was already freed (join_free/cleanup with full cleanup) so Item_field refers to temporary
table which memory had been already freed.
MDEV-567: Wrong result from a query with correlated subquery if ICP is allowed:
backport the fix developed for SHOW EXPLAIN:
revision-id: psergey@askmonty.org-20120719115219-212cxmm6qvf0wlrb
branch nick: 5.5-show-explain-r21
timestamp: Thu 2012-07-19 15:52:19 +0400
BUG#992942 & MDEV-325: Pre-liminary commit for testing
and adjust it so that it handles DS-MRR scans correctly.
This task fixes an ineffeciency that is a remainder from MySQL 5.0/5.1. There, subqueries
were optimized in a lazy manner, when executed for the first time. During this lazy optimization
it may happen that the server finds a more efficient subquery engine, and substitute the current
engine of the query being executed with the new engine. This required re-execution of the engine.
MariaDB 5.3 pre-optimizes subqueries in almost all cases, and the engine is chosen in most cases,
except when subquery materialization found that it must use partial matching. In this case, the
current code was performing one extra re-execution although it was not needed at all. The patch
performs the re-execution only if the engine was changed while executing.
In addition the patch performs small cleanup by removing "enum store_key_result" because it is
essentially a boolean, and the code that uses it already maps it to a boolean.
and small collateral changes
mysql-test/lib/My/Test.pm:
somehow with "print" we get truncated writes sometimes
mysql-test/suite/perfschema/r/digest_table_full.result:
md5 hashes of statement digests differ, because yacc token codes are different in mariadb
mysql-test/suite/perfschema/r/dml_handler.result:
host table is not ported over yet
mysql-test/suite/perfschema/r/information_schema.result:
host table is not ported over yet
mysql-test/suite/perfschema/r/nesting.result:
this differs, because we don't rewrite general log queries, and multi-statement
packets are logged as a one entry. this result file is identical to what mysql-5.6.5
produces with the --log-raw option.
mysql-test/suite/perfschema/r/relaylog.result:
MariaDB modifies the binlog index file directly, while MySQL 5.6 has a feature "crash-safe binlog index" and modifies a special "crash-safe" shadow copy of the index file and then moves it over. That's why this test shows "NONE" index file writes in MySQL and "MANY" in MariaDB.
mysql-test/suite/perfschema/r/server_init.result:
MariaDB initializes the "manager" resources from the "manager" thread, and starts this thread only when --flush-time is not 0. MySQL 5.6 initializes "manager" resources unconditionally on server startup.
mysql-test/suite/perfschema/r/stage_mdl_global.result:
this differs, because MariaDB disables query cache when query_cache_size=0. MySQL does not
do that, and this causes useless mutex locks and waits.
mysql-test/suite/perfschema/r/statement_digest.result:
md5 hashes of statement digests differ, because yacc token codes are different in mariadb
mysql-test/suite/perfschema/r/statement_digest_consumers.result:
md5 hashes of statement digests differ, because yacc token codes are different in mariadb
mysql-test/suite/perfschema/r/statement_digest_long_query.result:
md5 hashes of statement digests differ, because yacc token codes are different in mariadb
mysql-test/suite/rpl/r/rpl_mixed_drop_create_temp_table.result:
will be updated to match 5.6 when alfranio.correia@oracle.com-20110512172919-c1b5kmum4h52g0ni and anders.song@greatopensource.com-20110105052107-zoab0bsf5a6xxk2y are merged
mysql-test/suite/rpl/r/rpl_non_direct_mixed_mixing_engines.result:
will be updated to match 5.6 when anders.song@greatopensource.com-20110105052107-zoab0bsf5a6xxk2y is merged
The fix backports from MWL#182: Explain running statements the logic that
saves the original JOIN_TAB array of a query plan after optimization. This
array is later used during EXPLAIN to iterate over the original JOIN plan
nodes in the cases when this plan could be changed by early subquery
execution during the optimization phase of the outer query.
- The problem was that create_ref_for_key() would act differently, depending on
whether we're running EXPLAIN or the actual query.
- As the first step, fixed the EXPLAIN printout not to depend on actions in create_ref_for_key().
The patch enables back constant subquery execution during
query optimization after it was disabled during the development
of MWL#89 (cost-based choice of IN-TO-EXISTS vs MATERIALIZATION).
The main idea is that constant subqueries are allowed to be executed
during optimization if their execution is not expensive.
The approach is as follows:
- Constant subqueries are recursively optimized in the beginning of
JOIN::optimize of the outer query. This is done by the new method
JOIN::optimize_constant_subqueries(). This is done so that the cost
of executing these queries can be estimated.
- Optimization of the outer query proceeds normally. During this phase
the optimizer may request execution of non-expensive constant subqueries.
Each place where the optimizer may potentially execute an expensive
expression is guarded with the predicate Item::is_expensive().
- The implementation of Item_subselect::is_expensive has been extended
to use the number of examined rows (estimated by the optimizer) as a
way to determine whether the subquery is expensive or not.
- The new system variable "expensive_subquery_limit" controls how many
examined rows are considered to be not expensive. The default is 100.
In addition, multiple changes were needed to make this solution work
in the light of the changes made by MWL#89. These changes were needed
to fix various crashes and wrong results, and legacy bugs discovered
during development.
Analysis:
The reason for the wrong result is the interaction between constant
optimization (in this case 1-row table) and subquery optimization.
- First the outer query is optimized, and 'make_join_statistics' finds that
table t2 has one row, reads that row, and marks the whole table as constant.
This also means that all fields of t2 are constant.
- Next, we optimize the subquery in the end of the outer 'make_join_statistics'.
The field 'f2' is considered constant, with value '3'. The subquery predicate
is rewritten as the constant TRUE.
- The outer query execution detects early that the whole query result is empty
and calls 'return_zero_rows'. Since the query is with implicit grouping, we
have to produce one row with special values for the aggregates (depending on
each aggregate function), and NULL values for all non-aggregate fields. This
function calls 'no_rows_in_result' to set each aggregate function to the
default value when it aggregates over an empty result, and then calls
'send_data', which in turn evaluates each Item in the SELECT list.
- When evaluation reaches the subquery predicate, it executes the subquery
with field 'f2' having a constant value '3', and the subquery produces the
incorrect result '7'.
Solution:
Implement Item::no_rows_in_result for all subquery predicates. In order to
make this work, it is also needed to make all val_* methods of all subquery
predicates respect the Item_subselect::forced_const flag. Otherwise subqueries
are executed anyways, and override the default value set by no_rows_in_result
with whatever result is produced from the subquery evaluation.
- Remove all references of MAX_TABLES from JOIN struct and make these dynamic
- Updated Join_plan_state to allocate just as many elements as it's needed
sql/opt_subselect.cc:
Optimized version of Join_plan_state
sql/sql_select.cc:
Set join->positions and join->best_positions dynamicly
Don't call update_virtual_fields() if table->vfield is not set.
sql/sql_select.h:
Remove all references of MAX_TABLES from JOIN struct and Join_plan_state and make these dynamic
- Disable use of join cache when we're using FirstMatch strategy, and the join
order is such that subquery's inner tables are interleaved with outer. Join
buffering code is incapable of handling such join orders.
- The testcase requires use of @@debug_optimizer_prefer_join_prefix to hit the bug,
but I'm pushing it anyway (including the mention of the variable in .test file),
so that it can be found and enabled when/if we get something comparable in the
main tree.
The problem was that LooseScan execution code assumed that tab->key holds
the index used for looseScan. This is only true when range or full index
scan are used. In case of ref access, the index is in tab->ref.key (and
tab->index==0 which explains how LooseScan passed tests with ref access: they
used one index)
Fixed by setting/using loosescan_key, which always the correct index#.
The function subselect_uniquesubquery_engine::copy_ref_key has to take into
account that when EXPLAIN is processed the array of store_key object created
for any TABLE_REF may contain elements for constant items. These items should
be ignored by thefunction.
Problem: When building the condition for JOIN::outer_ref_cond the optimizer forgot to take into account
that this condition could depend on constant tables as well.