Open issues:
- A better fix for #57688; Igor is working on this
- Test failure in index_merge_innodb.test ; Igor promised to look at this
- Some Innodb tests fails (need to merge with latest xtradb) ; Kristian promised to look at this.
- Failing tests: innodb_plugin.innodb_bug56143 innodb_plugin.innodb_bug56632 innodb_plugin.innodb_bug56680 innodb_plugin.innodb_bug57255
- Werror is disabled; Should be enabled after merge with xtradb.
Analysis:
Single-row subqueries are not considered expensive and are
evaluated both during EXPLAIN in to detect errors like
"Subquery returns more than 1 row", and during optimization to
perform constant optimization.
The cause for the failed ASSERT is in JOIN::join_free, where we set
bool full= (!select_lex->uncacheable && !thd->lex->describe);
Thus for EXPLAIN statements full == FALSE, and as a result the call to
JOIN::cleanup doesn't call JOIN_TAB::cleanup which should have
called table->disable_keyread().
Solution:
Consider all kinds of subquery predicates as expensive.
Merge 5.3-mwl89 into 5.3 main.
There is one remaining test failure in this merge:
innodb_mysql_lock2. All other tests have been checked to
deliver the same results/explains as 5.3-mwl89, including
the few remaining wrong results.
The set of Ordered keys of a rowid merge engine is dense. Thus when
we decide not to create a key for a column that has only NULLs, this
column shouldn't be counted.
Notice that the caller has already precomputed the correct total
number of keys that should be created.
The cause for this bug is that MariaDB 5.3 still processes derived tables
(subqueries in the FROM clause) by fully executing them during the parse
phase. This will be remedied by MWL#106 once merged into the main 5.3.
The assert statement is triggered when MATERIALIZATION is ON for EXPLAIN
EXTENDED for derived tables with an IN subquery as follows:
- mysql_parse calls JOIN::exec for the derived table as if it is regular
execution (not explain).
- When materialization is ON, this call goes all the way to
subselect_hash_sj_engine::exec, which creates a partial match engine
because of NULL presence.
- In order to proceed with normal execution, the hash_sj engine substitutes
itself with the created partial match engine.
- After the parse phase it turns out that this execution was part of
EXPLAIN EXTENDED, which in turn calls
Item_cond::print -> ... -> Item_subselect::print,
which calls engine->print().
Since subselect_hash_sj_engine::exec substituted the current
Item_subselect engine with a partial match engine, eventually we call
its ::print() method. However the partial match engines are designed only
for execution, hence there is no implementation of this print() method.
The fix temporarily removes the assert, until this code is merged with
MWL#106.
create_sort_index() function overwrites original JOIN_TAB::type field.
At re-execution of subquery overwritten JOIN_TAB::type(JT_ALL) is
used instead of JT_FT. It misleads test_if_skip_sort_order() and
the function tries to find suitable key for the order that should
not be allowed for FULLTEXT(JT_FT) table.
The fix is to restore JOIN_TAB strucures for subselect on re-execution
for EXPLAIN.
Additional fix:
Update TABLE::maybe_null field which
affects list_contains_unique_index() behaviour as it
could have the value(maybe_null==TRUE) based on the
assumption that this join is outer
(see setup_table_map() func).
mysql-test/r/explain.result:
test case
mysql-test/t/explain.test:
test case
sql/item_subselect.cc:
Make subquery uncacheable in case of EXPLAIN. It allows to keep
original JOIN_TAB::type(see JOIN::save_join_tab) and restore it
on re-execution.
sql/sql_select.cc:
-restore JOIN_TAB strucures for subselect on re-execution for EXPLAIN
-Update TABLE::maybe_null field as it could have
the value(maybe_null==TRUE) based on the assumption
that this join is outer(see setup_table_map() func).
This change is not related to the crash problem but
affects EXPLAIN results in the test case.
create_sort_index() function overwrites original JOIN_TAB::type field.
At re-execution of subquery overwritten JOIN_TAB::type(JT_ALL) is
used instead of JT_FT. It misleads test_if_skip_sort_order() and
the function tries to find suitable key for the order that should
not be allowed for FULLTEXT(JT_FT) table.
The fix is to restore JOIN_TAB strucures for subselect on re-execution
for EXPLAIN.
Additional fix:
Update TABLE::maybe_null field which
affects list_contains_unique_index() behaviour as it
could have the value(maybe_null==TRUE) based on the
assumption that this join is outer
(see setup_table_map() func).
The crash happens because original join table is replaced with temporary table
at execution stage and later we attempt to use this temporary table in
select_describe. It might happen that
Item_subselect::update_used_tables() method which sets const_item flag
is not called by some reasons (no where/having conditon in subquery for example).
It prevents JOIN::join_tmp creation and breaks original join.
The fix is to call ::update_used_tables() before ::const_item() check.
mysql-test/r/ps.result:
test case
mysql-test/t/ps.test:
test case
sql/item_subselect.cc:
call ::update_used_tables() before ::const_item() check.
The crash happens because original join table is replaced with temporary table
at execution stage and later we attempt to use this temporary table in
select_describe. It might happen that
Item_subselect::update_used_tables() method which sets const_item flag
is not called by some reasons (no where/having conditon in subquery for example).
It prevents JOIN::join_tmp creation and breaks original join.
The fix is to call ::update_used_tables() before ::const_item() check.
Phase 3: Implementation of re-optimization of subqueries with injected predicates
and cost comparison between Materialization and IN->EXISTS strategies.
The commit contains the following known problems:
- The implementation of EXPLAIN has not been re-engineered to reflect the
changes in subquery optimization. EXPLAIN for subqueries is called during
the execute phase, which results in different code paths during JOIN::optimize
and thus in differing EXPLAIN messages for constant/system tables.
- There are some valgrind warnings that need investigation
- Several EXPLAINs with minor differences need to be reconsidered after fixing
the EXPLAIN problem above.
This patch also adds one extra optimizer_switch: 'in_to_exists' for complete
manual control of the subquery execution strategies.
to 5.5 (removed one test case as it is no longer valid).
mysql-test/r/select.result:
Removed a part of the test case for bug#48291 since it is not
valid anymore. The comments for the removed part were actually
describing a side-effect from the problem addressed by the
addendum patch for bug #54190.
mysql-test/t/select.test:
Removed a part of the test case for bug#48291 since it is not
valid anymore. The comments for the removed part were actually
describing a side-effect from the problem addressed by the
addendum patch for bug #54190.
result
Row subqueries producing no rows were not handled as UNKNOWN
values in row comparison expressions.
That was a result of the following two problems:
1. Item_singlerow_subselect did not mark the resulting row
value as NULL/UNKNOWN when no rows were produced.
2. Arg_comparator::compare_row() did not take into account that
a whole argument may be NULL rather than just individual scalar
values.
Before bug#34384 was fixed, the above problems were hidden
because an uninitialized (i.e. without any stored value) cached
object would appear as NULL for scalar values in a row subquery
returning an empty result. After the fix
Arg_comparator::compare_row() would try to evaluate
uninitialized cached objects.
Fixed by removing the aforementioned problems.
mysql-test/r/row.result:
Added a test case for bug #54190.
mysql-test/r/subselect.result:
Updated the result for a test relying on wrong behavior.
mysql-test/t/row.test:
Added a test case for bug #54190.
sql/item_cmpfunc.cc:
If either of the argument rows is NULL, return NULL as the
result of comparison.
sql/item_subselect.cc:
Adjust null_value for Item_singlerow_subselect depending on
whether a row has been produced by the row subquery.
result
Row subqueries producing no rows were not handled as UNKNOWN
values in row comparison expressions.
That was a result of the following two problems:
1. Item_singlerow_subselect did not mark the resulting row
value as NULL/UNKNOWN when no rows were produced.
2. Arg_comparator::compare_row() did not take into account that
a whole argument may be NULL rather than just individual scalar
values.
Before bug#34384 was fixed, the above problems were hidden
because an uninitialized (i.e. without any stored value) cached
object would appear as NULL for scalar values in a row subquery
returning an empty result. After the fix
Arg_comparator::compare_row() would try to evaluate
uninitialized cached objects.
Fixed by removing the aforementioned problems.
The EXISTS transformation has additional switches to catch the known corner
cases that appear when transforming an IN predicate into EXISTS. Guarded
conditions are used which are deactivated when a NULL value is seen in the
outer expression's row. When the inner query block supplies NULL values,
however, they are filtered out because no distinction is made between the
guarded conditions; guarded NOT x IS NULL conditions in the HAVING clause that
filter out NULL values cannot be de-activated in isolation from those that
match values or from the outer expression or NULL's.
The above problem is handled by making the guarded conditions remember whether
they have rejected a NULL value or not, and index access methods are taking
this into account as well.
The bug consisted of
1) Not resetting the property for every nested loop iteration on the inner
query's result.
2) Not propagating the NULL result properly from inner query to IN optimizer.
3) A hack that may or may not have been needed at some point. According to a
comment it was aimed to fix#2 by returning NULL when FALSE was actually
the result. This caused failures when #2 was properly fixed. The hack is
now removed.
The fix resolves all three points.
The EXISTS transformation has additional switches to catch the known corner
cases that appear when transforming an IN predicate into EXISTS. Guarded
conditions are used which are deactivated when a NULL value is seen in the
outer expression's row. When the inner query block supplies NULL values,
however, they are filtered out because no distinction is made between the
guarded conditions; guarded NOT x IS NULL conditions in the HAVING clause that
filter out NULL values cannot be de-activated in isolation from those that
match values or from the outer expression or NULL's.
The above problem is handled by making the guarded conditions remember whether
they have rejected a NULL value or not, and index access methods are taking
this into account as well.
The bug consisted of
1) Not resetting the property for every nested loop iteration on the inner
query's result.
2) Not propagating the NULL result properly from inner query to IN optimizer.
3) A hack that may or may not have been needed at some point. According to a
comment it was aimed to fix#2 by returning NULL when FALSE was actually
the result. This caused failures when #2 was properly fixed. The hack is
now removed.
The fix resolves all three points.
The bug is a result of the following change by Monty:
Revision Id: monty@askmonty.org-20100716073301-gstby2062nqd42qv
Timestamp: Fri 2010-07-16 10:33:01 +0300
Where Monty changed the queues interface and implementation.
The fix adjusts the queue_remove call to the new interface.
mysql-test/r/subselect_partial_match.result:
Added new file for tests related to MWL#89.
mysql-test/t/subselect_partial_match.test:
Added new file for tests related to MWL#89.
Step2 in the separation of the creation of IN->EXISTS equi-join conditions from
their injection. The goal of this separation is to make it possible that the
IN->EXISTS conditions can be used for cost estimation without actually modifying
the subquery.
This patch separates row_value_in_to_exists_transformer() into two methods:
- create_row_value_in_to_exists_cond(), and
- inject_row_value_in_to_exists_cond()
The patch performs minimal refactoring of the code so that it is easier to solve
problems resulting from the separation. There is a lot to be simplified in this
code, but this will be done separately.
Step1 in the separation of the creation of IN->EXISTS equi-join conditions from
their injection. The goal of this separation is to make it possible that the
IN->EXISTS conditions can be used for cost estimation without actually modifying
the subquery.
This patch separates single_value_in_to_exists_transformer() into two methods:
- create_single_value_in_to_exists_cond(), and
- inject_single_value_in_to_exists_cond()
The patch performs minimal refactoring of the code so that it is easier to solve
problems resulting from the separation. There is a lot to be simplified in this
code, but this will be done separately.
The issue was that we didn't always check result of ha_rnd_init() which caused a problem for handlers that returned an error in this code.
- Changed prototype of ha_rnd_init() to ensure that we get a compile warning if result is not checked.
- Added ha_rnd_init_with_error() that prints error on failure.
- Checked all usage of ha_rnd_init() and ensure we generate an error message on failures.
- Changed init_read_record() to return 1 on failure.
sql/create_options.cc:
Fixed wrong printf
sql/event_db_repository.cc:
Check result from init_read_record()
sql/events.cc:
Check result from init_read_record()
sql/filesort.cc:
Check result from ha_rnd_init()
sql/ha_partition.cc:
Check result from ha_rnd_init()
sql/ha_partition.h:
Fixed compiler warning
sql/handler.cc:
Added ha_rnd_init_with_error()
Check result from ha_rnd_init()
sql/handler.h:
Added ha_rnd_init_with_error()
Changed prototype of ha_rnd_init() to ensure that we get a compile warning if result is not checked
sql/item_subselect.cc:
Check result from ha_rnd_init()
sql/log.cc:
Check result from ha_rnd_init()
sql/log_event.cc:
Check result from ha_rnd_init()
sql/log_event_old.cc:
Check result from ha_rnd_init()
sql/mysql_priv.h:
init_read_record() now returns error code on failure
sql/opt_range.cc:
Check result from ha_rnd_init()
sql/records.cc:
init_read_record() now returns error code on failure
Check result from ha_rnd_init()
sql/sql_acl.cc:
Check result from init_read_record()
sql/sql_cursor.cc:
Print error if ha_rnd_init() fails
sql/sql_delete.cc:
Check result from init_read_record()
sql/sql_help.cc:
Check result from init_read_record()
sql/sql_plugin.cc:
Check result from init_read_record()
sql/sql_select.cc:
Check result from ha_rnd_init()
Print error if ha_rnd_init() fails.
sql/sql_servers.cc:
Check result from init_read_record()
sql/sql_table.cc:
Check result from init_read_record()
sql/sql_udf.cc:
Check result from init_read_record()
sql/sql_update.cc:
Check result from init_read_record()
storage/example/ha_example.cc:
Don't return error on rnd_init()
storage/ibmdb2i/ha_ibmdb2i.cc:
Removed not relevant comment
1. Changed the lazy optimization for subqueries that can be
materialized into bottom-up optimization during the optimization of
the main query.
The main change is implemented by the method
Item_in_subselect::setup_engine.
All other changes were required to correct problems resulting from
changing the order of optimization. Most of these problems followed
the same pattern - there are some shared structures between a
subquery and its parent query. Depending on which one is optimized
first (parent or child query), these shared strucutres may get
different values, thus resulting in an inconsistent query plan.
2. Changed the code-generation for subquery materialization to be
performed in runtime memory for each (re)execution, instead of in
statement memory (once per prepared statement).
- Item_in_subselect::setup_engine() no longer creates materialization
related objects in statement memory.
- Merged subselect_hash_sj_engine::init_permanent and
subselect_hash_sj_engine::init_runtime into
subselect_hash_sj_engine::init, which is called for each
(re)execution.
- Fixed deletion of the temp table accordingly.
mysql-test/r/subselect_mat.result:
Adjusted changed EXPLAIN because of earlier optimization of subqueries.
Fixed compiler warnings.
queues.h and queues.c are now based on the UNIREG code and thus made BSD.
Fix code to use new queue() interface. This mostly affects how you access elements in the queue.
If USE_NET_CLEAR is not set, don't clear connection from unexpected characters. This should give a speed up when doing a lot of fast queries.
Fixed some code in ma_ft_boolean_search.c that had not made it from myisam/ft_boolean_search.c
include/queues.h:
Use UNIREG code base (BSD)
Changed init_queue() to take all initialization arguments.
New interface to access elements in queue
include/thr_alarm.h:
Changed to use time_t instead of ulong (portability)
Added index_in_queue, to be able to remove random element from queue in O(1)
mysys/queues.c:
Use UNIREG code base (BSD)
init_queue() and reinit_queue() now takes more initialization arguments. (No need for init_queue_ex() anymore)
Now one can tell queue_insert() to store in the element a pointer to where element is in queue. This allows one to remove elements from queue in O(1) instead of O(N)
mysys/thr_alarm.c:
Use new option in queue() to allow fast removal of elements.
Do less inside LOCK_alarm mutex.
This should give a major speed up of thr_alarm usage when there is many threads
sql/create_options.cc:
Fixed wrong printf
sql/event_queue.cc:
Use new queue interface()
sql/filesort.cc:
Use new queue interface()
sql/ha_partition.cc:
Use new queue interface()
sql/ha_partition.h:
Fixed compiler warning
sql/item_cmpfunc.cc:
Fixed compiler warning
sql/item_subselect.cc:
Use new queue interface()
Removed not used variable
sql/net_serv.cc:
If USE_NET_CLEAR is not set, don't clear connection from unexpected characters.
This should give a speed up when doing a lot of fast queries at the disadvantage that if there is a bug in the client protocol the connection will be dropped instead of being unnoticed.
sql/opt_range.cc:
Use new queue interface()
Fixed compiler warnings
sql/uniques.cc:
Use new queue interface()
storage/maria/ma_ft_boolean_search.c:
Copy code from myisam/ft_boolean_search.c
Use new queue interface()
storage/maria/ma_ft_nlq_search.c:
Use new queue interface()
storage/maria/ma_sort.c:
Use new queue interface()
storage/maria/maria_pack.c:
Use new queue interface()
Use queue_fix() instead of own loop to fix queue.
storage/myisam/ft_boolean_search.c:
Use new queue interface()
storage/myisam/ft_nlq_search.c:
Use new queue interface()
storage/myisam/mi_test_all.sh:
Remove temporary file from last run
storage/myisam/myisampack.c:
Use new queue interface()
Use queue_fix() instead of own loop to fix queue.
storage/myisam/sort.c:
Use new queue interface()
storage/myisammrg/myrg_queue.c:
Use new queue interface()
storage/myisammrg/myrg_rnext.c:
Use new queue interface()
storage/myisammrg/myrg_rnext_same.c:
Use new queue interface()
storage/myisammrg/myrg_rprev.c:
Use new queue interface()
libmysqld/Makefile.am:
The new file added.
mysql-test/r/index_merge_myisam.result:
subquery_cache optimization option added.
mysql-test/r/myisam_mrr.result:
subquery_cache optimization option added.
mysql-test/r/subquery_cache.result:
The subquery cache tests added.
mysql-test/r/subselect3.result:
Subquery cache switched off to avoid changing read statistics.
mysql-test/r/subselect3_jcl6.result:
Subquery cache switched off to avoid changing read statistics.
mysql-test/r/subselect_no_mat.result:
subquery_cache optimization option added.
mysql-test/r/subselect_no_opts.result:
subquery_cache optimization option added.
mysql-test/r/subselect_no_semijoin.result:
subquery_cache optimization option added.
mysql-test/r/subselect_sj.result:
subquery_cache optimization option added.
mysql-test/r/subselect_sj_jcl6.result:
subquery_cache optimization option added.
mysql-test/t/subquery_cache.test:
The subquery cache tests added.
mysql-test/t/subselect3.test:
Subquery cache switched off to avoid changing read statistics.
sql/CMakeLists.txt:
The new file added.
sql/Makefile.am:
The new files added.
sql/item.cc:
Expression cache item (Item_cache_wrapper) added.
Item_ref and Item_field fixed for correct usage of result field and fast resolwing in SP.
sql/item.h:
Expression cache item (Item_cache_wrapper) added.
Item_ref and Item_field fixed for correct usage of result field and fast resolwing in SP.
sql/item_cmpfunc.cc:
Subquery cache added.
sql/item_cmpfunc.h:
Subquery cache added.
sql/item_subselect.cc:
Subquery cache added.
sql/item_subselect.h:
Subquery cache added.
sql/item_sum.cc:
Registration of subquery parameters added.
sql/mysql_priv.h:
subquery_cache optimization option added.
sql/mysqld.cc:
subquery_cache optimization option added.
sql/opt_range.cc:
Fix due to subquery cache.
sql/opt_subselect.cc:
Parameters of the function cahnged.
sql/procedure.h:
.h file guard added.
sql/sql_base.cc:
Registration of subquery parameters added.
sql/sql_class.cc:
Option to allow add indeces to temporary table.
sql/sql_class.h:
Item iterators added.
Option to allow add indeces to temporary table.
sql/sql_expression_cache.cc:
Expression cache for caching subqueries added.
sql/sql_expression_cache.h:
Expression cache for caching subqueries added.
sql/sql_lex.cc:
Registration of subquery parameters added.
sql/sql_lex.h:
Registration of subqueries and subquery parameters added.
sql/sql_select.cc:
Subquery cache added.
sql/sql_select.h:
Subquery cache added.
sql/sql_union.cc:
A new parameter to the function added.
sql/sql_update.cc:
A new parameter to the function added.
sql/table.cc:
Procedures to manage temporarty tables index added.
sql/table.h:
Procedures to manage temporarty tables index added.
storage/maria/ha_maria.cc:
Fix of handler to allow destoy a table in case of error during the table creation.
storage/maria/ha_maria.h:
.h file guard added.
storage/myisam/ha_myisam.cc:
Fix of handler to allow destoy a table in case of error during the table creation.
- Item_in_subselect::init_left_expr_cache() should not try to
guess whether the left expression is accessed "over the grouping operation"
(i.e. the subselect is evaluated after the grouping while the left_expr is
an Item_ref that wraps an expression from before the grouping). Instead,
let new_Cached_item not to try accessing item->real_item() when creating
left expr cache.