At the second execution of the PS
1. mark_as_dependent() is called with the same parameters as at the first
execution (select#4 and select#3)
2. as outer_select (select#3) has been already merged at the first
execution of PS it cannot be reached using the outer_select() function
anymore (and so can not stop iteration).
3. as a result all selects towards the top level select including the
select for 'ca' are marked as uncacheable.
4. Marked uncacheable it executed incorrectly triggering filling its
temporary table several times and using freed memory at the end.
To avoid the problem we use name resolution context to go "up".
NOTE: problem also exists in 10.2 but has no visible effect on execution.
That is why the problem is fixed in 10.2.
The patch also add debug logging of important procedures and
better specify parameters types of st_select_lex::mark_as_dependent.
derived table / view by equality
Now rows of a materialized derived table are always put into a
temporary table before join operation. If BNLH is used to join this
table with the result of a partial join then both operands of the
join are actually put into main memory. In most cases this is not
efficient.
We could avoid this by sending the rows of the derived table directly
to the join operation. However this kind of data flow is not supported
yet.
Fixed by not allowing usage of hash join algorithm to join a materialized
derived table if it's joined by an equality predicate of the form
f=e where f is a field of the derived table.
(Backport to 5.3)
(Attempt #2)
- Don't attempt to use BKA for materialized derived tables. The
table is neither filled nor fully opened yet, so attempt to
call handler->multi_range_read_info() causes crash.
- TABLE::create_key_part_by_field() should not set PART_KEY_FLAG in field->flags
= The reason is that it is used by hash join code which calls it to create a hash
table lookup structure. It doesn't create a real index.
= Another caller of the function is TABLE::add_tmp_key(). Made it to set the flag itself.
- The differences in join_cache.result could also be observed before this patch: one
could put "FLUSH TABLES" before the queries and get exactly the same difference.
(Attempt #2)
- Don't attempt to use BKA for materialized derived tables. The
table is neither filled nor fully opened yet, so attempt to
call handler->multi_range_read_info() causes crash.
Analysis:
The fix for lp:944706 introduces early subquery optimization.
While a subquery is being optimized some of its predicates may be
removed. In the test case, the EXISTS subquery is constant, and is
evaluated to TRUE. As a result the whole OR is TRUE, and thus the
correlated condition "b = alias1.b" is optimized away. The subquery
becomes non-correlated.
The subquery cache is designed to work only for correlated subqueries.
If constant subquery optimization is disallowed, then the constant
subquery is not evaluated, the subquery remains correlated, and its
execution is cached. As a result execution is fast.
However, when the constant subquery was optimized away, it was neither
cached by the subquery cache, nor it was cached by the internal subquery
caching. The latter was due to the fact that the subquery still appeared
as correlated to the subselect_XYZ_engine::exec methods, and they
re-executed the subquery on each call to Item_subselect::exec.
Solution:
The solution is to update the correlated status of the subquery after it has
been optimized. This status consists of:
- st_select_lex::is_correlated
- Item_subselect::is_correlated
- SELECT_LEX::uncacheable
- SELECT_LEX_UNIT::uncacheable
The status is updated by st_select_lex::update_correlated_cache(), and its
caller st_select_lex::optimize_unflattened_subqueries. The solution relies
on the fact that the optimizer already called
st_select_lex::update_used_tables() for each subquery. This allows to
efficiently update the correlated status of each subquery without walking
the whole subquery tree.
Notice that his patch is an improvement over MySQL 5.6 and older, where
subqueries are not pre-optimized, and the above analysis is not possible.
Problem was that now we can merge derived table (subquery in the FROM clause).
Fix: in case of detected conflict and presence of derived table "over" the table which cased the conflict - try materialization strategy.