It's better to prohibit pushdown of conditions that involve
regexp_substr() and regexp_replace() into materialized derived
tables / views until proper implementations of the get_copy()
virtual method are provided for those functions.
If the method SELECT_LEX::mark_const_derived() is called for a select
that is used in the specification of a materialized derived table / view D
then method should not set the flag fill_me for D on when
the flag JOIN::with_two_phase_optimization is set on for this select.
This patch corrects the code of the patch for mdev-13369 that
introduced the splitting technique when using materialized
derived tables / views with GROUP BY. The second actual parameters
of the call of the method JOIN::reoptimize() in the function
JOIN::push_splitting_cond_into_derived() was calculated incorrectly.
This could cause different failures for queries using derived tables
or views with GROUP BY when their FROM lists contained empty or
single-row tables.
Currently condition pushdown into materialized views / derived tables
is not implemented yet (see mdev-12387) and grouping views are
optimized early when subqueries are converted to semi-joins in
convert_join_subqueries_to_semijoins(). If a subquery that is converted
to a semi-join uses a grouping view this view is optimized in two phases.
For such a view V only the first phase of optimization is done after
the conversion of subqueries of the outer join into semi-joins.
At the same time the reference of the view V appears in the join
expression of the outer join. In fixed code there was an attempt to push
conditions into this view and to optimize it after this. This triggered
the second phase of the optimization of the view and it was done
prematurely. The second phase of the optimization for the materialized
view is supposed to be called after the splitting condition is pushed
into the view in the call of JOIN::improve_chosen_plan for the outer
join.
The fix blocks the attempt to push conditions into splittable views
if they have been already partly optimized and the following
optimization for them.
The test case of the patch shows that the code for mdev-13369
basically supported the splitting technique for materialized views /
derived tables.
The patch also replaces the name of the state JOIN::OPTIMIZATION_IN_STAGE_2
for JOIN::OPTIMIZATION_PHASE_1_DONE and fixes a bug in
TABLE_LIST::fetch_number_of_rows()
It allows to push conditions into derived with window functions not
only in the cases when the window specifications of these window
functions use the same partition, but also in the cases when the window
functions use partitions that share only some fields. In these
cases only the conditions over the common fields are pushed.
with window functions (mdev-10855).
This patch just modified the function pushdown_cond_for_derived()
to support this feature.
Some test cases demonstrating this optimization were added to
derived_cond_pushdown.test.
"Optimization for equi-joins of derived tables with GROUP BY"
should be considered rather as a 'proof of concept'.
The task itself is targeted at an optimization that employs re-writing
equi-joins with grouping derived tables / views into lateral
derived tables. Here's an example of such transformation:
select t1.a,t.max,t.min
from t1 [left] join
(select a, max(t2.b) max, min(t2.b) min from t2
group by t2.a) as t
on t1.a=t.a;
=>
select t1.a,tl.max,tl.min
from t1 [left] join
lateral (select a, max(t2.b) max, min(t2.b) min from t2
where t1.a=t2.a) as t
on 1=1;
The transformation pushes the equi-join condition t1.a=t.a into the
derived table making it dependent on table t1. It means that for
every row from t1 a new derived table must be filled out. However
the size of any of these derived tables is just a fraction of the
original derived table t. One could say that transformation 'splits'
the rows used for the GROUP BY operation into separate groups
performing aggregation for a group only in the case when there is
a match for the current row of t1.
Apparently the transformation may produce a query with a better
performance only in the case when
- the GROUP BY list refers only to fields returned by the derived table
- there is an index I on one of the tables T used in FROM list of
the specification of the derived table whose prefix covers the
the fields from the proper beginning of the GROUP BY list or
fields that are equal to those fields.
Whether the result of the re-writing can be executed faster depends
on many factors:
- the size of the original derived table
- the size of the table T
- whether the index I is clustering for table T
- whether the index I fully covers the GROUP BY list.
This patch only tries to improve the chosen execution plan using
this transformation. It tries to do it only when the chosen
plan reaches the derived table by a key whose prefix covers
all the fields of the derived table produced by the fields of
the table T from the GROUP BY list.
The code of the patch does not evaluates the cost of the improved
plan. If certain conditions are met the transformation is applied.
When an equality that can be pushed into a materialized derived
table / view is extracted from multiple equalities and their
operands are cloned then if they have some pointers to Item_equal
objects those pointers must be set to NULL in the clones. Anyway
they are not valid in the pushed predicates.
This patch fills in a serious flaw in the
code that supports condition pushdown into
materialized views / derived tables.
If a predicate happened to contain a reference
to a mergeable view / derived table and it does
not depended directly on the target materialized
view / derived table then the predicate was not
considered as a subject to pusdown to this view
/ derived table.
The fields st_select_lex::cond_pushed_into_where and
st_select_lex::cond_pushed_into_having should be re-initialized
for the unit specifying a derived table at every re-execution
of the query that uses this derived table, because the result
of condition pushdown may be different for different executions.
The fix for bug mdev-11488 introduced the virtual method
convert_to_basic_const_item for the class Item_cache.
The implementation of this method for the class Item_cache_str
was not quite correct: the server could crash if the cached item
was null.
A similar problem could appear for the implementation of
this method for the class Item_cache_decimal. Although I could not
reproduce the problem I decided to change the code appropriately.
When a condition containing NULLIF is pushed into a materialized
view/derived table the clone of the Item_func_nullif item must
be processed in a special way to guarantee that the first argument
points to the same item as the third argument.
The patch for bug mdev-10882 tried to fix it by providing an
implementation of the virtual method build_clone for the class
Item_cache. It's turned out that it is not easy provide a valid
implementation for Item_cache::build_clone(). At the same time
if the condition that can be pushed into a materialized view
contains a cached item this item can be substituted for a basic
constant of the same value. In such a way we can avoid building
proper clones for Item_cache objects when constructing pushdown
conditions.
There were no implementations for the virtual functions
exclusive_dependence_on_table_processor and
exclusive_dependence_on_table_processor. As a result
the procedure pushdown_cond_for_derived erroneously
detected some conditions with outer references as pushable
into materialized view / derived table.
The class Item_func_nop_all missed an implementation
of the virtual method get_copy.
As a result if the condition that can be pushed into
into a materialized view / derived table contained
an ANY subselect then the pushdown condition was built
incorrectly.
In a general case the conditions with outer fields cannot
be pushed into materialized views / derived tables.
However if the outer field in the condition refers to a
single row table then the condition may be pushable.
In this case a special care should be taken for outer
fields when pushing the condition into a materialized view /
derived table.
If a materialized derived table / view is specified by a unit
with SELECTs containing ORDER BY ... LIMIT then condition pushdown
cannot be done for these SELECTs.
If a materialized derived table / view is specified by a unit
containing global ORDER BY ... LIMIT then condition pushdown
cannot be done for this unit.
The condition pushed into WHERE/HAVING of a materialized
view/derived table may differ for different executions of
the same prepared statement. That's why the should be
ANDed with the existing WHERE/HAVING conditions only after all
permanent transformations of these conditions has been
performed.
This bug in the code of Item_ref::build_clone could
cause corruption of items in where conditions.
Also made sure that equality predicates extracted
from multiple equality items to be pushed into
materialized views were cloned.
for materialized views and derived tables: there were no
push-down if the view was defined as union of selects
without aggregation. Added test cases with such unions.
Adjusted result files after the merge of the code for mdev-9197.