splittable derived
If one of joined tables of the processed query is a materialized derived
table (or view or CTE) with GROUP BY clause then under some conditions it
can be subject to split optimization. With this optimization new equalities
are injected into the WHERE condition of the SELECT that specifies this
derived table. The injected equalities are generated for all join orders
with which the split optimization can employed. After the best join order
has been chosen only certain of this equalities are really needed. The
others can be safely removed. If it's not done and some of injected
equalities involve expressions over semi-joins with look-up access then
the query may return a wrong result set.
This patch effectively removes equalities injected for split optimization
that are needed only at the optimization stage and not needed for execution.
Approved by serg@mariadb.com
This bug could cause a crash for any query that used a derived table/view/CTE
whose specification was a SELECT with a GROUP BY clause and a FROM list
containing 32 or more table references.
The problem appeared only in the cases when the splitting optimization
could be applied to such derived table/view/CTE.
Do not materialize a semi-join nest if it contains a materialized derived
table /view that potentially can be subject to the split optimization.
Splitting of materialization of such nest would help, but currently there
is no code to support this technique.
If a splittable materialized derived table / view T is used in a inner nest
of an outer join with impossible ON condition then T is marked as a
constant table. Yet the execution plan to build T is still searched for
in spite of the fact that is not needed. So it should be set.
The function JOIN_TAB::choose_best_splitting() did not take into account
that for some tables whose fields were used in the GROUP BY list of
the specification of a splittable materialized derived there might exist
no elements in the array ext_keyuses_for_splitting.
The optimizer erroneously allowed to use join cache when joining a
splittable materialized table together with splitting optimization.
As a consequence in some rare cases the server returned wrong result
sets for queries with materialized derived.
This patch allows to use either join cache without usage of splitting
technique for materialization of a splittable derived table or splitting
without usage of join cache when joining such table. The costs the these
alternatives are compared and the best variant is chosen.
The bug was in the in the code of JOIN::check_for_splittable_materialized()
where the structures describing the fields of a materialized derived
table that potentially could be used in split optimization were build.
As a result of this bug some fields that were not usable for splitting
were detected as usable. This could trigger crashes further in
st_join_table::choose_best_splitting().
t1.pk IS NOT NULL where pk is a PRIMARY KEY
For equalites in the WHERE clause we create a keyuse array that contains the set of all equalities.
For each KEYUSE inside the keyuse array we have a field "null_rejecting"
which tells that the equality will not hold if either the left or right
hand side of the equality is NULL.
If the equality is NULL rejecting then we accordingly add a NOT NULL condition for the field present in
the item val(present in the KEYUSE struct) when we are doing ref access.
For the optimization of splitting with GROUP BY we always set the null_rejecting to TRUE and we are doing ref access on
the GROUP by field. This does create a problem when the equality is NOT NULL rejecting. This happens in this case as
in the equality we have the right hand side as t1.pk where pk is a PRIMARY KEY , hence it is NOT NULLABLE. So we should have
null rejecting set to FALSE for such a case.
The crash happened because JOIN::check_for_splittable_materialized()
called by mistake the function JOIN_TAB::is_inner_table_of_outer_join()
instead of the function TABLE_LIST::is_inner_table_of_outer_join().
The former cannot be called before the call of make_outerjoin_info().
Handle string length as size_t, consistently (almost always:))
Change function prototypes to accept size_t, where in the past
ulong or uint were used. change local/member variables to size_t
when appropriate.
This fix excludes rocksdb, spider,spider, sphinx and connect for now.
in JOIN::inject_best_splitting_cond
The value of SplM_opt_info::last_plan should be set to NULL
before any search for a splitting plan for a splittable
materialized table.
Do not try to apply the splitting optimization to a materialized
derived if it's specified by a select with impossible where
or if all joined tables of this select are constant.