Part of:
MDEV-28073 Slow query performance in MariaDB when using many tables
s->key_dependent has a list of tables that are compared with key fields
in the current table. However it does not take into account if a key
field could be resolved by another table.
This is because MariaDB expands 'join_tab->keyuse' to include all generated
comparisons.
For example:
SELECT * from t1,t2,t3 where t1.key=t2.key and t2.key=t3.key
In this case keyuse for t1 includes t2.key and t3.key and key_dependent
contains 't2.map | t3.map'
If we in best_extension_by_limited_search() consider t2,t1 then t1's
key is fully defined, but we cannot do any prune of plans as
s->key_dependent indicates that t3 is still needed.
Fixed by calculating in best_access_patch the current key_dependent map
of tables that is needed to satisfy all keys. This allows us to prune
more bad plans earlier as soon as all keys can be used.
We also set key_dependent to 0 if we found an EQ_REF key, as this an
optimal key for the table and there is no reason to check more keys.
best_extension_by_limited_search() assumes that tables should be sorted
according to size to be able to quickly disregard bad plans. However the
current usage of swap_variables() will change the table order to a not
sorted one for the next recursive call. This breaks the assumtion and
causes performance issues when using many tables (we have to examine
many more plans).
This patch fixes this by ensuring that the original table order is kept
for the not yet used tables when best_extension_by_limited_search() is
called.
This was done by always calling swap_variables() for each table and
restoring the original table order at exit.
Some test changed:
- In a majority of the test the change was that two "identical tables"
where swapped and the optimzer is now using the first/smaller table
- In few test the table order was changed. The new plan looks identical
or slighly better than the original.
(Try 2)
The code that updates semi-join optimization state for a join order prefix
had several bugs. The visible effect was bad optimization for FirstMatch or
LooseScan strategies: they either weren't considered when they should have
been, or considered when they shouldn't have been.
In order to hit the bug, the optimizer needs to consider several different
join prefixes in a certain order. Queries with "obvious" query plans which
prune all join orders except one are not affected.
Internally, the bugs in updates of semi-join state were:
1. restore_prev_sj_state() assumed that
"we assume remaining_tables doesnt contain @tab"
which wasn't true.
2. Another bug in this function: it did remove bits from
join->cur_sj_inner_tables but never added them.
3. greedy_search() adds tables into the join prefix but neglects to update
the semi-join optimization state. (It does update nested outer join
state, see this call:
check_interleaving_with_nj(best_table)
but there's no matching call to update the semi-join state.
(This wasn't visible because most of the state is in the POSITION
structure which is updated. But there is also state in JOIN, too)
The patch:
- Fixes all of the above
- Adds JOIN::dbug_verify_sj_inner_tables() which is used to verify the
state is correct at every step.
- Renames advance_sj_state() to optimize_semi_joins().
= Introduces update_sj_state() which ideally should have been called
"advance_sj_state" but I didn't reuse the name to not create confusion.
Main fix was replacing read_time+= with read_time
I also did updated the 'identical' code in optimize_straight_join) and
best_extension_by_limited_search() to make them eaiser to compare.
Reviewer: Sergei Petrunia <sergey@mariadb.com>
(Try 2) (Cherry-pick back into 10.3)
The code that updates semi-join optimization state for a join order prefix
had several bugs. The visible effect was bad optimization for FirstMatch or
LooseScan strategies: they either weren't considered when they should have
been, or considered when they shouldn't have been.
In order to hit the bug, the optimizer needs to consider several different
join prefixes in a certain order. Queries with "obvious" query plans which
prune all join orders except one are not affected.
Internally, the bugs in updates of semi-join state were:
1. restore_prev_sj_state() assumed that
"we assume remaining_tables doesnt contain @tab"
which wasn't true.
2. Another bug in this function: it did remove bits from
join->cur_sj_inner_tables but never added them.
3. greedy_search() adds tables into the join prefix but neglects to update
the semi-join optimization state. (It does update nested outer join
state, see this call:
check_interleaving_with_nj(best_table)
but there's no matching call to update the semi-join state.
(This wasn't visible because most of the state is in the POSITION
structure which is updated. But there is also state in JOIN, too)
The patch:
- Fixes all of the above
- Adds JOIN::dbug_verify_sj_inner_tables() which is used to verify the
state is correct at every step.
- Renames advance_sj_state() to optimize_semi_joins().
= Introduces update_sj_state() which ideally should have been called
"advance_sj_state" but I didn't reuse the name to not create confusion.
The warning comes from copying POSITION objects where 'type' is not
initialized. This does not affect any production code as when 'type'
was compared/used, it was always initialized.
Removed by initializing the type variable in the constructor
- In best_extension_by_limited_search(), do not check for
"(remaining_tables & real_table_bit)", it is guaranteed to be true.
Make it an assert.
- In (!idx || check_interleaving_with_nj())", remove the !idx part.
This check made sense only in the original version of this function.
- "micro optimization" in check_interleaving_with_nj().
The issue was that best_extension_by_limited_search() had to go through
too many plans with the same cost as there where many EQ_REF tables.
Fixed by shortcutting EQ_REF (AND REF) when the result only contains one
row. This got the optimization time down from hours to sub seconds.
The only known downside with this patch is that in some cases a table
with ref and 1 record may be used before on EQ_REF table. The faster
optimzation phase should compensate for this.
Fixed failing main.default on Windows
(to trigger an assert the test needed a debug build without
safemalloc, as 0xa5 happened to have the important bit set "correctly")
(This is the assert that was added in fix for MDEV-26047)
Table elimination may remove an ON expression from an outer join.
However SELECT_LEX::update_used_tables() will still call
item->walk(&Item::eval_not_null_tables)
for eliminated expressions. If the subquery is constant and cheap
Item_cond_and will attempt to evaluate it, which will trigger an
assert.
The fix is not to call update_used_tables() or eval_not_null_tables()
for ON expressions that were eliminated.
This reverts commit 5ba77222e9
but keeps the test. A different fix for
MDEV-21028 Server crashes in Query_arena::set_query_arena upon SELECT from view
internal temporary tables should use THD as expr_area
The cause of the bug is overflow of uint16 KEY_PART_INFO::length and/or
uint16 KEY_PART_INFO::store_length. The solution is to increase the size
of those variables to the 'uint' type (which is 32-bit long)
If JOIN::create_postjoin_aggr_table encounters errors during execution
then free_tmp_table() is then called twice for JOIN_TAB::aggr.
The solution is to initialize JOIN_TAB::aggr only on successful completion
of JOIN::create_postjoin_aggr_table
Second execution of a prepared statement for a query containing a constant
subquery with union that can be optimized away, could result in server abnormal
termination for debug build or incorrect result set output for release build.
For example, the following test case crashes a server built with debug on second
run of the statement EXECUTE stmt
CREATE TABLE t1 (a INT);
PREPARE stmt FROM 'EXPLAIN SELECT * FROM t1 HAVING 6 IN ( SELECT 6 UNION SELECT 5 )';
EXECUTE stmt;
EXECUTE stmt;
The reason for incorrect result set output or abnormal server termination
is careless working with the data member fake_select_lex->options inside
the function mysql_explain_union(). Once the flag SELECT_DESCRIBE is set in
the data member fake_select_lex->option before calling the methods
SELECT_LEX_UNIT::prepare/SELECT_LEX_UNIT::execute
the original value of the option is no longer restored.
As a consequence, next time the prepared statement is re-executed we have
the fake_select_lex with the flag SELECT_DESCRIBE set in the data member
fake_select_lex->option, that is incorrect. In result, the method
Item_subselect::assigned()
is not invoked during evaluation of a constant condition (constant subquery
with union) that being performed on OPTIMIZE phase of query handling.
This leads to the fact that records in the temporary table are not deleted
before calling
table->file->ha_enable_indexes(HA_KEY_SWITCH_ALL)
in the method st_select_lex_unit::optimize().
In result table->file->ha_enable_indexes(HA_KEY_SWITCH_ALL) returns error
and DBUG_ASSERT(0) is fired.
Stack trace to the line where the error generated on re-enabling indexes
for next subselect iteration is below:
st_select_lex_unit::optimize (at sql_union.cc:954)
handler::ha_enable_indexes (at handler.cc:4338)
ha_heap::enable_indexes (at ha_heap.cc:519)
heap_enable_indexes (at hp_clear.c:164)
The code snippet to clarify raising the error is also listed:
int heap_enable_indexes(HP_INFO *info)
{
int error= 0;
HP_SHARE *share= info->s;
if (share->data_length || share->index_length)
error= HA_ERR_CRASHED; <<== set error the value HA_ERR_CRASHED
since share->data_length != 0
To fix this issue the original value of unit->fake_select_lex->options
has to be saved before setting the flag SELECT_DESCRIBE and restored
on return from invocation of SELECT_LEX_UNIT::prepare/SELECT_LEX_UNIT::execute
The problem was that "group_min_max optimization" does not work if
some aggregate functions, like COUNT(*), is used.
The function get_best_group_min_max() is using the join->sum_funcs
array to check which aggregate functions are used.
The bug was that aggregates in HAVING where not yet added to
join->sum_funcs at the time get_best_group_min_max() was called.
Fixed by populate join->sum_funcs already in prepare, which means that
all sum functions will be in join->sum_funcs in get_best_group_min_max().
A benefit of this approach is that we can remove several calls to
make_sum_func_list() from the code and simplify the function.
I removed some wrong setting of 'sort_and_group'.
This variable is set when alloc_group_fields() is called, as part
of allocating the cache needed by end_send_group() and does not need
to be set by other functions.
One problematic thing was that Spider is using *join->sum_funcs to detect
at which stage the optimizer is and do internal calculations of aggregate
functions. Updating join->sum_funcs early caused Spider to fail when trying
to find min/max values in opt_sum_query().
Fixed by temporarily resetting sum_funcs during opt_sum_query().
Reviewer: Sergei Petrunia