1
0
mirror of https://github.com/MariaDB/server.git synced 2025-07-27 18:02:13 +03:00

MDEV-28749: restore_prev_nj_state() doesn't update cur_sj_inner_tables correctly

(Try 2) (Cherry-pick back into 10.3)

The code that updates semi-join optimization state for a join order prefix
had several bugs. The visible effect was bad optimization for FirstMatch or
LooseScan strategies: they either weren't considered when they should have
been, or considered when they shouldn't have been.

In order to hit the bug, the optimizer needs to consider several different
join prefixes in a certain order. Queries with "obvious" query plans which
prune all join orders except one are not affected.

Internally, the bugs in updates of semi-join state were:
1. restore_prev_sj_state() assumed that
  "we assume remaining_tables doesnt contain @tab"
  which wasn't true.
2. Another bug in this function: it did remove bits from
   join->cur_sj_inner_tables but never added them.
3. greedy_search() adds tables into the join prefix but neglects to update
   the semi-join optimization state. (It does update nested outer join
   state, see this call:
     check_interleaving_with_nj(best_table)
   but there's no matching call to update the semi-join state.
   (This wasn't visible because most of the state is in the POSITION
    structure which is updated. But there is also state in JOIN, too)

The patch:
- Fixes all of the above
- Adds JOIN::dbug_verify_sj_inner_tables() which is used to verify the
  state is correct at every step.
- Renames advance_sj_state() to optimize_semi_joins().
  = Introduces update_sj_state() which ideally should have been called
    "advance_sj_state" but I didn't reuse the name to not create confusion.
This commit is contained in:
Sergei Petrunia
2022-06-06 22:21:22 +03:00
parent 392e744aec
commit 19c721631e
6 changed files with 118 additions and 33 deletions

View File

@ -7721,6 +7721,10 @@ choose_plan(JOIN *join, table_map join_tables)
{
choose_initial_table_order(join);
}
/*
Note: constant tables are already in the join prefix. We don't
put them into the cur_sj_inner_tables, though.
*/
join->cur_sj_inner_tables= 0;
if (straight_join)
@ -8023,8 +8027,8 @@ optimize_straight_join(JOIN *join, table_map join_tables)
read_time= COST_ADD(read_time,
COST_ADD(join->positions[idx].read_time,
record_count / (double) TIME_FOR_COMPARE));
advance_sj_state(join, join_tables, idx, &record_count, &read_time,
&loose_scan_pos);
optimize_semi_joins(join, join_tables, idx, &record_count, &read_time,
&loose_scan_pos);
join_tables&= ~(s->table->map);
double pushdown_cond_selectivity= 1.0;
@ -8201,6 +8205,12 @@ greedy_search(JOIN *join,
/* This has been already checked by best_extension_by_limited_search */
DBUG_ASSERT(!is_interleave_error);
/*
Also, update the semi-join optimization state. Information about the
picked semi-join operation is in best_pos->...picker, but we need to
update the global state in the JOIN object, too.
*/
update_sj_state(join, best_table, idx, remaining_tables);
/* find the position of 'best_table' in 'join->best_ref' */
best_idx= idx;
@ -8983,8 +8993,8 @@ best_extension_by_limited_search(JOIN *join,
current_record_count /
(double) TIME_FOR_COMPARE));
advance_sj_state(join, remaining_tables, idx, &current_record_count,
&current_read_time, &loose_scan_pos);
optimize_semi_joins(join, remaining_tables, idx, &current_record_count,
&current_read_time, &loose_scan_pos);
/* Expand only partial plans with lower cost than the best QEP so far */
if (current_read_time >= join->best_read)