1
0
mirror of https://github.com/MariaDB/server.git synced 2025-07-29 05:21:33 +03:00

Improve pruning in greedy_search by sorting tables during search

MDEV-28073 Slow query performance in MariaDB when using many tables

The faster we can find a good query plan, the more options we have for
finding and pruning (ignoring) bad plans.

This patch adds sorting of plans to best_extension_by_limited_search().
The plans, from best_access_path() are sorted according to the numbers
of found rows.  This allows us to faster find 'good tables' and we are
thus able to eliminate 'bad plans' faster.

One side effect of this patch is that if two tables have equal cost,
the table that which was used earlier in the query is preferred.
This allows users to improve plans by reordering eq_ref tables in the
order they would like them to be uses.

Result changes caused by the patch:
- Traces are different as now we print the cost for using tables before
  we start considering them in the plan.
- Table order are changed for some plans. In most cases this is because
  the plans are equal and tables are in this case sorted according to
  their usage in the original query.
- A few plans was changed as the optimizer was able to find a better
  plan (that was pruned by the original code).

Other things:

- Added a new statistic variable: "optimizer_join_prefixes_check_calls",
  which counts number of calls to best_extension_by_limited_search().
  This can be used to check the prune efficiency in greedy_search().
- Added variable "JOIN_TAB::embedded_dependent" to be able to handle
  XX IN (SELECT..) in the greedy_optimizer.  The idea is that we
  should prune a table if any of the tables in embedded_dependent is
  not yet read.
- When using many tables in a query, there will be some additional
  memory usage as we need to pre-allocate table of
  table_count*table_count*sizeof(POSITION) objects (POSITION is 312
  bytes for now) to hold the pre-calculated best_access_path()
  information.  This memory usage is offset by the expected
  performance improvement when using many tables in a query.
- Removed the code from an earlier patch to keep the table order in
  join->best_ref in the original order.  This is not needed anymore as we
  are now sorting the tables for each best_extension_by_limited_search()
  call.
This commit is contained in:
Monty
2022-05-31 17:36:32 +03:00
committed by Oleg Smirnov
parent 213772f99a
commit b3c74bdc1f
33 changed files with 2816 additions and 1591 deletions

View File

@ -637,6 +637,7 @@ SELECT t0.a,t0.b,t1.a,t1.b,t2.a,t2.b,t3.a,t3.b,t4.a,t4.b,
(t8.b=t9.b OR t8.c IS NULL) AND
(t9.a=1);
--sorted_result
SELECT t0.a,t0.b,t1.a,t1.b,t2.a,t2.b,t3.a,t3.b,t4.a,t4.b,
t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b,t9.a,t9.b
FROM t0,t1