1
0
mirror of https://github.com/MariaDB/server.git synced 2025-07-29 05:21:33 +03:00

MDEV-31356: Range cost calculations does not take into account join_buffer

This patch also fixes
MDEV-31391 Assertion `((best.records_out) == 0.0 ... failed

Cost changes caused by this change:
- range queries with join buffer now have a notable smaller cost.
- range ranges are bit more expensive as the MULTI_RANGE_COST is now
  properly applied to it in all cases (this extra cost is equal to a
  key lookup).
- table scan cost is slight smaller as we now assume data is cached in
  the engine after the first scan pass. (We did this before for range
  scans and other access methods).
- partition tables had wrong values for max_row_blocks and
  max_index_blocks.  Correcting this, causes range access on
  partitioned tables to have slightly higher cost because of the
  increased estimated IO.
- Using first match + join buffer caused 'filtered' to be calcualted
  wrong.  (Only affected EXPLAIN, not query costs).
- Added cost_without_join_buffer to optimizer_trace.
- check_quick_select() adjusted the number of rows according to persistent
  statistics, but did not adjust cost. Now fixed.

The big change in the patch are:

- In best_access_path(), where we now are using storing the cost in
  'ALL_READ_COST cost' and only converting it to a double at the end.
   This allows us to more exactly calculate the effect of the join_cache.
- In JOIN_TAB::estimate_scan_time(), store the cost also in a
  ALL_READ_COST object.

One of effect if this change is that when joining very small tables:

t1    some_access_method
t2    range
t3    ALL         Use join buffer

This is swiched to

t1      some_access_method
t3      ALL
t2      range      use join buffer

Both plans has the same cost, but as table scan in this case has less
cost than rang, the table scan will be considered first and thus have
precidence.

Test case changes:
- optimizer_trace          - Addition of cost_without_join_buffer
- subselect_mat_cost_bugs  - Small tables and scan versus range
- range & range_mrr_icp    - Range + join_cache is faster than ref
- optimizer_trace          - cost_without_join_buffer, smaller scan cost,
                             range setup cost.
- mrr                      - range+join_buffer used as smaller cost
This commit is contained in:
Monty
2023-05-26 17:26:42 +03:00
parent 7079386587
commit 07b02ab40e
23 changed files with 444 additions and 195 deletions

View File

@ -46,3 +46,23 @@ SELECT * FROM information_schema.TABLES
# Cleanup
DROP VIEW v;
DROP TABLE t1, t2, t3, t4, t5, t1000, t2000;
--echo #
--echo # MDEV-31391 Assertion `((best.records_out) == 0.0 &&
--echo # (best.records) == 0.0) ||
--echo # (best.records_out)/(best.records) < 1.0000001' failed
--echo #
CREATE TABLE t1 (a INT, b INT, PRIMARY KEY(a), KEY(b)) ENGINE=Aria;
INSERT INTO t1 VALUES
(1,13),(2,22),(3,8),(4,88),(5,6),(7,21),(9,64),(10,14),(11,15),(12,8),
(6,20),(8,39),(13,0),(14,3),(15,54),(16,85),(17,1),(18,1),(19,0),(20,0);
CREATE TABLE t2 (c INT) ENGINE=Aria;
INSERT INTO t2 VALUES (1),(2),(3);
EXPLAIN SELECT a FROM t1 JOIN t2 WHERE a = b AND c <> 7 GROUP BY a HAVING a != 6 AND a <= 9;
SELECT a FROM t1 JOIN t2 WHERE a = b AND c <> 7 GROUP BY a HAVING a != 6 AND a <= 9;
DROP TABLE t1, t2;
--echo #
--echo # End of 11.0 tests
--echo #