The problem was the mysql_derived_prepare() did not correctly set
'distinct' when creating a temporary derivated table.
Fixed by separating checking for distinct for queries with and without
UNION.
Other things:
- Fixed bug in generate_derived_keys_for_table() where we set the wrong
bit for join_tab->keys
- Cleaned up JOIN::drop_unused_derived_keys()
- Changed TABLE::use_index() to keep unique keys and update
share->key_parts
Author: Sergei Petrunia <sergey@mariadb.com>, monty@mariadb.org
The main difference in code path between EQ_REF and REF is that for
REF we have to do an extra read_next on the index to check that there
is no more matching rows.
Before this patch we added a preference of EQ_REF by ensuring that REF
would always estimate to find at least 2 rows.
This patch adds the cost of the extra key read_next to REF access and
removes the code that limited REF to at least 2 rows. For some queries
this can have a big effect as the total estimated rows will be halved
for each REF table with 1 rows.
multi_range cost calculations are also changed to take into account
the difference between EQ_REF and REF.
The effect of the patch to the test suite:
- About 80 test case changed
- Almost all changes where for EXPLAIN where estimated rows for REF
where changed from 2 to 1.
- A few test cases using explain extended had a change of 'filtered'.
This is because of the estimated rows are now closer to the
calculated selectivity.
- A very few test had a change of table order.
This is because the change of estimated rows from 2 to 1 or the small
cost change for REF
(main.subselect_sj_jcl6, main.group_by, main.dervied_cond_pushdown,
main.distinct, main.join_nested, main.order_by, main.join_cache)
- No key statistics and the estimated rows are now smaller which cased
estimated filtering to be lower.
(main.subselect_sj_mat)
- The number of total rows are halved.
(main.derived_cond_pushdown)
- Plans with 1 row changed to use RANGE instead of REF.
(main.group_min_max)
- ALL changed to REF
(main.key_diff)
- Key changed from ref + index_only to PRIMARY key for InnoDB, as
OPTIMIZER_ROW_LOOKUP_COST + OPTIMIZER_ROW_NEXT_FIND_COST is smaller than
OPTIMIZER_KEY_LOOKUP_COST + OPTIMIZER_KEY_NEXT_FIND_COST.
(main.join_outer_innodb)
- Cost changes printouts
(main.opt_trace*)
- Result order change
(innodb_gis.rtree)
If a derived table has SELECT DISTINCT, provide index statistics for it so that the join optimizer in the
upper select knows that ref access to the table will produce one row.