eliminate_item_equal() uses quick_fix_field() for Item objects it creates.
It computes some of their attributes on its own (see update_used_tables()
call) but it doesn't update not_null_tables_cache.
Recompute not_null_tables_cache also. Not computing it is currently
harmless, except for producing MSAN error when some other code
propagates the wrong value of not_null_tables_cache to other item.
In MariaDB up to 10.11, the test_if_cheaper_ordering() code (that tries
to optimizer how GROUP BY is executed) assumes that if a table scan is used
then if there is any index usable by GROUP BY it will be used.
The reason MySQL 10.4 provides a better plan is because of two differences:
- Plans using 'ref' has a cost of 1/10 of what it should be (as a
protection against table scans). This is why 'ref' is used in 10.4
and not in 10.5.
- When 'ref' is used, then GROUP BY will not use an index for GROUP BY.
In MariaDB 10.5 the chosen plan is a table scan (as it calculated to be
faster) but as 'ref' is not used, the test_if_cheaper_ordering()
optimizer phase decides (as ref is not usd) to use an index for GROUP BY,
which has bad performance.
Description of fix:
- All new code is protected by the "optimizer_adjust_secondary_key_costs"
variable, which is now a bit map, and is only executed if the option
"disable_forced_index_in_group_by" set.
- Corrects GROUP BY handling in test_if_cheaper_ordering() by making
the choise of using and index with GROUP BY cost based instead of rule
based.
- Adds TIME_FOR_COMPARE to all costs, when using group by, to make
read_time, index_scan_time and range_cost comparable.
Other things:
- Made optimizer_adjust_secondary_key_costs a bit map (compatible with old
code).
Notes:
Current code ignores costs for the algorithm used when doing GROUP
BY on the first table:
- Create an in-memory temporary table for handling group by and doing a
filesort of the result file
We can probably in 10.6 continue to ignore this cost.
This patch should NOT be merged to 11.0 series (not needed in 11.0).
When executing a statement of the form
SELECT AGGR_FN(DISTINCT c1, c2,..,cn) FROM t1,
where AGGR_FN is an aggregate function such as COUNT(), AVG() or SUM(),
and a unique index exists on table t1 covering some or all of the
columns (c1, c2,..,cn), the retrieved values are inherently unique.
Consequently, the need for de-duplication imposed by the DISTINCT
clause can be eliminated, leading to optimization of aggregation
operations.
This optimization applies under the following conditions:
- only one table involved in the join (not counting const tables)
- some arguments of the aggregate function are fields
(not functions/subqueries)
This optimization extends to queries of the form
SELECT AGGR_FN(c1, c2,..,cn) GROUP BY cx,..cy
when a unique index covers some or all of the columns
(c1, c2,..cn, cx,..cy)
This patch fixes the issue with passing the DEFAULT or IGNORE values to
positional parameters for some kind of SQL statements to be executed
as prepared statements.
The main idea of the patch is to associate an actual value being passed
by the USING clause with the positional parameter represented by
the Item_param class. Such association must be performed on execution of
UPDATE statement in PS/SP mode. Other corner cases that results in
server crash is on handling CREATE TABLE when positional parameter
placed after the DEFAULT clause or CALL statement and passing either
the value DEFAULT or IGNORE as an actual value for the positional parameter.
This case is fixed by checking whether an error is set in diagnostics
area at the function pack_vcols() on return from the function pack_expression()
optimizer-adjust_secondary_key_costs is added to provide 2 small
adjustments to the 10.x optimizer cost model. This can be used in the
case where the optimizer wrongly uses a secondary key instead of a
clustered primary key.
The reason behind this change is that MariaDB 10.x does not take into
account that for engines like InnoDB, that scanning a primary key can be
up to 7x faster than scanning a secondary key + read the row data trough
the primary key.
The different values for optimizer_adjust_secondary_key_costs are:
optimizer_adjust_secondary_key_costs=0
- No changes to current model
optimizer_adjust_secondary_key_costs=1
- Ensure that the cost of of secondary indexes has a cost of at
least 5x times the cost of a clustered primary key (if one exists).
This disables part of the worst_seek optimization described below.
optimizer_adjust_secondary_key_costs=2
- Disable "worst_seek optimization" and adjust filter cost slightly
(add cost of 1 if filter is used).
The idea behind 'worst_seek optimization' is that we limit the
cost for all non clustered ref access to the least of:
- best-rows-by-range (or all rows in no range found) / 10
- scan-time-table (roughly number of file blocks to scan table) * 3
In addition we also do not try to use rowid_filter if number of rows
estimated for 'ref' access is less than the worst_seek limitation.
The idea is that worst_seek is trying to take into account that if
we do a lot of accesses through a key, this is likely to be cached.
However it only does this for secondary keys, and not for clustered
keys or index only reads.
The effect of the worst_seek are:
- In some cases 'ref' will have a much lower cost than range or using
a clustered key.
- Some possible rowid filters for secondary keys will be ignored.
When implementing optimizer_adjust_secondary_key_costs=2, I noticed
that there is a slightly different costs for how ref+filter and
range+filter are calculated. This caused a lot of range and
range+filter to change to ref+filter, which is not good as
range+filter provides the optimizer a better estimate of how many
accepted rows there will be in the result set.
Adding a extra small cost (1 seek) when using filter mitigated the
above problems in almost all cases.
This patch should not be applied to MariaDB 11.0 as worst_seeks is
removed in 11.0 and the cost calculation for clustered keys, secondary
keys, index scan and filter is more exact.
Test case changes for --optimizer-adjust_secondary_key_costs=1
(Fix secondary key costs to be 5x of primary key):
- stat_tables_innodb:
- Complex change (probably ok as number of rows are really small)
- ref over 1 row changed to range over 10 rows with join buffer
- ref over 5 rows changed to eq_ref
- secondary ref over 1 row changed to ref of primary key over 4 rows
- Change of key to use longer key with index pushdown (a little
bit worse but not significant).
- Change to use secondary (1 row) -> primary (4 rows)
- rowid_filter_innodb:
- index_merge (2 rows) & ref (1) -> all (23 rows) -> primary eq_ref.
Test case changes for --optimizer-adjust_secondary_key_costs=2
(remove of worst_seeks & adjust filter cost):
- stat_tables_innodb:
- Join order change (probably ok as number of rows are really small)
- ref (5 rows) & ref(1 row) changed to range (10 rows & join buffer)
& eq_ref.
- selectivity_innodb:
- ref -> ref|filter (ok)
- rowid_filter_innodb:
- ref -> ref|filter (ok)
- range|filter (64 rows) changed to ref|filter (128 rows).
ok as ref|filter outputs wrong number of rows in explain.
- range, range_mrr_icp:
-ref (500 rows -> ALL (1000 rows) (ok)
- select_pkeycache, select, select_jcl6:
- ref|filter (2 rows) -> ref (2 rows) (ok)
- selectivity:
- ref -> ref_filter (ok)
- range:
- Change of 'filtered' but no stat or plan change (ok)
- selectivity:
- ref -> ref+filter (ok)
- Change of filtered but no plan change (ok)
- join_nested_jcl6:
- range -> ref|filter (ok as only 2 rows)
- subselect3, subselect3_jcl6:
- ref_or_null (4 rows) -> ALL (10 rows) (ok)
- Index_subquery (4 rows) -> ALL (10 rows) (ok)
- partition_mrr_myisam, partition_mrr_aria and partition_mrr_innodb:
- Uses ALL instead of REF for a key value that is the same for > 50%
of rows. (good)
order_by_innodb:
- range (200 rows) -> ref (20 rows)+filesort (ok)
- subselect_sj2_mat:
- One test changed. One ALL removed and replaced with eq_ref. Likely
to be better.
- join_cache:
- Changed ref over 60% of the rows to use hash join (ok)
- opt_tvc:
- Changed to use eq_ref instead of ref with plan change (probably ok)
- opt_trace:
- No worst/max seeks clipping (good).
- Almost double range_scan_time and index_scan_time (ok).
- rowid_filter:
- ref -> ref|filtered (ok)
- range|filter (77 rows) changed to ref|filter (151 rows). Proably
ok as ref|filter outputs wrong number of rows in explain.
Reviewer: Sergei Petrunia <sergey@mariadb.com>
In the case of calcuating cost for a ref access for which there exists
a usable range, the variable keyread_tmp would always be 0.
If there is also another index that could be used as a filter, the cost
of that filter would be wrong.
In many cases 'the worst_seeks optimzation' would disable the filter
from getting used, which is why this bug has not been noticed before.
The next commit, which allows one to disable worst_seeks, will have a
test case for this bug.
Statements affect by this bug are all SQL statements that
1) prefixed with "EXPLAIN"
2) have a lower level join structure created for a union subquery.
A bug in select_describe() passed an incorrect "result" object to
mysql_explain_union(), resulting in unpredictable behaviour and
out of context calls.
Reviewed by: Oleksandr Byelkin, sanja@mariadb.com
Enable unusable key notes for non-equality predicates:
<, <=, =>, >, BETWEEN, IN, LIKE
Note, in some scenarios it displays duplicate notes, e.g.
for queries with ORDER BY:
SELECT * FROM t1
WHERE indexed_string_column >= 10
ORDER BY indexed_string_column
LIMIT 5;
This should be tolarable. Getting rid of the diplicate note
completely would need a much more complex patch, which is
not desiable in 10.6.
Details:
- Changing RANGE_OPT_PARAM::note_unusable_keys from bool
to a new data type Item_func::Bitmap, so the caller can
choose with a better granuality which predicates
should raise unusable key notes inside the range optimizer:
a. all predicates (=, <=>, <, <=, =>, >, BETWEEN, IN, LIKE)
b. all predicates except equality (=, <=>)
c. none of the predicates
"b." is needed because in some scenarios equality predicates (=, <=>)
send unusable key notes at an earlier stage, before the range optimizer,
during update_ref_and_keys(). Calling the range optimizer with
"all predicates" would produce duplicate notes for = and <=> in such cases.
- Fixing get_quick_record_count() to call the range optimizer
with "all predicates except equality" instead of "none of the predicates".
Before this change the range optimizer suppressed all notes for
non-equality predicates: <, <=, =>, >, BETWEEN, IN, LIKE.
This actually fixes the reported problem.
- Fixing JOIN::make_range_rowid_filters() to call the range optimizer
with "all predicates except equality" instead of "all predicates".
Before this change the range optimizer produced duplicate notes
for = and <=> during a rowid_filter optimization.
- Cleanup:
Adding the op_collation argument to Field::raise_note_cannot_use_key_part()
and displaying the operation collation rather than the argument collation
in the unusable key note. This is important for operations with more than
two arguments: BETWEEN and IN, e.g.:
SELECT * FROM t1
WHERE column_utf8mb3_general_ci
BETWEEN 'a' AND 'b' COLLATE utf8mb3_unicode_ci;
SELECT * FROM t1
WHERE column_utf8mb3_general_ci
IN ('a', 'b' COLLATE utf8mb3_unicode_ci);
The note for 'a' now prints utf8mb3_unicode_ci as the collation.
which is the collation of the entire operation:
Cannot use key key1 part[0] for lookup:
"`column_utf8mb3_general_ci`" of collation `utf8mb3_general_ci` >=
"'a'" of collation `utf8mb3_unicode_ci`
Before this change it printed the collation of 'a',
so the note was confusing:
Cannot use key key1 part[0] for lookup:
"`column_utf8mb3_general_ci`" of collation `utf8mb3_general_ci` >=
"'a'" of collation `utf8mb3_general_ci`"
During the 10.5->10.6 merge please use the 10.6 code on conflicts.
This is the 10.5 version of the patch (a backport of the 10.6 version).
Unlike 10.6 version, it makes changes in plugin/type_inet/sql_type_inet.*
rather than in sql/sql_type_fixedbin.h
Item_bool_rowready_func2, Item_func_between, Item_func_in
did not check if a not-NULL argument of an arbitrary data type
can produce a NULL value on conversion to INET6.
This caused a crash on DBUG_ASSERT() in conversion failures,
because the function returned SQL NULL for something that
has Item::maybe_null() equal to false.
Adding setting NULL-ability in such cases.
Details:
- Removing the code in Item_func::setup_args_and_comparator()
performing character set aggregation with optional narrowing.
This aggregation is done inside Arg_comparator::set_cmp_func_string().
So this code was redundant
- Removing Item_func::setup_args_and_comparator() as it git simplified to
just to two lines:
convert_const_compared_to_int_field(thd);
return cmp->set_cmp_func(thd, this, &args[0], &args[1], true);
Using these lines directly in:
- Item_bool_rowready_func2::fix_length_and_dec()
- Item_func_nullif::fix_length_and_dec()
- Adding a new virtual method:
- Type_handler::Item_bool_rowready_func2_fix_length_and_dec().
- Adding tests detecting if the data type conversion can return SQL NULL into
the following methods of Type_handler_inet6:
- Item_bool_rowready_func2_fix_length_and_dec
- Item_func_between_fix_length_and_dec
- Item_func_in_fix_comparator_compatible_types
test_if_skip_sort_order() should check that the 'select' pointer
(=tab->select) is not NULL before dereferencing it when invoking
the test_quick_select method.
The check was erroneously removed by:
1c88ac60cf Simple cleanup of removing QQ comments from sql_select.cc
Fix the issue introduced in ec2574fd8f, fix for MDEV-31983:
get_quick_record_count() must set quick_count=0 when it got
IMPOSSIBLE_RANGE from test_quick_select.
Failure to do so will cause an assertion in 11.0, when the number of
quick select rows (0) is checked to be lower than the number of
found_records (which is capped up to 1).
Pushing down statements to FederatedX engine is implemented by
printing either SELECT_LEX or SELECT_LEX_UNIT into a string and
sending that string to the engine. In the case of pushing down a
single SELECT having a CTE (WITH clause) there was a problem, because
normally single SELECTs were printed using SELECT_LEX::print().
But CTEs are stored in the upper unit of the SELECT_LEX -
SELECT_LEX_UNIT, so they were not unfolded in the string produced.
The solution is to invoke SELECT_LEX_UNIT::print() when pushing down
single SELECT statements (but not those which are parts of units),
so the possible CTEs are unfolded and printed.
Reviewed by Sergei Petrunia (sergey@mariadb.com)
MDEV-32441 SENT_ROWS shows random wrong values when stored function
is selected.
MDEV-32281 EXAMINED_ROWS is not populated in
information_schema.processlist upon SELECT.
Added ROWS_SENT to information_schema.processlist
This is to have the same information as Percona server (SENT_ROWS)
To ensure that information_schema.processlist has correct values for
sent_rows and examined_rows I introduced two new variables to hold the
total counts so far. This was needed as stored functions and stored
procedures will reset the normal counters to be able to count rows for
each statement individually for slow query log.
Other things:
- Selects with functions shows in processlist the total examined_rows
and sent_rows by the main statement and all functions.
- Stored procedures shows in processlist examined_rows and sent_rows
per stored procedure statement.
- Fixed some double accounting for sent_rows and examined_rows.
- HANDLER operations now also supports send_rows and examined_rows.
- Display sizes for MEMORY_USED, MAX_MEMORY_USED, EXAMINED_ROWS and
QUERY_ID in information_schema.processlist changed to 10 characters.
- EXAMINED_ROWS and SENT_ROWS changed to bigint.
- INSERT RETURNING and DELETE RETURNING now updates SENT_ROWS.
- As thd is always up to date with examined_rows, we do not need
to handle examined row counting for unions or filesort.
- I renamed SORT_INFO::examined_rows to m_examined_rows to ensure that
we don't get bugs in merges that tries to use examined_rows.
- Removed calls of type "thd->set_examined_row_count(0)" as they are
not needed anymore.
- Removed JOIN::join_examined_rows
- Removed not used functions:
THD::set_examined_row_count()
- Made inline some functions that where called for each row.
This commit addresses column naming issues with CTEs in the use of prepared
statements and stored procedures. Usage of either prepared statements or
procedures with Common Table Expressions and column renaming may be affected.
There are three related but different issues addressed here.
1) First execution issue. Consider the following
prepare s from "with cte (col1, col2) as (select a as c1, b as c2 from t
order by c1) select col1, col2 from cte";
execute s;
After parsing, items in the select are named (c1,c2), order by (and group by)
resolution is performed, then item names are set to (col1, col2).
When the statement is executed, context analysis is again performed, but
resolution of elements in the order by statement will not be able to find c1,
because it was renamed to col1 and remains this way.
The solution is to save the names of these items during context resolution
before they have been renamed. We can then reset item names back to those after
parsing so first execution can resolve items referred to in order and group by
clauses.
2) Second Execution Issue
When the derived table contains more than one select 'unioned' together we could
reasonably think that dealing with only items in the first select (which
determines names in the resultant table) would be sufficient. This can lead to
a different problem. Consider
prepare st from "with cte (c1,c2) as
(select a as col1, sum(b) as col2 from t1 where a > 0 group by col1
union select a as col3, sum(b) as col4 from t2 where b > 2 group by col3)
select * from cte where c1=1";
When the optimizer (only run during the first execution) pushes the outside
condition "c1=1" into every select in the derived table union, it renames the
items to make the condition valid. In this example, this leaves the first item
in the second select named 'c1'. The second execution will now fail 'group by'
resolution.
Again, the solution is to save the names during context analysis, resetting
before subsequent resolution, but making sure that we save/reset the item
names in all the selects in this union.
3) Memory Leak
During parsing Item::set_name() is used to allocate memory in the statement
arena. We cannot use this call during statement execution as this represents
a memory leak. We directly set the item list names to those in the column list
of this CTE (also allocated during parsing).
Approved by Igor Babaev <igor@mariadb.com>
ref->null_rejecting is a key_part_map. we need to check
the bit corresponding to the particular store_key.
Note that there are no store_key objects for const ref parts.