This is 11.0 part of the fix: in 11.0, get_costs_for_tables() calls
best_access_path() for all possible tables, for each call it saves
a POSITION object with the access method and "loose_scan_pos"
POSITION object.
The latter is saved even if there is no possible LooseScan plan. Saving
is done by copying POSITION objects which may generate a spurious
UBSan error.
EXPLAIN EXTENDED should always print the field item used in the left part
of an equality expression from the SET clause of an update statement as a
reference to table column.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
The problem was a wrong assert. I changed it to match the code in
best_access_path().
The given test case was a bit tricky for the optimizer, which first
decided on using a index scan (because of force index), but then
test_if_skip_sort_order() decided to use range anyway to handle
distinct.
This was caused of two minor issues:
- get_quick_record_count() returned the number of rows for range with
least cost, when it should have returned the minum number of rows
for any range.
- When changing REF to RANGE, we also changed records_out, which
should not be done (number of records in the result will not
change).
The above change can cause a small change in row estimates where the
optimizer chooses a clustered key with more rows than a range one
secondary key (unlikely case).
The problem was that when JOIN_TAB::remove_duplicates() noticed there
can only be one possible row in the output, it adjusted limits but
didn't take into account any possible offset.
Fixed by not adjusting limit offset when setting one-row-limit.
When a query does implicit grouping and join operation produces an empty
result set, a NULL-complemented row combination is generated.
However, constant table fields still show non-NULL values.
What happens in the is that end_send_group() is called with a
const row but without any rows matching the WHERE clause.
This last part is shown by 'join->first_record' not being set.
This causes item->no_rows_in_result() to be called for all items to reset
all sum functions to their initial state. However fields are not set
to NULL.
The used fix is to produce NULL-complemented records for constant tables
as well. Also, reset the constant table's records back in case we're
in a subquery which may get re-executed.
An alternative fix would have item->no_rows_in_result() also work
with Item_field objects.
There is some other issues with the code:
- join->no_rows_in_result_called is used but never set.
- Tables that are used with group functions are not properly marked as
maybe_null, which is required if the table rows should be regarded as
null-complemented (not existing).
- The code that tries to detect if mixed_implicit_grouping should be set
didn't take into account all usage of fields and sum functions.
- Item_func::restore_to_before_no_rows_in_result() called the wrong
function.
- join->clear() does not use a table_map argument to clear_tables(),
which caused it to ignore constant tables.
- unclear_tables() does not correctly restore status to what is
was before clear_tables().
Main bug fix was to always use a table_map argument to clear_tables() and
always use join->clear() and clear_tables() together with unclear_tables().
Other fixes:
- Fixed Item_func::restore_to_before_no_rows_in_result()
- Set 'join->no_rows_in_result_called' when no_rows_in_result_set()
is called.
- Removed not used argument from setup_end_select_func().
- More code comments
- Ensure that end_send_group() modifies the same fields as are in the
result set.
- Changed return_zero_rows() to use pointers instead of references,
similar to the rest of the code.
Reviewer: Sergei Petrunia <sergey@mariadb.com>
The problem was trying to access JOIN_TAB::select which is set to NULL
when using the filesort. The correct way is accessing either
JOIN_TAB::select or JOIN_TAB::filesort->select depending on whether
the filesort is used.
This commit introduces member function JOIN_TAB::get_sql_select()
encapsulating that check so the code duplication is eliminated.
The new condition (s->table->quick_keys.is_set(best_key->key))
was added to best_access_path() to eliminate a Valgrind error.
The cause of that error was using TRASH_ALLOC(quick_key_parts)
instead of bzero(quick_key_parts); hence, accessing
s->table->quick_key_parts[best_key->key]) without prior checking
for quick_keys.is_set() might have caused reading "dirty" memory
EXPLAIN EXTENDED should always print the field item used in the left part
of an equality expression from the SET clause of an update statement as a
reference to table column.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
The code in create_internal_tmp_table() didn't take into account that
now temporary (derived) tables may have multiple indexes:
- one index due to duplicate removal
= In this example created by conversion of big-IN(...) into subquery
= this index might be converted into a "unique constraint" if the key
length is too large.
- one index added by derived_with_keys optimization.
Make create_internal_tmp_table() handle multiple indexes.
Before this patch, use of a unique constraint was indicated in
TABLE_SHARE::uniques. This was ok as unique constraint was the only index
in the table. Now it's no longer the case so TABLE_SHARE::uniques is removed
and replaced with an in-memory-only flag HA_UNIQUE_HASH.
This patch is based on Monty's patch.
Co-Author: Monty <monty@mariadb.org>
The problem, introduced in patch for MDEV-26301:
When check_join_cache_usage() decides not to use join buffer, it must
adjust the access method accordingly. For BNL-H joins this means switching
from pseudo-"ref access"(with index=MAX_KEY) to some other access method.
Failing to do this will cause assertions down the line when code that is
not aware of BNL-H will try to initialize index use for ref access with
index=MAX_KEY.
The fix is to follow the regular code path to disable the join buffer for
the join_tab ("goto no_join_cache") instead of just returning from
check_join_cache_usage().
Introduces @@optimizer_switch flag: hash_join_cardinality
When this option is on, use EITS statistics to produce tighter bounds
for hash join output cardinality.
This patch is an extension / replacement to a similar patch in 10.6
New features:
- optimizer_switch hash_join_cardinality is on by default
- records_out is set to fanout when HASH is used
- Fixed bug in is_eits_usable: The function did not work with views
This patch optimizes the number of refills for the lateral derived table
to which a materialized derived table subject to split optimization is
is converted. This optimized number of refills is now considered as the
expected number of refills of the materialized derived table when searching
for the best possible splitting of the table.
When a query does implicit grouping and join operation produces an empty
result set, a NULL-complemented row combination is generated.
However, constant table fields still show non-NULL values.
What happens in the is that end_send_group() is called with a
const row but without any rows matching the WHERE clause.
This last part is shown by 'join->first_record' not being set.
This causes item->no_rows_in_result() to be called for all items to reset
all sum functions to their initial state. However fields are not set
to NULL.
The used fix is to produce NULL-complemented records for constant tables
as well. Also, reset the constant table's records back in case we're
in a subquery which may get re-executed.
An alternative fix would have item->no_rows_in_result() also work
with Item_field objects.
There is some other issues with the code:
- join->no_rows_in_result_called is used but never set.
- Tables that are used with group functions are not properly marked as
maybe_null, which is required if the table rows should be regarded as
null-complemented (not existing).
- The code that tries to detect if mixed_implicit_grouping should be set
didn't take into account all usage of fields and sum functions.
- Item_func::restore_to_before_no_rows_in_result() called the wrong
function.
- join->clear() does not use a table_map argument to clear_tables(),
which caused it to ignore constant tables.
- unclear_tables() does not correctly restore status to what is
was before clear_tables().
Main bug fix was to always use a table_map argument to clear_tables() and
always use join->clear() and clear_tables() together with unclear_tables().
Other fixes:
- Fixed Item_func::restore_to_before_no_rows_in_result()
- Set 'join->no_rows_in_result_called' when no_rows_in_result_set()
is called.
- Removed not used argument from setup_end_select_func().
- More code comments
- Ensure that end_send_group() modifies the same fields as are in the
result set.
- Changed return_zero_rows() to use pointers instead of references,
similar to the rest of the code.
When processing a query over a mergeable view at some conditions checked
at prepare stage it may be decided to materialize the view rather than
to merge it. Before this patch in such case the field 'derived' of the
TABLE_LIST structure created for the view remained set to 0. As a result
the guard condition preventing range analysis for materialized views did
not work properly. This led to a call of some handler method for the
temporary table created to contain the view's records that was supposed
to be used only for opened tables. However temporary tables created for
materialization of derived tables or views are not opened yet when range
analysis is performed.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
Introduce @@optimizer_switch flag: hash_join_cardinality
When it is on, use EITS statistics to produce tighter bounds for
hash join output cardinality.
Amended by Monty.
Reviewed by: Monty <monty@mariadb.org>
Fix-up for commit 476b24d084
Author: Monty
Date: Thu Feb 16 14:19:33 2023 +0200
MDEV-20057 Distinct SUM on CROSS JOIN and grouped returns wrong result
which misses initializing of sorder->suffix_length.
In this commit the initialization is implemented by passing
MY_ZEROFILL flag to the allocation of SORT_FIELD elements
Now the same rule applied to vews and derived tables. So we should
allow merge of views (and derived) in queries with rownum, because
it do not change results, only makes query plans better.