Partial commit of the greater MDEV-34348 scope.
MDEV-34348: MariaDB is violating clang-16 -Wcast-function-type-strict
The functions queue_compare, qsort2_cmp, and qsort_cmp2
all had similar interfaces, and were used interchangable
and unsafely cast to one another.
This patch consolidates the functions all into the
qsort_cmp2 interface.
Reviewed By:
============
Marko Mäkelä <marko.makela@mariadb.com>
The code in best_access_path() uses PREV_BITS(uint, N) to
compute a bitmap of all keyparts: {keypart0, ... keypart{N-1}).
The problem is that PREV_BITS($type, N) macro code can't handle the case
when N=<number of bits in $type).
Also, why use PREV_BITS(uint, ...) for key part map computations when
we could have used PREV_BITS(key_part_map) ?
Fixed both:
- Change PREV_BITS(type, N) to handle any N in [0; n_bits(type)].
- Change PREV_BITS() to use key_part_map when computing key_part_map bitmaps.
Stop skipping const items when selecting but skip them when storing
their results to spider row to avoid storing in mismatching temporary
table fields.
Skip auxiliary fields in SELECTing, and do not store
the (non-existing) results to the corresponding temporary table
accordingly.
When there are BOTH auxiliary fields AND const items in the auxiliary
field items, do not use the spider GBH. This is a rare occasion if it
happens at all and not worth the added complexity to cover it.
Use the original item (item_ptr) in constructing GROUP BY and ORDER
BY, which also means using item->name instead of field->field_name as
aliases in constructing SELECT items. This fixes spurious regressions
caused by the above changes in some tests using ORDER BY, such as
mdev_24517.test. As a by-product, this also fixes MDEV-29546.
Therefore we update mdev_29008.test to include the MDEV-29546 case.
(Variant 2b: call greedy_search() twice, correct handling for limited
search_depth)
Modify the join optimizer to specifically try to produce join orders that
can short-cut their execution for ORDER BY..LIMIT clause.
The optimization is controlled by @@optimizer_join_limit_pref_ratio.
Default value 0 means don't construct short-cutting join orders.
Other value means construct short-cutting join order, and prefer it only
if it promises speedup of more than #value times.
In Optimizer Trace, look for these names:
* join_limit_shortcut_is_applicable
* join_limit_shortcut_plan_search
* join_limit_shortcut_choice
When executing a statement of the form
SELECT AGGR_FN(DISTINCT c1, c2,..,cn) FROM t1,
where AGGR_FN is an aggregate function such as COUNT(), AVG() or SUM(),
and a unique index exists on table t1 covering some or all of the
columns (c1, c2,..,cn), the retrieved values are inherently unique.
Consequently, the need for de-duplication imposed by the DISTINCT
clause can be eliminated, leading to optimization of aggregation
operations.
This optimization applies under the following conditions:
- only one table involved in the join (not counting const tables)
- some arguments of the aggregate function are fields
(not functions/subqueries)
This optimization extends to queries of the form
SELECT AGGR_FN(c1, c2,..,cn) GROUP BY cx,..cy
when a unique index covers some or all of the columns
(c1, c2,..cn, cx,..cy)
(Variant#3: Allow cross-charset comparisons, use a special
CHARSET_INFO to create lookup keys. Review input addressed.)
Equalities that compare utf8mb{3,4}_general_ci strings, like:
WHERE ... utf8mb3_key_col=utf8mb4_value (MB3-4-CMP)
can now be used to construct ref[const] access and also participate
in multiple-equalities.
This means that utf8mb3_key_col can be used for key-lookups when
compared with an utf8mb4 constant, field or expression using '=' or
'<=>' comparison operators.
This is controlled by optimizer_switch='cset_narrowing=on', which is
OFF by default.
IMPLEMENTATION
Item value comparison in (MB3-4-CMP) is done using utf8mb4_general_ci.
This is valid as any utf8mb3 value is also an utf8mb4 value.
When making index lookup value for utf8mb3_key_col, we do "Charset
Narrowing": characters that are in the Basic Multilingual Plane (=BMP) are
copied as-is, as they can be represented in utf8mb3. Characters that are
outside the BMP cannot be represented in utf8mb3 and are replaced
with U+FFFD, the "Replacement Character".
In utf8mb4_general_ci, the Replacement Character compares as equal to any
character that's not in BMP. Because of this, the constructed lookup value
will find all index records that would be considered equal by the original
condition (MB3-4-CMP).
Approved-by: Monty <monty@mariadb.org>
The fix is to return 3-state value from Range_rowid_filter::build()
call:
1. The filter was built successfully;
2. The filter was not built, but the error was not fatal, i.e. there is
no need to rollback transaction. For example, if the size of container to
storevrow ids is not enough;
3. The filter was not built because of fatal error, for example,
deadlock or lock wait timeout from storage engine. In this case we
should stop query plan execution and roll back transaction.
Reviewed by: Sergey Petrunya
There was no actual execution of the SQL of a pushed derived table,
which caused "r_rows" to be always displayed as 0 and "r_total_time_ms"
to show inaccurate numbers.
This commit makes a derived table SQL to be executed by the storage
engine, so the server is able to calculate the number of rows returned
and measure the execution time more accurately
The code in choose_best_splitting() assumed that the join prefix is
in join->positions[].
This is not necessarily the case. This function might be called when
the join prefix is in join->best_positions[], too.
Follow the approach from best_access_path(), which calls this function:
pass the current join prefix as an argument,
"const POSITION *join_positions" and use that.
When a query does implicit grouping and join operation produces an empty
result set, a NULL-complemented row combination is generated.
However, constant table fields still show non-NULL values.
What happens in the is that end_send_group() is called with a
const row but without any rows matching the WHERE clause.
This last part is shown by 'join->first_record' not being set.
This causes item->no_rows_in_result() to be called for all items to reset
all sum functions to their initial state. However fields are not set
to NULL.
The used fix is to produce NULL-complemented records for constant tables
as well. Also, reset the constant table's records back in case we're
in a subquery which may get re-executed.
An alternative fix would have item->no_rows_in_result() also work
with Item_field objects.
There is some other issues with the code:
- join->no_rows_in_result_called is used but never set.
- Tables that are used with group functions are not properly marked as
maybe_null, which is required if the table rows should be regarded as
null-complemented (not existing).
- The code that tries to detect if mixed_implicit_grouping should be set
didn't take into account all usage of fields and sum functions.
- Item_func::restore_to_before_no_rows_in_result() called the wrong
function.
- join->clear() does not use a table_map argument to clear_tables(),
which caused it to ignore constant tables.
- unclear_tables() does not correctly restore status to what is
was before clear_tables().
Main bug fix was to always use a table_map argument to clear_tables() and
always use join->clear() and clear_tables() together with unclear_tables().
Other fixes:
- Fixed Item_func::restore_to_before_no_rows_in_result()
- Set 'join->no_rows_in_result_called' when no_rows_in_result_set()
is called.
- Removed not used argument from setup_end_select_func().
- More code comments
- Ensure that end_send_group() modifies the same fields as are in the
result set.
- Changed return_zero_rows() to use pointers instead of references,
similar to the rest of the code.
Reviewer: Sergei Petrunia <sergey@mariadb.com>
The problem was trying to access JOIN_TAB::select which is set to NULL
when using the filesort. The correct way is accessing either
JOIN_TAB::select or JOIN_TAB::filesort->select depending on whether
the filesort is used.
This commit introduces member function JOIN_TAB::get_sql_select()
encapsulating that check so the code duplication is eliminated.
The new condition (s->table->quick_keys.is_set(best_key->key))
was added to best_access_path() to eliminate a Valgrind error.
The cause of that error was using TRASH_ALLOC(quick_key_parts)
instead of bzero(quick_key_parts); hence, accessing
s->table->quick_key_parts[best_key->key]) without prior checking
for quick_keys.is_set() might have caused reading "dirty" memory
The problem was that join_buffer_size conflicted with
join_buffer_space_limit, which caused the query to be run without join
buffer. However this caused wrong results as the optimizer assumed
that hash+join buffer would ensure that the equi-join condition
would be satisfied, and didn't check it itself.
Fixed by not using join_buffer_space_limit when
optimize_join_buffer_size=off. This matches the documentation at
https://mariadb.com/kb/en/block-based-join-algorithms
Other things:
- Removed not used variable JOIN_TAB::join_buffer_size_limit
- Give an error if we cannot allocate a join buffer. This can
only happen if the join_buffer variables are wrongly configured or
we are running out of memory.
In the future, instead of returning an error, we could properly
convert the query plan that uses BNL-H join into one that doesn't
use join buffering:
make sure the equi-join condition is checked where appropriate.
Reviewer: Sergei Petrunia <sergey@mariadb.com>
This patch optimizes the number of refills for the lateral derived table
to which a materialized derived table subject to split optimization is
is converted. This optimized number of refills is now considered as the
expected number of refills of the materialized derived table when searching
for the best possible splitting of the table.
When a query does implicit grouping and join operation produces an empty
result set, a NULL-complemented row combination is generated.
However, constant table fields still show non-NULL values.
What happens in the is that end_send_group() is called with a
const row but without any rows matching the WHERE clause.
This last part is shown by 'join->first_record' not being set.
This causes item->no_rows_in_result() to be called for all items to reset
all sum functions to their initial state. However fields are not set
to NULL.
The used fix is to produce NULL-complemented records for constant tables
as well. Also, reset the constant table's records back in case we're
in a subquery which may get re-executed.
An alternative fix would have item->no_rows_in_result() also work
with Item_field objects.
There is some other issues with the code:
- join->no_rows_in_result_called is used but never set.
- Tables that are used with group functions are not properly marked as
maybe_null, which is required if the table rows should be regarded as
null-complemented (not existing).
- The code that tries to detect if mixed_implicit_grouping should be set
didn't take into account all usage of fields and sum functions.
- Item_func::restore_to_before_no_rows_in_result() called the wrong
function.
- join->clear() does not use a table_map argument to clear_tables(),
which caused it to ignore constant tables.
- unclear_tables() does not correctly restore status to what is
was before clear_tables().
Main bug fix was to always use a table_map argument to clear_tables() and
always use join->clear() and clear_tables() together with unclear_tables().
Other fixes:
- Fixed Item_func::restore_to_before_no_rows_in_result()
- Set 'join->no_rows_in_result_called' when no_rows_in_result_set()
is called.
- Removed not used argument from setup_end_select_func().
- More code comments
- Ensure that end_send_group() modifies the same fields as are in the
result set.
- Changed return_zero_rows() to use pointers instead of references,
similar to the rest of the code.
In block-nl-join, add:
- r_loops - this shows how many incoming record combinations this
query plan node had.
- r_effective_rows - this shows the average number of matching rows
that this table had for each incoming record combination. This is
comparable with r_rows in non-blocked access methods.
For BNL-joins, it is always equal to
$.table.r_rows * $.table.r_filtered
For BNL-H joins the value cannot be computed from other values
Reviewed by: Monty <monty@mariadb.org>
This patch is the result of running
run-clang-tidy -fix -header-filter=.* -checks='-*,modernize-use-equals-default' .
Code style changes have been done on top. The result of this change
leads to the following improvements:
1. Binary size reduction.
* For a -DBUILD_CONFIG=mysql_release build, the binary size is reduced by
~400kb.
* A raw -DCMAKE_BUILD_TYPE=Release reduces the binary size by ~1.4kb.
2. Compiler can better understand the intent of the code, thus it leads
to more optimization possibilities. Additionally it enabled detecting
unused variables that had an empty default constructor but not marked
so explicitly.
Particular change required following this patch in sql/opt_range.cc
result_keys, an unused template class Bitmap now correctly issues
unused variable warnings.
Setting Bitmap template class constructor to default allows the compiler
to identify that there are no side-effects when instantiating the class.
Previously the compiler could not issue the warning as it assumed Bitmap
class (being a template) would not be performing a NO-OP for its default
constructor. This prevented the "unused variable warning".