handler::clone() call did not work with read only tables like S3.
It gave a wrong error message (out of memory instead of a permission
error) and aborted the query.
The issue was that the clone call had a wrong parameter to ha_open().
This now fixed. I also changed the clone call to provide the correct
error message if things fails.
This patch fixes an 'out of memory' error when using the S3 engine
for queries that could use multiple indexes together to find the matching
rows, like the following:
SELECT * FROM t1 WHERE key1 = 99 OR key2 = 2
Disallow range optimization for BETWEEN when casting one of the arguments
from STRING to a numeric type would be required to construct a range for
the query.
Adds a new method on Item_func_between called can_optimize_range_const
which allows range optimization when the types of the arguments to BETWEEN
would permit it.
Use reclength because rec_buff_length is the actual reclength with
padding, whose use could cause ASAN unknown-crash, presumably caused
by memory violation.
Loose index scan currently only supports ASC key. When searching for
the next MAX value, the search starts from the rightmost range and
moves towards the left. If the key has "moved" as the result of a
successful ha_index_read_map() call when handling a previous range, we
check if the key is to the left of the current range, and if so, we
can skip to the next range.
The existing check on whether the loop has iterated at least once is
not sufficient, as an unsuccessful ha_index_read_map() often (always?)
does not "move" the key.
If a query has many OR-ed constructs which can use multiple indexes
(key1=1 AND key2=10) OR
(key1=2 AND key2=20) OR
(key1=3 AND key2=30) OR
...
The range optimizer would construct and then discard a lot of potential
index_merge plans. This process
1. is CPU-intensive
2. can hit the @@optimizer_max_sel_args limitation after which all
potential range or index_merge plans are discarded.
The fix is to apply a heuristic: if there is an OR clause with more than
MAX_OR_ELEMENTS_FOR_INDEX_MERGE=100 branches (hard-coded constant),
disallow construction of index_merge plans for the OR branches.
Partial commit of the greater MDEV-34348 scope.
MDEV-34348: MariaDB is violating clang-16 -Wcast-function-type-strict
The functions queue_compare, qsort2_cmp, and qsort_cmp2
all had similar interfaces, and were used interchangable
and unsafely cast to one another.
This patch consolidates the functions all into the
qsort_cmp2 interface.
Reviewed By:
============
Marko Mäkelä <marko.makela@mariadb.com>
Search conditions were evaluated using val_int(), which was wrong.
Fixing the code to use val_bool() instead.
Details:
- Adding a new item_base_t::IS_COND flag which marks Items used
as <search condition> in WHERE, HAVING, JOIN ON, CASE WHEN clauses.
The flag is at the parse time.
These expressions must be evaluated using val_bool() rather than val_int().
Note, the optimizer creates more Items which are used as search conditions.
Most of these items are not marked with IS_COND yet. This is OK for now,
but eventually these Items can also be fixed to have the flag.
- Adding a method Item::is_cond() which tests if the Item has the IS_COND flag.
- Implementing Item_cache_bool. It evaluates the cached expression using
val_bool() rather than val_int().
Overriding Type_handler_bool::Item_get_cache() to create Item_cache_bool.
- Implementing Item::save_bool_in_field(). It uses val_bool() rather than
val_int() to evaluate the expression.
- Implementing Type_handler_bool::Item_save_in_field()
using Item::save_bool_in_field().
- Fixing all Item_bool_func descendants to implement a virtual val_bool()
rather than a virtual val_int().
- To find places where val_int() should be fixed to val_bool(), a few
DBUG_ASSERT(!is_cond()) where added into val_int() implementations
of selected (most frequent) classes:
Item_field
Item_str_func
Item_datefunc
Item_timefunc
Item_datetimefunc
Item_cache_bool
Item_bool_func
Item_func_hybrid_field_type
Item_basic_constant descendants
- Fixing all places where DBUG_ASSERT() happened during an "mtr" run
to use val_bool() instead of val_int().
When calculate_cond_selectivity_for_table() takes into account multi-
column selectivities from range access, it tries to take-into account
that selectivity for some columns may have been already taken into account.
For example, for range access on IDX1 using {kp1, kp2}, the selectivity
of restrictions on "kp2" might have already been taken into account
to some extent.
So, the code tries to "discount" that using rec_per_key[] estimates.
This seems to be wrong and unreliable: the "discounting" may produce a
rselectivity_multiplier number that hints that the overall selectivity
of range access on IDX1 was greater than 1.
Do a conservative fix: if we arrive at conclusion that selectivity of
range access on condition in IDX1 >1.0, clip it down to 1.
my_like_range*() can create longer keys than Field::char_length().
This caused warnings during print_range().
Fix:
Suppressing warnings in print_range().
The optimizer deals with Rowid Filters this way:
1. First, range optimizer is invoked. It saves information
about all potential range accesses.
2. A query plan is chosen. Suppose, it uses a Rowid Filter on
index $IDX.
3. JOIN::make_range_rowid_filters() calls the range optimizer
again to create a quick select on index $IDX which will be used
to populate the rowid filter.
The problem: KILL command catches the query in step #3. Quick
Select is not created which causes a crash.
Fixed by checking if query was killed. Note: the problem also
affects 10.6, even if error handling for
SQL_SELECT::test_quick_select is different there.
(Variant for 10.6: return error code from SQL_SELECT::test_quick_select)
The optimizer deals with Rowid Filters this way:
1. First, range optimizer is invoked. It saves information
about all potential range accesses.
2. A query plan is chosen. Suppose, it uses a Rowid Filter on
index $IDX.
3. JOIN::make_range_rowid_filters() calls the range optimizer
again to create a quick select on index $IDX which will be used
to populate the rowid filter.
The problem: KILL command catches the query in step #3. Quick
Select is not created which causes a crash.
Fixed by checking if query was killed.
Some fixes related to commit f838b2d799 and
Rows_log_event::do_apply_event() and Update_rows_log_event::do_exec_row()
for system-versioned tables were provided by Nikita Malyavin.
This was required by test versioning.rpl,trx_id,row.
MDEV-33502 Slowdown when running nested statement with many partitions
caused this error as I failed to take into account bigendian architectures.
This patch also introduces bitmap_import() and bitmap_export() to be used
when one wants to store bitmaps in files/logs in a portable way.
Reviewed-by: Kristian Nielsen <knielsen@knielsen-hq.org>
MDEV-33502 Slowdown when running nested statement with many partitions
This change was triggered to help some MariaDB users with close to
10000 bits in their bitmaps.
- Change underlaying storage to be 64 bit instead of 32bit.
- This reduses number of loops to scan bitmaps.
- This can cause some bitmaps to be 4 byte large.
- Ensure that all not used top-bits are always 0 (simplifes code as
the last 64 bit storage is not a special case anymore).
- Use my_find_first_bit() to find the first set bit which is much faster
than scanning trough things byte by byte and then bit by bit.
Other things:
- Added a bool to remember if my_bitmap_init() did allocate the bitmap
array. my_bitmap_free() will only free arrays it did allocate.
This allowed me to remove setting 'bitmap=0' before calling
my_bitmap_free() for cases where the bitmap's where allocated externally.
- my_bitmap_init() sets bitmap to 0 in case of failure.
- Added 'universal' asserts to most bitmap functions.
- Change all remaining calls to bitmap_init() to my_bitmap_init().
- To finish the change from 2014.
- Changed all usage of uint32 in my_bitmap.h to my_bitmap_map.
- Updated bitmap_copy() to handle bitmaps of different size.
- Removed const from bitmap_exists_intersection() as this caused casts
on all usage.
- Removed not used function bitmap_set_above().
- Renamed create_last_word_mask() to create_last_bit_mask() (to match
name changes in my_bitmap.cc)
- Extended bitmap-t with test for more bitmap functions.
Variant#3: moved the logic out of create_key_parts_for_pseudo_indexes
Range Analyzer (get_mm_tree functions) can only process up to MAX_KEY=64
indexes. The problem was that calculate_cond_selectivity_for_table used
it to estimate selectivities for columns, and since a table can
have > MAX_KEY columns, would invoke Range Analyzer with more than MAX_KEY
"pseudo-indexes".
Fixed by making calculate_cond_selectivity_for_table() to run Range
Analyzer with at most MAX_KEY pseudo-indexes. If there are more
columns to process, Range Analyzer will be invoked multiple times.
Also made this change:
- param.real_keynr[0]= 0;
+ MEM_UNDEFINED(¶m.real_keynr, sizeof(param.real_keynr));
Range Analyzer should have no use on real_keynr when it is run with
pseudo-indexes.
Enable unusable key notes for non-equality predicates:
<, <=, =>, >, BETWEEN, IN, LIKE
Note, in some scenarios it displays duplicate notes, e.g.
for queries with ORDER BY:
SELECT * FROM t1
WHERE indexed_string_column >= 10
ORDER BY indexed_string_column
LIMIT 5;
This should be tolarable. Getting rid of the diplicate note
completely would need a much more complex patch, which is
not desiable in 10.6.
Details:
- Changing RANGE_OPT_PARAM::note_unusable_keys from bool
to a new data type Item_func::Bitmap, so the caller can
choose with a better granuality which predicates
should raise unusable key notes inside the range optimizer:
a. all predicates (=, <=>, <, <=, =>, >, BETWEEN, IN, LIKE)
b. all predicates except equality (=, <=>)
c. none of the predicates
"b." is needed because in some scenarios equality predicates (=, <=>)
send unusable key notes at an earlier stage, before the range optimizer,
during update_ref_and_keys(). Calling the range optimizer with
"all predicates" would produce duplicate notes for = and <=> in such cases.
- Fixing get_quick_record_count() to call the range optimizer
with "all predicates except equality" instead of "none of the predicates".
Before this change the range optimizer suppressed all notes for
non-equality predicates: <, <=, =>, >, BETWEEN, IN, LIKE.
This actually fixes the reported problem.
- Fixing JOIN::make_range_rowid_filters() to call the range optimizer
with "all predicates except equality" instead of "all predicates".
Before this change the range optimizer produced duplicate notes
for = and <=> during a rowid_filter optimization.
- Cleanup:
Adding the op_collation argument to Field::raise_note_cannot_use_key_part()
and displaying the operation collation rather than the argument collation
in the unusable key note. This is important for operations with more than
two arguments: BETWEEN and IN, e.g.:
SELECT * FROM t1
WHERE column_utf8mb3_general_ci
BETWEEN 'a' AND 'b' COLLATE utf8mb3_unicode_ci;
SELECT * FROM t1
WHERE column_utf8mb3_general_ci
IN ('a', 'b' COLLATE utf8mb3_unicode_ci);
The note for 'a' now prints utf8mb3_unicode_ci as the collation.
which is the collation of the entire operation:
Cannot use key key1 part[0] for lookup:
"`column_utf8mb3_general_ci`" of collation `utf8mb3_general_ci` >=
"'a'" of collation `utf8mb3_unicode_ci`
Before this change it printed the collation of 'a',
so the note was confusing:
Cannot use key key1 part[0] for lookup:
"`column_utf8mb3_general_ci`" of collation `utf8mb3_general_ci` >=
"'a'" of collation `utf8mb3_general_ci`"
When QUICK_GROUP_MIN_MAX_SELECT is initialized or being reset
it stores the prefix of the last group of the index chosen for
retrieving data (last_value). Later, when looping through records
at get_next() method, the server checks whether the retrieved
group is the last, and if so, it finishes processing.
At the same time, it looks like there is no need for that additional
check since method next_prefix() returns HA_ERR_KEY_NOT_FOUND
or HA_ERR_END_OF_FILE when there are no more satisfying records.
If we do not perform the check, we do not need to retrieve and
store last_value either.
This commit removes using of last_value from QUICK_GROUP_MIN_MAX_SELECT.
Reviewer: Sergei Petrunia <sergey@mariadb.com>