handler::clone() call did not work with read only tables like S3.
It gave a wrong error message (out of memory instead of a permission
error) and aborted the query.
The issue was that the clone call had a wrong parameter to ha_open().
This now fixed. I also changed the clone call to provide the correct
error message if things fails.
This patch fixes an 'out of memory' error when using the S3 engine
for queries that could use multiple indexes together to find the matching
rows, like the following:
SELECT * FROM t1 WHERE key1 = 99 OR key2 = 2
This commit introduces:
- the infrastructure for optimizer hints;
- hints for join buffering: BNL(), NO_BNL(), BKA(), NO_BKA();
- NO_ICP() hint for disabling index condition pushdown;
- MRR(), MO_MRR() hint for multi-range reads control;
- NO_RANGE_OPTIMIZATION() for disabling range optimization;
- QB_NAME() for assigning names for query blocks.
Extend loose index scan to support descending indexes.
This is achieved by removing a block skipping creating loose index
scan plan for descending index, as well as generalising the execution
of such plans.
The generalisation applies to all levels looking for min/max in loose
index scan. In the highest level (get_next), generalise min and max to
first and last, so that it still proceeds in the direction agreeing
with the index parity. In the lower levels, combine next_min and
next_max methods into next_min_max, and combine next_min_in_range and
next_max_in_range into next_min_max_in_range. This retains existing
logic of these functions and reduces code duplication, while allowing
handling of all four combinations (min, max) x (asc index, desc
index).
Disallow range optimization for BETWEEN when casting one of the arguments
from STRING to a numeric type would be required to construct a range for
the query.
Adds a new method on Item_func_between called can_optimize_range_const
which allows range optimization when the types of the arguments to BETWEEN
would permit it.
Use reclength because rec_buff_length is the actual reclength with
padding, whose use could cause ASAN unknown-crash, presumably caused
by memory violation.
Allows index condition pushdown for reverse ordered scans, a previously
disabled feature due to poor performance. This patch adds a new
API to the handler class called set_end_range which allows callers to
tell the handler what the end of the index range will be when scanning.
Combined with a pushed index condition, the handler can scan the index
efficiently and not read beyond the end of the given range. When
checking if the pushed index condition matches, the handler will also
check if scanning has reached the end of the provided range and stop if
so.
If we instead only enabled ICP for reverse ordered scans without
also calling this new API, then the handler would perform unnecessary
index condition checks. In fact this would continue until the end of
the index is reached.
These changes are agnostic of storage engine. That is, any storage
engine that supports index condition pushdown will inhereit this new
behavior as it is implemented in the SQL and storage engine
API layers.
The partitioned tables storage meta-engine (ha_partition) adds an
override of set_end_range which recursively calls set_end_range on its
child storage engine (handler) implementations.
This commit updates the test made in an earlier commit to show that
ICP matches happen for the reverse ordered case.
This patch is based on changes written by Olav Sandstaa in
MySQL commit da1d92fd46071cd86de61058b6ea39fd9affcd87
Loose index scan currently only supports ASC key. When searching for
the next MAX value, the search starts from the rightmost range and
moves towards the left. If the key has "moved" as the result of a
successful ha_index_read_map() call when handling a previous range, we
check if the key is to the left of the current range, and if so, we
can skip to the next range.
The existing check on whether the loop has iterated at least once is
not sufficient, as an unsuccessful ha_index_read_map() often (always?)
does not "move" the key.
If a query has many OR-ed constructs which can use multiple indexes
(key1=1 AND key2=10) OR
(key1=2 AND key2=20) OR
(key1=3 AND key2=30) OR
...
The range optimizer would construct and then discard a lot of potential
index_merge plans. This process
1. is CPU-intensive
2. can hit the @@optimizer_max_sel_args limitation after which all
potential range or index_merge plans are discarded.
The fix is to apply a heuristic: if there is an OR clause with more than
MAX_OR_ELEMENTS_FOR_INDEX_MERGE=100 branches (hard-coded constant),
disallow construction of index_merge plans for the OR branches.
normalize_cond() translated `WHERE col` into `WHERE col<>0`
But the opetator "not equal to 0" does not necessarily exists
for all data types.
For example, the query:
SELECT * FROM t1 WHERE inet6col;
was translated to:
SELECT * FROM t1 WHERE inet6col<>0;
which further failed with this error:
ERROR : Illegal parameter data types inet6 and bigint for operation '<>'
This patch changes the translation from `col<>0` to `col IS TRUE`.
So now
SELECT * FROM t1 WHERE inet6col;
gets translated to:
SELECT * FROM t1 WHERE inet6col IS TRUE;
Details:
1. Implementing methods:
- Field_longstr::val_bool()
- Field_string::val_bool()
- Item::val_int_from_val_str()
If the input contains bad data,
these methods raise a better error message:
Truncated incorrect BOOLEAN value
Before the change, the error was:
Truncated incorrect DOUBLE value
2. Fixing normalize_cond() to generate Item_func_istrue/Item_func_isfalse
instances instead of Item_func_ne/Item_func_eq
3. Making Item_func_truth sargable, so it uses the range optimizer.
Implementing the following methods:
- get_mm_tree(), get_mm_leaf(), add_key_fields() in Item_func_truth.
- get_func_mm_tree(), for all Item_func_truth descendants.
4. Implementing the method negated_item() for all Item_func_truth
descendants, so the negated item has a chance to be sargable:
For example,
WHERE NOT col IS NOT FALSE -- this notation is not sargable
is now translated to:
WHERE col IS FALSE -- this notation is sargable
Make Item_func_eq of the following forms sargable by updating the relevant range
analysis methods:
1. substr(col, 1, n) = str
2. str = substr(col, 1, n)
3. left(col, n) = str
4. str = left(col, n)
where col is a indexed column and str is a const and inexpensive item
of length n.
We do this by factoring out Item_func_like::get_mm_leaf() and apply it
to a string obtained from escaping str and then appending a wildcard
"%" to it.
The addition of the two Functype enums, LEFT_FUNC and SUBSTR_FUNC,
requires changes in the spider group by handler to continue handling
LEFT and SUBSTR correctly.
Co-authored-by: Yuchen Pei <ycp@mariadb.com>
Co-authored-by: Sergei Petrunia <sergey@mariadb.com>
Partial commit of the greater MDEV-34348 scope.
MDEV-34348: MariaDB is violating clang-16 -Wcast-function-type-strict
The functions queue_compare, qsort2_cmp, and qsort_cmp2
all had similar interfaces, and were used interchangable
and unsafely cast to one another.
This patch consolidates the functions all into the
qsort_cmp2 interface.
Reviewed By:
============
Marko Mäkelä <marko.makela@mariadb.com>
create templates
thd->alloc<X>(n) to use instead of (X*)thd->alloc(sizeof(X)*n)
and the same for thd->calloc(). By the default the type is char,
so old usage of thd->alloc(size) works too.
the information about index algorithm was stored in two
places inconsistently split between both.
BTREE index could have key->algorithm == HA_KEY_ALG_BTREE, if the user
explicitly specified USING BTREE or HA_KEY_ALG_UNDEF, if not.
RTREE index had key->algorithm == HA_KEY_ALG_RTREE
and always had key->flags & HA_SPATIAL
FULLTEXT index had key->algorithm == HA_KEY_ALG_FULLTEXT
and always had key->flags & HA_FULLTEXT
HASH index had key->algorithm == HA_KEY_ALG_HASH or HA_KEY_ALG_UNDEF
long unique index always had key->algorithm == HA_KEY_ALG_LONG_HASH
In this commit:
All indexes except BTREE and HASH always have key->algorithm
set, HA_SPATIAL and HA_FULLTEXT flags are not used anymore (except
for storage to keep frms backward compatible).
As a side effect ALTER TABLE now detects FULLTEXT index renames correctly
Single-table UPDATE/DELETE didn't provide outer_lookup_keys value for
subqueries. This didn't allow to make a meaningful choice between
IN->EXISTS and Materialization strategies for subqueries.
Fix this:
* Make UPDATE/DELETE save Sql_cmd_dml::scanned_rows,
* Then, subquery's JOIN::choose_subquery_plan() can fetch it from
there for outer_lookup_keys
Details:
UPDATE/DELETE now calls select_lex->optimize_unflattened_subqueries()
twice, like SELECT does (first call optimize_constant_subquries() in
JOIN::optimize_inner(), then call optimize_unflattened_subqueries() in
JOIN::optimize_stage2()):
1. Call with const_only=true before any optimizations. This allows
range optimizer and others to use the values of cheap const
subqueries.
2. Call it with const_only=false after range optimizer, partition
pruning, etc. outer_lookup_keys value is provided, so it's possible to
pick a good subquery strategy.
Note: PROTECT_STATEMENT_MEMROOT requires that first SP execution
performs subquery optimization for all subqueries, even for degenerate
query plans like "Impossible WHERE". Due to that, we ensure that the
call to optimize_unflattened_subqueries (with const_only=false) even
for degenerate query plans still happens, as was the case before this
change.
Search conditions were evaluated using val_int(), which was wrong.
Fixing the code to use val_bool() instead.
Details:
- Adding a new item_base_t::IS_COND flag which marks Items used
as <search condition> in WHERE, HAVING, JOIN ON, CASE WHEN clauses.
The flag is at the parse time.
These expressions must be evaluated using val_bool() rather than val_int().
Note, the optimizer creates more Items which are used as search conditions.
Most of these items are not marked with IS_COND yet. This is OK for now,
but eventually these Items can also be fixed to have the flag.
- Adding a method Item::is_cond() which tests if the Item has the IS_COND flag.
- Implementing Item_cache_bool. It evaluates the cached expression using
val_bool() rather than val_int().
Overriding Type_handler_bool::Item_get_cache() to create Item_cache_bool.
- Implementing Item::save_bool_in_field(). It uses val_bool() rather than
val_int() to evaluate the expression.
- Implementing Type_handler_bool::Item_save_in_field()
using Item::save_bool_in_field().
- Fixing all Item_bool_func descendants to implement a virtual val_bool()
rather than a virtual val_int().
- To find places where val_int() should be fixed to val_bool(), a few
DBUG_ASSERT(!is_cond()) where added into val_int() implementations
of selected (most frequent) classes:
Item_field
Item_str_func
Item_datefunc
Item_timefunc
Item_datetimefunc
Item_cache_bool
Item_bool_func
Item_func_hybrid_field_type
Item_basic_constant descendants
- Fixing all places where DBUG_ASSERT() happened during an "mtr" run
to use val_bool() instead of val_int().
When calculate_cond_selectivity_for_table() takes into account multi-
column selectivities from range access, it tries to take-into account
that selectivity for some columns may have been already taken into account.
For example, for range access on IDX1 using {kp1, kp2}, the selectivity
of restrictions on "kp2" might have already been taken into account
to some extent.
So, the code tries to "discount" that using rec_per_key[] estimates.
This seems to be wrong and unreliable: the "discounting" may produce a
rselectivity_multiplier number that hints that the overall selectivity
of range access on IDX1 was greater than 1.
Do a conservative fix: if we arrive at conclusion that selectivity of
range access on condition in IDX1 >1.0, clip it down to 1.
my_like_range*() can create longer keys than Field::char_length().
This caused warnings during print_range().
Fix:
Suppressing warnings in print_range().