1
0
mirror of https://github.com/MariaDB/server.git synced 2025-10-25 18:38:00 +03:00
Commit Graph

8170 Commits

Author SHA1 Message Date
Sergei Petrunia
c8d040938a MDEV-34720: Poor plan choice for large JOIN with ORDER BY and small LIMIT
(Variant 2b: call greedy_search() twice, correct handling for limited
 search_depth)

Modify the join optimizer to specifically try to produce join orders that
can short-cut their execution for ORDER BY..LIMIT clause.

The optimization is controlled by @@optimizer_join_limit_pref_ratio.
Default value 0 means don't construct short-cutting join orders.
Other value means construct short-cutting join order, and prefer it only
if it promises speedup of more than #value times.

In Optimizer Trace, look for these names:
* join_limit_shortcut_is_applicable
* join_limit_shortcut_plan_search
* join_limit_shortcut_choice
2024-09-02 16:37:18 +03:00
Sergei Petrunia
819765a47d Code cleanup in test_if_skip_sort_order()
Added comments
Moved a part into find_indexes_matching_order().
2024-09-02 16:37:18 +03:00
Sergei Petrunia
9020baf126 Trivial fix: Make test_if_cheaper_ordering() use actual_rec_per_key()
Discovered this while working on MDEV-34720: test_if_cheaper_ordering()
uses rec_per_key, while the original estimate for the access method
is produced in best_access_path() by using actual_rec_per_key().

Make test_if_cheaper_ordering() also use actual_rec_per_key().
Also make several getter function "const" to make this compile.
Also adjusted the testcase to handle this (the change backported from
11.0)
2024-08-25 16:05:00 +03:00
Marko Mäkelä
757c368139 Merge 10.5 into 10.6 2024-08-14 10:56:11 +03:00
Oleksandr Byelkin
8f020508c8 Merge branch '10.5' into 10.6 2024-08-03 09:04:24 +02:00
Galina Shalygina
d072a29601 MDEV-23983: Crash caused by query containing constant having clause
Before this patch the crash occured when a single row dataset is used and
Item::remove_eq_conds() is called for HAVING. This function is not supposed
to be called after the elimination of multiple equalities.

To fix this problem instead of Item::remove_eq_conds() Item::val_int() is
used. In this case the optimizer tries to evaluate the condition for the
single row dataset and discovers impossible HAVING immediately. So, the
execution phase is skipped.

Approved by Igor Babaev <igor@maridb.com>
2024-08-01 12:18:29 +02:00
Oleg Smirnov
c91aeb3771 MDEV-34634 Types mismatch when cloning items causes debug assertion
New runtime diagnostic introduced with MDEV-34490 has detected
that `Item_int_with_ref` incorrectly returns an instance of its ancestor
class `Item_int`. This commit fixes that.

In addition, this commit reverts a part of the diagnostic related
to `clone_item()` checks. As it turned out, `clone_item()` is not required
to return an object of the same class as the cloned one. For example,
look at `Item_param::clone_item()`: it can return objects of `Item_null`,
`Item_int`, `Item_string`, etc, depending on the object state.
So the runtime type diagnostic is not applicable to `clone_item()` and
is disabled with this commit.

As the similar diagnostic failures are expected to appear again
in the future, this commit introduces a new test file in the main suite:
item_types.test, and new test cases may be added to this file

Reviewer: Oleksandr Byelkin <sanja@mariadb.com>
2024-07-23 20:11:28 +07:00
Yuchen Pei
f071b7620b Merge branch '10.5' into 10.6 2024-07-16 15:54:22 +08:00
Oleg Smirnov
405613ebb5 MDEV-34490 get_copy() and build_clone() may return an instance of an ancestor class instead of a copy/clone
The `Item` class methods `get_copy()`, `build_clone()`, and `clone_item()`
face an issue where they may be defined in a descendant class
(e.g., `Item_func`) but not in a further descendant (e.g., `Item_func_child`).
This can lead to scenarios where `build_clone()`, when operating on an
instance of `Item_func_child` with a pointer to the base class (`Item`),
returns an instance of `Item_func` instead of `Item_func_child`.

Since this limitation cannot be resolved at compile time, this commit
introduces runtime type checks for the copy/clone operations.
A debug assertion will now trigger in case of a type mismatch.

`get_copy()`, `build_clone()`, and `clone_item()` are no more virtual,
but virtual `do_get_copy()`, `do_build_clone()`, and `do_clone_item()`
are added to the protected section of the class `Item`.

Additionally, const qualifiers have been added to certain methods
to enhance code reliability.

Reviewer: Oleksandr Byelkin <sanja@mariadb.com>
2024-07-15 18:25:57 +07:00
Oleg Smirnov
aae3233c4f MDEV-34041 Display additional information for materialized subqueries in EXPLAIN/ANALYZE FORMAT=JSON
This commits adds the "materialization" block to the output of
EXPLAIN/ANALYZE FORMAT=JSON when materialized subqueries are involved
into processing. In the case of ANALYZE additional runtime information
is displayed, such as:
  - chosen strategy of materialization
  - number of partial match/index lookup loops
  - sizes of partial match buffers
2024-07-11 17:40:39 +07:00
Dave Gosselin
02e38e2ece MDEV-33971 NAME_CONST in WHERE clause replaced by inner item
Improve performance of queries like
  SELECT * FROM t1 WHERE field = NAME_CONST('a', 4);
by, in this example, replacing the WHERE clause with field = 4
in the case of ref access.

The rewrite is done during fix_fields and we disambiguate this
case from other cases of NAME_CONST by inspecting where we are
in parsing.  We rely on THD::where to accomplish this.  To
improve performance there, we change the type of THD::where to
be an enumeration, so we can avoid string comparisons during
Item_name_const::fix_fields.  Consequently, this patch also
changes all usages of THD::where to conform likewise.
2024-07-10 17:23:43 -04:00
Marko Mäkelä
0076eb3d4e Merge 10.5 into 10.6 2024-06-24 13:09:47 +03:00
Sergei Petrunia
a2066b2400 MDEV-30651: Assertion `sel->quick' in make_range_rowid_filters
The optimizer deals with Rowid Filters this way:

1. First, range optimizer is invoked. It saves information
   about all potential range accesses.
2. A query plan is chosen. Suppose, it uses a Rowid Filter on
   index $IDX.
3. JOIN::make_range_rowid_filters() calls the range optimizer
again to create a quick select on index $IDX which will be used
to populate the rowid filter.

The problem: KILL command catches the query in step #3. Quick
Select is not created which causes a crash.

Fixed by checking if query was killed. Note: the problem also
affects 10.6, even if error handling for
SQL_SELECT::test_quick_select is different there.
2024-06-17 14:08:32 +03:00
Sergei Petrunia
ef9e3e73ed MDEV-30651: Assertion `sel->quick' in make_range_rowid_filters
(Variant for 10.6: return error code from SQL_SELECT::test_quick_select)
The optimizer deals with Rowid Filters this way:

1. First, range optimizer is invoked. It saves information
   about all potential range accesses.
2. A query plan is chosen. Suppose, it uses a Rowid Filter on
   index $IDX.
3. JOIN::make_range_rowid_filters() calls the range optimizer
again to create a quick select on index $IDX which will be used
to populate the rowid filter.

The problem: KILL command catches the query in step #3. Quick
Select is not created which causes a crash.

Fixed by checking if query was killed.
2024-06-17 12:50:43 +03:00
Sergei Petrunia
b47bd3f8bf MDEV-33875: ORDER BY DESC causes ROWID Filter slowdown
Rowid Filter cannot be used with reverse-ordered scans, for the
same reason as IndexConditionPushdown cannot be.

test_if_skip_sort_order() already has logic to disable ICP when
setting up a reverse-ordered scan. Added logic to also disable
Rowid Filter in this case, factored out the code into
prepare_for_reverse_ordered_access(), and added a comment describing
the cause of this limitation.
2024-06-17 09:50:32 +03:00
Marko Mäkelä
27834ebc91 Merge 10.5 into 10.6 2024-06-10 15:22:15 +03:00
Alexander Barkov
246c0b3a35 MDEV-34227 On startup: UBSAN: runtime error: applying non-zero offset in JOIN::make_aggr_tables_info in sql/sql_select.cc
Avoid undefined behaviour (applying offset to nullptr).
The reported scenario is covered in mysql-test/connect-no-db.test
No new tests needed.
2024-06-10 12:50:52 +04:00
Dmitry Shulga
5e6c122427 MDEV-33769: Memory leak found in the test main.rownum run with --ps-protocol against a server built with the option -DWITH_PROTECT_STATEMENT_MEMROOT
A memory leak happens on the second execution of a query that run in PS mode
and uses the function ROWNUM().

A memory leak took place on allocation of an instance of the class Item_int
for storing a limit value that is performed at the function set_limit_for_unit
indirectly called from JOIN::optimize_inner. Typical trace to the place where
the memory leak occurred is below:
 JOIN::optimize_inner
  optimize_rownum
   process_direct_rownum_comparison
    set_limit_for_unit
     new (thd->mem_root) Item_int(thd, lim, MAX_BIGINT_WIDTH);

To fix this memory leak, calling of the function optimize_rownum()
has to be performed only once on first execution and never called
after that. To control it, the new data member
  first_rownum_optimization
added into the structure st_select_lex.
2024-05-13 17:07:48 +07:00
Sergei Golubchik
7b53672c63 Merge branch '10.5' into 10.6 2024-05-08 20:06:00 +02:00
Sergei Petrunia
40b3525fcc MDEV-28621: group by optimization incorrectly removing subquery where subject buried in a function
Workaround patch: Do not remove GROUP BY clause when it has
subquer(ies) in it.

remove_redundant_subquery_clauses() removes redundant GROUP BY clause
from queries in form:
  expr IN (SELECT no_aggregates GROUP BY ...)
  expr {CMP} {ALL|ANY|SOME} (SELECT no_aggregates GROUP BY ...)
This hits problems when the GROUP BY clause itself has subquer(y/ies).

This patch is just a workaround: it disables removal of GROUP BY clause
if the clause has one or more subqueries in it.

Tests:
- subselect_elimination.test has all known crashing cases.
- subselect4.result, insert_select.result are updated.
Note that in some cases results of SELECT are changed too (not just
EXPLAINs). These are caused by non-deterministic SQL: when running a
query like:

  x > ANY( SELECT col1 FROM t1 GROUP BY constant_expression)

without removing the GROUP BY, the executor is free to pick the value
of t1.col1 from any row in the GROUP BY group (denote it $COL1_VAL).
Then, it computes x > ANY(SELECT $COL1_VAL).

When running the same query and removing the GROUP BY:

   x > ANY( SELECT col1 FROM t1)

the executor will actually check all rows of t1.
2024-05-07 21:25:22 +02:00
Sergei Golubchik
22b3ba9312 MDEV-25102 UNIQUE USING HASH error after ALTER ... DISABLE KEYS
on disable_indexes(HA_KEY_SWITCH_NONUNIQ_SAVE) the engine does
not know that the long unique is logically unique, because on the
engine level it is not. And the engine disables it,

Change the disable_indexes/enable_indexes API. Instead of the enum
mode, send a key_map of indexes that should be enabled. This way the
server will decide what is unique, not the engine.
2024-05-06 17:16:10 +02:00
Alexander Barkov
c6e3fe29d4 MDEV-30646 View created via JSON_ARRAYAGG returns incorrect json object
Backporting add782a13e from 10.6, this fixes the problem.
2024-04-29 13:47:45 +04:00
Marko Mäkelä
829cb1a49c Merge 10.5 into 10.6 2024-04-17 14:14:58 +03:00
Oleksandr Byelkin
9b18275623 Merge branch '10.4' into 10.5 2024-04-16 11:04:14 +02:00
Yuchen Pei
f9e0ebeca4 MDEV-33742 Do not create group by handler when all tables are constant 2024-04-08 14:35:36 +10:00
Sergei Petrunia
8cc36fb743 MDEV-21102: Server crashes in JOIN_CACHE::write_record_data upon EXPLAIN with subqueries
JOIN_CACHE has a light-weight initialization mode that's targeted at
EXPLAINs. In that mode, JOIN_CACHE objects are not able to execute.

Light-weight mode was used whenever the statement was an EXPLAIN. However
the EXPLAIN can execute subqueries, provided they enumerate less than
@@expensive_subquery_limit rows.

Make sure we use light-weight initialization mode only when the select is
more expensive @@expensive_subquery_limit.

Also add an assert into JOIN_CACHE::put_record() which prevents its use
if it was initialized for EXPLAIN only.
2024-04-04 10:30:09 +03:00
Sergei Golubchik
55cea0c2a6 Merge branch '10.5' into 10.6 2024-03-14 19:52:08 +01:00
Sergei Petrunia
9d5a8bd663 MDEV-33665: MSAN failure due to uninitialized Item_func::not_null_tables_cache
eliminate_item_equal() uses quick_fix_field() for Item objects it creates.
It computes some of their attributes on its own (see update_used_tables()
call) but it doesn't update not_null_tables_cache.

Recompute not_null_tables_cache also. Not computing it is currently
harmless, except for producing MSAN error when some other code
propagates the wrong value of not_null_tables_cache to other item.
2024-03-14 16:11:40 +03:00
Sergei Golubchik
f71d7f2f0f Merge branch '10.5' into 10.6 2024-03-13 21:02:34 +01:00
Marko Mäkelä
f703e72bd8 Merge 10.4 into 10.5 2024-03-11 10:08:20 +02:00
Kristian Nielsen
5707f1efda MDEV-33468: Crash due to missing stack overrun check in two recursive functions
Thanks to Yury Chaikou for finding this problem (and the fix).

Reviewed-by: Monty <monty@mariadb.org>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-02-16 12:48:30 +01:00
Marko Mäkelä
691f923906 Merge 10.5 into 10.6 2024-02-13 20:42:59 +02:00
Monty
3907345e22 MDEV-33306 Optimizer choosing incorrect index in 10.6, 10.5 but not in 10.4
In MariaDB up to 10.11, the test_if_cheaper_ordering() code (that tries
to optimizer how GROUP BY is executed) assumes that if a table scan is used
then if there is any index usable by GROUP BY it will be used.

The reason MySQL 10.4 provides a better plan is because of two differences:
- Plans using 'ref' has a cost of 1/10 of what it should be (as a
  protection against table scans). This is why 'ref' is used in 10.4
  and not in 10.5.
- When 'ref' is used, then GROUP BY will not use an index for GROUP BY.

In MariaDB 10.5 the chosen plan is a table scan (as it calculated to be
faster) but as 'ref' is not used, the test_if_cheaper_ordering()
optimizer phase decides (as ref is not usd) to use an index for GROUP BY,
which has bad performance.

Description of fix:
- All new code is protected by the "optimizer_adjust_secondary_key_costs"
  variable, which is now a bit map, and is only executed if the option
  "disable_forced_index_in_group_by" set.
- Corrects GROUP BY handling in test_if_cheaper_ordering() by making
  the choise of using and index with GROUP BY cost based instead of rule
  based.
- Adds TIME_FOR_COMPARE to all costs, when using group by, to make
  read_time, index_scan_time and range_cost comparable.

Other things:
- Made optimizer_adjust_secondary_key_costs a bit map (compatible with old
  code).

Notes:
Current code ignores costs for the algorithm used when doing GROUP
BY on the first table:
  - Create an in-memory temporary table for handling group by and doing a
    filesort of the result file
We can probably in 10.6 continue to ignore this cost.

This patch should NOT be merged to 11.0 series (not needed in 11.0).
2024-02-12 16:43:00 +02:00
Marko Mäkelä
8ec12e0d6d Merge 10.4 into 10.5 2024-02-12 11:38:13 +02:00
Rex
36f51d9748 MDEV-29179 Condition pushdown from HAVING into WHERE is not shown in optimizer trace
JOIN::optimize_inner(), Condition pushdown from HAVING into WHERE
	  not shown in optimizer trace.
2024-02-11 22:21:32 +01:00
Oleg Smirnov
15623c7f29 MDEV-30660 Aggregation functions fail to leverage uniqueness property
When executing a statement of the form
  SELECT AGGR_FN(DISTINCT c1, c2,..,cn) FROM t1,
where AGGR_FN is an aggregate function such as COUNT(), AVG() or SUM(),
and a unique index exists on table t1 covering some or all of the
columns (c1, c2,..,cn), the retrieved values are inherently unique.
Consequently, the need for de-duplication imposed by the DISTINCT
clause can be eliminated, leading to optimization of aggregation
operations.
This optimization applies under the following conditions:
  - only one table involved in the join (not counting const tables)
  - some arguments of the aggregate function are fields
        (not functions/subqueries)

This optimization extends to queries of the form
  SELECT AGGR_FN(c1, c2,..,cn) GROUP BY cx,..cy
when a unique index covers some or all of the columns
(c1, c2,..cn, cx,..cy)
2024-02-10 14:54:03 +07:00
Dmitry Shulga
e48bd474a2 MDEV-15703: Crash in EXECUTE IMMEDIATE 'CREATE OR REPLACE TABLE t1 (a INT DEFAULT ?)' USING DEFAULT
This patch fixes the issue with passing the DEFAULT or IGNORE values to
positional parameters for some kind of SQL statements to be executed
as prepared statements.

The main idea of the patch is to associate an actual value being passed
by the USING clause with the positional parameter represented by
the Item_param class. Such association must be performed on execution of
UPDATE statement in PS/SP mode. Other corner cases that results in
server crash is on handling CREATE TABLE when positional parameter
placed after the DEFAULT clause or CALL statement and passing either
the value DEFAULT or IGNORE as an actual value for the positional parameter.
This case is fixed by checking whether an error is set in diagnostics
area at the function pack_vcols() on return from the function pack_expression()
2024-02-08 09:21:54 +01:00
Monty
6f65e08277 MDEV-33118 optimizer_adjust_secondary_key_costs variable
optimizer-adjust_secondary_key_costs is added to provide 2 small
adjustments to the 10.x optimizer cost model. This can be used in the
case where the optimizer wrongly uses a secondary key instead of a
clustered primary key.

The reason behind this change is that MariaDB 10.x does not take into
account that for engines like InnoDB, that scanning a primary key can be
up to 7x faster than scanning a secondary key + read the row data trough
the primary key.

The different values for optimizer_adjust_secondary_key_costs are:

optimizer_adjust_secondary_key_costs=0
- No changes to current model

optimizer_adjust_secondary_key_costs=1
- Ensure that the cost of of secondary indexes has a cost of at
  least 5x times the cost of a clustered primary key (if one exists).
  This disables part of the worst_seek optimization described below.

optimizer_adjust_secondary_key_costs=2
- Disable "worst_seek optimization" and adjust filter cost slightly
  (add cost of 1 if filter is used).

The idea behind 'worst_seek optimization' is that we limit the
cost for all non clustered ref access to the least of:
- best-rows-by-range (or all rows in no range found) / 10
- scan-time-table (roughly number of file blocks to scan table) * 3

In addition we also do not try to use rowid_filter if number of rows
estimated for 'ref' access is less than the worst_seek limitation.

The idea is that worst_seek is trying to take into account that if
we do a lot of accesses through a key, this is likely to be cached.
However it only does this for secondary keys, and not for clustered
keys or index only reads.

The effect of the worst_seek are:
- In some cases 'ref' will have a much lower cost than range or using
  a clustered key.
- Some possible rowid filters for secondary keys will be ignored.

When implementing optimizer_adjust_secondary_key_costs=2, I noticed
that there is a slightly different costs for how ref+filter and
range+filter are calculated.  This caused a lot of range and
range+filter to change to ref+filter, which is not good as
range+filter provides the optimizer a better estimate of how many
accepted rows there will be in the result set.
Adding a extra small cost (1 seek) when using filter mitigated the
above problems in almost all cases.

This patch should not be applied to MariaDB 11.0 as worst_seeks is
removed in 11.0 and the cost calculation for clustered keys, secondary
keys, index scan and filter is more exact.

Test case changes for --optimizer-adjust_secondary_key_costs=1
(Fix secondary key costs to be 5x of primary key):

- stat_tables_innodb:
  - Complex change (probably ok as number of rows are really small)
    - ref over 1 row changed to range over 10 rows with join buffer
    - ref over 5 rows changed to eq_ref
    - secondary ref over 1 row changed to ref of primary key over 4 rows
    - Change of key to use longer key with index pushdown (a little
      bit worse but not significant).
  - Change to use secondary (1 row) -> primary (4 rows)
- rowid_filter_innodb:
  - index_merge (2 rows) & ref (1) -> all (23 rows) -> primary eq_ref.

Test case changes for --optimizer-adjust_secondary_key_costs=2
(remove of worst_seeks & adjust filter cost):

- stat_tables_innodb:
  - Join order change (probably ok as number of rows are really small)
  - ref (5 rows) & ref(1 row) changed to range (10 rows & join buffer)
    & eq_ref.
- selectivity_innodb:
  - ref -> ref|filter  (ok)
- rowid_filter_innodb:
  - ref -> ref|filter (ok)
  - range|filter (64 rows) changed to ref|filter (128 rows).
    ok as ref|filter outputs wrong number of rows in explain.
- range, range_mrr_icp:
  -ref (500 rows -> ALL (1000 rows) (ok)
- select_pkeycache, select, select_jcl6:
  - ref|filter (2 rows) -> ref (2 rows) (ok)
- selectivity:
  - ref -> ref_filter (ok)
- range:
  - Change of 'filtered' but no stat or plan change (ok)
- selectivity:
 - ref -> ref+filter (ok)
 - Change of filtered but no plan change (ok)
- join_nested_jcl6:
  - range -> ref|filter (ok as only 2 rows)
- subselect3, subselect3_jcl6:
  - ref_or_null (4 rows) -> ALL (10 rows) (ok)
  - Index_subquery (4 rows) -> ALL (10 rows)  (ok)
- partition_mrr_myisam, partition_mrr_aria and partition_mrr_innodb:
  - Uses ALL instead of REF for a key value that is the same for > 50%
    of rows.  (good)
order_by_innodb:
  - range (200 rows) -> ref (20 rows)+filesort (ok)
- subselect_sj2_mat:
  - One test changed. One ALL removed and replaced with eq_ref. Likely
    to be better.
- join_cache:
  - Changed ref over 60% of the rows to use hash join (ok)
- opt_tvc:
  - Changed to use eq_ref instead of ref with plan change (probably ok)
- opt_trace:
  - No worst/max seeks clipping (good).
  - Almost double range_scan_time and index_scan_time (ok).
- rowid_filter:
  - ref -> ref|filtered (ok)
  - range|filter (77 rows) changed to ref|filter (151 rows).  Proably
    ok as ref|filter outputs wrong number of rows in explain.

Reviewer: Sergei Petrunia <sergey@mariadb.com>
2024-01-23 13:03:11 +02:00
Monty
c49fd90263 Ensure keyread_tmp variable is assigned.
In the case of calcuating cost for a ref access for which there exists
a usable range, the variable keyread_tmp would always be 0.
If there is also another index that could be used as a filter, the cost
of that filter would be wrong.
In many cases 'the worst_seeks optimzation' would disable the filter
from getting used, which is why this bug has not been noticed before.

The next commit, which allows one to disable worst_seeks, will have a
test case for this bug.
2024-01-23 13:03:11 +02:00
Daniel Black
4ef9c9bb75 Merge remote-tracking branch 10.5 into 10.6
Notably MDEV-33290, columnstore disable stays in 10.5.
2024-01-23 15:25:42 +11:00
Rex
117388225c MDEV-33165 Incorrect result interceptor passed to mysql_explain_union()
Statements affect by this bug are all SQL statements that
1) prefixed with "EXPLAIN"
2) have a lower level join structure created for a union subquery.

A bug in select_describe() passed an incorrect "result" object to
mysql_explain_union(), resulting in unpredictable behaviour and
out of context calls.

Reviewed by: Oleksandr Byelkin, sanja@mariadb.com
2024-01-23 08:57:15 +13:00
Alexander Barkov
4ced4898fd MDEV-32958 Unusable key notes do not get reported for some operations
Enable unusable key notes for non-equality predicates:
   <, <=, =>, >, BETWEEN, IN, LIKE

Note, in some scenarios it displays duplicate notes, e.g.
for queries with ORDER BY:

  SELECT * FROM t1
  WHERE    indexed_string_column >= 10
  ORDER BY indexed_string_column
  LIMIT 5;

This should be tolarable. Getting rid of the diplicate note
completely would need a much more complex patch, which is
not desiable in 10.6.

Details:

- Changing RANGE_OPT_PARAM::note_unusable_keys from bool
  to a new data type Item_func::Bitmap, so the caller can
  choose with a better granuality which predicates
  should raise unusable key notes inside the range optimizer:
    a. all predicates (=, <=>, <, <=, =>, >, BETWEEN, IN, LIKE)
    b. all predicates except equality (=, <=>)
    c. none of the predicates

  "b." is needed because in some scenarios equality predicates (=, <=>)
  send unusable key notes at an earlier stage, before the range optimizer,
  during update_ref_and_keys(). Calling the range optimizer with
  "all predicates" would produce duplicate notes for = and <=> in such cases.

- Fixing get_quick_record_count() to call the range optimizer
  with "all predicates except equality" instead of "none of the predicates".
  Before this change the range optimizer suppressed all notes for
  non-equality predicates: <, <=, =>, >, BETWEEN, IN, LIKE.
  This actually fixes the reported problem.

- Fixing JOIN::make_range_rowid_filters() to call the range optimizer
  with "all predicates except equality" instead of "all predicates".
  Before this change the range optimizer produced duplicate notes
  for = and <=> during a rowid_filter optimization.

- Cleanup:
  Adding the op_collation argument to Field::raise_note_cannot_use_key_part()
  and displaying the operation collation rather than the argument collation
  in the unusable key note. This is important for operations with more than
  two arguments: BETWEEN and IN, e.g.:

    SELECT * FROM t1
    WHERE column_utf8mb3_general_ci
          BETWEEN 'a' AND 'b' COLLATE utf8mb3_unicode_ci;

    SELECT * FROM t1
    WHERE column_utf8mb3_general_ci
          IN ('a', 'b' COLLATE utf8mb3_unicode_ci);

    The note for 'a' now prints utf8mb3_unicode_ci as the collation.
    which is the collation of the entire operation:

      Cannot use key key1 part[0] for lookup:
      "`column_utf8mb3_general_ci`" of collation `utf8mb3_general_ci` >=
      "'a'" of collation `utf8mb3_unicode_ci`

    Before this change it printed the collation of 'a',
    so the note was confusing:

      Cannot use key key1 part[0] for lookup:
      "`column_utf8mb3_general_ci`" of collation `utf8mb3_general_ci` >=
      "'a'" of collation `utf8mb3_general_ci`"
2023-12-11 08:55:27 +04:00
Alexander Barkov
f436b4a523 MDEV-32879 Server crash in my_decimal::operator= or unexpected ER_DUP_ENTRY upon comparison with INET6 and similar types
During the 10.5->10.6 merge please use the 10.6 code on conflicts.

This is the 10.5 version of the patch (a backport of the 10.6 version).
Unlike 10.6 version, it makes changes in plugin/type_inet/sql_type_inet.*
rather than in sql/sql_type_fixedbin.h

Item_bool_rowready_func2, Item_func_between, Item_func_in
did not check if a not-NULL argument of an arbitrary data type
can produce a NULL value on conversion to INET6.

This caused a crash on DBUG_ASSERT() in conversion failures,
because the function returned SQL NULL for something that
has Item::maybe_null() equal to false.

Adding setting NULL-ability in such cases.

Details:

- Removing the code in Item_func::setup_args_and_comparator()
  performing character set aggregation with optional narrowing.
  This aggregation is done inside Arg_comparator::set_cmp_func_string().
  So this code was redundant

- Removing Item_func::setup_args_and_comparator() as it git simplified to
  just to two lines:
    convert_const_compared_to_int_field(thd);
    return cmp->set_cmp_func(thd, this, &args[0], &args[1], true);
  Using these lines directly in:
    - Item_bool_rowready_func2::fix_length_and_dec()
    - Item_func_nullif::fix_length_and_dec()

- Adding a new virtual method:
  - Type_handler::Item_bool_rowready_func2_fix_length_and_dec().

- Adding tests detecting if the data type conversion can return SQL NULL into
  the following methods of Type_handler_inet6:
  - Item_bool_rowready_func2_fix_length_and_dec
  - Item_func_between_fix_length_and_dec
  - Item_func_in_fix_comparator_compatible_types
2023-11-28 07:26:39 +04:00
Marko Mäkelä
44b9e41694 Merge 10.5 into 10.6 2023-11-17 13:07:35 +02:00
Rex
8b509a5d64 Merge 10.4 into 10.5 2023-11-16 06:43:24 +12:00
Oleksandr Byelkin
b83c379420 Merge branch '10.5' into 10.6 2023-11-08 15:57:05 +01:00
Oleksandr Byelkin
6cfd2ba397 Merge branch '10.4' into 10.5 2023-11-08 12:59:00 +01:00
Sergei Petrunia
1697747461 MDEV-32682: Assertion `range->rows >= s->found_records' failed in best_access_path
Fix the issue introduced in ec2574fd8f, fix for MDEV-31983:

get_quick_record_count() must set quick_count=0 when it got
IMPOSSIBLE_RANGE from test_quick_select.

Failure to do so will cause an assertion in 11.0, when the number of
quick select rows (0) is checked to be lower than the number of
found_records (which is capped up to 1).
2023-11-08 12:08:23 +01:00
Rex
eb8053b377 MDEV-31995 Bogus error executing PS for query using CTE with renaming of columns
This commit addresses column naming issues with CTEs in the use of prepared
statements and stored procedures. Usage of either prepared statements or
procedures with Common Table Expressions and column renaming may be affected.

There are three related but different issues addressed here.

1) First execution issue. Consider the following

prepare s from "with cte (col1, col2) as (select a as c1, b as c2 from t
order by c1) select col1, col2 from cte";
execute s;

After parsing, items in the select are named (c1,c2), order by (and group by)
resolution is performed, then item names are set to (col1, col2).
When the statement is executed, context analysis is again performed, but
resolution of elements in the order by statement will not be able to find c1,
because it was renamed to col1 and remains this way.

The solution is to save the names of these items during context resolution
before they have been renamed. We can then reset item names back to those after
parsing so first execution can resolve items referred to in order and group by
clauses.

2) Second Execution Issue

When the derived table contains more than one select 'unioned' together we could
reasonably think that dealing with only items in the first select (which
determines names in the resultant table) would be sufficient.  This can lead to
a different problem.  Consider

prepare st from "with cte (c1,c2) as
  (select a as col1, sum(b) as col2 from t1 where a > 0 group by col1
    union select a as col3, sum(b) as col4 from t2 where b > 2 group by col3)
  select * from cte where c1=1";

When the optimizer (only run during the first execution) pushes the outside
condition "c1=1" into every select in the derived table union, it renames the
items to make the condition valid.  In this example, this leaves the first item
in the second select named 'c1'.  The second execution will now fail 'group by'
resolution.

Again, the solution is to save the names during context analysis, resetting
before subsequent resolution, but making sure that we save/reset the item
names in all the selects in this union.

3) Memory Leak

During parsing Item::set_name() is used to allocate memory in the statement
arena.  We cannot use this call during statement execution as this represents
a memory leak.  We directly set the item list names to those in the column list
of this CTE (also allocated during parsing).

Approved by Igor Babaev <igor@mariadb.com>
2023-10-30 16:47:18 +12:00
Rex
ab6139ddc0 MDEV-32612 Assertion `tab->select->quick' failed in test_if_skip_sort_order
Fixup for MDEV-31983, incorrect test for checking ability to use quick select.

Approved by Sergei Petrunia
2023-10-29 07:26:18 +12:00