1
0
mirror of https://github.com/MariaDB/server.git synced 2025-11-09 11:41:36 +03:00
Commit Graph

8661 Commits

Author SHA1 Message Date
Oleksandr Byelkin
c770bce898 Merge branch '11.2' into 11.4 2024-10-30 15:11:17 +01:00
Oleg Smirnov
bf9662f6fa MDEV-35275 Unexpected WARN_SORTING_ON_TRUNCATED_LENGTH or assertion failure in diagnostics area
MDEV-27277 added warnings on truncation during sorting for SELECTs
but did not for DML operations. However, UPDATEs and DELETEs may also
perform sorting and thus produce warnings. This commit fixes that
2024-10-30 18:47:11 +07:00
Oleksandr Byelkin
69d033d165 Merge branch '10.11' into 11.2 2024-10-29 16:42:46 +01:00
Oleksandr Byelkin
3d0fb15028 Merge branch '10.6' into 10.11 2024-10-29 15:24:38 +01:00
Oleksandr Byelkin
f00711bba2 Merge branch '10.5' into 10.6 2024-10-29 14:20:03 +01:00
Sergei Petrunia
284593413f MDEV-35253: xa_prepare_unlock_unmodified fails: shift exponent 32 is too large
The code in best_access_path() uses PREV_BITS(uint, N) to
compute a bitmap of all keyparts: {keypart0, ... keypart{N-1}).

The problem is that PREV_BITS($type, N) macro code can't handle the case
when N=<number of bits in $type).
Also, why use PREV_BITS(uint, ...) for key part map computations when
we could have used PREV_BITS(key_part_map) ?

Fixed both:
- Change PREV_BITS(type, N) to handle any N in [0; n_bits(type)].
- Change PREV_BITS() to use key_part_map when computing key_part_map bitmaps.
2024-10-25 18:02:14 +03:00
Yuchen Pei
4b6922a315 MDEV-25008: UPDATE/DELETE: Cost-based choice IN->EXISTS vs Materialization
Single-table UPDATE/DELETE didn't provide outer_lookup_keys value for
subqueries. This didn't allow to make a meaningful choice between
IN->EXISTS and Materialization strategies for subqueries.

Fix this:
* Make UPDATE/DELETE save Sql_cmd_dml::scanned_rows,
* Then, subquery's JOIN::choose_subquery_plan() can fetch it from
there for outer_lookup_keys

Details:
UPDATE/DELETE now calls select_lex->optimize_unflattened_subqueries()
twice, like SELECT does (first call optimize_constant_subquries() in
JOIN::optimize_inner(), then call optimize_unflattened_subqueries() in
JOIN::optimize_stage2()):
1. Call with const_only=true before any optimizations. This allows
range optimizer and others to use the values of cheap const
subqueries.
2. Call it with const_only=false after range optimizer, partition
pruning, etc. outer_lookup_keys value is provided, so it's possible to
pick a good subquery strategy.

Note: PROTECT_STATEMENT_MEMROOT requires that first SP execution
performs subquery optimization for all subqueries, even for degenerate
query plans like "Impossible WHERE". Due to that, we ensure that the
call to optimize_unflattened_subqueries (with const_only=false) even
for degenerate query plans still happens, as was the case before this
change.
2024-10-23 23:51:24 +11:00
Oleg Smirnov
6bd1cb0ea0 MDEV-34880 Incorrect result for query with derived table having TEXT field
When a derived table which has distinct values and BLOB fields is
materialized, an index is created over all columns to ensure only
unique values are placed to the result.
This index is created in a special mode HA_UNIQUE_HASH to support BLOBs.
Later the optimizer may incorrectly choose this index to retrieve values
from the derived table, although such type of index cannot be used
for data retrieval.

This commit excludes HA_UNIQUE_HASH indexes from adding to
`JOIN::keyuse` array thus preventing their subsequent usage for
data retrieval
2024-10-23 17:55:00 +07:00
Oleg Smirnov
fd87e01f38 MDEV-27277 Add a warning when max_sort_length is reached
During a query execution some sorting and grouping operations
on strings may be involved. System variable max_sort_length defines
the maximum number of bytes to use when comparing strings during
sorting/grouping. Thus, the comparable parts of strings may be less
than their actual size, so the results of the query may be not
sorted/grouped properly.
To indicate that some comparisons were done on a truncated lengths,
a new warning has been introduced with this commit.
2024-10-22 22:39:36 +07:00
Alexander Barkov
e1cd3c4033 MDEV-12252 ROW data type for stored function return values
Adding support for the ROW data type in the stored function RETURNS clause:

- explicit ROW(..members...) for both sql_mode=DEFAULT and sql_mode=ORACLE

  CREATE FUNCTION f1() RETURNS ROW(a INT, b VARCHAR(32)) ...

- anchored "ROW TYPE OF [db1.]table1" declarations for sql_mode=DEFAULT

  CREATE FUNCTION f1() RETURNS ROW TYPE OF test.t1 ...

- anchored "[db1.]table1%ROWTYPE" declarations for sql_mode=ORACLE

  CREATE FUNCTION f1() RETURN test.t1%ROWTYPE ...

Adding support for anchored scalar data types in RETURNS clause:

- "TYPE OF [db1.]table1.column1" for sql_mode=DEFAULT

  CREATE FUNCTION f1() RETURNS TYPE OF test.t1.column1;

- "[db1.]table1.column1" for sql_mode=ORACLE

  CREATE FUNCTION f1() RETURN test.t1.column1%TYPE;

Details:

- Adding a new sql_mode_t parameter to
    sp_head::create()
    sp_head::sp_head()
    sp_package::create()
    sp_package::sp_package()
  to guarantee early initialization of sp_head::m_sql_mode.
  Before this change, this member was not initialized at all during
  CREATE FUNCTION/PROCEDURE/PACKAGE statements, and was not used.
  Now it needs to be initialized to write properly the
  mysql.proc.returns column, according to the create time sql_mode.

- Code refactoring to make the things simpler and functions smaller:

  * Adding a new method
    Field_row::row_create_fields(THD *thd, List<Spvar_definition> *list)
    to make a Virtual_tmp_table with Fields for ROW members
    from an explicit definition.

  * Adding a new method
    Field_row::row_create_fields(THD *thd, const Spvar_definition &def)
    to make a Virtual_tmp_table with Fields for ROW members
    from an explicit or a table anchored definition.

  * Adding a new method
    Item_args::add_array_of_item_field(THD *thd, const Virtual_tmp_table &vtable)
    to create and array of Item_field corresponding to all Field instances
    in a Virtual_tmp_table

  * Removing Item_field_row::row_create_items(). It was decomposed
    into the new methods described above.

  * Moving the code from the loop body in sp_rcontext::init_var_items()
    into a separate method Spvar_definition::make_item_field_row(),
    to make the code clearer (smaller functions).
    make_item_field_row() itself uses the new methods described above.

- Changing the data type of sp_head::m_return_field_def
  from Column_definition to Spvar_definition.
  So now it supports not only SQL column field types,
  but also explicit ROW and anchored ROW data types,
  as well as anchored column types.

- Adding a new Column_definition parameter to sp_head::create_result_field().
  Before this patch, create_result_field() took the definition only
  from m_return_field_def. Now it's also called with a local Column_definition
  variable which contains the explicit definition resolved from an
  anchored defition.

- Modifying sql_yacc.yy to support the new grammar.
  Adding new helper methods:
    * sf_return_fill_definition_row()
    * sf_return_fill_definition_rowtype_of()
    * sf_return_fill_definition_type_of()

- Fixing tests in:
  * Virtual_tmp_table::setup_field_pointers() in sql_select.cc
  * Send_field::normalize() in field.h
  * store_column_type()
  to prevent calling Type_handler_row::field_type(),
  which is implemented a DBUG_ASSERT(0).
  Before this patch the affected methods and functions were called only
  for scalar data types. Now ROW is also possible.

- Adding a new virtual method Field::cols()

- Overriding methods:
   Item_func_sp::cols()
   Item_func_sp::element_index()
   Item_func_sp::check_cols()
   Item_func_sp::bring_value()
  to support the ROW data type.

- Extending the rule sp_return_type to support
  * explicit ROW and anchored ROW data types
  * anchored scalar data types

- Overriding Field_row::sql_type() to print
  the data type of an explicit ROW.
2024-10-21 07:59:29 +04:00
Sergei Petrunia
a68e74b5a4 MDEV-35164: optimizer_join_limit_pref_ratio: assertion when the ORDER BY table becomes constant
Assertion failure has happened due to this scenario:

A query was ran with optimizer_join_limit_pref_ratio=1.
The query had "ORDER BY t1.col LIMIT N".
The optimizer set join->limit_shortcut_applicable=1.
Then, table t1 was marked as constant.
The code in choose_query_plan() still set join->limit_optimization_mode=1
which caused the optimizer to only consider t1 as the first non-const table.
But t1 was already put into the join prefix as the constant table.
The optimizer couldn't produce any join order at all and crashed.

Fixed by not searching for shortcut plan if ORDER BY table is a constant.
We will not try to do sorting anyway in this case (and LIMIT short-cutting
will be done for any join order).
2024-10-18 15:42:05 +03:00
Sergei Petrunia
0540eac05c MDEV-35180: ref_to_range rewrite causes poor query plan
(Variant 2: only allow rewrite for ref(const))

make_join_select() has a "ref_to_range" rewrite: it would rewrite
any ref access to a range access on the same index if the latter uses
more keyparts.
It seems, he initial intent of this was to fix poor query plan choice
in cases like

  t.keypart1=const AND t.keypart2 < 'foo'

Due to deficiency in cost model, ref access could be picked while range
would enumerate fewer rows and be cheaper.
However, the condition also forces a rewrite in cases like:

  t.keypart1=prev_table.col AND t.keypart1<='foo' AND t.keypart2<'bar'

Here, it can be that
* keypart1=prev_table.col is highly selective
* (keypart1, keypart2) <= ('foo', 'bar') is not at all selective.

Still, the rewrite would be made and poor query plan chosen.
Fixed this by only doing the rewrite if ref access was ref(const)
so we can be certain that quick select also used these restrictions
and will scan a subset of rows that ref access would scan.
2024-10-18 13:37:04 +03:00
Sergei Petrunia
9849e3f948 MDEV-35072: Assertion with optimizer_join_limit_pref_ratio and 1-table select
Pre-11.0 variant:

1. In recompute_join_cost_with_limit(), add an assertion that that
partial_join_cost >= 0.0.

2. best_extension_by_limited_search() subtracts COST_EPS from
   join->best_read. But it is not subtracted from
   join->positions[0].read_time, add it back.

2. We could get very small negative partial_join_cost due to rounding
  errors. For fraction=1.0, we were computing essentially this (denote
  as EXPR-1):

  $row_read_cost + $where_cost - ($row_read_cost + $where_cost)

which should compute to 0.
But the computation was done in the following order (left-to-right):
EXPR-2:

  ($row_read_cost + $where_cost) - $row_read_cost - $where_cost

this produced a value of -1.1102230246251565e-16 due to a rounding
error. Change the computation use EXPR-1 instead of EXPR-2.
2024-10-15 15:56:41 +03:00
Sergei Petrunia
5619f29384 Replace 0.001 with symbolic name COST_EPS
optimize_straight_join and best_extension_by_limited_search()
use 0.001 to make choice between plans with identical cost deterministic.

Use COST_EPS instead of 0.001, like it's done in newer versions.
2024-10-15 15:56:41 +03:00
Sergei Petrunia
66b8d32b75 MDEV-35072: Assertion with optimizer_join_limit_pref_ratio and 1-table select
Variant for 11.2+:

In recompute_join_cost_with_limit(), do not subtract the cost of checking
the WHERE:
   pos->records_read* WHERE_COST_THD(join->thd)

It is already included in pos->read_time.

Also added comments about difference between this fix and the pre-11.2
variant.
2024-10-15 15:01:29 +03:00
Yuchen Pei
cd5577ba4a Merge branch '10.5' into 10.6 2024-10-15 16:00:44 +11:00
Yuchen Pei
77ed235d50 MDEV-26345 Spider GBH should execute original queries on the data node
Stop skipping const items when selecting but skip them when storing
their results to spider row to avoid storing in mismatching temporary
table fields.

Skip auxiliary fields in SELECTing, and do not store
the (non-existing) results to the corresponding temporary table
accordingly.

When there are BOTH auxiliary fields AND const items in the auxiliary
field items, do not use the spider GBH. This is a rare occasion if it
happens at all and not worth the added complexity to cover it.

Use the original item (item_ptr) in constructing GROUP BY and ORDER
BY, which also means using item->name instead of field->field_name as
aliases in constructing SELECT items. This fixes spurious regressions
caused by the above changes in some tests using ORDER BY, such as
mdev_24517.test. As a by-product, this also fixes MDEV-29546.
Therefore we update mdev_29008.test to include the MDEV-29546 case.
2024-10-15 15:36:12 +11:00
Rex
10008b3d3e MDEV-31466 Add optional correlation column list for derived tables
Extend derived table syntax to support column name assignment.
(subquery expression) [as|=] ident [comma separated column name list].
Prior to this patch, the optional comma separated column name list is
not supported.

Processing within the unit of the subquery expression will use
original column names, outside the unit will use the new names.

For example, in the query

select a1, a2 from
  (select c1, c2, c3 from t1 where c2 > 0) as dt (a1, a2, a3)
where a2 > 10;

we see the second column of the derived table dt being used both within,
(where c2 > 0), and outside, (where a2 > 10), the specification.
Both conditions apply to t1.c2.

When multiple unit preparations are required, such as when being used within
a prepared statement or procedure, original column names are needed for
correct resolution. Original names are reset within mysql_derived_reinit().

Item_holder items, used for result tables in both TVC and union preparations
are renamed before use within st_select_lex_unit::prepare().

During wildcard expansion, if column names are present, items names are
set directly after creation.

Reviewed by Igor Babaev (igor@mariadb.com)
2024-10-15 06:08:46 +12:00
Oleksandr Byelkin
a79c9b3812 MDEV-35135 Assertion `!is_cond()' failed in Item_bool_func::val_int / do_select
Change val_int with val_bool when it is a condition.
2024-10-14 09:36:17 +02:00
Oleksandr Byelkin
1d0e94c55f Merge branch '10.5' into 10.6 2024-10-09 08:38:48 +02:00
Sergei Golubchik
3ea71a2c8e MDEV-16699 heap-use-after-free in group_concat with compressed or GIS columns
Field_blob::store() has special code for GROUP_CONCAT temporary table
(to store blob values in Blob_mem_storage - this prevents them
from being freed/overwritten when a next row is read).

Field_geom and Field_blob_compressed inherit from Field_blob but they
have their own ::store() method without this special Blob_mem_storage
support.

Considering that non-grouping CONCAT() of such fields converts
them to plain BLOB, let's do the same for GROUP_CONCAT. To do it,
Item_func_group_concat::setup will signal that it's creating
a temporary table for GROUP_CONCAT, and Field_blog::make_new_field()
override will create base Field_blob when under group concat.
2024-10-08 15:31:02 +02:00
Alexander Barkov
a931da82fa MDEV-34123 CONCAT Function Returns Unexpected Empty Set in Query
Search conditions were evaluated using val_int(), which was wrong.
Fixing the code to use val_bool() instead.

Details:
- Adding a new item_base_t::IS_COND flag which marks Items used
  as <search condition> in WHERE, HAVING, JOIN ON, CASE WHEN clauses.
  The flag is at the parse time.
  These expressions must be evaluated using val_bool() rather than val_int().

  Note, the optimizer creates more Items which are used as search conditions.
  Most of these items are not marked with IS_COND yet. This is OK for now,
  but eventually these Items can also be fixed to have the flag.

- Adding a method Item::is_cond() which tests if the Item has the IS_COND flag.

- Implementing Item_cache_bool. It evaluates the cached expression using
  val_bool() rather than val_int().
  Overriding Type_handler_bool::Item_get_cache() to create Item_cache_bool.

- Implementing Item::save_bool_in_field(). It uses val_bool() rather than
  val_int() to evaluate the expression.

- Implementing Type_handler_bool::Item_save_in_field()
  using Item::save_bool_in_field().

- Fixing all Item_bool_func descendants to implement a virtual val_bool()
  rather than a virtual val_int().

- To find places where val_int() should be fixed to val_bool(), a few
  DBUG_ASSERT(!is_cond()) where added into val_int() implementations
  of selected (most frequent) classes:

  Item_field
  Item_str_func
  Item_datefunc
  Item_timefunc
  Item_datetimefunc
  Item_cache_bool
  Item_bool_func
  Item_func_hybrid_field_type
  Item_basic_constant descendants

- Fixing all places where DBUG_ASSERT() happened during an "mtr" run
  to use val_bool() instead of val_int().
2024-10-08 11:58:46 +02:00
Marko Mäkelä
43465352b9 Merge 11.4 into 11.6 2024-10-03 16:09:56 +03:00
Marko Mäkelä
b53b81e937 Merge 11.2 into 11.4 2024-10-03 14:32:14 +03:00
Marko Mäkelä
12a91b57e2 Merge 10.11 into 11.2 2024-10-03 13:24:43 +03:00
Marko Mäkelä
63913ce5af Merge 10.6 into 10.11 2024-10-03 10:55:08 +03:00
Marko Mäkelä
7e0afb1c73 Merge 10.5 into 10.6 2024-10-03 09:31:39 +03:00
Yuchen Pei
ba7088d462 Merge '11.4' into 11.6 2024-10-03 15:59:20 +10:00
Sergei Golubchik
b1bbdbab9e cleanup: remove redundant if()
likely, a result of auto-merge of two fixes in different versions
2024-10-01 18:29:11 +02:00
Oleksandr Byelkin
8d810e9426 MDEV-29537 Creation of view with UNION and SELECT ... FOR UPDATE in definition is failed with error
lock_type is writen in the last SELECT of the unit even if it parsed last,
so it should be printed last from the last select of the unit.
2024-10-01 11:07:45 +02:00
Sergei Petrunia
d67c88946f Debugging: add dbug_print_join_prefix() to use in best_access_path
A call to

  dbug_print_join_prefix(join_positions, idx, s)

returns a const char* ponter to string with current join prefix,
including the table being added to it.
2024-09-25 16:53:59 +03:00
Yuchen Pei
1f7f406b7b Merge branch '11.2' into 11.4 2024-09-18 11:27:53 +10:00
Sergei Petrunia
5cd3fa81ef Merge 10.11 -> 11.2 2024-09-17 12:34:33 +03:00
Sergei Petrunia
2c3b298337 Merge 11.2 into 11.4 2024-09-09 14:40:02 +03:00
Sergei Petrunia
abd98336d2 Merge 10.11 -> 11.2 2024-09-09 13:50:38 +03:00
Yuchen Pei
d002b1f503 Merge branch '10.6' into 10.11 2024-09-09 11:34:19 +10:00
Sergei Petrunia
c630e23a18 MDEV-34894: Poor query plan, because range estimates are not reused for ref(const)
(Variant 4, with @@optimizer_adjust_secondary_key_costs, reuse in two
places, and conditions are replaced with equivalent simpler forms in two more)

In best_access_path(), ReuseRangeEstimateForRef-3,  the check
for whether
 "all used key_part_i used key_part_i=const"
was incorrect: it may produced a "NO" answer for cases when we
had:
 key_part1= const // some key parts are usable
 key_part2= value_not_in_join_prefix  //present but unusable
 key_part3= non_const_value // unusable due to gap in key parts.

This caused the optimizer to fail to apply ReuseRangeEstimateForRef
heuristics. The consequence is poor query plan choice when the index
in question has very skewed data distribution.

The fix is enabled if its @@optimizer_adjust_secondary_key_costs flag
is set.
2024-09-08 16:26:13 +03:00
Monty
c41ab95a38 Remove rows and cost from optimizer trace for not usable key parts 2024-09-07 16:51:52 +03:00
Monty
886d740ad7 Optimized max_part_bit in sql_select.cc to use my_find_first_bit. 2024-09-07 16:51:52 +03:00
Monty
f0b2e76577 Removed ctrl-l from the source 2024-09-06 15:30:18 +03:00
Marko Mäkelä
2da4839bb6 Merge 10.6 into 10.11 2024-09-06 14:45:22 +03:00
Yuchen Pei
60b93cdd30 Merge branch '10.5' into 10.6 2024-09-06 13:52:57 +10:00
Yuchen Pei
2c3e07df47 MDEV-34447: Memory leakage is detected on running the test main.ps against the server 11.1
The memory leak happened on second execution of a prepared statement
that runs UPDATE statement with correlated subquery in right hand side of
the SET clause. In this case, invocation of the method
  table->stat_records()
could return the zero value that results in going into the 'if' branch
that handles impossible where condition. The issue is that this condition
branch missed saving of leaf tables that has to be performed as first
condition optimization activity. Later the PS statement memory root
is marked as read only on finishing first time execution of the prepared
statement. Next time the same statement is executed it hits the assertion
on attempt to allocate a memory on the PS memory root marked as read only.
This memory allocation takes place by the sequence of the following
invocations:
 Prepared_statement::execute
  mysql_execute_command
   Sql_cmd_dml::execute
    Sql_cmd_update::execute_inner
     Sql_cmd_update::update_single_table
      st_select_lex::save_leaf_tables
       List<TABLE_LIST>::push_back

To fix the issue, add the flag SELECT_LEX::leaf_tables_saved to control
whether the method SELECT_LEX::save_leaf_tables() has to be called or
it has been already invoked and no more invocation required.

Similar issue could take place on running the DELETE statement with
the LIMIT clause in PS/SP mode. The reason of memory leak is the same as for
UPDATE case and be fixed in the same way.
2024-09-06 11:41:58 +10:00
Marko Mäkelä
a5b80531fb Merge 11.4 into 11.6 2024-09-04 10:38:25 +03:00
Sergei Petrunia
c8d040938a MDEV-34720: Poor plan choice for large JOIN with ORDER BY and small LIMIT
(Variant 2b: call greedy_search() twice, correct handling for limited
 search_depth)

Modify the join optimizer to specifically try to produce join orders that
can short-cut their execution for ORDER BY..LIMIT clause.

The optimization is controlled by @@optimizer_join_limit_pref_ratio.
Default value 0 means don't construct short-cutting join orders.
Other value means construct short-cutting join order, and prefer it only
if it promises speedup of more than #value times.

In Optimizer Trace, look for these names:
* join_limit_shortcut_is_applicable
* join_limit_shortcut_plan_search
* join_limit_shortcut_choice
2024-09-02 16:37:18 +03:00
Sergei Petrunia
819765a47d Code cleanup in test_if_skip_sort_order()
Added comments
Moved a part into find_indexes_matching_order().
2024-09-02 16:37:18 +03:00
Marko Mäkelä
44733aa8cf Merge 11.2 into 11.4 2024-08-29 19:10:38 +03:00
Marko Mäkelä
e91a799458 Merge 10.11 into 11.2 2024-08-29 16:02:57 +03:00
Marko Mäkelä
cfcf27c6fe Merge 10.6 into 10.11 2024-08-29 07:47:29 +03:00
Sergei Petrunia
9020baf126 Trivial fix: Make test_if_cheaper_ordering() use actual_rec_per_key()
Discovered this while working on MDEV-34720: test_if_cheaper_ordering()
uses rec_per_key, while the original estimate for the access method
is produced in best_access_path() by using actual_rec_per_key().

Make test_if_cheaper_ordering() also use actual_rec_per_key().
Also make several getter function "const" to make this compile.
Also adjusted the testcase to handle this (the change backported from
11.0)
2024-08-25 16:05:00 +03:00