1
0
mirror of https://github.com/MariaDB/server.git synced 2025-07-16 00:42:55 +03:00
Commit Graph

858 Commits

Author SHA1 Message Date
c3a5cf2b5b Merge branch '10.5' into 10.6 2023-01-31 09:31:42 +01:00
7fa02f5c0b Merge branch '10.4' into 10.5 2023-01-27 13:54:14 +01:00
dd24fa3063 Merge branch '10.3' into 10.4 2023-01-26 10:34:26 +01:00
2ed598eae8 Added comments re JOIN::all_fields, JOIN::fields_list 2023-01-24 13:30:22 +02:00
a8c5635cf1 Merge 10.5 into 10.6 2023-01-17 20:02:29 +02:00
981a6b7044 MDEV-30395 Wrong result with semijoin and Federated as outer table
The problem was that federated engine does not support comparable rowids
which was not taken into account by semijoin code.

Fixed by checking that we don't use semijoin with tables that does not
support comparable rowids.

Other things:
- Fixed some typos in the code comments
2023-01-13 16:23:21 +02:00
3386b30975 Merge 10.5 into 10.6 2023-01-13 10:45:41 +02:00
73ecab3d26 Merge 10.4 into 10.5 2023-01-13 10:18:30 +02:00
610cea3dda cleanup
Helper class to swicth to relaxed checks during field copy.
Temporarily.
2023-01-09 18:06:06 +01:00
e0dbec1ce3 MDEV-29129: Performance regression starting in 10.6: select order by limit ...
The cause of regression was handling for ROWNUM() function.
For queries like

  SELECT ROWNUM() FROM ... ORDER BY ...

ROWNUM() should be computed before the ORDER BY.
The computation was moved to be before the ORDER BY for any entries in
the select list that had RAND_TABLE_BIT set.

This had a negative impact on queries in form:

  SELECT sp_func() FROM t1 ORDER BY ... LIMIT n

where sp_func() is NOT declared as DETERMINISTIC (and so has
RAND_TABLE_BIT set).

The fix is to require evaluation for sorting only for the ROWNUM()
function. Functions that just have RAND_TABLE_BIT() can be computed
after ORDER BY ... LIMIT is applied.

(think about a possible index that satisfies the ORDER BY clause. In
that case, the the rows would be read in the needed order and we would
stop after reading LIMIT rows, achieving the same effect).
2022-12-03 15:46:00 +03:00
2ac1edb1c3 Merge 10.5 into 10.6 2022-11-08 17:37:22 +02:00
a732d5e2ba Merge 10.4 into 10.5 2022-11-08 17:01:28 +02:00
0d927a57d2 MDEV-29624 MDEV-29655 Fix ASAN errors on pushdown of derived table
Deallocation of TABLE_LIST::dt_handler and TABLE_LIST::pushdown_derived
was performed in multiple places if code. This not only made the code
more difficult to maintain but also led to memory leaks and
ASAN heap-use-after-free errors.
This commit puts deallocation of TABLE_LIST::dt_handler and
TABLE_LIST::pushdown_derived to the single point - JOIN::cleanup()
2022-10-31 19:20:17 +04:00
5027cb2b74 MDEV-29662 Replace same values in 'IN' list with an equality
If all elements in the list of 'IN' or 'NOT IN' clause are equal
and there are no NULLs then clause
-  "a IN (e1,..,en)" can be converted to "a = e1"
-  "a NOT IN (e1,..,en)" can be converted to "a <> e1".
This means an object of Item_func_in can be replaced with an object
of Item_func_eq for IN (e1,..,en) clause and Item_func_ne for
NOT IN (e1,...,en). Such a replacement allows the optimizer to choose
a better execution plan
2022-10-26 11:01:56 +07:00
ee620a7416 Merge branch '10.5' into 10.6 2022-08-04 16:58:42 +02:00
ea12dafe65 Merge branch '10.4' into 10.5 2022-08-04 12:16:35 +02:00
992b510b2f Fix compile errors. 2022-08-04 10:09:57 +02:00
c6406643cd Fix compile errors. 2022-08-04 10:01:24 +02:00
1e71ea806b Merge branch '10.4' into 10.5 2022-08-04 08:30:03 +02:00
e509065247 Merge branch '10.3' into 10.4 2022-08-03 19:51:44 +02:00
2cd98c95de MDEV-23809: Server crash in JOIN_CACHE::free or ...
The problem was caused by use of COLLATION(AVG('x')). This is an
item whose value is a constant.
Name Resolution code called convert_const_to_int() which removed AVG('x').
However, the item representing COLLATION(...) still had with_sum_func=1.

This inconsistent state confused the code that handles grouping and
DISTINCT: JOIN::get_best_combination() decided to use one temporary
table and allocated one JOIN_TAB for it, but then
JOIN::make_aggr_tables_info() attempted to use two and made writes
beyond the end of the JOIN::join_tab array.

The fix:
- Do not replace constant expressions which contain aggregate functions.
- Add JOIN::dbug_join_tab_array_size to catch attempts to use more
  JOIN_TAB objects than we've allocated.
2022-08-03 19:40:02 +03:00
30914389fe Merge 10.5 into 10.6 2022-07-27 17:52:37 +03:00
098c0f2634 Merge 10.4 into 10.5 2022-07-27 17:17:24 +03:00
3bb36e9495 Merge branch '10.3' into 10.4 2022-07-27 11:02:57 +02:00
49e14000ee MDEV-26427 MariaDB Server SEGV on INSERT .. SELECT
1. For INSERT..SELECT statements: don't include table/view the data
   is inserted into in the list of leaf tables
2. Remove duplicated and dead code related to table_count
2022-07-14 11:07:24 +07:00
a9d0bb12e6 Merge 10.4 into 10.5 2022-06-09 12:22:55 +03:00
c89e3b70a7 Merge 10.3 into 10.4 2022-06-09 11:53:46 +03:00
432a4ebe5c Improve table pruning in optimizer with up to date key_dependent map
Part of:
MDEV-28073 Slow query performance in MariaDB when using many tables

s->key_dependent has a list of tables that are compared with key fields
in the current table.  However it does not take into account if a key
field could be resolved by another table.
This is because MariaDB expands 'join_tab->keyuse' to include all generated
comparisons.
For example:
SELECT * from t1,t2,t3 where t1.key=t2.key and t2.key=t3.key
In this case keyuse for t1 includes t2.key and t3.key and key_dependent
contains 't2.map | t3.map'
If we in best_extension_by_limited_search() consider t2,t1 then t1's
key is fully defined, but we cannot do any prune of plans as
s->key_dependent indicates that t3 is still needed.

Fixed by calculating in best_access_patch the current key_dependent map
of tables that is needed to satisfy all keys. This allows us to prune
more bad plans earlier as soon as all keys can be used.

We also set key_dependent to 0 if we found an EQ_REF key, as this an
optimal key for the table and there is no reason to check more keys.
2022-06-07 20:43:11 +03:00
f0ea7f7f33 MDEV-28749: restore_prev_nj_state() doesn't update cur_sj_inner_tables correctly
(Try 2)

The code that updates semi-join optimization state for a join order prefix
had several bugs. The visible effect was bad optimization for FirstMatch or
LooseScan strategies: they either weren't considered when they should have
been, or considered when they shouldn't have been.

In order to hit the bug, the optimizer needs to consider several different
join prefixes in a certain order. Queries with "obvious" query plans which
prune all join orders except one are not affected.

Internally, the bugs in updates of semi-join state were:
1. restore_prev_sj_state() assumed that
  "we assume remaining_tables doesnt contain @tab"
  which wasn't true.
2. Another bug in this function: it did remove bits from
   join->cur_sj_inner_tables but never added them.
3. greedy_search() adds tables into the join prefix but neglects to update
   the semi-join optimization state. (It does update nested outer join
   state, see this call:
     check_interleaving_with_nj(best_table)
   but there's no matching call to update the semi-join state.
   (This wasn't visible because most of the state is in the POSITION
    structure which is updated. But there is also state in JOIN, too)

The patch:
- Fixes all of the above
- Adds JOIN::dbug_verify_sj_inner_tables() which is used to verify the
  state is correct at every step.
- Renames advance_sj_state() to optimize_semi_joins().
  = Introduces update_sj_state() which ideally should have been called
    "advance_sj_state" but I didn't reuse the name to not create confusion.
2022-06-07 20:43:10 +03:00
19c721631e MDEV-28749: restore_prev_nj_state() doesn't update cur_sj_inner_tables correctly
(Try 2) (Cherry-pick back into 10.3)

The code that updates semi-join optimization state for a join order prefix
had several bugs. The visible effect was bad optimization for FirstMatch or
LooseScan strategies: they either weren't considered when they should have
been, or considered when they shouldn't have been.

In order to hit the bug, the optimizer needs to consider several different
join prefixes in a certain order. Queries with "obvious" query plans which
prune all join orders except one are not affected.

Internally, the bugs in updates of semi-join state were:
1. restore_prev_sj_state() assumed that
  "we assume remaining_tables doesnt contain @tab"
  which wasn't true.
2. Another bug in this function: it did remove bits from
   join->cur_sj_inner_tables but never added them.
3. greedy_search() adds tables into the join prefix but neglects to update
   the semi-join optimization state. (It does update nested outer join
   state, see this call:
     check_interleaving_with_nj(best_table)
   but there's no matching call to update the semi-join state.
   (This wasn't visible because most of the state is in the POSITION
    structure which is updated. But there is also state in JOIN, too)

The patch:
- Fixes all of the above
- Adds JOIN::dbug_verify_sj_inner_tables() which is used to verify the
  state is correct at every step.
- Renames advance_sj_state() to optimize_semi_joins().
  = Introduces update_sj_state() which ideally should have been called
    "advance_sj_state" but I didn't reuse the name to not create confusion.
2022-06-07 18:48:44 +03:00
b729896d00 MDEV-28073 Query performance degradation in newer MariaDB versions when using many tables
The issue was that best_extension_by_limited_search() had to go through
too many plans with the same cost as there where many EQ_REF tables.

Fixed by shortcutting EQ_REF (AND REF) when the result only contains one
row. This got the optimization time down from hours to sub seconds.

The only known downside with this patch is that in some cases a table
with ref and 1 record may be used before on EQ_REF table. The faster
optimzation phase should compensate for this.
2022-05-12 10:01:10 +03:00
3bc98a4ec4 Merge branch '10.5' into 10.6 2022-05-10 14:01:23 +02:00
ef781162ff Merge branch '10.4' into 10.5 2022-05-09 22:04:06 +02:00
a70a1cf3f4 Merge branch '10.3' into 10.4 2022-05-08 23:03:08 +02:00
6f741eb6e4 Merge branch '10.2' into 10.3 2022-05-07 11:48:15 +02:00
8dbfaa2aa4 MDEV-28437: Assertion `!eliminated' failed in Item_subselect::exec
(This is the assert that was added in fix for MDEV-26047)

Table elimination may remove an ON expression from an outer join.
However SELECT_LEX::update_used_tables() will still call

  item->walk(&Item::eval_not_null_tables)

for eliminated expressions. If the subquery is constant and cheap
Item_cond_and will attempt to evaluate it, which will trigger an
assert.
The fix is not to call update_used_tables() or eval_not_null_tables()
for ON expressions that were eliminated.
2022-05-05 18:58:25 +03:00
cce994057b Merge 10.5 into 10.6 2022-02-09 15:49:50 +02:00
34c5019698 Merge branch '10.5' into bb-10.5-release 2022-02-09 08:57:41 +01:00
38058c04a4 MDEV-26585 Wrong query results when using index for group-by
The problem was that "group_min_max optimization" does not work if
some aggregate functions, like COUNT(*), is used.
The function get_best_group_min_max() is using the join->sum_funcs
array to check which aggregate functions are used.
The bug was that aggregates in HAVING where not yet added to
join->sum_funcs at the time get_best_group_min_max() was called.

Fixed by populate join->sum_funcs already in prepare, which means that
all sum functions will be in join->sum_funcs in get_best_group_min_max().
A benefit of this approach is that we can remove several calls to
make_sum_func_list() from the code and simplify the function.

I removed some wrong setting of 'sort_and_group'.
This variable is set when alloc_group_fields() is called, as part
of allocating the cache needed by end_send_group() and does not need
to be set by other functions.

One problematic thing was that Spider is using *join->sum_funcs to detect
at which stage the optimizer is and do internal calculations of aggregate
functions. Updating join->sum_funcs early caused Spider to fail when trying
to find min/max values in opt_sum_query().
Fixed by temporarily resetting sum_funcs during opt_sum_query().

Reviewer: Sergei Petrunia
2022-02-08 14:32:29 +02:00
f5c5f8e41e Merge branch '10.5' into 10.6 2022-02-03 17:01:31 +01:00
cf63eecef4 Merge branch '10.4' into 10.5 2022-02-01 20:33:04 +01:00
a576a1cea5 Merge branch '10.3' into 10.4 2022-01-30 09:46:52 +01:00
0041265671 MDEV-27510 Query returns wrong result when using split optimization
This bug may affect the queries that uses a grouping derived table with
grouping list containing references to columns from different tables if
the optimizer decides to employ the split optimization for the derived
table. In some very specific cases it may affect queries with a grouping
derived table that refers only one base table.
This bug was caused by an improper fix for the bug MDEV-25128. The fix
tried to get rid of the equality conditions pushed into the where clause
of the grouping derived table T to which the split optimization had been
applied. The fix erroneously assumed that only those pushed equalities
that were used for ref access of the tables referenced by T were needed.
In fact the function remove_const() that figures out what columns from the
group list can be removed if the split optimization is applied can uses
other pushed equalities as well.
This patch actually provides a proper fix for MDEV-25128. Rather than
trying to remove invalid pushed equalities referencing the fields of SJM
tables with a look-up access the patch attempts not to push such equalities.

Approved by Oleksandr Byelkin <sanja@mariadb.com>
2022-01-25 17:12:37 -08:00
cc125bebfe Fix all warnings given by UBSAN
The 'special' cases where we disable, suppress or circumvent UBSAN are:
- ref10 source (as here we intentionally do some shifts that UBSAN
  complains about.
- x86 version of optimized int#korr() methods. UBSAN do not like unaligned
  memory access of integers.  Fixed by using byte_order_generic.h when
  compiling with UBSAN
- We use smaller thread stack with ASAN and UBSAN, which forced me to
  disable a few tests that prints the thread stack size.
- Verifying class types does not work for shared libraries. I added
  suppression in mysql-test-run.pl for this case.
- Added '#ifdef WITH_UBSAN' when using integer arithmetic where it is
  safe to have overflows (two cases, in item_func.cc).

Things fixed:
- Don't left shift signed values
  (byte_order_generic.h, mysqltest.c, item_sum.cc and many more)
- Don't assign not non existing values to enum variables.
- Ensure that bool and enum values are properly initialized in
  constructors.  This was needed as UBSAN checks that these types has
  correct values when one copies an object.
  (gcalc_tools.h, ha_partition.cc, item_sum.cc, partition_element.h ...)
- Ensure we do not called handler functions on unallocated objects or
  deleted objects.
  (events.cc, sql_acl.cc).
- Fixed bugs in Item_sp::Item_sp() where we did not call constructor
  on Query_arena object.
- Fixed several cast of objects to an incompatible class!
  (Item.cc, Item_buff.cc, item_timefunc.cc, opt_subselect.cc, sql_acl.cc,
   sql_select.cc ...)
- Ensure we do not do integer arithmetic that causes over or underflows.
  This includes also ++ and -- of integers.
  (Item_func.cc, Item_strfunc.cc, item_timefunc.cc, sql_base.cc ...)
- Added JSON_VALUE_UNITIALIZED to json_value_types and ensure that
  value_type is initialized to this instead of to -1, which is not a valid
  enum value for json_value_types.
- Ensure we do not call memcpy() when second argument could be null.

Other things:

- Changed struct st_position to an OBJECT and added an initialization
  function to it to ensure that we do not copy or use uninitialized
  members. The change to a class was also motived that we used "struct
  st_position" and POSITION randomly trough the code which was
  confusing.
- Notably big rewrite in sql_acl.cc to avoid using deleted objects.
- Changed in sql_partition to use '^' instead of '-'. This is safe as
  the operator is either 0 or 0x8000000000000000ULL.
- Added check for select_nr < INT_MAX in JOIN::build_explain() to
  avoid bug when get_select() could return NULL.
- Reordered elements in POSITION for better alignment.
- Changed sql_test.cc::print_plan() to use pointers instead of objects.
- Fixed bug in find_set() where could could execute '1 << -1'.
- Added variable have_sanitizer, used by mtr.  (This variable was before
  only in 10.5 and up).  It can now have one of two values:
  ASAN or UBSAN.
- Moved ~Archive_share() from ha_archive.cc to ha_archive.h and marked
  it virtual. This was an effort to get UBSAN to work with loaded storage
  engines. I kept the change as the new place is better.
- Added in CONNECT engine COLBLK::SetName(), to get around a wrong cast
  in tabutil.cpp.

Changes that should not be needed but had to be done to suppress warnings
from UBSAN:

- Added static_cast<<uint16_t>> around shift to get rid of a LOT of
  compiler warnings when using UBSAN.
- Had to change some '/' of 2 base integers to shift to get rid of
  some compile time warnings.

Fixes:

MDEV-25505 Assertion `old_flags == ((my_flags & 0x10000U) ? 1 : 0)
fixed (was caused by an old version if this commit).

Reviewed by:
- Json changes: Alexey Botchkov
- Charset changes in ctype-uca.c: Alexander Barkov
- InnoDB changes: Marko Mäkelä
- sql_acl.cc changes: Vicențiu Ciorbaru
- build_explain() changes: Sergey Petrunia
Temporary commit to log changes for UBSAN
2021-05-19 22:54:14 +02:00
db9398ba5e Improved code comment and removed nop test 2021-05-19 22:54:12 +02:00
08bc062e3c Remove some usage of Check_level_instant_set and Sql_mode_save
The reason for the removal are:
- Generates more code
  - Storing and retreving THD
  - Causes extra code and daata to be generated to handle possible throw
    exceptions (which never happens in MariaDB code)
- Uses more stack space

Other things:
- Changed convert_const_to_int() to use item->save_in_field_no_warnings(),
  which made the code shorter and simpler.
- Removed not needed code in Sp_handler::sp_create_routine()
- Added thd as argument to store_key.copy() to make function simpler
- Added thd as argument to some subselect* constructor that inherites
  from Item_subselect.
2021-05-19 22:54:12 +02:00
be093c81a7 MDEV-24089 support oracle syntax: rownum
The ROWNUM() function is for SELECT mapped to JOIN->accepted_rows, which is
incremented for each accepted rows.
For Filesort, update, insert, delete and load data, we map ROWNUM() to
internal variables incremented when the table is changed.
The connection between the row counter and Item_func_rownum is done
in sql_select.cc::fix_items_after_optimize() and
sql_insert.cc::fix_rownum_pointers()

When ROWNUM() is used anywhere in query, the optimization to ignore ORDER
BY in sub queries are disabled. This was done to get the following common
Oracle query to work:
select * from (select * from t1 order by a desc) as t where rownum() <= 2;
MDEV-3926 "Wrong result with GROUP BY ... WITH ROLLUP" contains a discussion
about this topic.

LIMIT optimization is enabled when in a top level WHERE clause comparing
ROWNUM() with a numerical constant using any of the following expressions:
- ROWNUM() < #
- ROWNUM() <= #
- ROWNUM() = 1
ROWNUM() can be also be the right argument to the comparison function.

LIMIT optimization is done in two cases:
- For the current sub query when the ROWNUM comparison is done on the top
  level:
  SELECT * from t1 WHERE rownum() <= 2 AND t1.a > 0
- For an inner sub query, when the upper level has only a ROWNUM comparison
  in the WHERE clause:
  SELECT * from (select * from t1) as t WHERE rownum() <= 2

In Oracle mode, one can also use ROWNUM without parentheses.

Other things:
- Fixed bug where the optimizer tries to optimize away sub queries
  with RAND_TABLE_BIT set (non-deterministic queries). Now these
  sub queries will not be converted to joins.  This bug fix was also
  needed to get rownum() working inside subqueries.
- In remove_const() remove setting simple_order to FALSE if ROLLUP is
  USED. This code was disable a long time ago because of wrong assignment
  in the following code.  Instead we set simple_order to false if
  RAND_TABLE_BIT was used in the SELECT list.  This ensures that
  we don't delete ORDER BY if the result set is not deterministic, like
  in 'SELECT RAND() AS 'r' FROM t1 ORDER BY r';
- Updated parameters for Sort_param::init_for_filesort() to be able
  to provide filesort with information where the number of accepted
  rows should be stored
- Reordered fields in class Filesort to optimize storage layout
- Added new error messsage to tell that a function can't be used in HAVING
- Added field 'with_rownum' to THD to mark that ROWNUM() is used in the
  query.

Co-author: Oleksandr Byelkin <sanja@mariadb.com>
           LIMIT optimization for sub query
2021-05-19 22:54:11 +02:00
30f0a246a0 Added override to all releveant methods in Item (and a few other classes)
Other things:
- Remove inline and virtual for methods that are overrides
- Added a 'final' to some Item classes
2021-05-19 22:27:53 +02:00
3105c9e7a5 Change bitfields in Item to an uint16
The reason for the change is that neither clang or gcc can do efficient
code when several bit fields are change at the same time or when copying
one or more bits between identical bit fields.
Updated bits explicitely with & and | is MUCH more efficient than what
current compilers can do.
2021-05-19 22:27:28 +02:00
916b237b3f Merge 10.5 into 10.6 2021-05-07 15:00:27 +03:00