(trivial backport to 10.2)
The optimizer removes redundant GROUP BY operations. If GROUP BY element
is a subselect, it is "eliminated".
However one must not eliminate the item if it is used both in the select
list and in the GROUP BY, like so:
select (select ... ) as SUBQ from ... group by SUBQ
Do not eliminate such items.
The optimizer removes redundant GROUP BY operations. If GROUP BY element
is a subselect, it is "eliminated".
However one must not eliminate the item if it is used both in the select
list and in the GROUP BY, like so:
select (select ... ) as SUBQ from ... group by SUBQ
Do not eliminate such items.
Pushing LIMIT to temp aggregation table is possible, but not when WITH
TIES is used. In a degenerate case with constant ORDER BY, the constant
gets removed and the code assumed the limit is push-able.
Ensure that if WITH TIES is present, that this does not happen.
This commit implements the standard SQL extension
OFFSET start { ROW | ROWS }
[FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } { ONLY | WITH TIES }]
To achieve this a reserved keyword OFFSET is introduced.
The general logic for WITH TIES implies:
1. The number of rows a query returns is no longer known during optimize
phase. Adjust optimizations to no longer consider this.
2. During end_send make use of an "order Cached_item"to compare if the
ORDER BY columns changed. Keep returning rows until there is a
change. This happens only after we reached the row limit.
3. Within end_send_group, the order by clause was eliminated. It is
still possible to keep the optimization of using end_send_group for
producing the final result set.
Replace
* select_lex::offset_limit
* select_lex::select_limit
* select_lex::explicit_limit
with select_lex::Lex_select_limit
The Lex_select_limit already existed with the same elements and was used in
by the yacc parser.
This commit is in preparation for FETCH FIRST implementation, as it
simplifies a lot of the code.
Additionally, the parser is simplified by making use of the stack to
return Lex_select_limit objects.
Cleanup of init_query() too. Removes explicit_limit= 0 as it's done a bit later
in init_select() with limit_params.empty()
(Also fixes MDEV-25254).
Re-work Name Resolution for the argument of JSON_TABLE(json_doc, ....)
function. The json_doc argument can refer to other tables, but it can
only refer to the tables that precede[*] the JSON_TABLE(...) call.
[*] - For queries with RIGHT JOINs, the "preceding" is determined after
the query is normalized by converting RIGHT JOIN into left one.
The implementation is as follows:
- Table function arguments use their own Name_resolution_context.
- The Name_resolution_context now has a bitmap of tables that should be
ignored when searching for a field.
- get_disallowed_table_deps() walks the TABLE_LIST::nested_join tree
and computes a bitmap of tables that do not "precede" the given
JSON_TABLE(...) invocation (according the above definition of
"preceding").
Fix for for the problem with
- Cross-outer-join dependency
- dead-end join prefix
- join order pruning
See the comments in the patch for detailed description
The easiest way to compile and test the server with UBSAN is to run:
./BUILD/compile-pentium64-ubsan
and then run mysql-test-run.
After this commit, one should be able to run this without any UBSAN
warnings. There is still a few compiler warnings that should be fixed
at some point, but these do not expose any real bugs.
The 'special' cases where we disable, suppress or circumvent UBSAN are:
- ref10 source (as here we intentionally do some shifts that UBSAN
complains about.
- x86 version of optimized int#korr() methods. UBSAN do not like unaligned
memory access of integers. Fixed by using byte_order_generic.h when
compiling with UBSAN
- We use smaller thread stack with ASAN and UBSAN, which forced me to
disable a few tests that prints the thread stack size.
- Verifying class types does not work for shared libraries. I added
suppression in mysql-test-run.pl for this case.
- Added '#ifdef WITH_UBSAN' when using integer arithmetic where it is
safe to have overflows (two cases, in item_func.cc).
Things fixed:
- Don't left shift signed values
(byte_order_generic.h, mysqltest.c, item_sum.cc and many more)
- Don't assign not non existing values to enum variables.
- Ensure that bool and enum values are properly initialized in
constructors. This was needed as UBSAN checks that these types has
correct values when one copies an object.
(gcalc_tools.h, ha_partition.cc, item_sum.cc, partition_element.h ...)
- Ensure we do not called handler functions on unallocated objects or
deleted objects.
(events.cc, sql_acl.cc).
- Fixed bugs in Item_sp::Item_sp() where we did not call constructor
on Query_arena object.
- Fixed several cast of objects to an incompatible class!
(Item.cc, Item_buff.cc, item_timefunc.cc, opt_subselect.cc, sql_acl.cc,
sql_select.cc ...)
- Ensure we do not do integer arithmetic that causes over or underflows.
This includes also ++ and -- of integers.
(Item_func.cc, Item_strfunc.cc, item_timefunc.cc, sql_base.cc ...)
- Added JSON_VALUE_UNITIALIZED to json_value_types and ensure that
value_type is initialized to this instead of to -1, which is not a valid
enum value for json_value_types.
- Ensure we do not call memcpy() when second argument could be null.
- Fixed that Item_func_str::make_empty_result() creates an empty string
instead of a null string (safer as it ensures we do not do arithmetic
on null strings).
Other things:
- Changed struct st_position to an OBJECT and added an initialization
function to it to ensure that we do not copy or use uninitialized
members. The change to a class was also motived that we used "struct
st_position" and POSITION randomly trough the code which was
confusing.
- Notably big rewrite in sql_acl.cc to avoid using deleted objects.
- Changed in sql_partition to use '^' instead of '-'. This is safe as
the operator is either 0 or 0x8000000000000000ULL.
- Added check for select_nr < INT_MAX in JOIN::build_explain() to
avoid bug when get_select() could return NULL.
- Reordered elements in POSITION for better alignment.
- Changed sql_test.cc::print_plan() to use pointers instead of objects.
- Fixed bug in find_set() where could could execute '1 << -1'.
- Added variable have_sanitizer, used by mtr. (This variable was before
only in 10.5 and up). It can now have one of two values:
ASAN or UBSAN.
- Moved ~Archive_share() from ha_archive.cc to ha_archive.h and marked
it virtual. This was an effort to get UBSAN to work with loaded storage
engines. I kept the change as the new place is better.
- Added in CONNECT engine COLBLK::SetName(), to get around a wrong cast
in tabutil.cpp.
- Added HAVE_REPLICATION around usage of rgi_slave, to get embedded
server to compile with UBSAN. (Patch from Marko).
- Added #ifdef for powerpc64 to avoid a bug in old gcc versions related
to integer arithmetic.
Changes that should not be needed but had to be done to suppress warnings
from UBSAN:
- Added static_cast<<uint16_t>> around shift to get rid of a LOT of
compiler warnings when using UBSAN.
- Had to change some '/' of 2 base integers to shift to get rid of
some compile time warnings.
Reviewed by:
- Json changes: Alexey Botchkov
- Charset changes in ctype-uca.c: Alexander Barkov
- InnoDB changes & Embedded server: Marko Mäkelä
- sql_acl.cc changes: Vicențiu Ciorbaru
- build_explain() changes: Sergey Petrunia
SELECT_LEX objects that are "fake_select_lex" (i.e read UNION output)
used both INT_MAX and UINT_MAX as select_number.
- mysql_explain_union() assigned UINT_MAX
- st_select_lex_unit::add_fake_select_lex assigned INT_MAX
This didn't matter initially (before EXPLAIN FORMAT=JSON), because the
code had no checks for this value.
EXPLAIN FORMAT=JSON and later other features did introduce checks for
select_number values. The check had to check for two constants and
looked really confusing.
This patch joins the two constants into one - FAKE_SELECT_LEX_ID.
At the second execution of the PS
1. mark_as_dependent() is called with the same parameters as at the first
execution (select#4 and select#3)
2. as outer_select (select#3) has been already merged at the first
execution of PS it cannot be reached using the outer_select() function
anymore (and so can not stop iteration).
3. as a result all selects towards the top level select including the
select for 'ca' are marked as uncacheable.
4. Marked uncacheable it executed incorrectly triggering filling its
temporary table several times and using freed memory at the end.
To avoid the problem we use name resolution context to go "up".
NOTE: problem also exists in 10.2 but has no visible effect on execution.
That is why the problem is fixed in 10.2.
The patch also add debug logging of important procedures and
better specify parameters types of st_select_lex::mark_as_dependent.
Adds an implementation for SELECT ... FOR UPDATE SKIP LOCKED /
SELECT ... LOCK IN SHARED MODE SKIP LOCKED
This is implemented only InnoDB at the moment, not in RockDB yet.
This adds a new hander flag HA_CAN_SKIP_LOCKED than
will be used when the storage engine advertises the flag.
When a storage engine indicates this flag it will get
TL_WRITE_SKIP_LOCKED and TL_READ_SKIP_LOCKED transaction types.
The Lex structure has been updated to store both the FOR UPDATE/LOCK IN
SHARE as well as the SKIP LOCKED so the SHOW CREATE VIEW
implementation is simplier.
"SELECT FOR UPDATE ... SKIP LOCKED" combined with CREATE TABLE AS or
INSERT.. SELECT on the result set is not safe for STATEMENT based
replication. MIXED replication will replicate this as row based events."
Thanks to guidance from Facebook commit
193896c466
This helped verify basic test case, and components that need implementing
(even though every part was implemented differently).
Thanks Marko for guidance on simplier InnoDB implementation.
Reviewers: Marko, Monty
Use < TL_FIRST_WRITE for determining a READ transaction.
Use TL_FIRST_WRITE as the relative operator replacing TL_WRITE_ALLOW_WRITE
as the minimium WRITE lock type.
splittable derived
If one of joined tables of the processed query is a materialized derived
table (or view or CTE) with GROUP BY clause then under some conditions it
can be subject to split optimization. With this optimization new equalities
are injected into the WHERE condition of the SELECT that specifies this
derived table. The injected equalities are generated for all join orders
with which the split optimization can employed. After the best join order
has been chosen only certain of this equalities are really needed. The
others can be safely removed. If it's not done and some of injected
equalities involve expressions over semi-joins with look-up access then
the query may return a wrong result set.
This patch effectively removes equalities injected for split optimization
that are needed only at the optimization stage and not needed for execution.
Approved by serg@mariadb.com
row_number() over () window function can be used without any column in the OVER
clause. Additionally, the item doesn't reference any tables, as it's not
effectively referencing any table. Rather it is specifically built based
on the end temporary table used for window function computation.
This caused remove_const function to wrongly drop it from the ORDER
list. Effectively, we shouldn't be dropping any window function from the
ORDER clause, so adjust remove_const to account for that.
Reviewed by: Sergei Petrunia sergey@mariadb.com