Parser changes made by Alexander Barkov <bar@mariadb.com>.
Part of the tests made by Iqbal Hassan <iqbal@hasprime.com>.
Initially marking with ORA_JOIN flag made also by
Iqbal Hassan <iqbal@hasprime.com>.
Main idea is that parser mark fields with (+) with a flag
(ORA_JOIN).
During Prepare the flag bring to the new created items if needed.
Later after preparing (fix_firlds()) WHERE confition the
relations betweel the tables analyzed and tables reordered
so to make JOIN/LEFT JOIN operators in chain equivalent to
the query with oracle outer join operator (+).
Then the flags of (+) removed.
main/statistics_json.result is updated for f8ba5ced55 (MDEV-36099)
The test uses 'delete from t1' in many places and then populates
the table again. The natural order of rows in a MyISAM table is well
defined and the test was implicitly relying on that.
before f8ba5ced55 delete was deleting rows one by one, using
ha_myisam::delete_row() because the connection was stuck in rbr mode.
This caused rows to be shown in the reverse insertion order (because of
the delete link list).
MDEV-36099 fixes this bug and the server now correctly uses
ha_myisam::delete_all_rows(). This makes rows to be shown in the
insertion order as expected.
temporary table
Compressed field cannot be part of a key by its nature: there is no
data to order, only the compressed data.
For optimizer temporary table we create uncompressed substitute.
In all other cases (MDEV-16808) we don't use key: add_keyuse() is
skipped by !field->compression_method() condition.
Also expand vcol field index coverings to include indexes covering all
the fields in the expression. The reasoning goes as follows: let f(c1,
c2, ..., cn) be a function on applied to columns c1, c2, ..., cn, if
f(...) is covered by an index, so should vc whose expression is
f(...).
For example, if t.vf = t.c1 + t.c2, and t has three indexes (vf), (c1,
c2), (c1).
Before this change, vf's index covering is a singleton {(vf)}. Let's call
that the "conventional" index covering.
After this change vf's index covering is now {(vf), (c1, c2)}, since
(c1, c2) covers both c1 and c2. Let's call (c1, c2) in this case the
"extra" covering.
With the coverings updated, when an index in the "extra" covering is
chosen for keyread, the vcol also needs to be calculated. In this case
we mark vcol in the table read_set, and ensure it is computed.
With these changes, we see various improvements, including from using
full table scan + filesort to full index scan + filesort when ORDER BY
an indexed vcol (here vc = c + 1 is a vcol and both c and vc are
indexes):
explain select c + 1 from t order by vc;
id select_type table type possible_keys key key_len ref rows Extra
-1 SIMPLE t ALL NULL NULL NULL NULL 10000 Using filesort
+1 SIMPLE t index NULL c 5 NULL 10000 Using index; Using filesort
The substitutions are followed updates to all_fields which include a
copy of the ORDER BY/GROUP BY item pointers, as well as corresponding
updates to ref_pointer_array so that the all_fields and
ref_pointer_array remain in sync.
Another, related change is the recomputation of table index covering
on substitutions. It not only reflects the correct table index
covering after the substitutions, but also improve executions where
the vcol index can be chosen, such as this example (here vc = c + 1
and vc is the only index in the table), from full table scan +
filesort to full index scan:
select vc from t order by c + 1;
We do it in SELECT as well as in single table DELETE/UPDATE.
For all degenerate select queries having sub-queries in them,
the field rows_examined in the slow query log is always being set to 0.
The problem is that, although sub-queries increment the rows_examined field
of the thd object correctly, the degenerate outer select query is resetting
the rows_examined to zero after it has finished execution, by
invoking thd->set_examined_row_count(0).
The solution is to remove the thd->set_examined_row_count(0) in the
degenerate select queries.
The recursive nature of add_table_function_dependencies
resolution meant that the detection of a stack overrun
would continue to recursively call itself.
Its quite possible that a user SQL could get multiple
ER_STACK_OVERRUN_NEED_MORE errors.
Additionaly the results of the stack overrrun check
result was incorrectly assigned to a table_map result.
Its only because of the "if error" check after
add_table_function_dependencies is called, that would
detected the stack overrun error, prevented a
potential corruped tablemap is from being processed.
Corrected add_table_function_dependencies to stop and
return on the detection of a stack overrun error.
The add_extra_deps call also was true on a stack overrun.
Wrong assertion was added by f1f9284181 (MDEV-34046) because PS
parameter is applicable not only to DELETE HISTORY.
Keeping value of select_lex->where for DELETE HISTORY was remade via
prep_where which is read by reinit_stmt_before_use(). For SELECT
prep_where is set in JOIN::optimize_inner() and that is not called for
DELETE.
Implements and tests the optimizer hints DERIVED_CONDITION_PUSHDOWN
and NO_DERIVED_CONDITION_PUSHDOWN, table-level hints to enable and
disable, respectively, the condition pushdown for derived tables
which is typically controlled by the condition_pushdown_for_derived
optimizer switch.
Implements and tests the optimizer hints MERGE and NO_MERGE, table-level
hints to enable and disable, respectively, the derived_merge optimization
which is typically controlled by the derived_merge optimizer switch.
Sometimes hints need to be fixed before TABLE instances are available, but
after TABLE_LIST instances have been created (as in the cases of MERGE and
NO_MERGE). This commit introduces a new function called
fix_hints_for_derived_table to allow early hint fixing for derived tables,
using only a TABLE_LIST instance (so long as such hints are not index-level).
replication problems
DELETE HISTORY did not process parameterized PS properly as the
history expression was checked on prepare stage when the parameters
was not yet substituted. In that case check_units() succeeded as there
is no invalid type: Item_param has type_handler_null which is
inherited from string type and this is valid type for history
expression. The warning was thrown when the expression was evaluated
for comparison on delete execution (when the parameter was already
substituted).
The fix postpones check_units() until the first PS execution. We have
to postpone where conditions processing until the first execution and
update select_lex.where on every execution as it is reset to the state
after prepare.
When processing queries like
INSERT INTO t1 (..) SELECT .. FROM t1, t2 ...,
there is a single query block (i.e., a single SELECT_LEX) for both INSERT and
SELECT parts. During hints resolution, when hints are attached to particular
TABLE_LIST's, the search is performed by table name across the whole
query block.
So, if a table mentioned in an optimizer hint is present in the INSERT part,
the hint is attached to the that table. This is obviously wrong as
optimizer hints are supposed to only affect the SELECT part of
an INSERT..SELECT clause.
This commit disables possible attaching hints to tables in the INSERT part
and fixes some other bugs related to INSERT..SELECT statements processing
This commit:
- fixes a couple of bugs in check_join_cache_usage();
- separates a part of opt_hints.test to a new file opt_hints_join_cache.test;
- add a batch of test cases run against different join_cache_level settings.
This commit implements optimizer hints allowing to affect the order
of joining tables:
- JOIN_FIXED_ORDER similar to existing STRAIGHT_JOIN hint;
- JOIN_ORDER to apply the specified table order;
- JOIN_PREFIX to hint what tables should be first in the join;
- JOIN_SUFFIX to hint what tables should be last in the join.
- remove get_args_printer() from hints printing
- add append_hint_arguments(THD *thd, opt_hints_enum hint, String *str)
- add more comments
- rename st_opt_hint_info::hint_name to hint_type
- add pptimizer trace support for hints
- add dbug_print_hints()
- make print_warn() not be a template
- introduce Printable_parser_rule interface, make grammar rules that
emit warnings implement it and print_warn invokes its function)
- remove Parser::Hint::append_args() as it is not used anywhere
(it used to be necessary call print_warn(... (Parser::Hint*)NULL);
It places a limit N (a timeout value in milliseconds) on how long
a statement is permitted to execute before the server terminates it.
Syntax:
SELECT /*+ MAX_EXECUTION_TIME(milliseconds) */ ...
Only top-level SELECT statements support the hint.
BNL() hint effectively increases join_cache_level up to 4 if it is
set to value less than 4.
This commit also makes the BKA() hint override not only
`join_cache_bka` optimizer switch but `join_cache_level` as well.
I.e., BKA() hint enables BKA and BKAH join buffers both flat and
incremental despite `join_cache_level` and `join_cache_bka` setting.
join_cache_level=0 disables join cache buffers, but the hint
BNL() now allows to employ BNL(H) buffers for particular tables
or query blocks.
This commit also adds a number of test cases including
OUTER JOINs to make sure hints do not break the rules of
join buffers application
This commit introduces:
- the infrastructure for optimizer hints;
- hints for join buffering: BNL(), NO_BNL(), BKA(), NO_BKA();
- NO_ICP() hint for disabling index condition pushdown;
- MRR(), MO_MRR() hint for multi-range reads control;
- NO_RANGE_OPTIMIZATION() for disabling range optimization;
- QB_NAME() for assigning names for query blocks.
In non-EXPLAIN queries with subqueries, the trace was flooded
with empty "join_execution":{} nodes. Now, they are gone.
The "Range checked for each record" optimization still prints
content into trace on join execution. Now, we wrap it into
"range-checked-for-each-record" to delimit the invocations.
This new object has fields "select_id" which corresponds to
the outer query block, and the "loop" which corresponds to
the inner query block iteration number. Additionally,
the field "row_estimation" which itself is an object has
"table", and "range_analysis" fields that were moved
from the old "join_execution"'s steps array.
This patch adds support for SYS_REFCURSOR (a weakly typed cursor)
for both sql_mode=ORACLE and sql_mode=DEFAULT.
Works as a regular stored routine variable, parameter and return value:
- can be passed as an IN parameter to stored functions and procedures
- can be passed as an INOUT and OUT parameter to stored procedures
- can be returned from a stored function
Note, strongly typed REF CURSOR will be added separately.
Note, to maintain dependencies easier, some parts of sql_class.h
and item.h were moved to new header files:
- select_results.h:
class select_result_sink
class select_result
class select_result_interceptor
- sp_cursor.h:
class sp_cursor_statistics
class sp_cursor
- sp_rcontext_handler.h
class Sp_rcontext_handler and its descendants
The implementation consists of the following parts:
- A new class sp_cursor_array deriving from Dynamic_array
- A new class Statement_rcontext which contains data shared
between sub-statements of a compound statement.
It has a member m_statement_cursors of the sp_cursor_array data type,
as well as open cursor counter. THD inherits from Statement_rcontext.
- A new data type handler Type_handler_sys_refcursor in plugins/type_cursor/
It is designed to store uint16 references -
positions of the cursor in THD::m_statement_cursors.
- Type_handler_sys_refcursor suppresses some derived numeric features.
When a SYS_REFCURSOR variable is used as an integer an error is raised.
- A new abstract class sp_instr_fetch_cursor. It's needed to share
the common code between "OPEN cur" (for static cursors) and
"OPER cur FOR stmt" (for SYS_REFCURSORs).
- New sp_instr classes:
* sp_instr_copen_by_ref - OPEN sys_ref_curor FOR stmt;
* sp_instr_cfetch_by_ref - FETCH sys_ref_cursor INTO targets;
* sp_instr_cclose_by_ref - CLOSE sys_ref_cursor;
* sp_instr_destruct_variable - to destruct SYS_REFCURSOR variables when
the execution goes out of the BEGIN..END block
where SYS_REFCURSOR variables are declared.
- New methods in LEX:
* sp_open_cursor_for_stmt - handles "OPEN sys_ref_cursor FOR stmt".
* sp_add_instr_fetch_cursor - "FETCH cur INTO targets" for both
static cursors and SYS_REFCURSORs.
* sp_close - handles "CLOSE cur" both for static cursors and SYS_REFCURSORs.
- Changes in cursor functions to handle both static cursors and SYS_REFCURSORs:
* Item_func_cursor_isopen
* Item_func_cursor_found
* Item_func_cursor_notfound
* Item_func_cursor_rowcount
- A new system variable @@max_open_cursors - to limit the number
of cursors (static and SYS_REFCURSORs) opened at the same time.
Its allowed range is [0-65536], with 50 by default.
- A new virtual method Type_handler::can_return_bool() telling
if calling item->val_bool() is allowed for Items of this data type,
or if otherwise the "Illegal parameter for operation" error should be raised
at fix_fields() time.
- New methods in Sp_rcontext_handler:
* get_cursor()
* get_cursor_by_ref()
- A new class Sp_rcontext_handler_statement to handle top level statement
wide cursors which are shared by all substatements.
- A new virtual method expr_event_handler() in classes Item and Field.
It's needed to close (and make available for a new OPEN)
unused THD::m_statement_cursors elements which do not have any references
any more. It can happen in various moments in time, e.g.
* after evaluation parameters of an SQL routine
* after assigning a cursor expression into a SYS_REFCURSOR variable
* when leaving a BEGIN..END block with SYS_REFCURSOR variables
* after setting OUT/INOUT routine actual parameters from formal
parameters.
This bug is exposed by MDEV-30073, causing bogus warning messages to
be pushed by find_order_in_list(), but which is otherwise benign.
An existing test case in show_explain.test, MDEV-238 can be used together
with an assert to find a query which exposes the issue.
if (resolution == RESOLVED_BEHIND_ALIAS &&
order_item->fix_fields_if_needed_for_order_by(thd, order->item))
return TRUE;
/* Lookup the current GROUP field in the FROM clause. */
order_item_type= order_item->type();
+ DBUG_ASSERT( order_item_type == (*order->item)->type() );
This will fail here
CREATE TABLE t2 ( a INT );
INSERT INTO t2 VALUES (1),(2),(1),(4),(2);
explain SELECT alias.a FROM t2, ( SELECT * FROM t2 ) AS alias
GROUP BY alias.a;
This assert makes little sense after the patch.
DaveGosselin-MariaDB approved these changes Apr 18, 2025
Avoid ASAN failure by collecting statistics from Result objects
before cleaning them up. In related single-table cases, statistics
are maintained directly by the single-table update and delete
functions.
The problem is that copy function was used in field list but never
copied in this execution path.
So copy should be performed before returning result.
Protection against uninitialized copy usage added.
When subquery with LEFT JOIN is converted into semi-join, it is possible
to construct cases where the LEFT JOIN's ON expression refers to a table
in the current select but not in the current join nest. For example:
t1 SEMI JOIN (
t2
LEFT JOIN (t3 LEFT JOIN t4 ON t4.col=t1.col) ON expr
)
here, ON t4.col=t1.col" has this property. Let's denote it as
ON-EXPR-HAS-REF-OUTSIDE-NEST.
The optimizer handles LEFT JOINs like so:
- Outer join runtime requires that "inner tables follow outer" in
any join order.
- Join optimizer enforces this by constructing join orders that follow
table dependencies as they are specified in TABLE_LIST::dep_tables.
- The dep_tables are set in simplify_joins() according to the contents
of ON expressions and LEFT JOIN structure.
However, the logic in simplify_joins() failed to account for possible
ON-EXPR-HAS-REF-OUTSIDE-NEST. It assumed that references outside of the
current join nest could only be OUTER_REF_TABLE_BIT or RAND_TABLE_BIT.
The fix was to add the missing logic.
The fix for MDEV-34413 added support for Index Condition Pushdown with reverse
ordered scans. This makes Rowid filtering work with reverse-ordered scans, too,
so enable it. For example, InnoDB can now check the pushed index condition and
then check the rowid filter on success, in the ORDER BY ... DESC case.