Make sure that SELECT_LEX_UNIT::derived, behaves as documented
(points to the "TABLE_LIST representing this union in the
embedding select"). For recursive CTE this was not necessarily
the case, it could've pointed to the TABLE_LIST inside the CTE,
not in the embedding select.
To fix:
* don't update unit->derived in mysql_derived_prepare(), pass derived
as an argument to st_select_lex_unit::prepare()
* prefer to set unit->derived in TABLE_LIST::init_derived()
to the TABLE_LIST in the embedding select, not to the recursive
reference. Fail if there are many TABLE_LISTs in the embedding
select with conflicting FOR SYSTEM_TIME clauses.
cleanup:
* remove redundant THD* argument from st_select_lex_unit::prepare()
This bug happened due to a defect of the implementation of the handler
function ha_delete_all_rows() for the ARIA engine.
The function maria_delete_all_rows() truncated the table, but it didn't
touch the write cache, so the cache's write offset was not reset.
In the scenario like in the function st_select_lex_unit::exec_recursive
when first all records were deleted from the table and then several new
records were added some metadata became inconsistent with the state of
the cache. As a result the table scan function could not read records
at the end of the table.
The same defect could be found in the implementation of ha_delete_all_rows()
for the MYISAM engine mi_delete_all_rows().
Additionally made late instantiation for the temporary table used to store
rows that were used for each new iteration when executing a recursive CTE.
the non-recursive CTE defined with UNION
The problem appears as the columns of the non-recursive CTE weren't renamed.
The renaming procedure was called for recursive CTEs only.
To fix it in the procedure st_select_lex_unit::prepare
With_element::rename_columns_of_derived_unit is called now for both CTEs:
recursive and non-recursive.
is not supported
Allowed to use recursive references in derived tables.
As a result usage of recursive references in operands of
INTERSECT / EXCEPT is now supported.
This was done in, among other things:
- thd->db and thd->db_length
- TABLE_LIST tablename, db, alias and schema_name
- Audit plugin database name
- lex->db
- All db and table names in Alter_table_ctx
- st_select_lex db
Other things:
- Changed a lot of functions to take const LEX_CSTRING* as argument
for db, table_name and alias. See init_one_table() as an example.
- Changed some function arguments from LEX_CSTRING to const LEX_CSTRING
- Changed some lists from LEX_STRING to LEX_CSTRING
- threads_mysql.result changed because process list_db wasn't always
correctly updated
- New append_identifier() function that takes LEX_CSTRING* as arguments
- Added new element tmp_buff to Alter_table_ctx to separate temp name
handling from temporary space
- Ensure we store the length after my_casedn_str() of table/db names
- Removed not used version of rename_table_in_stat_tables()
- Changed Natural_join_column::table_name and db_name() to never return
NULL (used for print)
- thd->get_db() now returns db as a printable string (thd->db.str or "")
TRASH was mapped to TRASH_FREE and was supposed to be used for memory
that should not be accessed anymore, while TRASH_ALLOC() is to be
used for uninitialized but to-be-used memory.
But sometimes TRASH() was used in the latter sense.
Remove TRASH() macro, always use explicit TRASH_ALLOC() or TRASH_FREE().
with recursive reference in subquery
If a recursive CTE uses a subquery with recursive reference then
the virtual function reset() must be called after each iteration
performed at the execution of the CTE.
Most "new" failures fixed in the following files:
- sql_select.cc
- item.cc
- item_func.cc
- opt_subselect.cc
Other things:
- Allocate udf_handler strings in mem_root
- Required changes in sql_string.h
- Add mem_root as argument to some new [] calls
- Mark udf_handler strings as thread specific
- Removed some comment blocks with code
With INTERSECT/EXCEPT fact that subquery item of IN/ALL/ANY was not assigned value does not mean that temporary table used for calculating unit is empty (records could be deleted).
With INTERSECT/EXCEPT fact that subquery item of IN/ALL/ANY was not assigned value does not mean that temporary table used for calculating unit is empty (records could be deleted).
As a result of this merge the code for the following tasks appears in 10.3:
- MDEV-12172 Implement tables specified by table value constructors
- MDEV-12176 Transform [NOT] IN predicate with long list of values INTO
[NOT] IN subquery.
- Added sql/mariadb.h file that should be included first by files in sql
directory, if sql_plugin.h is not used (sql_plugin.h adds SHOW variables
that must be done before my_global.h is included)
- Removed a lot of include my_global.h from include files
- Removed include's of some files that my_global.h automatically includes
- Removed duplicated include's of my_sys.h
- Replaced include my_config.h with my_global.h
"Optimization for equi-joins of derived tables with GROUP BY"
should be considered rather as a 'proof of concept'.
The task itself is targeted at an optimization that employs re-writing
equi-joins with grouping derived tables / views into lateral
derived tables. Here's an example of such transformation:
select t1.a,t.max,t.min
from t1 [left] join
(select a, max(t2.b) max, min(t2.b) min from t2
group by t2.a) as t
on t1.a=t.a;
=>
select t1.a,tl.max,tl.min
from t1 [left] join
lateral (select a, max(t2.b) max, min(t2.b) min from t2
where t1.a=t2.a) as t
on 1=1;
The transformation pushes the equi-join condition t1.a=t.a into the
derived table making it dependent on table t1. It means that for
every row from t1 a new derived table must be filled out. However
the size of any of these derived tables is just a fraction of the
original derived table t. One could say that transformation 'splits'
the rows used for the GROUP BY operation into separate groups
performing aggregation for a group only in the case when there is
a match for the current row of t1.
Apparently the transformation may produce a query with a better
performance only in the case when
- the GROUP BY list refers only to fields returned by the derived table
- there is an index I on one of the tables T used in FROM list of
the specification of the derived table whose prefix covers the
the fields from the proper beginning of the GROUP BY list or
fields that are equal to those fields.
Whether the result of the re-writing can be executed faster depends
on many factors:
- the size of the original derived table
- the size of the table T
- whether the index I is clustering for table T
- whether the index I fully covers the GROUP BY list.
This patch only tries to improve the chosen execution plan using
this transformation. It tries to do it only when the chosen
plan reaches the derived table by a key whose prefix covers
all the fields of the derived table produced by the fields of
the table T from the GROUP BY list.
The code of the patch does not evaluates the cost of the improved
plan. If certain conditions are met the transformation is applied.