This bug could cause a crash when executing queries that used mutually
recursive CTEs with system variable big_tables set to 1. It happened due
to several bugs in the code that handled recursive table references
referred mutually recursive CTEs. For each recursive table reference a
temporary table is created that contains all rows generated for the
corresponding recursive CTE table on the previous step of recursion.
This temporary table should be created in the same way as the temporary
table created for a regular materialized derived table using the
method select_union::create_result_table(). In this case when the
temporary table is created it uses the select_union::TMP_TABLE_PARAM
structure as the parameter for the table construction. However the
code created the temporary table using just the function create_tmp_table()
and passed pointers to certain fields of the TMP_TABLE_PARAM structure
used for accumulation of rows of the recursive CTE table as parameters
for update. This was a mistake because now different temporary tables
cannot share some TMP_TABLE_PARAM fields in a general case. Besides,
depending on how mutually recursive CTE tables were defined and which
of them were referred in the executed query the select_union object
allocated for a recursive table reference could be allocated again after
the the temporary table had been created. In this case the TMP_TABLE_PARAM
object associated with the temporary table created for the recursive
table reference contained unassigned fields needed for execution when
Aria engine is employed as the engine for temporary tables.
This patch ensures that
- select_union object is created only once for any recursive table
reference
- any temporary table created for recursive CTEs uses its own
TMP_TABLE_PARAM structure
The patch also fixes a problem caused by incomplete cleanup of join tables
associated with recursive table references.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
in TABLE_LIST::is_recursive_with_tables
After the patch for MDEV-23619 the code of st_select_lex::cleanup started
using the list st_select_lex::leaf_tables. This list is built for any
query with FROM clause in the function setup_tables(). If such query is
used in a stored procedure it must be ensured that the list is empty
before each new call of the procedure. Otherwise if the first call of
the procedure is successful while the second call reports an error before
the setup_tables() is invoked then list st_select_lex::leaf_tables would
point to a piece of memory that has been already freed.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
Due to a premature cleanup of the unit that specified a recursive CTE
used in the second operand of union the server fell into an infinite
loop in the reported test case. In other cases this premature cleanup
could cause other problems.
The bug is the result of a not quite correct fix for MDEV-17024. The
unit that specifies a recursive CTE has to be cleaned only after the
cleanup of the last external reference to this CTE. It means that
cleanups of the unit triggered not by the cleanup of a external
reference to the CTE must be blocked.
Usage of local table chains in selects to get external references to
recursive CTEs was not correct either because of possible merges of
some selects.
Also fixed a minor bug in st_select_lex::set_explain_type() that caused
typing 'RECURSIVE UNION' instead of 'UNION' in EXPLAIN output for external
references to a recursive CTE.
Removing the ORDER BY clause from the UNION when UNION is inside an IN/ALL/ANY/EXISTS subquery.
The rewrites are done for subqueries but this rewrite is not done for the fake_select of
the UNION.
This bug is the same as the bug MDEV-17024. The crashes caused by these
bugs were due to premature cleanups of the unit specifying recursive CTEs
that happened in some cases when there were several outer references the
same recursive CTE.
The problem of premature cleanups for recursive CTEs could be already
resolved by the correction in TABLE_LIST::set_as_with_table() introduced
in this patch. ALL other changes introduced by the patches for MDEV-17024
and MDEV-22748 guarantee that this clean-ups are performed as soon as
possible: when the select containing the last outer reference to a
recursive CTE is being cleaned up the specification of the recursive CTE
should be cleaned up as well.
MDEV-18957 UPDATE with LIMIT clause is wrong for versioned partitioned tables
UPDATE, DELETE: replace linear search of current/historical records
with vers_setup_conds().
Additional DML cases in view.test
A CTE can be defined as a table values constructor. In this case the CTE is
always materialized in a temporary table.
If the definition of the CTE contains a list of the names of the CTE
columns then the query expression that uses this CTE can refer to the CTE
columns by these names. Otherwise the names of the columns are taken from
the names of the columns in the result set of the query that specifies the
CTE.
Thus if the column names of a CTE are provided in the definition the
columns of result set should be renamed. In a general case renaming of
the columns is done in the select lists of the query specifying the CTE.
If a CTE is specified by a table value constructor then there are no such
select lists and renaming is actually done for the columns of the result
of materialization.
Now if a view is specified by a query expression that uses a CTE specified
by a table value constructor saving the column names of the CTE in the
stored view definition becomes critical: without these names the query
expression is not able to refer to the columns of the CTE.
This patch saves the given column names of CTEs in stored view definitions
that use them.
query with VALUES()
A table value constructor can be used in all contexts where a select
can be used. In particular an ORDER BY clause or a LIMIT clause or both
of them can be attached to a table value constructor to produce a new
query. Unfortunately execution of such queries was not supported.
This patch fixes the problem.
If a derived table has SELECT DISTINCT, provide index statistics for it so that the join optimizer in the
upper select knows that ref access to the table will produce one row.
1. Always drop merged_for_insert flag on cleanup (there could be errors which prevent TABLE to be assigned)
2. Make more precise cleanup of select parts which was touched
When the with clause of a query contains a recursive CTE that is not used
then processing of EXPLAIN for this query does not require optimization
of the unit specifying this CTE. In this case if 'derived' is the
TABLE_LIST object created for this CTE then derived->derived_result is NULL
and any assignment to derived->derived_result->table causes a crash.
After fixing this problem in the code of st_select_lex_unit::prepare()
EXPLAIN for such a query worked without crashes. Yet an execution
plan for the recursive CTE appeared there. The cause of this problem was
an incorrect condition used in JOIN::save_explain_data_intern() that
determined whether CTE was to be optimized or not. A similar condition was
used in select_describe() and this patch has corrected it as well.
The function st_select_lex_unit::exec_recursive() missed resetting of
select_limit_cnt and offset_limit_cnt before execution of union parts.
As a result recursive CTEs specified by UNIONs whose SELECTs contained
LIMIT/OFFSET could return wrong sets of records.
This problem manifested itself when a join query used two or more
materialized CTE such that each of them employed the same recursive CTE.
The bug caused a crash. The crash happened because the cleanup()
function was performed premature for recursive CTE. This clean up was
induced by the cleanup of the first CTE referenced the recusrsive CTE.
This cleanup destroyed the structures that would allow to read from the
temporary table containing the rows of the recursive CTE and an attempt to read
these rows for the second CTE referencing the recursive CTE triggered a
crash.
The clean up for a recursive CTE R should be performed after the cleanup
of the last materialized CTE that uses R.
value constructor shows wrong number of rows
If the specification of a derived table contained a table value constructor
then the optimizer incorrectly estimated the number of rows in the derived
table. This happened because the optimizer did not take into account the
number of rows in the constructor. The wrong estimate could lead to choosing
inefficient execution plans.
This is to mark that a field is indirectly part of a key, which simplifes
checking if we need to have this field up to date to evaluate a key.
For example:
CREATE TABLE t1 (a int, b int as (a) virtual,
c int as (b) virtual, index(c))
would mark a and b with PART_INDIRECT_KEY_FLAG.
c is marked with PART_KEY_FLAG as before.
This bug caused crashes for queries with unreferenced non-recursive
CTEs specified by unions.It happened because the function
st_select_lex_unit::prepare() tried to use the value of the field 'derived'
that could not be set for unferenced CTEs as there was no derived
table associated with an unreferenced CTE.
The current code does not support recursive CTEs whose specifications
contain a mix of ALL UNION and DISTINCT UNION operations.
This patch catches such specifications and reports errors for them.
with recursive subquery
There were two problems:
1. The code did not report that usage of global ORDER BY / LIMIT clauses
was not supported yet.
2. The code just reset fake_select_lex of the the unit specifying
a recursive CTE to NULL and that caused memory leaks in some cases.
In this issue we hit the assert because we are adding addition fields to the field JOIN::all_fields list. This
is done because HEAP tables can't index BIT fields so we need to use an additional hidden field for grouping because later it will be
converted to a LONG field. Original field will remain of the BIT type and will be returned. This happens when we convert DISTINCT to
GROUP BY.
The solution is to take into account the number of such hidden fields that would be added to the field
JOIN::all_fields list while calculating the size of the ref_pointer_array.
Partition engine FT keys are implemented in such a way that
the FT function's cleanup() methods use table's internals.
So calling them after close_thread_tables is unsafe.
Forced columns of recursive CTEs to be nullable. SQL standard
requires this only from recursive columns, but in our code
so far we do not differentiate between recursive and non-recursive
columns when aggregating types of the union that specifies a
recursive CTE.
Make sure that SELECT_LEX_UNIT::derived, behaves as documented
(points to the "TABLE_LIST representing this union in the
embedding select"). For recursive CTE this was not necessarily
the case, it could've pointed to the TABLE_LIST inside the CTE,
not in the embedding select.
To fix:
* don't update unit->derived in mysql_derived_prepare(), pass derived
as an argument to st_select_lex_unit::prepare()
* prefer to set unit->derived in TABLE_LIST::init_derived()
to the TABLE_LIST in the embedding select, not to the recursive
reference. Fail if there are many TABLE_LISTs in the embedding
select with conflicting FOR SYSTEM_TIME clauses.
cleanup:
* remove redundant THD* argument from st_select_lex_unit::prepare()