SplM_opt_info::last_refills is never set, so remove it. It was used in
In JOIN_TAB::choose_best_splitting(), use local variable "refills" there,
instead.
The logic in the function is that we compute
startup_cost= refills * spl_plan->cost;
only when we have a splitting query plan which implies "refills" was set
accordingly.
Also
- Added a comment about what "refills" is.
- Updated derived_cond_pushdown.result: the change makes the
Split-Materialized plans more expensive, so the join order changes from
t3, <split-materialized(outer_ref)> to <split-materialized>, t3
failed in find_field_in_table_ref
The main crash with segfault in find_field_in_tables() was fixed by
6aa47fae30 (MDEV-35276). This fix is for debug assertion.
Since Item_default_value is also Item_field there is nothing to be
done except adding DEFAULT_VALUE_ITEM type check.
(Variant 3) (commit in 11.4)
When a derived table has a GROUP BY clause:
SELECT ...
FROM (SELECT ... GROUP BY col1, col2) AS tbl
The optimizer would use inner join's output cardinality as an estimate
of derived table size, ignoring the fact that GROUP BY operation would
produce much fewer groups.
Add code to produce tighter bounds:
- The GROUP BY list is split into per-table lists. If GROUP BY list has
expressions that refer to multiple tables, we fall back to join output
cardinality.
- For each table, the first cardinality estimate is join_tab->read_records.
- Then, we try to get a tighter bound by using index statistics.
- If indexes do not cover all GROUP BY columns, we try to use per-column
EITS statistics.
MDEV-35958 Cost estimates for materialized derived tables are poor
(Backport 11.8->11.4, the same patch)
Estimate_scan_time() calculates the cost of scanning a derivied table.
The old code did not take into account that the temporary table heap table
may be converted to Aria.
Things fixed:
- Added checking if the temporary tables data will fit in the heap.
If not, then calculate the cost based on the designated internal
temporary table engine (Aria).
- Removed MY_MAX(records, 1000) and instead trust the optimizer's
estimate of records. This reduces the cost of temporary tables a bit
for small tables, which caused a few changes in mtr results.
- Fixed cost calculation for HEAP.
- HEAP costs->row_next_find_cost was not set. This does not affect old
costs calculation as this cost slot was not used anywhere.
Now HEAP cost->row_next_find_cost is set, which allowed me to remove
some duplicated computation in ha_heap::scan_time()
Reviewed by: Sergei Petrunia <sergey@mariadb.com>
MDEV-32329 (patch) pushdown from having into where: Server crashes at sub_select
When generating an Item_equal with a Item_ref that refers to a field
outside of a subselect, remove_item_direct_ref() causes the dependency
(depended_from) on the outer select to be lost, which causes trouble
for code downstream that can no longer determine the scope of the Item.
Not calling remove_item_direct_ref() retains the Item's dependency.
Test cases from MDEV-32395 and MDEV-32329 are included.
Some fixes from other developers:
Monty:
- Fixed wrong code in Item_equal::create_pushable_equalities()
that could cause wrong item to be used if there was no matching items.
Daniel Black:
- Added test cases from MDEV-32329
Igor Babaev:
- Provided fix for removing call to remove_item_direct_ref() in
eliminate_item_equal()
MDEV-32395: update_depend_map_for_order: SEGV at /mariadb-11.3.0/sql/sql_select.cc:16583
Include test cases from MDEV-32329.
Two problem solved:
1) Item_default_value makes a shallow copy so the copy
should not delete field belong to the Item
2) Item_default_value should not inherit
derived_field_transformer_for_having and
derived_field_transformer_for_where (in this variant
pushing DEFAULT(f) is prohibited (return NULL) but
if return "this" it will be allowed (should go with
a lot of tests))
Statements affected by this bug need all the following to be true
1) a derived table table or view whose specification contains a set
operation at the top level.
2) a grouping operator (group by/having) operating on a column alias
other than in the first select of the union/intersect
3) an outer condition that will be pushed into all selects in this
union/intersect, either into the where or having clause
When pushing a condition into all selects of a unit with more than one
select, pushdown_cond_for_derived() renames items so we can re-use the
condition being pushed.
These names need to be saved and reset for correct name resolution on
second execution of prepared statements.
Reviewed by Igor Babaev (igor@mariadb.com)
This commits adds the "materialization" block to the output of
EXPLAIN/ANALYZE FORMAT=JSON when materialized subqueries are involved
into processing. In the case of ANALYZE additional runtime information
is displayed, such as:
- chosen strategy of materialization
- number of partial match/index lookup loops
- sizes of partial match buffers
This bug could affect queries with IN subqueries in WHERE clause and using
derived tables to which split optimization potentially could be applied.
When looking for the best split of a splittable derived table T any key
access from a semi-join materialized table used for lookups S to table T
must be excluded from consideration because in the current implementation
of such tables as S the values from its records cannot be used to access
other tables.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
Author: Sergei Petrunia <sergey@mariadb.com>
Date: Wed Oct 11 19:02:25 2023 +0300
MDEV-32301: Server crashes at Arg_comparator::compare_row
In Item_bool_rowready_func2::build_clone(): if we're setting
clone->cmp.comparators=0
also set
const_item_cache=0
as the Item is currently in a state where one cannot compute it.
In Item_bool_rowready_func2::build_clone(): if we're setting
clone->cmp.comparators=0
also set
const_item_cache=0
as the Item is currently in a state where one cannot compute it.
ANALYZE FORMAT=JSON output now includes table.r_engine_stats which
has the engine statistics. Only non-zero members are printed.
Internally: EXPLAIN data structures Explain_table_acccess and
Explain_update now have handler* handler_for_stats pointer.
It is used to read statistics from handler_for_stats->handler_stats.
The following applies only to 10.9+, backport doesn't use it:
Explain data structures exist after the tables are closed. We avoid
walking invalid pointers using this:
- SQL layer calls Explain_query::notify_tables_are_closed() before
closing tables.
- After that call, printing of JSON output is disabled. Non-JSON output
can be printed but we don't access handler_for_stats when doing that.