If a splittable materialized derived table / view T is used in a inner nest
of an outer join with impossible ON condition then T is marked as a
constant table. Yet the execution plan to build T is still searched for
in spite of the fact that is not needed. So it should be set.
When the chosen execution plan accesses a join table employing a range
rowid filter a quick select to scan this range has to be built. This
quick select is built by a call of SQL_SELECT::test_quick_select().
At this call the function should allow to evaluate only single index
range scans. In order to be able to do this a new parameter was added
to this function.
This patch implements engine independent unique hash index.
Usage:- Unique HASH index can be created automatically for blob/varchar/test column whose key
length > handler->max_key_length()
or it can be explicitly specified.
Automatic Creation:-
Create TABLE t1 (a blob unique);
Explicit Creation:-
Create TABLE t1 (a int , unique(a) using HASH);
Internal KEY_PART Representations:-
Long unique key_info will have 2 representations.
(lets understand this with an example create table t1(a blob, b blob , unique(a, b)); )
1. User Given Representation:- key_info->key_part array will be similar to what user has defined.
So in case of example it will have 2 key_parts (a, b)
2. Storage Engine Representation:- In this case there will be only one key_part and it will point to
HASH_FIELD. This key_part will be always after user defined key_parts.
So:- User Given Representation [a] [b] [hash_key_part]
key_info->key_part ----^
Storage Engine Representation [a] [b] [hash_key_part]
key_info->key_part ------------^
Table->s->key_info will have User Given Representation, While table->key_info will have Storage Engine
Representation.Representation can be changed into each other by calling re/setup_keyinfo_hash function.
Working:-
1. So when user specifies HASH_INDEX or key_length is > handler->max_key_length(), In mysql_prepare_create_table
One extra vfield is added (for each long unique key). And key_info->algorithm is set to HA_KEY_ALG_LONG_HASH.
2. In init_from_binary_frm_image values for hash_keypart is set (like fieldnr , field and flags)
3. In parse_vcol_defs, HASH_FIELD->vcol_info is created. Item_func_hash is used with list of Item_fields,
When Explicit length is given by user then Item_left is used to concatenate Item_field values.
4. In ha_write_row/ha_update_row check_duplicate_long_entry_key is called which will create the hash key from
table->record[0] and then call ha_index_read_map , if we found duplicated hash , we will compare the result
field by field.
* inject portion of time updates into mysql_delete main loop
* triggered case emits delete+insert, no updates
* PORTION OF `SYSTEM_TIME` is forbidden
* `DELETE HISTORY .. FOR PORTION OF ...` is forbidden as well
Optimized the code that removed multiple equalities pushed from HAVING
into WHERE. Now this removal is postponed until all multiple equalities
are eliminated in substitute_for_best_equal_field().
Condition can be pushed from the HAVING clause into the WHERE clause
if it depends only on the fields that are used in the GROUP BY list
or depends on the fields that are equal to grouping fields.
Aggregate functions can't be pushed down.
How the pushdown is performed on the example:
SELECT t1.a,MAX(t1.b)
FROM t1
GROUP BY t1.a
HAVING (t1.a>2) AND (MAX(c)>12);
=>
SELECT t1.a,MAX(t1.b)
FROM t1
WHERE (t1.a>2)
GROUP BY t1.a
HAVING (MAX(c)>12);
The implementation scheme:
1. Extract the most restrictive condition cond from the HAVING clause of
the select that depends only on the fields that are used in the GROUP BY
list of the select (directly or indirectly through equalities)
2. Save cond as a condition that can be pushed into the WHERE clause
of the select
3. Remove cond from the HAVING clause if it is possible
The optimization is implemented in the function
st_select_lex::pushdown_from_having_into_where().
New test file having_cond_pushdown.test is created.
Due to inconsistent usage of different cost models to calculate
the cost of ref accesses we have to make the calculation of the
gain promising by usage a range filter more complex.
This task involves the implementation for the optimizer trace.
This feature produces a trace for any SELECT/UPDATE/DELETE/,
which contains information about decisions taken by the optimizer during
the optimization phase (choice of table access method, various costs,
transformations, etc). This feature would help to tell why some decisions were
taken by the optimizer and why some were rejected.
Trace is session-local, controlled by the @@optimizer_trace variable.
To enable optimizer trace we need to write:
set @@optimizer_trace variable= 'enabled=on';
To display the trace one can run:
SELECT trace FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE;
This task also involves:
MDEV-18489: Limit the memory used by the optimizer trace
introduces a switch optimizer_trace_max_mem_size which limits
the memory used by the optimizer trace. This was implemented by
Sergei Petrunia.
MDEV-17631 select_handler for a full query pushdown
Interfaces + Proof of Concept for federatedx with test cases.
The interfaces have been developed for integration of ColumnStore engine.
ANALYZE and ANALYZE FORMAT=JSON structures are changed in the way that they
show additional information when rowid filter is used:
- r_selectivity_pct - the observed filter selectivity
- r_buffer_size - the size of the rowid filter container buffer
- r_filling_time_ms - how long it took to fill rowid filter container
New class Rowid_filter_tracker was added. This class is needed to collect data
about how rowid filter is executed.
remove TABLE_SHARE::error_table_name() and TABLE_SHARE::orig_table_name
(that was allocated in a wrong memroot in this bug).
instead, simply set TABLE_SHARE::table_name correctly.
This patch contains a full implementation of the optimization
that allows to use in-memory rowid / primary filters built for range
conditions over indexes. In many cases usage of such filters reduce
the number of disk seeks spent for fetching table rows.
In this implementation the choice of what possible filter to be applied
(if any) is made purely on cost-based considerations.
This implementation re-achitectured the partial implementation of
the feature pushed by Galina Shalygina in the commit
8d5a11122c.
Besides this patch contains a better implementation of the generic
handler function handler::multi_range_read_info_const() that
takes into account gaps between ranges when calculating the cost of
range index scans. It also contains some corrections of the
implementation of the handler function records_in_range() for MyISAM.
This patch supports the feature for InnoDB and MyISAM.
Also fixes:
MDEV-17741 Assertion `thd->Item_change_list::is_empty()' failed in mysql_parse after unsuccessful PS
The problem was introduced by:
commit f033fbd9f2
Changed the test case for MDEV-15571
It was later fixed, but in 10.3 only:
commit ce2cf855bf
MDEV-16043 Assertion thd->Item_change_list::is_empty() failed in mysql_parse
upon SELECT from a view reading from a versioned table
This patch is a backport of ce2cf855bf to 10.2
While calculating distinct with the function remove_dup_with_compare, we don't have rnd_end calls
when we have completed the scan over the temporary table.
Added ha_rnd_end calls when we are done with the scan of the table.
When the with clause of a query contains a recursive CTE that is not used
then processing of EXPLAIN for this query does not require optimization
of the unit specifying this CTE. In this case if 'derived' is the
TABLE_LIST object created for this CTE then derived->derived_result is NULL
and any assignment to derived->derived_result->table causes a crash.
After fixing this problem in the code of st_select_lex_unit::prepare()
EXPLAIN for such a query worked without crashes. Yet an execution
plan for the recursive CTE appeared there. The cause of this problem was
an incorrect condition used in JOIN::save_explain_data_intern() that
determined whether CTE was to be optimized or not. A similar condition was
used in select_describe() and this patch has corrected it as well.
During the optimize state of a query, we come know that the result set
would atmost contain one row, then for such a query we don't need
to compute GROUP BY, ORDER BY and DISTINCT.
Changing the way how a cursor is opened to fetch its structure only,
e.g. for a cursor FOR loop record variable.
The old methods with setting thd->lex->limit_rows_examined to an Item_uint(0)
was not reliable and could push these messages into diagnostics area:
The query examined at least 1 rows, which exceeds LIMIT ROWS EXAMINED (0)
The new method should be more reliable, as it completely prevents the call
of do_select() in JOIN::exec_inner() during the cursor structure discovery,
so the execution of the cursor SELECT query returns immediately after the
preparation step (when the result row structure becomes known),
without even entering the code that fetches the result rows.