RESULTS ON IN() & NOT IN() COMP #3
This bug causes a wrong result in mysql-trunk when ICP is used
and bad performance in mysql-5.5 and mysql-trunk.
Using the query from bug report to explain what happens and causes
the wrong result from the query when ICP is enabled:
1. The t3 table contains four records. The outer query will read
these and for each of these it will execute the subquery.
2. Before the first execution of the subquery it will be optimized. In
this case the important is what happens to the first table t1:
-make_join_select() will call the range optimizer which decides
that t1 should be accessed using a range scan on the k1 index
It creates a QUICK_RANGE_SELECT object for this.
-As the last part of optimization the ICP code pushes the
condition down to the storage engine for table t1 on the k1 index.
This produces the following information in the explain for this table:
2 DEPENDENT SUBQUERY t1 range k1 k1 5 NULL 3 Using index condition; Using filesort
Note the use of filesort.
3. The first execution of the subquery does (among other things) due
to the need for sorting:
a. Call create_sort_index() which again will call find_all_keys():
b. find_all_keys() will read the required keys for all qualifying
rows from the storage engine. To do this it checks if it has a
quick-select for the table. It will use the quick-select for
reading records. In this case it will read four records from the
storage engine (based on the range criteria). The storage engine
will evaluate the pushed index condition for each record.
c. At the end of create_sort_index() there is code that cleans up a
lot of stuff on the join tab. One of the things that is cleaned
is the select object. The result of this is that the
quick-select object created in make_join_select is deleted.
4. The second execution of the subquery does the same as the first but
the result is different:
a. Call create_sort_index() which again will call find_all_keys()
(same as for the first execution)
b. find_all_keys() will read the keys from the storage engine. To
do this it checks if it has a quick-select for the table. Now
there is NO quick-select object(!) (since it was deleted in
step 3c). So find_all_keys defaults to read the table using a
table scan instead. So instead of reading the four relevant records
in the range it reads the entire table (6 records). It then
evaluates the table's condition (and here it goes wrong). Since
the entire condition has been pushed down to the storage engine
using ICP all 6 records qualify. (Note that the storage engine
will not evaluate the pushed index condition in this case since
it was pushed for the k1 index and now we do a table scan
without any index being used).
The result is that here we return six qualifying key values
instead of four due to not evaluating the table's condition.
c. As above.
5. The two last execution of the subquery will also produce wrong results
for the same reason.
Summary: The problem occurs due to all but the first executions of the
subquery is done as a table scan without evaluating the table's
condition (which is pushed to the storage engine on a different
index). This is caused by the create_sort_index() function deleting
the quick-select object that should have been used for executing the
subquery as a range scan.
Note that this bug in addition to causing wrong results also can
result in bad performance due to executing the subquery using a table
scan instead of a range scan. This is an issue in MySQL 5.5.
The fix for this problem is to avoid that the Quick-select-object that
the optimizer created is deleted when create_sort_index() is doing
clean-up of the join-tab. This will ensure that the quick-select
object and the corresponding pushed index condition will be available
and used by all following executions of the subquery.
sql/sql_select.cc:
Fix for Bug#12667154: Change how create_sort_index() cleans up the
join_tab's select and quick-select objects in order to avoid that a
quick-select object created outside of create_sort_index() is deleted.
The optimizer chose a less efficient execution plan due to the following
defects of the code:
1. the generic handler function handler::keyread_time did not take into account
that in clustered primary keys record data is included into each index entry
2. the function make_join_readinfo erroneously decided that index only scan
could not be used if join cache was empoyed.
Added no additional test case.
Adjusted some of the test results.
The patch backports two patches from mysql 5.6:
- BUG#12640437: USING SQL_BUFFER_RESULT RESULTS IN A DIFFERENT QUERY OUTPUT
- Bug#12578908: SELECT SQL_BUFFER_RESULT OUTPUTS TOO MANY ROWS WHEN GROUP IS OPTIMIZED AWAY
Original comment:
-----------------
3714 Jorgen Loland 2012-03-01
BUG#12640437 - USING SQL_BUFFER_RESULT RESULTS IN A DIFFERENT
QUERY OUTPUT
For all but simple grouped queries, temporary tables are used to
resolve grouping. In these cases, the list of grouping fields is
stored in the temporary table and grouping is resolved
there (e.g. by adding a unique constraint on the involved
fields). Because of this, grouping is already done when the rows
are read from the temporary table.
In the case where a group clause may be optimized away, grouping
does not have to be resolved using a temporary table. However, if
a temporary table is explicitly requested (e.g. because the
SQL_BUFFER_RESULT hint is used, or the statement is
INSERT...SELECT), a temporary table is used anyway. In this case,
the temporary table is created with an empty group list (because
the group clause was optimized away) and it will therefore not
create groups. Since the temporary table does not take care of
grouping, JOIN::group shall not be set to false in
make_simple_join(). This was fixed in bug 12578908.
However, there is an exception where make_simple_join() should
set JOIN::group to false even if the query uses a temporary table
that was explicitly requested but is not strictly needed. That
exception is if the loose index scan access method (explain
says "Using index for group-by") is used to read into the
temporary table. With loose index scan, grouping is resolved
by the access method. This is exactly what happens in this bug.
The problem was in the code (update_const_equal_items()) which marked
index parts constant independently of the place where the equality was used.
In the test suite it marked t2_1.c part constant despite the fact that
it connected by OR with other expression.
Solution is to mark constant only top equalities connected with AND.
mysql-test/r/select.result:
Added test result for Bug#12713907
mysql-test/t/select.test:
Added test case for Bug#12713907
sql/sql_select.cc:
Remove the call to set_keyread as we do it from access
functions 'join_read_first' and 'join_read_last'
ORDER BY COUNT(*) LIMIT.
PROBLEM:
With respect to problem in the bug description, we
exhibit different behaviors for the two tables
presented, because innodb statistics (rec_per_key
in this case) are updated for the first table
and not so for the second one. As a result the
query plan gets changed in test_if_skip_sort_order
to use 'index' scan. Hence the difference in the
explain output. (NOTE: We can reproduce the problem
with first table by reducing the number of tuples
and changing the table structure)
The varied output w.r.t the query on the second table
is because of the result in the query plan change.
When a query plan is changed to use 'index' scan,
after the call to test_if_skip_sort_order, we set
keyread to TRUE immedietly. If for some reason
we drop this index scan for a filesort later on,
we fetch only the keys not the entire tuple.
As a result we would see junk values in the result set.
Following is the code flow:
Call test_if_skip_sort_order
-Choose an index to give sorted output
-If this is a covering index, set_keyread to TRUE
-Set the scan to INDEX scan
Call test_if_skip_sort_order second time
-Index is not chosen (note that we do not pass the
actual limit value second time. Hence we do not choose
index scan second time which in itself is a bug fixed
in 5.6 with WL#5558)
-goto filesort
Call filesort
-Create quick range on a different index
-Since keyread is set to TRUE, we fetch only the columns of
the index
-results in the required columns are not fetched
FIX:
Remove the call to set_keyread(TRUE) from
test_if_skip_sort_order. The access function which is
'join_read_first' or 'join_read_last' calls set_keyread anyways.
mysql-test/r/func_group_innodb.result:
Added test result for Bug#12713907
mysql-test/t/func_group_innodb.test:
Added test case for Bug#12713907
sql/sql_select.cc:
Remove the call to set_keyread as we do it from access
functions 'join_read_first' and 'join_read_last'
The previous patch for the bug (that erroneously identified the bug as
bug 972973 in its comment) was incorrect.
It turned out that the code that triggered the abort complain reported for
the bug was not needed at all.
When the function free_tmp_table deletes the handler object for
a temporary table the field TABLE::file for this table should be
set to NULL. Otherwise an assertion failure may occur.
Bug#13639204 64111: CRASH ON SELECT SUBQUERY WITH NON UNIQUE INDEX
The crash happened due to wrong calculation
of key length during creation of reference for
sort order index. The problem is that
keyuse->used_tables can have OUTER_REF_TABLE_BIT enabled
but used_tables parameter(create_ref_for_key() func) does
not have it. So key parts which have OUTER_REF_TABLE_BIT
are ommited and it could lead to incorrect key length
calculation(zero key length).
mysql-test/r/subselect_innodb.result:
test result
mysql-test/t/subselect_innodb.test:
test case
sql/sql_select.cc:
added OUTER_REF_TABLE_BIT to the used_tables parameter
for create_ref_for_key() function.
storage/innobase/handler/ha_innodb.cc:
added assertion, request from Inno team
storage/innodb_plugin/handler/ha_innodb.cc:
added assertion, request from Inno team
The main problem was a bug in CSV where it provided wrong statistics (it claimed the table was empty when it wasn't)
I also fixed wrong freeing of blob's in the CSV handler. (Any call to handler::read_first_row() on a CSV table with blobs would fail)
mysql-test/r/csv.result:
Added new test case
mysql-test/r/partition_innodb.result:
Updated test results after fixing bug with impossible partitions and const tables
mysql-test/t/csv.test:
Added new test case
sql/sql_select.cc:
Cleaned up code for handling of partitions.
Fixed also a bug where we didn't threat a table with impossible partitions as a const table.
storage/csv/ha_tina.cc:
Allocate blobroot onces.
- When doing join optimization, pre-sort the tables so that they mimic the execution
order we've had with 'semijoin=off'.
- That way, we will not get regressions when there are two query plans (the old and the
new) that have indentical costs but different execution times (because of factors that
the optimizer was not able to take into account).
mysql-test/suite/innodb/t/group_commit_crash.test:
remove autoincrement to avoid rbr being used for insert ... select
mysql-test/suite/innodb/t/group_commit_crash_no_optimize_thread.test:
remove autoincrement to avoid rbr being used for insert ... select
mysys/my_addr_resolve.c:
a pointer to a buffer is returned to the caller -> the buffer cannot be on the stack
mysys/stacktrace.c:
my_vsnprintf() is ok here, in 5.5
- This is a regession introduced by fix for BUG#951937
- The problem was that there were scenarios where check_simple_equality() would create an
Item_equal object but would not call item_equal->set_context_field() on it.
- The fix was to add the missing calls.
include/mysql_com.h:
remove "shutdown levels" that aren't shutdown levels from mysql_enum_shutdown_level
mysys/my_addr_resolve.c:
my_snprintf in 5.5 (but not in 5.3) supports %p
sql/item_func.cc:
use a method (that exists only in 5.5) instead of directly accessing a member
sql/item_subselect.cc:
use a method (that exists only in 5.5) instead of directly accessing a member
sql/opt_subselect.cc:
use a method (that exists only in 5.5) instead of directly accessing a member
sql/sql_select.cc:
use a method (that exists only in 5.5) instead of directly accessing a member
- Fix equality propagation to work with SJM nests and OR clauses (full descirption of problem and
solution in the comment in the patch)
(The second commit with post-review fixes)
- Remove all references of MAX_TABLES from JOIN struct and make these dynamic
- Updated Join_plan_state to allocate just as many elements as it's needed
sql/opt_subselect.cc:
Optimized version of Join_plan_state
sql/sql_select.cc:
Set join->positions and join->best_positions dynamicly
Don't call update_virtual_fields() if table->vfield is not set.
sql/sql_select.h:
Remove all references of MAX_TABLES from JOIN struct and Join_plan_state and make these dynamic
If the first component of a ref key happened to be a constant appeared
after constant row substitution then no store_key element should be
created for such a component. Yet create_ref_for_key() erroneously could
create such an element that caused construction of invalid ref keys and
wrong results for some joins.
- Avoid needless load/stores in my_hash_sort_simple due to possible aliasing
- Avoid expensive Join_plan_state constructor in choose_subquery_plan when no subquery
- Avoid calling update_virtual_fields for every row when no virtual fields.
Do not call, directly or indirectly, SQL_SELECT::test_quick_select()
for derived materialized tables / views when optimizing joins referring
to these tables / views to get cost estimates of materialization.
The current code does not create B-tree indexes for materialized
derived tables / views. So now it's not possible to get any estimates
for ranges conditions over the results of the materialization.
The function mysql_derived_create() must take into account the fact
that array of the KEY structures specifying the keys over a derived
table / view may be moved after the optimization phase if the
derived table / view is materialized.
Added 'from_end' as extra parameter to Field::unpack() to detect wrong from data.
Change ha_archive::unpack_row() to detect wrong field lengths.
Replication code changed to detect wrong field information in events.
mysql-test/r/archive.result:
dded test case for lp:917689
sql/field.cc:
Added 'from_end' as extra parameter to Field::unpack() to detect wrong from data.
Removed not used 'unpack_key' functions.
sql/field.h:
Added 'from_end' as extra parameter to Field::unpack() to detect wrong from data.
Removed not used 'unpack_key' functions.
Removed some not needed unpack() functions.
sql/filesort.cc:
Added buffer end parameter to unpack_addon_fields()
sql/log_event.h:
Added end of buffer argument to unpack_row()
sql/log_event_old.cc:
Added end of buffer argument to unpack_row()
sql/log_event_old.h:
Added end of buffer argument to unpack_row()
sql/records.cc:
Added buffer end parameter to unpack_addon_fields()
sql/rpl_record.cc:
Added end of buffer argument to unpack_row()
Added detection of wrong field information in events
sql/rpl_record.h:
Added end of buffer argument to unpack_row()
sql/rpl_record_old.cc:
Added end of buffer argument to unpack_row()
Added detection of wrong field information in events
sql/rpl_record_old.h:
Added end of buffer argument to unpack_row()
sql/table.h:
Added buffer end parameter to unpack()
storage/archive/ha_archive.cc:
Change ha_archive::unpack_row() to detect wrong field lengths.
This fixes lp:917689
https://mariadb.atlassian.net/browse/MDEV-28
This task implements a new clause LIMIT ROWS EXAMINED <num>
as an extention to the ANSI LIMIT clause. This extension
allows to limit the number of rows and/or keys a query
would access (read and/or write) during query execution.
Fixed memory leaks and compiler warnings in ha_sphinx.cc
Added HA_MUST_USE_TABLE_CONDITION_PUSHDOWN so that an engine can force index condition to be used
mysql-test/suite/sphinx/sphinx.result:
Added testing of pushdown conditions and sphinx status variables.
mysql-test/suite/sphinx/sphinx.test:
Added testing of pushdown conditions and sphinx status variables.
mysql-test/suite/sphinx/suite.pm:
Print version number if sphinx version is too old.
sql/handler.h:
Added HA_MUST_USE_TABLE_CONDITION_PUSHDOWN so that an engine can force index condition to be used
sql/sql_base.cc:
Added 'thd' argument to check_unused() to be able to set 'entry->in_use' if we call handler->extra().
This was needed as sphinx (and possible other storage engines) assumes that 'in_use' is set if handler functions are called.
sql/sql_select.cc:
Test if handler is forcing pushdown condition to be used.
storage/sphinx/ha_sphinx.cc:
Updated to version 2.0.4
Fixed memory leaks and compiler warnings.
storage/sphinx/ha_sphinx.h:
Updated to version 2.0.4
storage/sphinx/snippets_udf.cc:
Updated to version 2.0.4