Analysis:
The cause of the valgrind warning was an attempt to evaluate a Field that was not yet read.
The reason was that on one hand Item_func_isnotnull was marked as constant by
Item_func_isnotnull::update_used_tables, and this allowed eval_const_cond() to be called.
On the other hand Item_func_isnotnull::val_int() evaluated its argument as if it was not
constant.
Solution:
The fix make sure that Item_func_isnotnull::val_int() doesn't evaluate its argument when
it is constant and cannot be NULL, because the result is known in this case.
This patch almost totally revised the patch for bug mdev-4177.
The latter had too many defects. In particular, it did not
propagate multiple equalities formed when merging a degenerate
disjunct into underlying AND formula.
- Modify the way Item_cond::fix_fields() and Item_cond::eval_not_null_tables()
calculate bitmap for Item_cond_or::not_null_tables():
if they see a "... OR inexpensive_const_false_item OR ..." then the item can
be ignored.
- Updated test results. There can be more warnings produced since parts of WHERE
are evaluated more times.
includes:
* remove some remnants of "Bug#14521864: MYSQL 5.1 TO 5.5 BUGS PARTITIONING"
* introduce LOCK_share, now LOCK_ha_data is strictly for engines
* rea_create_table() always creates .par file (even in "frm-only" mode)
* fix a 5.6 bug, temp file leak on dummy ALTER TABLE
Cleanup: remove TIME_FUZZY_DATE.
Introduce TIME_FUZZY_DATES which means "very fuzzy, the resulting
value is only used for comparison. It can be invalid date, fine, as long as it can be
compared".
Updated many tests results (they're better now).
get_datetime_value() should not double-cache its own Item_cache_temporal items,
but it *should* cache other Item_cache items, such as Item_cache_str.
sql/item.h:
shortcut, to avoid going through the switch in Item::cmp_type()
sql/item_cmpfunc.cc:
even if the item is Item_cache_str - it still needs to be converted and cached.
sql/item_timefunc.h:
all descendants of Item_temporal_func always have cmp_type==TIME_RESULT.
Even Item_date_add_interval, that might have field_type == MYSQL_TYPE_STRING.
IN IN-CLAUSE USING MYISAM OR MEMORY ENGINE
Backport from 5.6. Original message:
The coincidences caused a data loss:
* The query has IN subqueries nested twice,
* the WHERE clause of the inner subquery refers to the
outer field, and the whole WHERE clause returns FALSE,
* the inner subquery has a LEFT JOIN that joins a single
row with a row of NULLs; one of that NULL columns
represents the select list of the subquery.
Normally, that inner subquery should return empty record set.
However, in our case:
* the Item_is_not_null_test item goes constant, since
its underlying field is NULL (because of LEFT JOIN ... ON
FALSE of const table row with a row of nulls);
* we evaluate Item_is_not_null_test::val_int() as a part
of fake HAVING expression of the transformed subquery;
* as far as the underlying field is NULL, we optimize
out the whole fake HAVING expression as FALSE as well
as a whole subquery with a zero result:
Impossible HAVING noticed after reading const tables";
* thus, the optimizer ignores the presence of the WHERE
clause (the WHERE expression is FALSE in our case, so
the subquery should return empty set);
* however, during the evaluation of the
Item_is_not_null_test::val_int() in the optimizer,
it marked its "owner" with the "was_null" flag -- that
forced the subquery to return UNKNOWN instead of empty
set.
That caused a wrong result.
The problem is a regression of the small cleanup in
the fix for the bug11827369 (the Item_is_not_null_test part)
that conflicts with optimizations in the fix for the bug11752543.
Before that regression the Item_is_not_null_test items
never were constants.
The fix is the rollback of Item_is_not_null_test parts
of the bug11827369 fix.
IN IN-CLAUSE USING MYISAM OR MEMORY ENGINE
Backport from 5.6. Original message:
The coincidences caused a data loss:
* The query has IN subqueries nested twice,
* the WHERE clause of the inner subquery refers to the
outer field, and the whole WHERE clause returns FALSE,
* the inner subquery has a LEFT JOIN that joins a single
row with a row of NULLs; one of that NULL columns
represents the select list of the subquery.
Normally, that inner subquery should return empty record set.
However, in our case:
* the Item_is_not_null_test item goes constant, since
its underlying field is NULL (because of LEFT JOIN ... ON
FALSE of const table row with a row of nulls);
* we evaluate Item_is_not_null_test::val_int() as a part
of fake HAVING expression of the transformed subquery;
* as far as the underlying field is NULL, we optimize
out the whole fake HAVING expression as FALSE as well
as a whole subquery with a zero result:
Impossible HAVING noticed after reading const tables";
* thus, the optimizer ignores the presence of the WHERE
clause (the WHERE expression is FALSE in our case, so
the subquery should return empty set);
* however, during the evaluation of the
Item_is_not_null_test::val_int() in the optimizer,
it marked its "owner" with the "was_null" flag -- that
forced the subquery to return UNKNOWN instead of empty
set.
That caused a wrong result.
The problem is a regression of the small cleanup in
the fix for the bug11827369 (the Item_is_not_null_test part)
that conflicts with optimizations in the fix for the bug11752543.
Before that regression the Item_is_not_null_test items
never were constants.
The fix is the rollback of Item_is_not_null_test parts
of the bug11827369 fix.
The function remove_eq_cond removes the parts of a disjunction
for which it has been proved that they are always true. In the
result of this removal the disjunction may be converted into a
formula without OR that must be merged into the the AND formula
that contains the disjunction.
The merging of two AND conditions must take into account the
multiple equalities that may be part of each of them.
These multiple equality must be merged and become part of the
and object built as the result of the merge of the AND conditions.
Erroneously the function remove_eq_cond lacked the code that
would merge multiple equalities of the merged AND conditions.
This could lead to confusing situations when at the same AND
level there were two multiple equalities with common members
and the list of equal items contained only some of these
multiple equalities.
This, in its turn, could lead to an incorrect work of the
function substitute_for_best_equal_field when it tried to optimize
ref accesses. This resulted in forming invalid TABLE_REF objects
that were used to build look-up keys when materialized subqueries
were exploited.
This bug in the legacy code could manifest itself in queries with
semi-join materialized subqueries.
When a subquery is materialized all conditions that are imposed
only on the columns belonging to the tables from the subquery
are taken into account.The code responsible for subquery optimizations
that employes subquery materialization makes sure to remove these
conditions from the WHERE conditions of the query obtained after
it has transformed the original query into a query with a semi-join.
If the condition to be removed is an equality condition it could
be added to ON expressions and/or conditions from disjunctive branches
(parts of OR conditions) in an attempt to generate better access keys
to the tables of the query. Such equalities are supposed to be removed
later from all the formulas where they have been added to.
However, erroneously, this was not done in some cases when an ON
expression and/or a disjunctive part of the OR condition could
be converted into one multiple equality. As a result some equality
predicates over columns belonging to the tables of the materialized
subquery remained in the ON condition and/or the a disjunctive
part of the OR condition, and the excuter later, when trying to
evaluate them, returned wrong answers as the values of the fields
from these equalities were not valid.
This happened because any standalone multiple equality (a multiple
equality that are not ANDed with any other predicates) lacked
the information about equality predicates inherited from upper
levels (in particular, inherited from the WHERE condition).
The fix adds a reference to such information to any standalone
multiple equality.