The bug was due to a loss happened during a refactoring made
on May 30 2005 that modified the function JOIN::reinit.
As a result of it for any subquery the value of offset_limit_cnt
was not restored for the following executions. Yet the first
execution of the subquery made it equal to 0.
The fix restores this value in the function JOIN::reinit.
DESCRIBE returned the type BIGINT for a column of a view if the column
was specified by an expression over values of the type INT.
E.g. for the view defined as follows:
CREATE VIEW v1 SELECT COALESCE(f1,f2) FROM t1
DESCRIBE returned type BIGINT for the only column of the view if f1,f2 are
columns of the INT type.
At the same time DESCRIBE returned type INT for the only column of the table
defined by the statement:
CREATE TABLE t2 SELECT COALESCE(f1,f2) FROM t1.
This inconsistency was removed by the patch.
Now the code chooses between INT/BIGINT depending on the
precision of the aggregated column type.
Thus both DESCRIBE commands above returns type INT for v1 and t2.
* don't use join cache when the incoming data set is already ordered
for ORDER BY
This choice must be made because join cache will effectively
reverse the join order and the results will be sorted by the index
of the table that uses join cache.
may return a wrong result.
An Item_sum_hybrid object has the was_values flag which indicates whether any
values were added to the sum function. By default it is set to true and reset
to false on any no_rows_in_result() call. This method is called only in
return_zero_rows() function. An ALL/ANY subquery can be optimized by MIN/MAX
optimization. The was_values flag is used to indicate whether the subquery
has returned at least one row. This bug occurs because return_zero_rows() is
called only when we know that the select will return zero rows before
starting any scans but often such information is not known.
In the reported case the return_zero_rows() function is not called and
the was_values flag is not reset to false and yet the subquery return no rows
Item_func_not_all and Item_func_nop_all functions return a wrong
comparison result.
The end_send_group() function now calls no_rows_in_result() for each item
in the fields_list if there is no rows were found for the (sub)query.
To make MySQL compatible with some ODBC applications, you can find
the AUTO_INCREMENT value for the last inserted row with the following query:
SELECT * FROM tbl_name WHERE auto_col IS NULL.
This is done with a special code that replaces 'auto_col IS NULL' with
'auto_col = LAST_INSERT_ID'.
However this also resets the LAST_INSERT_ID to 0 as it uses it for a flag
so as to ensure that only the first SELECT ... WHERE auto_col IS NULL
after an INSERT has this special behaviour.
In order to avoid resetting the LAST_INSERT_ID a special flag is introduced
in the THD class. This flag is used to restrict the second and subsequent
SELECTs instead of LAST_INSERT_ID.
The problem was that we restored SQL_CACHE, SQL_NO_CACHE flags in SELECT
statement from internal structures based on value set later at runtime, not
the original value set by the user.
The solution is to remember that original value.
'SELECT DISTINCT a,b FROM t1' should not use temp table if there is unique
index (or primary key) on a.
There are a number of other similar cases that can be calculated without the
use of a temp table : multi-part unique indexes, primary keys or using GROUP BY
instead of DISTINCT.
When a GROUP BY/DISTINCT clause contains all key parts of a unique
index, then it is guaranteed that the fields of the clause will be
unique, therefore we can optimize away GROUP BY/DISTINCT altogether.
This optimization has two effects:
* there is no need to create a temporary table to compute the
GROUP/DISTINCT operation (or the temporary table will be smaller if only GROUP
is removed and DISTINCT stays or if DISTINCT is removed and GROUP BY stays)
* this causes the statement in effect to become updatable in Connector/Java
because the result set columns will be direct reference to the primary key of
the table (instead to the temporary table that it currently references).
Implemented a check that will optimize away GROUP BY/DISTINCT for queries like
the above.
Currently it will work only for single non-constant table in the FROM clause.
Bug#17294 - INSERT DELAYED puting an \n before data
Bug#16611 - INSERT DELAYED corrupts data
Bug#13707 - Server crash with INSERT DELAYED on MyISAM table
Combined as Bug#16218.
INSERT DELAYED crashed in 5.0 on a table with a varchar that
could be NULL and was created pre-5.0 (Bugs 16218 and 13707).
INSERT DELAYED corrupted data in 5.0 on a table with varchar
fields that was created pre-5.0 (Bugs 17294 and 16611).
In case of INSERT DELAYED the open table is copied from the
delayed insert thread to be able to create a record for the
queue. When copying the fields, a method was used that did
convert old varchar to new varchar fields and did not set up
some pointers into the record buffer of the table.
The field conversion was guilty for the misinterpretation of
the record contents by the delayed insert thread. The wrong
pointer setup was guilty for the crashes.
For Bug 13707 (Server crash with INSERT DELAYED on MyISAM table)
I fixed the above mentioned method to set up one of the pointers.
For Bug 16218 I set up the other pointers too.
But when looking at the corruptions I got aware that converting
the field type was totally wrong for INSERT DELAYED. The copied
table is used to create a record that is to be sent to the
delayed insert thread. Of course it can interpret the record
correctly only if all field types are the same in both table
objects.
So I revoked the fix for Bug 13707 and changed the new_field()
method so that it can suppress conversions.
No test case as this is a migration problem. One needs to
create a table with 4.x and use it with 5.x. I added two
test scripts to the bug report.
tables
Currently in INSERT ... SELECT ... LIMIT ... the compiler uses a
temporary table to store the results of SELECT ... LIMIT .. and then
uses that table as a source for INSERT. The problem is that in some cases
it actually skips the LIMIT clause in doing that and materializes the
whole SELECT result set regardless of the LIMIT.
This fix is limiting the process of filling up the temp table with only
that much rows that will be actually used by propagating the LIMIT value.
The bug report revealed two problems related to min/max optimization:
1. If the length of a constant key used in a SARGable condition for
for the MIN/MAX fields is greater than the length of the field an
unwanted warning on key truncation is issued;
2. If MIN/MAX optimization is applied to a partial index, like INDEX(b(4))
than can lead to returning a wrong result set.
The check for view security was lacking several points :
1. Check with the right set of permissions : for each table ref that
participates in a view there were the right credentials to use in it's
security_ctx member, but these weren't used for checking the credentials.
This makes hard enforcing the SQL SECURITY DEFINER|INVOKER property
consistently.
2. Because of the above the security checking for views was just ruled out
in explicit ways in several places.
3. The security was checked only for the columns of the tables that are
brought into the query from a view. So if there is no column reference
outside of the view definition it was not detecting the lack of access to
the tables in the view in SQL SECURITY INVOKER mode.
The fix below tries to fix the above 3 points.
When a CREATE TABLE command created a table from a materialized
view id does not inherit default values from the underlying table.
Moreover the temporary table used for the view materialization
does not inherit those default values.
In the case when the underlying table contained ENUM fields it caused
misleading error messages. In other cases the created table contained
wrong default values.
The code was modified to ensure inheritance of default values for
materialized views.
Re-work best_access_path() and find_best() to reuse E(#rows(range access)) as
E(#rows(ref[_or_null](const) access) only when it is appropriate.
[This is the final cumulative patch]
When converting DISTINCT to GROUP BY where the columns are from the covering
index and they are quoted twice in the SELECT list the optimizer is creating
improper processing sequence. This is because of the fact that the columns
of the covering index are not recognized as such and treated as non-index
columns.
Generally speaking duplicate columns can safely be removed from the GROUP
BY/DISTINCT list because this will not add or remove new rows in the
resulting set. Duplicates can be removed even if they are not consecutive
(as is the case for ORDER BY, where the duplicate columns can be removed
only if they are consecutive).
So we can safely transform "SELECT DISTINCT a,a FROM ... ORDER BY a" to
"SELECT a,a FROM ... GROUP BY a ORDER BY a" instead of
"SELECT a,a FROM .. GROUP BY a,a ORDER BY a". We can even transform
"SELECT DISTINCT a,b,a FROM ... ORDER BY a,b" to
"SELECT a,b,a FROM ... GROUP BY a,b ORDER BY a,b".
The fix to this bug consists of checking for duplicate columns in the SELECT
list when constructing the GROUP BY list in transforming DISTINCT to GROUP
BY and skipping the ones that are already in.
A query with a group by and having clauses could return a wrong
result set if the having condition contained a constant conjunct
evaluated to FALSE.
It happened because the pushdown condition for table with
grouping columns lost its constant conjuncts.
Pushdown conditions are always built by the function make_cond_for_table
that ignores constant conjuncts. This is apparently not correct when
constant false conjuncts are present.
The bug was as follows: When merge_key_fields() encounters "t.key=X OR t.key=Y" it will
try to join them into ref_or_null access via "t.key=X OR NULL". In order to make this
inference it checks if Y<=>NULL, ignoring the fact that value of Y may be not yet known.
The fix is that the check if Y<=>NULL is made only if value of Y is known (i.e. it is a
constant).
TODO: When merging to 5.0, replace used_tables() with const_item() everywhere in merge_key_fields().
This performance degradation was due to the fact that some
cost evaluation code added into 4.1 in the function find_best was
not merged into the code of the function best_access_path added
together with other code for greedy optimizer.
Added a parameter to the function print_plan. The parameter contains
accumulated cost for a given partial join.
The patch does not include a special test case since this performance
degradation is hard to reproduse with a simple example.
TODO: make the function find_best use the function best_access_path
in order to remove duplication of code which might result in incomplete
merges in the future.
The bug caused wrong result sets for union constructs of the form
(SELECT ... ORDER BY order_list1 [LIMIT n]) ORDER BY order_list2.
For such queries order lists were concatenated and limit clause was
completely neglected.
The SQL standard doesn't allow to use in HAVING clause fields that are not
present in GROUP BY clause and not under any aggregate function in the HAVING
clause. However, mysql allows using such fields. This extension assume that
the non-grouping fields will have the same group-wise values. Otherwise, the
result will be unpredictable. This extension allowed in strict
MODE_ONLY_FULL_GROUP_BY sql mode results in misunderstanding of HAVING
capabilities.
The new error message ER_NON_GROUPING_FIELD_USED message is added. It says
"non-grouping field '%-.64s' is used in %-.64s clause". This message is
supposed to be used for reporting errors when some field is not found in the
GROUP BY clause but have to be present there. Use cases for this message are
this bug and when a field is present in a SELECT item list not under any
aggregate function and there is GROUP BY clause present which doesn't mention
that field. It renders the ER_WRONG_FIELD_WITH_GROUP error message obsolete as
being more descriptive.
The resolve_ref_in_select_and_group() function now reports the
ER_NON_GROUPING_FIELD_FOUND error if the strict mode is set and the field for
HAVING clause is found in the SELECT item list only.
used
In a simple queries a result of the GROUP_CONCAT() function was always of
varchar type.
But if length of GROUP_CONCAT() result is greater than 512 chars and temporary
table is used during select then the result is converted to blob, due to
policy to not to store fields longer than 512 chars in tmp table as varchar
fields.
In order to provide consistent behaviour, result of GROUP_CONCAT() now
will always be converted to blob if it is longer than 512 chars.
Item_func_group_concat::field_type() is modified accordingly.
Multiple equalities were not adjusted after reading constant tables.
It resulted in neglecting good index based methods that could be
used to access of other tables.