The columns in HAVING can reference the GROUP BY and
SELECT columns. There can be "table" prefixes when
referencing these columns. And these "table" prefixes
in HAVING use the table alias if available.
This means that table aliases are subject to the same
storage rules as table names and are dependent on
lower_case_table_names in the same way as the table
names are.
Fixed by :
1. Treating table aliases as table names
and make them lowercase when printing out the SQL
statement for view persistence.
2. Using case insensitive comparison for table
aliases when requested by lower_case_table_names
UNIQUE (eq-ref) lookups result in table being considered as a "constant" table.
Queries that consist of only constant tables are processed in do_select() in a
special way that doesn't invoke evaluate_join_record(), and therefore doesn't
increase the counters join->examined_rows and join->thd->row_count.
The patch increases these counters in this special case.
NOTICE:
This behavior seems to contradict what the documentation says in Sect. 5.11.4:
"Queries handled by the query cache are not added to the slow query log, nor
are queries that would not benefit from the presence of an index because the
table has zero rows or one row."
No test case in 5.0 as issue shows only in slow query log, and other counters
can give subtly different values (with regard to counting in create_sort_index(),
synthetic rows in ROLLUP, etc.).
The bug is a regression introduced by the fix for bug30596. The problem
was that in cases when groups in GROUP BY correspond to only one row,
and there is ORDER BY, the GROUP BY was removed and the ORDER BY
rewritten to ORDER BY <group_by_columns> without checking if the
columns in GROUP BY and ORDER BY are compatible. This led to
incorrect ordering of the result set as it was sorted using the
GROUP BY columns. Additionaly, the code discarded ASC/DESC modifiers
from ORDER BY even if its columns were compatible with the GROUP BY
ones.
This patch fixes the regression by checking if ORDER BY columns form a
prefix of the GROUP BY ones, and rewriting ORDER BY only in that case,
preserving the ASC/DESC modifiers. That check is sufficient, since the
GROUP BY columns contain a unique index.
tables or more
The problem was that the optimizer used the join buffer in cases when
the result set is ordered by filesort. This resulted in the ORDER BY
clause being ignored, and the records being returned in the order
determined by the order of matching records in the last table in join.
Fixed by relaxing the condition in make_join_readinfo() to take
filesort-ordered result sets into account, not only index-ordered ones.
The HAVING clause is subject to the same rules as the SELECT list
about using aggregated and non-aggregated columns.
But this was not enforced when processing implicit grouping from
using aggregate functions.
Fixed by performing the same checks for HAVING as for SELECT.
bitmap_is_set(table->write_set, fiel
Problem: creating a temporary table we allocate the group buffer if needed
followed by table bitmaps (see create_tmp_table()). Reserving less memory for
the group buffer than actually needed (used) for values retrieval may lead
to overlapping with followed bitmaps in the memory pool that in turn leads
to unpredictable consequences.
As we use Item->max_length sometimes to calculate group buffer size,
it must be set to proper value. In this particular case
Item_datetime_typecast::max_length is too small.
Another problem is that we use max_length to calculate the group buffer
key length for items represented as DATE/TIME fields which is superfluous.
Fix: set Item_datetime_typecast::max_length properly,
accurately calculate the group buffer key length for items
represented as DATE/TIME fields in the buffer.
The change_to_use_tmp_fields function leaves the orig_table member of an
expression's tmp table field filled for the new Item_field being created.
Later orig_table is used by the Field::make_field function to provide some
info about original table and field name to a user. This is ok for a field
but for an expression it should be empty.
The change_to_use_tmp_fields function now resets orig_table member of
an expression's tmp table field to prevent providing a wrong info to a user.
The Field::make_field function now resets the table_name and the org_col_name
variables when the orig_table is set to 0.
When storing the VIEW the CREATE VIEW command is reconstructed
from the parse tree. While constructing the command string
the index hints specified should also be printed.
Fixed by adding code to print the index hints when printing a
table in the FROM clause.
The optimizer sets index traversal in reverse order only if there are
used key parts that are not compared to a constant.
However using the primary key as an ORDER BY suffix rendered the check
incomplete : going in reverse order must still be used even if
all the parts of the secondary key are compared to a constant.
Fixed by relaxing the check and set reverse traversal even when all
the secondary index keyparts are compared to a const.
Also account for the case when all the primary keys are compared to a
constant.
Currently the Last_query_cost session status variable shows
only the cost of a single flat subselect. For complex queries
(with subselects or unions etc) Last_query_cost is not valid
as it was showing the cost for the last optimized subselect.
Fixed by reseting to zero Last_query_cost when the complete
cost of the query cannot be determined.
Last_query_cost will be non-zero only for single flat queries.
The optimization that uses a unique index to remove GROUP BY, did not
ensure that the index was actually used, thus violating the ORDER BY
that is impled by GROUP BY.
Fixed by replacing GROUP BY with ORDER BY if the GROUP BY clause contains
a unique index. In case GROUP BY ... ORDER BY null is used, GROUP BY is
simply removed.
HEAP tables can't index BIT fields. Due to this when grouping by such fields is
needed they are converted to a fields of the LONG type when temporary table
is being created. But a side effect of this is that a wrong type of BIT
fields is returned to a client.
Now the JOIN::prepare and the create_distinct_group functions are create
additional hidden copy of BIT fields to preserve original fields untouched.
New hidden fields are used for grouping instead.
The bug caused memory corruption for some queries with top OR level
in the WHERE condition if they contained equality predicates and
other sargable predicates in disjunctive parts of the condition.
The corruption happened because the upper bound of the memory
allocated for KEY_FIELD and SARGABLE_PARAM internal structures
containing info about potential lookup keys was calculated incorrectly
in some cases. In particular it was calculated incorrectly when the
WHERE condition was an OR formula with disjuncts being AND formulas
including equalities and other sargable predicates.
- Don't call mysql_select() several times for the select that enumerates
a temporary table with the results of the UNION. Making this call for
every subquery execution caused O(#enumerated-rows-in-the-outer-query)
memory allocations.
- Instead, call join->reinit() and join->exec(), and
= disable constant table detection for such joins,
= provide special handling for table-less constant subqueries.
SELECT statement itself returns empty.
As a result of this bug 'SELECT AGGREGATE_FUNCTION(fld) ... GROUP BY'
can return one row instead of an empty result set.
When GROUP BY only has fields of constant tables
(with a single row), the optimizer deletes the group_list.
After that we lose the information about whether we had an
GROUP BY statement. Though it's important
as SELECT min(x) from empty_table; and
SELECT min(x) from empty_table GROUP BY y; have to return
different results - the first query should return one row,
second - an empty result set.
So here we add the 'group_optimized_away' flag to remember this case
when GROUP BY exists in the query and is removed
by the optimizer, and check this flag in end_send_group()
Use size_t instead of uint when calculating join buffer size, because uint can be overflown on 64-bit platforms and join_buffer_size > 4 GB.
The test case for this bug is a part of the test suite for bug #5731.
- make ha_berkeley::cmp_ref() take into account that auto-generated PKs
are stored in LSB-first order.
- Remove the temporary code that made the bugfix work for innodb only
This bug manifested itself for join queries with GROUP BY and HAVING clauses
whose SELECT lists contained DISTINCT. It occurred when the optimizer could
deduce that the result set would have not more than one row.
The bug could lead to wrong result sets for queries of this type because
HAVING conditions were erroneously ignored in some cases in the function
remove_duplicates.
ORDER BY primary_key on InnoDB table
Queries that use an InnoDB secondary index to retrieve
data don't need to sort in case of ORDER BY primary key
if the secondary index is compared to constant(s).
They can also skip sorting if ORDER BY contains both the
the secondary key parts and the primary key parts (in
that order).
This is because InnoDB returns the rows in order of the
primary key for rows with the same values of the secondary
key columns.
Fixed by preventing temp table sort for the qualifying
queries.
A bug in the restore_prev_nj_state function allowed interleaving
inner tables of outer join operations with outer tables. With the
current implementation of the nested loops algorithm it could lead
to wrong result sets for queries with nested outer joins.
Another bug in this procedure effectively blocked evaluation of some
valid execution plans for queries with nested outer joins.
The need arose when working on Bug 26141, where it became
necessary to replace TABLE_LIST with its forward declaration in a few
headers, and this involved a lot of s/TABLE_LIST/st_table_list/.
Although other workarounds exist, this patch is in line
with our general strategy of moving away from typedef-ed names.
Sometime in future we might also rename TABLE_LIST to follow the
coding style, but this is a huge change.
query / no aggregate of subquery
The optimizer counts the aggregate functions that
appear as top level expressions (in all_fields) in
the current subquery. Later it makes a list of these
that it uses to actually execute the aggregates in
end_send_group().
That count is used in several places as a flag whether
there are aggregates functions.
While collecting the above info it must not consider
aggregates that are not aggregated in the current
context. It must treat them as normal expressions
instead. Not doing that leads to incorrect data about
the query, e.g. running a query that actually has no
aggregate functions as if it has some (and hence is
expected to return only one row).
Fixed by ignoring the aggregates that are not aggregated
in the current context.
One other smaller omission discovered and fixed in the
process : the place of aggregation was not calculated for
user defined functions. Fixed by calling
Item_sum::init_sum_func_check() and
Item_sum::check_sum_func() as it's done for the rest of
the aggregate functions.