fix_fields for the arguments of the NTH_VALUE function was updating the same reference,
so for the second argument (or after the first argument) the items were not resolved
to their corresponding field from the view as they were updating the reference to the
first argument.
Removed an unnecessary ifndef which was printing the window name for a named
window only in the case of debug build. The print() for the window function
should behave in the same way on both release and debug builds.
Allocate space for fields inside the window function (arguments, PARTITION BY and ORDER BY clause)
in the ref pointer array. All fields inside the window function are part of the temporary
table that is required for the window function computation.
Currently when both PARTITION BY and ORDER BY clauses are empty then we create a Item
with the first field in the select list and sort with that field.
It should be created as an Item_temptable_field instead of Item_field because the
print() function continues to work even if the table has been dropped.
A temporary table is needed for window function computation but if only a NAMED WINDOW SPEC
is used and there is no window function, then there is no need to create a temporary
table as there is no stage to compute WINDOW FUNCTION
The issue here is for degenerate joins we should execute the window
function but it is not getting executed in all the cases.
To get the window function values window function needs to be executed
always. This currently does not happen in few cases
where the join would return 0 or 1 row like
1) IMPOSSIBLE WHERE
2) MIN/MAX optimization
3) EMPTY CONST TABLE
The fix is to make sure that window functions get executed
and the temporary table is setup for the execution of window functions
The query requires 2 temporary tables for execution, the window function
is always attached to the last temporary table, but in this case the
result field of the window function points to the first temporary table
rather than the last one.
Fixed this by not changing window function items with temporary table
items of the first temporary table.
The issue here is window function makes the passed string object
to point to an area in a temporary table's record buffer.
Then, the temporary table is freed, together with its record buffer.
Then, Item_cache_str attempts to read this value.
The fix is to call value_buff.copy(). This will make the value_buff to store
its string in a buffer that it owns, which will not disappear unexpectedly.
Adding destructor for Group_bound_tracker to free Cached_item_str.
The Cached_item for window functions are allocated on THD:mem_root
but the Cached_item_str has value of type string which is allocated on
the heap, so we need to call free() for it
Need to call split_sum_func if an aggregate function is part of order by
or partition by clause so that we have the required fields inside the temporary
table, as all the fields inside the partition by and order by clause of the
window function needs to be there in the temp table used for window function
computation.
The issue here is that for a window function in the ORDER BY clause, we were not
creating an extra field in the temporary table for the window function
(which is contained in an expression).
So a call to split_sum_func is added to handle this case
Also we need to update all items that contain a window function
in the temp table during window function computation as filesort would need
these values to be updated to calculate the ORDER BY clause of the select.
For degenerate joins we may have JOIN::table_list as NULL, so instead
of using JOIN::top_join_tab_count use the function JOIN::exec_join_tab_cnt
to get the number of tables joined at the top level.
failed in compare_order_elements function
The issue here is the function compare_order_lists() is called for the order by list of the window functions
so that those window function that can be computed together are adjacent.
So in the function compare_order_list we iterate over all the elements in the order list of the two functions and
compare the items in their order by clause.
The function compare_order_elements() is called for each item in the
order by clause. This function assumes that all the items that are in the order by list would be of the type
Item::FIELD_ITEM.
The case we have is that we have constants in the order by clause. We should ignore the constant and only compare
items of the type Item::FIELD_ITEM in compare_order_elements()
The issue here was when we had a subquery and a window function in an expression in
the select list then subquery was getting computed after window function computation.
This resulted in incorrect results because the subquery was correlated and the fields
in the subquery was pointing to the base table instead of the temporary table.
The approach to fix this was to have an additional field in the temporary table
for the subquery and to execute the subquery before window function execution.
After execution the values for the subquery were stored in the temporary table
and then when we needed to calcuate the expression, all we do is read the values
from the temporary table for the subquery.
While printing a view containing a window function we were printing it as an
Item_field object instead of an Item_window_func object. This is incorrect and this
leads to us throwing an error ER_VIEW_INVALID.
Fixed by adjusting the Item_ref:print function.
Also made UDF function aware if there arguments have window function.
While calculating distinct with the function remove_dup_with_compare, we don't have rnd_end calls
when we have completed the scan over the temporary table.
Added ha_rnd_end calls when we are done with the scan of the table.
For prepare statemtent/stored procedures we rollback the items to original
ones after prepare execution in the function reinit_stmt_before_use.
This rollback is done for group by, order by clauses but is not done for
the window specification containing the order by and partition by clause of the
window function.
During the optimize state of a query, we come know that the result set
would atmost contain one row, then for such a query we don't need
to compute GROUP BY, ORDER BY and DISTINCT.
Users expect window functions to produce a certain ordering of rows in
the final result set. Although the standard does not require this, we
already have the filesort result done for when we computed the window
function. If there is no ORDER BY attached to the query, just keep it
till the SELECT is completely evaluated and use that to print the
result.
Update test cases as many did not take care to guarantee a stable
result.
The ONLY_FULL_GROUP_BY mode states that for SELECT ... GROUP BY queries,
disallow SELECTing columns which are not referred to in the GROUP BY clause,
unless they are passed to an aggregate function like COUNT() or MAX().
This holds only for the GROUP BY clause of the query.
The code also checks this for the partition clause of the window function which is
incorrect.
When we have a query which has implicit_grouping then we are sure that we would end up with only one
row so there is no point to do DISTINCT computation
The issue here is that the window function execution is not called for the correct join tab, when we have GROUP BY
where we create extra temporary tables then we need to call window function execution for the last join tab. For doing
so the current code does not take into account the JOIN::aggr_tables.
Fixed by introducing a new function JOIN::total_join_tab_cnt that takes in account the temporary tables also.
join_tab->distinct=true means "Before doing record read with this
join_tab, call join_tab->remove_duplicates() to eliminate duplicates".
remove_duplicates() assumes that
- there is a temporary table $T with rows that are to be de-duplicated
- there is a previous join_tab (e.g. with join_tab->fields) which was
used to populate the temp.table $T.
When the query has "Impossible WHERE" and window function, then the above
conditions are not met (but we still might need a window function
computation step when the query has implicit grouping).
The fix is to not add remove_duplicates step if the select execution is
degenerate (and we'll have at most one row in the output anyway).
Window is a reserved keyword according to SQL Standard 2016. However, we
can make the grammar slightly flexible by allowing WINDOW keyword everywhere
except table aliases. Change yacc grammar to separate between all keywords
and table_alias keywords.
When creating an Item_direct_view_ref we were setting the
Name_resolution_context based on TABLE_LIST::view member variable.
However, for derived tables this member is NULL. To not set a wrong
context, if TABLE_LIST::view is empty, use THD::lex instead.
Item_cond did not correctly propagate with_window_func flag in the Item
tree. Without it, we would not call Item::split_sum_func correctly and the
window function's result_field would remain NULL.
This is backport of 25ad623d64 for 10.2.
The issue is similar to the one from MDEV-13240. Item::save_in_field()
returns an error during tmp table population in a create table from select query
as we try to save an empty string as a date value when
force_return_blank is set to true for window functions.
MDEV-13240 Wrong warning with MAX(datetime_field) OVER (...)
The problem resided in Item_window_func implementation,
and it was revealed by bb-10.2-ext specific changes:
Item_window_func::save_in_field() works differently in bb-10.2-ext vs 10.2:
- 10.2 goes through val_str()
- bb-10.2-ext goes through get_date(), due to Type_handler related changes.
get_date() tries to convert empty string to DATETIME, hence the warning.
During a discussion with Vicentiu, it was decided to fix
Item_window_func::val_xxx() to return NULL
(instead of an "empty" value, such as 0 for numbers and '' for strings)
when force_return_blank is set.
Make st_select_lex::set_explain_type() take into account that JOIN_TABs
it is traversing may be also post-join aggregation JOIN_TABs (which
have pos_in_table_list=NULL, etc).
Do not run the window function computation step when the select
produces no rows (zero_result_cause!=NULL).
This may cause reads from uninitialized memory.
We still need to run the window function computation step when
the output includes just one row (for example
SELECT MAX(col), RANK() OVER (...) FROM t1 WHERE 1=0).
This fix also resolves an issue with queries with window functions
producing an output row where should be none, like in
SELECT ROW_NUMBER() FROM t1 WHERE 1=0.
Updated a few test results in the existing tests to reflect this.
Interval function makes use of Item_row. Item_row did not correctly mark
with_window_func flag according to its arguments. Fix it by making
Item_row aware of this flag.
The "is null" function performs one operation which no other Item_func
does, which is to update used tables during fix_length_and_dec().
This however can not be performed before window functions have had a
chance to resolve their order by and partition by definitions, which
happens after the initial setup_fields call. Consequently, do not call
Item_func_isnull update_used_tables during fix_length_and_dec().
There was another issue detected once the crash was resolved.
Because window functions did not implement is_null() method, we would
end up returning bad results for "is null" and "is not null" functions.
Implemented is_null() method for Item_windowfunc.
During statement preparation st_order::item gets set to a value in
ref_ptr_array. During statement execution we were overriding that value,
causing subsequent checks for window functions to return true.
Whenever we do any setting from ref_ptr_array, make sure to always
store the value in all_fields as well.
For function items containing window functions, as MDEV-12336 has
discovered, we don't need to create a separate Item_direct_ref or
Item_aggregate_ref as they will be computed directly from the top-level
item once the window function argument columns are computed.
The usage of windows functions when all tables were optimized away
by min/max optimization were not supported. As result a result,
the queries that used window functions with min/max aggregation
over the whole table returned wrong result sets.
The patch fixed this problem.
The function Item::split_sum_func2() incorrectly processed the function
items with window functions that were not window functions themselfes
and were used as arguments of other functions.
The bug was not visible in current HEAD. Introduced test case to catch
regressions. Also improve error messages regarding distinct usage in
window functions.
Window functions need to be computed after applying the HAVING clause.
An optimization that we have for regular, non-window function, cases is
to apply having only during sending of the rows to the client. This
allows rows that should be filtered from the temporary table used to
store aggregation results to be stored there.
This behaviour is undesireable for window functions, as we have to
compute window functions on the result-set after HAVING is applied.
Storing extra rows in the table leads to wrong values as the frame
bounds might capture those -to be filtered afterwards- rows.
These are different bugs, but the fixing code is the same:
if window functions are used over implicit grouping then
now the execution should follow the general path calling
the function set in JOIN::first_select.
Due to this bug many queries that contained a window function
with MIN/MAX aggregation returned wrong results.
Calculation of a MIN/MAX aggregate function uses cache objects
and a comparator object that are created and set up in
Item_sum_hybrid::fix_fields () by a call of Item_sum_hybrid::setup_hybrid().
The latter binds the objects to the first argument of the
MIN/MAX function. Meanwhile window function perform aggregation
over fields of a temporary table. So binding must be done rather to
these fields. The earliest moment when setup the objects used in
MIN/max functions can be done is after all calls of the method
split_sum_func().
This patch introduces this late setup, but only for aggregate
functions used in window functions.
Probably it makes sense to use this late setup for all MIN/MAX
objects.
This patch complements the patch for bug 11138.
Without this patch some table-less queries with window functions
could cause crashes due to a memory overwrite.