The issue here is that for a window function in the ORDER BY clause, we were not
creating an extra field in the temporary table for the window function
(which is contained in an expression).
So a call to split_sum_func is added to handle this case
Also we need to update all items that contain a window function
in the temp table during window function computation as filesort would need
these values to be updated to calculate the ORDER BY clause of the select.
failed in compare_order_elements function
The issue here is the function compare_order_lists() is called for the order by list of the window functions
so that those window function that can be computed together are adjacent.
So in the function compare_order_list we iterate over all the elements in the order list of the two functions and
compare the items in their order by clause.
The function compare_order_elements() is called for each item in the
order by clause. This function assumes that all the items that are in the order by list would be of the type
Item::FIELD_ITEM.
The case we have is that we have constants in the order by clause. We should ignore the constant and only compare
items of the type Item::FIELD_ITEM in compare_order_elements()
Users expect window functions to produce a certain ordering of rows in
the final result set. Although the standard does not require this, we
already have the filesort result done for when we computed the window
function. If there is no ORDER BY attached to the query, just keep it
till the SELECT is completely evaluated and use that to print the
result.
Update test cases as many did not take care to guarantee a stable
result.
The issue here is that the window function execution is not called for the correct join tab, when we have GROUP BY
where we create extra temporary tables then we need to call window function execution for the last join tab. For doing
so the current code does not take into account the JOIN::aggr_tables.
Fixed by introducing a new function JOIN::total_join_tab_cnt that takes in account the temporary tables also.
Window definitions are resolved during fix fields. Updating used tables
for window functions must be done after all window functions have had a
chance to be resolved.
There was an additional problem with the implementation: expressions that
contained window functions never updated the expression's used tables.
To fix both these issues, make sure to call "update_used_tables" on all
items that contain window functions after we have passed through all
items.
The usage of windows functions when all tables were optimized away
by min/max optimization were not supported. As result a result,
the queries that used window functions with min/max aggregation
over the whole table returned wrong result sets.
The patch fixed this problem.
The bug was not visible in current HEAD. Introduced test case to catch
regressions. Also improve error messages regarding distinct usage in
window functions.
The bug was caused by several issues.
2 problems in seek_io_cache. Due to wrong offsets used, we would end up
seeking way too much (first change), or over the intended seek point
(second change). Fixing it requires correctly detecting available data
in buffer (first change), and not using "IO_SIZE alligned" reads. The
second is needed because _my_b_cache_read adjusts the pos_in_file itself
based on read_pos and read_end. Pretending buffer is empty when we want
to force a read will aleviate this problem.
Secondly, the big-table cursors didn't repect the interface definitions
of always returning the rownumber that Table_read_cursor::fetch() would activate.
At the same time, next(), prev() and move_to() should not perform any
row activation.
Window functions need to be computed after applying the HAVING clause.
An optimization that we have for regular, non-window function, cases is
to apply having only during sending of the rows to the client. This
allows rows that should be filtered from the temporary table used to
store aggregation results to be stored there.
This behaviour is undesireable for window functions, as we have to
compute window functions on the result-set after HAVING is applied.
Storing extra rows in the table leads to wrong values as the frame
bounds might capture those -to be filtered afterwards- rows.
The same approach is needed for LAST_VALUE, otherwise the LAST_VALUE sum
functions are not cleared correctly. Now LAST_VALUE behaves as NTH_VALUE
with 0 offset, only that the frame that it is examining is the bottom bound,
not the top bound.
There was no implementation of the virtual method print()
for the Item_window_func class. As a result for a view
containing window function an invalid view definition could
be written in the frm file. When a query that refers to
this view was executed a syntax error was reported.
Add support for having multiple IO_CACHEs with type=READ_CACHE to share
the file they are reading from.
Each IO_CACHE keeps its own in-memory buffer. When doing a read or seek
operation on the file, it notifies other IO_CACHEs that the file position
has been changed.
Make Rowid_seq_cursor use cloned IO_CACHE when reading filesort result.
Refactour out (into a copy for now) the logic of Item_sum_hybrid, to
allow for multiple arguments. It does not contain the comparator
members. The result is the class Item_sum_hybrid_simple.
LEAD and LAG make use of this Item to store previous rows in a chache.
It also helps in specifying the field type. Currently LEAD/LAG do not
support default values.
NTH_VALUE behaves identical to LEAD and LAG, except that the starting
position cursor is placed on the top of the frame instead of the current
row.
Make window functions work with an empty over clause by forcing
a sort on the first column of the current join_tab. This is a temporary
fix until we get window functions to work with big tables.
The positional cursor now fetches rows based on the positional
cursor and an offset (if present). It will fetch rows, based on the
offset, only if the required position is not out of bounds.
With clever use of partition bounds, we only need to add one row to the
items at a time. This way we remove the need to "reset" the item and run
through the full partition again.
We can set values in the record buffer first and only perform one table
write call at the end. No need to write to file every time one column is
updated.
Also, remove unused method from Table_read_cursor.
This makes them behave exactly like CURRENT ROW. Standard specifies
unsigned integer, which includes the value 0.
Expand the win_min_max test to include this kind of frame definitions.
Cursors now report their current row number as the boundary of the
partition. This is used by Frame_scan_cursor to compute aggregate
functions that do not support removal.
Currently cursors automatically add values to the sum functions they
manage. There are use cases when we just want to figure out the frame
boundaries, without actually adding/removing values from them.
When specifying a RANGE type frame that exceeds the partition size, both
for the top and bottom cursors we end up removing more rows than added
to the aggregate function. This happens because our TOP range cursor,
which removes values from the aggregate function, would be allowed to breach
partition boundaries, while the BOTTOM range cursor would not.
To prevent this from happening, force the TOP range cursor to only move
within the current partition, as does the BOTTOM range cursor.
Perform only one table scan for each window function present. We do this
by keeping keeping cursors for each window function frame bound and
running them for each function for every row.
- Avoid some realloc() during startup
- Ensure that file_key_management_plugin frees it's memory early, even if
it's linked statically.
- Fixed compiler warnings from unused variables and missing destructors
- Fixed wrong indentation
Make Frame_range_current_row_bottom to take into account partition bounds.
Other partition bounds that could potentially hit the end of partition are
Frame_range_n_bottom, Frame_n_rows_following, Frame_unbounded_following,
and they all had end-of-partition protection.
To simplify the code, factored out end-of-partition checks into
class Partition_read_cursor.
This bug revealed a serious problem: if the same partition list
was used in two window specifications then the temporary table created
to calculate window functions contained fields for two identical
partitions. This problem was fixed as well.
n=0 in "ROWS 0 PRECEDING" is valid, add handling for it:
- Adjust the assert
- Bottom bound of 'ROW 0 PRECEDING' is actually looking at the current
row, that is, it needs to process partition's first row directly in
Frame_n_rows_preceding::next_partition().
- Added testcases
" The sort order for the sub-sequence of window functions starting
from the element marked by SORTORDER_CHANGE_FLAG up to the next
element marked by SORTORDER_CHANGE_FLAG must be taken from the
last element of the sub-sequence (not from the first one)."
- Rename Window_funcs_computation to Window_funcs_computation_step
- Introduce Window_func_sort which invokes filesort and then
invokes computation of all window functions that use this ordering.
- Expose Window functions' sort operations in EXPLAIN|ANALYZE FORMAT=JSON