1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-07-29 08:21:15 +03:00

fix(aggregation, disk-based) MCOL-5691 distinct aggregate disk based (#3145)

* fix(aggregation, disk-based): MCOL-5689 this fixes disk-based distinct aggregation functions
Previously disk-based distinct aggregation functions produced incorrect results b/c there was no finalization applied for previous generations stored on disk.

*  fix(aggregation, disk-based): Fix disk-based COUNT(DISTINCT ...) queries. (Case 2). (Distinct & Multi-Distinct, Single- & Multi-Threaded).

* fix(aggregation, disk-based): Fix disk-based DISTINCT & GROUP BY queries. (Case 1). (Distinct & Multi-Distinct, Single- & Multi-Threaded).

---------

Co-authored-by: Theresa Hradilak <theresa.hradilak@gmail.com>
Co-authored-by: Roman Nozdrin <rnozdrin@mariadb.com>
This commit is contained in:
drrtuy
2024-03-24 17:04:37 +02:00
committed by Leonid Fedorov
parent 8cb7bc8e54
commit 444cf4c65e
7 changed files with 398 additions and 128 deletions

View File

@ -679,6 +679,17 @@ class RowAggregationUM : public RowAggregation
*/
bool nextRowGroup();
/** @brief Returns aggregated rows in a RowGroup as long as there are still not returned result RowGroups.
*
* This function should be called repeatedly until false is returned (meaning end of data).
* Returns data from in-memory storage, as well as spilled data from disk. If disk-based aggregation is
* happening, finalAggregation() should be called before returning result RowGroups to finalize the used
* RowAggStorages, merge different spilled generations and obtain correct aggregation results.
*
* @returns True if there are more result RowGroups, else false if all results have been returned.
*/
bool nextOutputRowGroup();
/** @brief Add an aggregator for DISTINCT aggregation
*/
void distinctAggregator(const boost::shared_ptr<RowAggregation>& da)