* move GROUP_CONCAT/JSON_ARRAYAGG storage to the RowGroup from
the RowAggregation*
* internal data structures (de)serialization
* get rid of a specialized classes for processing JSON_ARRAYAGG
* move the memory accounting to disk-based aggregation classes
* allow aggregation generations to be used for queries with
GROUP_CONCAT/JSON_ARRAYAGG
* Remove the thread id from the error message as it interferes with the mtr
This patch introduces an internal aggregate operator SELECT_SOME that
is automatically added to columns that are not in GROUP BY. It
"computes" some plausible value of the column (actually, last one
passed).
Along the way it fixes incorrect handling of HAVING being transferred
into WHERE, window function handling and a bit of other inconsistencies.
The buffer can utilize > 4GB RAM that is necessary for PM side join.
RGData ctor uses uint32_t allocating data buffer.
This fact causes implicit heap overflow.
* MCOL-4234: improve GROUP BY and ORDER BY interaction (#3194)
This patch fixes the problem in MCOL-4234 and also generally improves
behavior of GROUP BY.
It does so by introducing a "dummy" aggregate and by wrapping columns
into it. This allows for columns that are not in GROUP BY to be used
more freely, for example, in SELECT * FROM tbl GROUP BY col - all
columns that are not "col" will be wrapped into an aggregate and query
will proceed to execution.
The dummy aggregate itself does nothing more than remember last value
passed into it.
There also an additional error message that tries to explain what types
of expressions can be wrapped into an aggregate.
* MCOL-5772: incorrect ORDER BY ordering for a columns not in GROUP BY (#3214)
When ORDER BY column is not in GROUP BY, is not an aggregate and there
is a SELECT column that is also not an aggregate, there was a problem:
ordering happened on the SELECTed column, not ORDERed one.
This patch fixes that particular problem and also performs some tidying
around newly added aggregate.
---------
Co-authored-by: Leonid Fedorov <79837786+mariadb-LeonidFedorov@users.noreply.github.com>
The new added invariant checking that RGData knows the number of columns and fixed size columns was failing for disk-based aggregation workloads, leading them to provide a wrong result. (The assertion failure happened in RGData::getRow(uint32_t num, Row* row) which is called in the finalization of sub-aggregation results, necessary for merging part results. As the merging failed, duplicate results were output for disk-based aggregation queries.
The assertion failure was caused by RGData::deserialize(ByteStream& bs,
uint32_t defAmount) not setting rowSize and colCount if necessary (e.g.
when the deserialization happens into a new, default RGData, which
doesn't know anything about its structure yet. This is the case when the
default constructor for RGData() is used, which sets rowSize and
columnCount to 0 each.
There are three code parts that make use of the default RGData() ctor.
The fix is for the use in RowGroupStorage::loadRG(uint64_t rgid,
std::unique_ptr<RGData>& rgdata, bool unlinkDump = false), where the
default RGData object is used to directly deserialize a ByteStream into
it. The deserialize method now checks if both rowSize and columnCount
are 0 and if yes sets the read values from the ByteStream for both.
We should probably check the other two code parts making use of the
default RGData ctor, too. This happens in joinpartition.cpp and
tuplejoiner.cpp.
---------
Co-authored-by: Theresa Hradilak <34538290+phoeinx@users.noreply.github.com>
Adds a special column which helps to differentiate data and rollups of
various depts and a simple logic to row aggregation to add processing of
subtotals.
1. Input and output RowGroup's used in GROUP_CONCAT classes
are currently allocating a raw memory buffer of size equal
to the actual width of the string datatype. As an example,
for the following query:
SELECT col1, GROUP_CONCAT(col2) FROM t GROUP BY col1;
If col2 is a TEXT field with default width, the input
RowGroup containing the target rows to be concatenated will
assign 64kb of memory for every input row in the RowGroup.
This is wasteful as actual field values in real workloads
would be much smaller. We fix this by enabling the
RowGroup to use the StringStore when the RowGroup contains
long strings.
2. RowAggregation::initialize() allocates a memory buffer
for a NULL row. The size of this buffer is equal to the
row size for the output RowGroup. For the above scenario,
using the default group_concat_max_len (which is a server
variable that sets the maximum length of the GROUP_CONCAT string)
value of 1mb, the buffer size would be
(1mb + 64kb + some additional metadata). If the user sets
group_concat_max_len to a higher value, say 3gb, this buffer
size would be ~3gb. Now if the runtime initiates several
instances of RowAggregation, total memory consumption by
PrimProc could exceed the hardware memory limits causing the
OS OOM to kill the process. We fix this problem by again
enabling the StringStore for the NULL row allocation.
3. In the plugin code in buildAggregateColumn(), there is
an integer overflow when the server group_concat_max_len
variable (which is an uint32_t) is set to a value > INT32_MAX
(such as 3gb) and is assigned to
CalpontSystemCatalog::ColType::colWidth (which is an int32_t).
As a short term fix, we saturate the assigned value to colWidth
to INT32_MAX. Proper fix would be to upgrade
CalpontSystemCatalog::ColType::colWidth to an uint32_t.
* Fixes of bugs from ASAN warnings, part one
* MQC as static library, with nifty counter for global map and mutex
* Switch clang to 16
* link messageqcpp to execplan
This patch improves handling of NULLs in textual fields in ColumnStore.
Previously empty strings were considered NULLs and it could be a problem
if data scheme allows for empty strings. It was also one of major
reasons of behavior difference between ColumnStore and other engines in
MariaDB family.
Also, this patch fixes some other bugs and incorrect behavior, for
example, incorrect comparison for "column <= ''" which evaluates to
constant True for all purposes before this patch.
Given that idx is a RH hashmap bucket number and info is intra-bucket idx
the root cause is triggered by the difference of idx/hash pair
calculation for a certain GROUP BY generation and for generation
aggregations merging that takes place in RowAggStorage::finalize.
This patch generalizes rowHashToIdx to leverage it in both cases
mentioned above.
in aggregation code
The patch disables padding that forces hasher to calculate over the whole 2k buffer. This patch also moves hashing code
into the common place where it belongs.