1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-07-29 08:21:15 +03:00

fix(rgdata)!: Fix assertion failure leading to disk-based aggregation failure

The new added invariant checking that RGData knows the number of columns and fixed size columns was failing for disk-based aggregation workloads, leading them to provide a wrong result. (The assertion failure happened in RGData::getRow(uint32_t num, Row* row) which is called in the finalization of sub-aggregation results, necessary for merging part results. As the merging failed, duplicate results were output for disk-based aggregation queries.
The assertion failure was caused by RGData::deserialize(ByteStream& bs,
uint32_t defAmount) not setting rowSize and colCount if necessary (e.g.
when the deserialization happens into a new, default RGData, which
doesn't know anything about its structure yet. This is the case when the
default constructor for RGData() is used, which sets rowSize and
columnCount to 0 each.
There are three code parts that make use of the default RGData() ctor.
The fix is for the use in RowGroupStorage::loadRG(uint64_t rgid,
std::unique_ptr<RGData>& rgdata, bool unlinkDump = false), where the
default RGData object is used to directly deserialize a ByteStream into
it. The deserialize method now checks if both rowSize and columnCount
are 0 and if yes sets the read values from the ByteStream for both.
We should probably check the other two code parts making use of the
default RGData ctor, too. This happens in joinpartition.cpp and
tuplejoiner.cpp.

---------

Co-authored-by: Theresa Hradilak <34538290+phoeinx@users.noreply.github.com>
This commit is contained in:
Leonid Fedorov
2023-09-30 00:02:31 +03:00
committed by GitHub
parent 6f52039658
commit 86c1c5d537

View File

@ -414,9 +414,17 @@ void RGData::deserialize(ByteStream& bs, uint32_t defAmount)
uint32_t rowSizeTemp;
bs >> colCountTemp;
bs >> rowSizeTemp;
if (rowSize != 0)
if (rowSize != 0 || columnCount != 0)
{
idbassert(colCountTemp == columnCount && rowSize == rowSizeTemp);
idbassert(rowSize == rowSizeTemp && colCountTemp == columnCount);
}
else
{
// if deserializing into an empty RGData created by default constructor
// which sets columnCount = 0 and rowSize = 0, set columnCount and rowSize
// from deserialized bytestream
columnCount = colCountTemp;
rowSize = rowSizeTemp;
}
rowData.reset(new uint8_t[std::max(amount, defAmount)]);
buf = bs.buf();