You've already forked mariadb-columnstore-engine
mirror of
https://github.com/mariadb-corporation/mariadb-columnstore-engine.git
synced 2025-11-12 13:21:04 +03:00
The new added invariant checking that RGData knows the number of columns and fixed size columns was failing for disk-based aggregation workloads, leading them to provide a wrong result. (The assertion failure happened in RGData::getRow(uint32_t num, Row* row) which is called in the finalization of sub-aggregation results, necessary for merging part results. As the merging failed, duplicate results were output for disk-based aggregation queries. The assertion failure was caused by RGData::deserialize(ByteStream& bs, uint32_t defAmount) not setting rowSize and colCount if necessary (e.g. when the deserialization happens into a new, default RGData, which doesn't know anything about its structure yet. This is the case when the default constructor for RGData() is used, which sets rowSize and columnCount to 0 each. There are three code parts that make use of the default RGData() ctor. The fix is for the use in RowGroupStorage::loadRG(uint64_t rgid, std::unique_ptr<RGData>& rgdata, bool unlinkDump = false), where the default RGData object is used to directly deserialize a ByteStream into it. The deserialize method now checks if both rowSize and columnCount are 0 and if yes sets the read values from the ByteStream for both. We should probably check the other two code parts making use of the default RGData ctor, too. This happens in joinpartition.cpp and tuplejoiner.cpp. --------- Co-authored-by: Theresa Hradilak <34538290+phoeinx@users.noreply.github.com>
45 KiB
45 KiB