1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-07-27 21:01:50 +03:00

MCOL-4580 extent elimination for dictionary-based text/varchar types

The idea is relatively simple - encode prefixes of collated strings as
integers and use them to compute extents' ranges. Then we can eliminate
extents with strings.

The actual patch does have all the code there but miss one important
step: we do not keep collation index, we keep charset index. Because of
this, some of the tests in the bugfix suite fail and thus main
functionality is turned off.

The reason of this patch to be put into PR at all is that it contains
changes that made CHAR/VARCHAR columns unsigned. This change is needed in
vectorization work.
This commit is contained in:
Serguey Zefirov
2022-02-04 11:55:09 +00:00
parent a66a8dfabf
commit 53b9a2a0f9
54 changed files with 698 additions and 227 deletions

View File

@ -2820,16 +2820,24 @@ LBID_t ExtentMap::_createColumnExtent_DBroot(uint32_t size, int OID, uint32_t co
e->HWM = 0;
e->status = EXTENTUNAVAILABLE; // mark extent as in process
// Partition, segment, and blockOffset 0 represents new table or column.
// When DDL creates a table, we can mark the first extent as VALID, since
// the table has no data. Marking as VALID enables cpimport to update
// the CP min/max for the first import.
// If DDL is adding a column to an existing table, setting to VALID won't
// hurt, because DDL resets to INVALID after the extent is created.
if ((e->partitionNum == 0) && (e->segmentNum == 0) && (e->blockOffset == 0))
e->partition.cprange.isValid = CP_VALID;
else
e->partition.cprange.isValid = CP_INVALID;
#if 0 // XXX: sergueyz: I'll leave these under conditional flag for a while because it appears a huge change.
// Partition, segment, and blockOffset 0 represents new table or column.
// When DDL creates a table, we can mark the first extent as VALID, since
// the table has no data. Marking as VALID enables cpimport to update
// the CP min/max for the first import.
// If DDL is adding a column to an existing table, setting to VALID won't
// hurt, because DDL resets to INVALID after the extent is created.
// XXX: the comment above is out of date. bulk set of extents ranges
// works differently right now.
if ((e->partitionNum == 0) &&
(e->segmentNum == 0) &&
(e->blockOffset == 0))
e->partition.cprange.isValid = CP_VALID;
else
e->partition.cprange.isValid = CP_INVALID;
#else
e->partition.cprange.isValid = CP_INVALID;
#endif
partitionNum = e->partitionNum;
segmentNum = e->segmentNum;
@ -3029,16 +3037,22 @@ LBID_t ExtentMap::_createColumnExtentExactFile(uint32_t size, int OID, uint32_t
e->HWM = 0;
}
// Partition, segment, and blockOffset 0 represents new table or column.
// When DDL creates a table, we can mark the first extent as VALID, since
// the table has no data. Marking as VALID enables cpimport to update
// the CP min/max for the first import.
// If DDL is adding a column to an existing table, setting to VALID won't
// hurt, because DDL resets to INVALID after the extent is created.
if ((e->partitionNum == 0) && (e->segmentNum == 0) && (e->blockOffset == 0))
e->partition.cprange.isValid = CP_VALID;
else
e->partition.cprange.isValid = CP_INVALID;
#if 0 // XXX: sergueyz: I'll leave these under conditional flag for a while because it appears a huge change.
// Partition, segment, and blockOffset 0 represents new table or column.
// When DDL creates a table, we can mark the first extent as VALID, since
// the table has no data. Marking as VALID enables cpimport to update
// the CP min/max for the first import.
// If DDL is adding a column to an existing table, setting to VALID won't
// hurt, because DDL resets to INVALID after the extent is created.
if ((e->partitionNum == 0) &&
(e->segmentNum == 0) &&
(e->blockOffset == 0))
e->partition.cprange.isValid = CP_VALID;
else
e->partition.cprange.isValid = CP_INVALID;
#else
e->partition.cprange.isValid = CP_INVALID;
#endif
startBlockOffset = e->blockOffset;