mariadb-columnstore-engine

mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-07-07 09:01:10 +03:00

Author	SHA1	Message	Date
drrtuy	8ae5a3da40	Fix/mcol 5787 rgdata buffer max size dev (#3325 ) * fix(rowgroup): RGData now uses uint64_t counter for the fixed sizes columns data buf. The buffer can utilize > 4GB RAM that is necessary for PM side join. RGData ctor uses uint32_t allocating data buffer. This fact causes implicit heap overflow. * feat(bytestream,serdes): BS buffer size type is uint64_t This necessary to handle 64bit RGData, that comes as a separate patch. The pair of patches would allow to have PM joins when SmallSide size > 4GB. * feat(bytestream,serdes): Distribute BS buf size data type change to avoid implicit data type narrowing * feat(rowgroup): this returns bits lost during cherry-pick. The bits lost caused the first RGData::serialize to crash a process	2024-11-09 19:44:02 +00:00
Denis Khalikov	1c88a7fcd8	MCOL-5597 Rollback changes introduced for DJS. This patch changes: 1. The number of buckets created on each split. 2. The heuristic which calculates the bucket size.	2024-04-15 19:37:29 +03:00
Sergey Zefirov	ebcf43a517	fix(join, disk-based): MCOL-5597: large side read errors (#3117 ) The large side read errors mentioned there can be due to failure to close file stream properly. Some of the data may still reside in the file stream buffers, closing must flush it. The flush is an I/O operation and can fail, leading to partial write and subsequent partial read. This patch tries to provide better diagnostics.	2024-02-09 22:25:43 +03:00
Denis Khalikov	2a66ae2ed1	MCOL-5514 Parallel disk join step.	2023-07-11 14:05:14 +03:00
Denis Khalikov	1f190a6e75	MCOL-5477 Disk join step improvement. This patch: 1. Handles corner case when the bucket exceeded the memory limit, but we cannot redistribute the data in this bucket into new buckets based on a hash algorithm, because the rows have the same values. 2. Adds force option for disk join step. 3. Add a option to contol the depth of the partition tree.	2023-06-23 18:40:15 +03:00
Leonid Fedorov	56f2346083	Remove windows ifdefs	2023-03-02 15:59:42 +00:00
david.hall	3b6449842f	Merge branch 'develop' into MCOL-4841 # Conflicts: # exemgr/main.cpp # oam/etc/Columnstore.xml.singleserver # primitives/primproc/primproc.cpp	2022-06-09 10:07:26 -05:00
Leonid Fedorov	c25ae4f378	Use external boost 1.78	2022-05-02 18:23:37 +00:00
David Hall	27dea733c5	MCOL4841 dev port run large join without OOM	2022-02-09 17:33:55 -06:00
Leonid Fedorov	04752ec546	clang format apply	2022-01-21 16:43:49 +00:00
Denis Khalikov	cc1c3629c5	MCOL-987 Add LZ4 compression. * Adds CompressInterfaceLZ4 which uses LZ4 API for compress/uncompress. * Adds CMake machinery to search LZ4 on running host. * All methods which use static data and do not modify any internal data - become `static`, so we can use them without creation of the specific object. This is possible, because the header specification has not been modified. We still use 2 sections in header, first one with file meta data, the second one with pointers for compressed chunks. * Methods `compress`, `uncompress`, `maxCompressedSize`, `getUncompressedSize` - become pure virtual, so we can override them for the other compression algos. * Adds method `getChunkMagicNumber`, so we can verify chunk magic number for each compression algo. * Renames "s/IDBCompressInterface/CompressInterface/g" according to requirement.	2021-07-06 18:04:37 +03:00
Alexey Antipovsky	475104e4d3	[MCOL-4709] Disk-based aggregation * Introduce multigeneration aggregation * Do not save unused part of RGDatas to disk * Add IO error explanation (strerror) * Reduce memory usage while aggregating * introduce in-memory generations to better memory utilization * Try to limit the qty of buckets at a low limit * Refactor disk aggregation a bit * pass calculated hash into RowAggregation * try to keep some RGData with free space in memory * do not dump more than half of rowgroups to disk if generations are allowed, instead start a new generation * for each thread shift the first processed bucket at each iteration, so the generations start more evenly * Unify temp data location * Explicitly create temp subdirectories whether disk aggregation/join are enabled or not	2021-06-06 16:09:15 +03:00
Patrick LeBlanc	a09a9d5d0f	Mass substitution 'Corporaton' -> 'Corporation'	2019-08-07 14:43:25 -05:00
David Hall	3f2c753947	MCOL-1822-c final checkin	2019-03-05 09:33:39 -06:00
David Hall	c5b9ae11e5	MCOL-1822 add LONG DOUBLE support	2019-01-29 09:55:43 -06:00
David Hill	b7b98a3e1a	MCOL-520	2018-09-25 11:32:56 -05:00
Andrew Hutchings	01446d1e22	Reformat all code to coding standard	2017-10-26 17:18:17 +01:00
david hill	f6afc42dd0	the begginning	2016-01-06 14:08:59 -06:00

18 Commits