mariadb-columnstore-engine

mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-08-05 16:15:50 +03:00

Author	SHA1	Message	Date
Alexey Antipovsky	69fd36847d	[MCOL-5213] Fix a rare IO error	2022-09-14 17:09:56 +03:00
Roman Nozdrin	a33597b073	MCOL-5198 This patch enables RowStorage to dump data on disk using startNewGeneration if there is 50 Megs left free	2022-08-23 21:30:55 +00:00
Alexey Antipovsky	1be82f859b	Randomly start a new generation if the free memory is less than 30%	2022-08-23 17:59:52 +00:00
Roman Nozdrin	fd9fe182d5	MCOL-5199 This patch solves the overal performance degradation introduced with a new way of char columns hashing in aggregation code The patch disables padding that forces hasher to calculate over the whole 2k buffer. This patch also moves hashing code into the common place where it belongs.	2022-08-22 13:39:45 +00:00
Alexey Antipovsky	30429a7f6c	Fix excessive memory consumption at the last stage of aggregation	2022-08-18 13:58:23 +03:00
Roman Nozdrin	5f485f40ca	MCOL-5153 This patch replaces MDB collation aware hash function with the (#2487 ) exact functionality that does not use MDB hash function. This patch also takes a bit from Robin Hood hash map implementation forgotten that reduces hash function collision rate.	2022-08-04 16:22:11 -05:00
Roman Nozdrin	eabca67c8d	MCOL-5153 This increases the size of the multiplier in the guarding check in RowAggStorage::increaseSize() so that it doesn't throw w/o a reason	2022-07-07 08:58:05 +00:00
Leonid Fedorov	7c808317dc	clang format apply	2022-02-11 12:24:40 +00:00
David.Hall	509f005be7	Mcol 4841 dev6 Handle large joins without OOM (#2155 ) * MCOL-4846 dev-6 Handle large join results Use a loop to shrink the number of results reported per message to something manageable. * MCOL-4841 small changes requested by review * Add EXTRA threads to prioritythreadpool prioritythreadpool is configured at startup with a fixed number of threads available. This is to prevent thread thrashing. Since most of the time, BPP job steps are short lived, and a rescheduling mechanism exist if no threads are available, this works to keep cpu wastage to a minimum. However, if a query or queries consume all the threads in prioritythreadpool and then block (due to the consumer not consuming fast enough) we can run out of threads and no work will be done until some threads unblock. A new mechanism allows for EXTRA threads to be generated for the duration of the blocking action. These threads can act on new queries. When all blocking is completed, these threads will be released when idle. * MCOL-4841 dev6 Reconcile with changes in develop-6 * MCOL-4841 Some format corrections * MCOL-4841 dev clean up some things based on review * MCOL-4841 dev 6 ExeMgr Crashes after large join This commit fixes up memory accounting issues in ExeMgr * MCOL-4841 remove LDI change Opened MCOL-4968 to address the issue * MCOL-4841 Add fMaxBPPSendQueue to ResourceManager This causes the setting to be loaded at run time (requires restart to accept a change) BPPSendthread gets this in it's ctor Also rolled back changes to TupleHashJoinStep::smallRunnerFcn() that used a local variable to count locally allocated memory, then added it into the global counter at function's end. Not counting the memory globally caused conversion to UM only join way later than it should. This resulted in MCOL-4971. * MCOL-4841 make blockedThreads and extraThreads atomic Also restore previous scope of locks in bppsendthread. There is some small chance the new scope could be incorrect, and the performance boost is negligible. Better safe than sorry.	2022-02-09 21:38:32 +03:00
Alexey Antipovsky	2328f4ef2a	[MCOL-4829] More accurate memory counting	2021-09-07 19:48:53 +03:00
Alexey Antipovsky	bf1640be65	[MCOL-4829] Compression for the temp disk-based aggregation files	2021-09-02 19:31:38 +03:00
Alexey Antipovsky	60495564b8	[MCOL-4709] Fix another UB in disk aggregation	2021-06-29 17:47:07 +03:00
Alexey Antipovsky	8a0b68f25e	[MCOL-4709] Fix UB in disk aggregation	2021-06-28 20:07:23 +03:00
Alexey Antipovsky	475104e4d3	[MCOL-4709] Disk-based aggregation * Introduce multigeneration aggregation * Do not save unused part of RGDatas to disk * Add IO error explanation (strerror) * Reduce memory usage while aggregating * introduce in-memory generations to better memory utilization * Try to limit the qty of buckets at a low limit * Refactor disk aggregation a bit * pass calculated hash into RowAggregation * try to keep some RGData with free space in memory * do not dump more than half of rowgroups to disk if generations are allowed, instead start a new generation * for each thread shift the first processed bucket at each iteration, so the generations start more evenly * Unify temp data location * Explicitly create temp subdirectories whether disk aggregation/join are enabled or not	2021-06-06 16:09:15 +03:00

14 Commits