1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-04-23 07:05:36 +03:00

17 Commits

Author SHA1 Message Date
Roman Nozdrin
dfc9e89496 fix(rowstorage): SplitMix64 PRNG implementation to replace stdlib MT PRNG that uses /dev/urandom guarded by spinlock 2023-11-01 18:19:45 +00:00
Roman Nozdrin
688b47d4e7 MCOL-5451 This resolves external GROUP BY result inconsistency issues
Given that idx is a RH hashmap bucket number and info is intra-bucket idx
    the root cause is triggered by the difference of idx/hash pair
    calculation for a certain GROUP BY generation and for generation
    aggregations merging that takes place in RowAggStorage::finalize.
    This patch generalizes rowHashToIdx to leverage it in both cases
    mentioned above.
2023-03-25 15:04:16 +00:00
david.hall
8642231666 Changes to compile local 2022-11-17 11:29:21 -06:00
Alexey Antipovsky
440101dfff [MCOL-5213] Fix a rare IO error 2022-09-14 17:12:15 +03:00
Roman Nozdrin
72e264e8ef MCOL-5199 This patch solves the overal performance degradation introduced with a new way of char columns hashing
in aggregation code
The patch disables padding that forces hasher to calculate over the whole 2k buffer. This patch also moves hashing code
into the common place where it belongs.
2022-08-24 19:07:06 +00:00
Roman Nozdrin
20f57b713a MCOL-5198 This patch enables RowStorage to dump data on disk
using startNewGeneration if there is 50 Megs left free
2022-08-24 14:00:43 +00:00
Alexey Antipovsky
15ce531270 Randomly start a new generation if the free memory is less than 30% 2022-08-24 14:00:37 +00:00
Alexey Antipovsky
dca359c2ab Fix excessive memory consumption at the last stage of aggregation 2022-08-18 14:00:53 +03:00
Roman Nozdrin
dd96e686c0
MCOL-5153 This patch replaces MDB collation aware hash function with the (#2488)
exact functionality that does not use MDB hash function.
This patch also takes a bit from Robin Hood hash map implementation forgotten
that reduces hash function collision rate.
2022-08-07 02:36:03 +03:00
Roman Nozdrin
6b17c358c0
MCOL-5153 This increases the size of the multiplier in the guarding check in RowAggStorage::increaseSize() so that it doesn't throw w/o a reason (#2463) 2022-07-22 10:19:36 -05:00
David Hall
27dea733c5 MCOL4841 dev port run large join without OOM 2022-02-09 17:33:55 -06:00
Leonid Fedorov
04752ec546 clang format apply 2022-01-21 16:43:49 +00:00
Alexey Antipovsky
6a4140394d [MCOL-4829] More accurate memory counting 2021-09-07 19:52:20 +03:00
Alexey Antipovsky
7fea3c988e [MCOL-4829] Compression for the temp disk-based aggregation files 2021-09-02 19:30:25 +03:00
Alexey Antipovsky
60495564b8 [MCOL-4709] Fix another UB in disk aggregation 2021-06-29 17:47:07 +03:00
Alexey Antipovsky
8a0b68f25e [MCOL-4709] Fix UB in disk aggregation 2021-06-28 20:07:23 +03:00
Alexey Antipovsky
475104e4d3 [MCOL-4709] Disk-based aggregation
* Introduce multigeneration aggregation

* Do not save unused part of RGDatas to disk
* Add IO error explanation (strerror)

* Reduce memory usage while aggregating
* introduce in-memory generations to better memory utilization

* Try to limit the qty of buckets at a low limit

* Refactor disk aggregation a bit
* pass calculated hash into RowAggregation
* try to keep some RGData with free space in memory

* do not dump more than half of rowgroups to disk if generations are
  allowed, instead start a new generation
* for each thread shift the first processed bucket at each iteration,
  so the generations start more evenly

* Unify temp data location

* Explicitly create temp subdirectories
  whether disk aggregation/join are enabled or not
2021-06-06 16:09:15 +03:00