mariadb-columnstore-engine

mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-08-05 16:15:50 +03:00

Author	SHA1	Message	Date
Leonid Fedorov	aa7e0fb9b4	Deep build refactoring phase 1 (#3562 ) * configcpp refactored * logging and datatypes refactored * more dataconvert * chore(build): massive removals, auto add files to debian install file * chore(codemanagement): nodeps headers, potentioal library * chore(build): configure before autobake * chore(build): use custom cmake commands for components, mariadb-plugin-columnstore.install generated * chore(build): install deps as separate step for build-packages * more deps * check debian/mariadb-plugin-columnstore.install automatically * chore(build): add option for multibracnh compilation * Fix warning	2025-05-30 14:05:21 +04:00
Leonid Fedorov	6db2dc668f	stubs and cmake formatting	2025-05-20 18:22:59 +04:00
Leonid Fedorov	2036e521c7	named linkage	2025-05-20 18:22:59 +04:00
Leonid Fedorov	a0bee173f6	chore(build): fixes to satisfy clang19 warnings	2025-05-15 19:05:38 +04:00
drrtuy	d4fe2e7a45	chore(): re-enabled memory accounting for RGData generated by PP::execute()	2025-05-02 10:11:40 +01:00
drrtuy	42417764d8	chore(): cleanup.	2025-05-02 10:11:40 +01:00
drrtuy	01cc73d416	fix(perf,allocator): test build with reduced CountingAllocator parameter values.	2025-05-02 10:11:40 +01:00
=	671b7301f3	fix(allocator,perf): performance degradation caused by lack of STLPoolAllocator replaced by CountingAllocator	2025-05-02 10:11:40 +01:00
drrtuy	a16fbd137b	fix(disk-based-join): this fixes multiple SEGV for disk-based join algo	2025-04-15 15:06:49 +01:00
drrtuy	bb94343080	fix(): allocate Pointer vector in both TupleJoiner ctor	2025-03-27 22:12:48 +00:00
drrtuy	4c1d9bceb7	feat(): Replacing STLPoolAllocator with CountingAllocator for in-memory joins	2025-03-27 22:12:48 +00:00
drrtuy	bb4cd40ca4	feat(): aggregating CountingAllocator	2025-03-27 22:12:48 +00:00
drrtuy	cf9b5eb32c	feat(): restore user-space mem allocator	2025-03-27 22:12:48 +00:00
drrtuy	1aa2f3a42b	feat(): TupleHashJoin now handles bad_alloc case switching to disk-based if it is enabled	2025-03-27 22:12:48 +00:00
drrtuy	f594d27685	feat(): accounts hash tables RAM allocations/removes STLPoolAllocator	2025-03-27 22:12:48 +00:00
drrtuy	90b4322470	feat(): propagated changes into SLTPoolAllocator and friends	2025-03-27 22:12:48 +00:00
drrtuy	aa4bbc0152	feat(joblist,runtime): this is the first part of the execution model that produces a workload that can be predicted for a given query. * feat(joblist,runtime): this is the first part of the execution model that produces a workload that can be predicted for a given query. - forces to UM join converter to use a value from a configuration - replaces a constant used to control a number of outstanding requests with a value depends on column width - modifies related Columnstore.xml values	2024-12-03 22:18:21 +00:00
Alexey Antipovsky	11136b3545	fix(PrimProc): MCOL-5651 Add a workaround to avoid choosing an incorrect TupleHashJoinStep as a joiner [stable-23.10] (#3331 ) * fix(PrimProc): MCOL-5651 Add a workaround to avoid choosing an incorrect TupleHashJoinStep as a joiner	2024-11-08 12:51:25 +00:00
drrtuy	6f6e69815d	feat(bytestream,serdes): Distribute BS buf size data type change to avoid implicit data type narrowing	2024-11-08 16:28:51 +04:00
drrtuy	6757535b6e	fix(join, UM, perf): UM join is multi-threaded now (#3286 ) * chore: UM join is multi-threaded now * fix(UMjoin): replace TR1 maps with stdlib versions	2024-09-04 18:56:35 +04:00
Denis Khalikov	985cd94402	fix(join, disk-based): MCOL-5597: large side read errors (#3117 ) (#3225 ) The large side read errors mentioned there can be due to failure to close file stream properly. Some of the data may still reside in the file stream buffers, closing must flush it. The flush is an I/O operation and can fail, leading to partial write and subsequent partial read. This patch tries to provide better diagnostics. Co-authored-by: Sergey Zefirov <72864488+mariadb-SergeyZefirov@users.noreply.github.com>	2024-06-27 17:24:45 +04:00
Denis Khalikov	d6db3552c3	MCOL-5597 Rollback changes introduced for DJS. (#3224 ) This patch changes: 1. The number of buckets created on each split. 2. The heuristic which calculates the bucket size.	2024-06-27 17:22:11 +04:00
Denis Khalikov	2a66ae2ed1	MCOL-5514 Parallel disk join step.	2023-07-11 14:05:14 +03:00
Denis Khalikov	1f190a6e75	MCOL-5477 Disk join step improvement. This patch: 1. Handles corner case when the bucket exceeded the memory limit, but we cannot redistribute the data in this bucket into new buckets based on a hash algorithm, because the rows have the same values. 2. Adds force option for disk join step. 3. Add a option to contol the depth of the partition tree.	2023-06-23 18:40:15 +03:00
Roman Nozdrin	4fe9cd64a3	Revert "No boost condition (#2822 )" (#2828 ) This reverts commit `f916e64927`.	2023-04-22 15:49:50 +03:00
Leonid Fedorov	f916e64927	No boost condition (#2822 ) This patch replaces boost primitives with stdlib counterparts.	2023-04-22 00:42:45 +03:00
Leonid Fedorov	c2d0fa24da	replace boost::shared_array<T> to std::shared_ptr<T[]>	2023-04-14 10:33:27 +00:00
Leonid Fedorov	a508b86091	remove boost/shared_array include	2023-04-14 09:42:50 +00:00
Leonid Fedorov	6c32c658d5	MCOL-5385: Delete RowGroup::setData and make Pointer ctor explicit (#2808 ) * Delete RowGroup::setData and make Pointer ctor explicit * some push_backs replaced with emplace_backs * Fixes of review notes	2023-04-13 03:55:30 +03:00
Leonid Fedorov	56f2346083	Remove windows ifdefs	2023-03-02 15:59:42 +00:00
Denis Khalikov	e09d24cb8d	[MCOL-5265] Change boost:shared_ptr to std::shared_ptr. This is attempt to make some part of the code more stable. For some reason we can get a spurious nullptr for boost::shared_ptr which cause an assert and abort.	2022-11-14 18:53:53 +03:00
david.hall	3b6449842f	Merge branch 'develop' into MCOL-4841 # Conflicts: # exemgr/main.cpp # oam/etc/Columnstore.xml.singleserver # primitives/primproc/primproc.cpp	2022-06-09 10:07:26 -05:00
Leonid Fedorov	c25ae4f378	Use external boost 1.78	2022-05-02 18:23:37 +00:00
Roman Nozdrin	e174696351	MCOL-5001 This patch merges ExeMgr and PrimProc runtimes EM and PP are most resource-hungry runtimes. The merge enables to control their cummulative resource consumption, thread allocation + enables zero-copy data exchange b/w local EM and PP facilities.	2022-04-04 11:46:33 +00:00
David Hall	c6c36eb622	MCOL-5002 dev use largeRG when indexing by largeKeyColumns[]	2022-03-01 08:44:39 -06:00
David Hall	27dea733c5	MCOL4841 dev port run large join without OOM	2022-02-09 17:33:55 -06:00
Leonid Fedorov	04752ec546	clang format apply	2022-01-21 16:43:49 +00:00
Leonid Fedorov	01f3ceb437	replace header guards with #pragma once	2022-01-21 15:24:58 +00:00
Denis Khalikov	b382f681a1	[MCOL-4849] Parallelize the processing of the bytestream vector. This patch changes the logic of the `receiveMultiPrimitiveMessages` function in the following way: 1. We have only one aggregation thread which reads the data from Queue (which is populated by messages from BPPs). 2. Processing of the received `bytestream vector` could be in parallel depends on the type of `TupleBPS` operation (join, fe2, ...) and actual thread pool workload. The motivation is to eliminate some amount of context switches.	2021-11-04 13:28:22 +03:00
Roman Nozdrin	866dc25729	Merge pull request #1842 from denis0x0D/MCOL-987_LZ MCOL-987 LZ4 compression support.	2021-07-07 13:13:18 +03:00
Alexander Barkov	9794f24369	MCOL-4801 Replace Row methods getStringLength() and getStringPointer() to getConstString()	2021-07-06 21:15:32 +04:00
Denis Khalikov	cc1c3629c5	MCOL-987 Add LZ4 compression. * Adds CompressInterfaceLZ4 which uses LZ4 API for compress/uncompress. * Adds CMake machinery to search LZ4 on running host. * All methods which use static data and do not modify any internal data - become `static`, so we can use them without creation of the specific object. This is possible, because the header specification has not been modified. We still use 2 sections in header, first one with file meta data, the second one with pointers for compressed chunks. * Methods `compress`, `uncompress`, `maxCompressedSize`, `getUncompressedSize` - become pure virtual, so we can override them for the other compression algos. * Adds method `getChunkMagicNumber`, so we can verify chunk magic number for each compression algo. * Renames "s/IDBCompressInterface/CompressInterface/g" according to requirement.	2021-07-06 18:04:37 +03:00
Roman Nozdrin	bed0b7c6bc	MCOL-4173 This patch adds support for wide-DECIMAL INNER, OUTER, SEMI, functional JOINs based on top of TypelessData	2021-06-24 08:07:23 +00:00
Alexander Barkov	b3d6f62964	MCOL-4753 Performance problem in Typeless join	2021-06-10 09:26:26 +00:00
Alexey Antipovsky	475104e4d3	[MCOL-4709] Disk-based aggregation * Introduce multigeneration aggregation * Do not save unused part of RGDatas to disk * Add IO error explanation (strerror) * Reduce memory usage while aggregating * introduce in-memory generations to better memory utilization * Try to limit the qty of buckets at a low limit * Refactor disk aggregation a bit * pass calculated hash into RowAggregation * try to keep some RGData with free space in memory * do not dump more than half of rowgroups to disk if generations are allowed, instead start a new generation * for each thread shift the first processed bucket at each iteration, so the generations start more evenly * Unify temp data location * Explicitly create temp subdirectories whether disk aggregation/join are enabled or not	2021-06-06 16:09:15 +03:00
Alexander Barkov	284fc51bb7	MCOL-4726 Wrong result of WHERE char1_col='A'	2021-05-21 14:40:16 +04:00
Roman Nozdrin	13e160ec2b	MCOL-4470 Fix the crash in collation aware JOIN code	2020-12-24 14:28:09 +00:00
Alexander Barkov	a433c65575	A cleanup for MCOL-4064 Make JOIN collation aware After creating and populating tables with CHAR(5) case insensitive columns, in a set of consequent joins like: select * from t1, t2 where t1.c1=t2.c1; select * from t1, t2 where t1.c1=t2.c2; select * from t1, t2 where t1.c2=t2.c1; select * from t1, t2 where t1.c2=t2.c2; only the first join worked reliably case insensitively. Removing the remaining pieces of the code that used order_swap() to compare short CHAR columns, and using Charset::strnncollsp() instead. This fixes the issue.	2020-12-10 19:19:36 +04:00
Alexander Barkov	c6158eee31	Part#1 MCOL-4064 Make JOIN collation aware Making field1=field2 collation aware for long CHAR/VARCHAR.	2020-12-04 08:41:26 +04:00
Alexander Barkov	129d5b5a0f	MCOL-4174 Review/refactor frontend/connector code	2020-11-18 13:53:15 +00:00

1 2

89 Commits