1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-06-09 06:41:19 +03:00

78 Commits

Author SHA1 Message Date
drrtuy
d4fe2e7a45 chore(): re-enabled memory accounting for RGData generated by PP::execute() 2025-05-02 10:11:40 +01:00
drrtuy
f5494ad7a2 chore(): leftovers cleanup. 2025-05-02 10:11:40 +01:00
drrtuy
42417764d8 chore(): cleanup. 2025-05-02 10:11:40 +01:00
drrtuy
01cc73d416 fix(perf,allocator): test build with reduced CountingAllocator parameter values. 2025-05-02 10:11:40 +01:00
=
671b7301f3 fix(allocator,perf): performance degradation caused by lack of STLPoolAllocator replaced by CountingAllocator 2025-05-02 10:11:40 +01:00
drrtuy
3dfc8cd454 feat(): first cleanup 2025-03-27 22:12:48 +00:00
drrtuy
1aa2f3a42b feat(): TupleHashJoin now handles bad_alloc case switching to disk-based if it is enabled 2025-03-27 22:12:48 +00:00
drrtuy
90b4322470 feat(): propagated changes into SLTPoolAllocator and friends 2025-03-27 22:12:48 +00:00
drrtuy
a6de8ec1ac feat(): dangling pointer/ref issue has been solved for both RGData and BS 2025-03-27 22:12:48 +00:00
drrtuy
5f1bd3be12 feat(RGData,StringStore): add counting allocator capabilities to those ctors used in BPP::execute() 2025-03-27 22:12:48 +00:00
drrtuy
02b8ea1331 feat(PP,ByteStream): new counting memory allocator 2025-03-27 22:12:48 +00:00
drrtuy
aa4bbc0152
feat(joblist,runtime): this is the first part of the execution model that produces a workload that can be predicted for a given query.
* feat(joblist,runtime): this is the first part of the execution model that produces a workload that can be predicted for a given query.
  - forces to UM join converter to use a value from a configuration
  - replaces a constant used to control a number of outstanding requests with a value depends on column width
  - modifies related Columnstore.xml values
2024-12-03 22:18:21 +00:00
Alexey Antipovsky
11136b3545
fix(PrimProc): MCOL-5651 Add a workaround to avoid choosing an incorrect TupleHashJoinStep as a joiner [stable-23.10] (#3331)
* fix(PrimProc): MCOL-5651 Add a workaround to avoid choosing an incorrect TupleHashJoinStep as a joiner
2024-11-08 12:51:25 +00:00
Leonid Fedorov
9a9b5f8036
fix(join,threadpool): MCOL-5565: MCOL-5636: MCOL-5645: port from develop-23.02 to [stable-23.10] (#3127)
* fix(threadpool): MCOL-5565 queries stuck in FairThreadScheduler. (#3100)

Meta Primitive Jobs, .e.g ADD_JOINER, LAST_JOINER stuck
	in Fair scheduler without out-of-band scheduler. Add OOB
	scheduler back to remedy the issue.

* fix(messageqcpp): MCOL-5636 same node communication crashes transmiting PP errors to EM b/c error messaging leveraged socket that was a nullptr. (#3106)

* fix(threadpool): MCOL-5645 errenous threadpool Job ctor implictly sets socket shared_ptr to nullptr causing sigabrt when threadpool returns an error (#3125)

---------

Co-authored-by: drrtuy <roman.nozdrin@mariadb.com>
2024-02-14 14:56:07 +03:00
Sergey Zefirov
920607520c
feat(runtime)!: MCOL-678 A "GROUP BY ... WITH ROLLUP" support
Adds a special column which helps to differentiate data and rollups of
various depts and a simple logic to row aggregation to add processing of
subtotals.
2023-09-26 17:01:53 +03:00
Denis Khalikov
896e8dd769
MCOL-5522 Properly process pm join result count. (#2909)
This patch:
1. Properly processes situation when pm join result count is exceeded.
2. Adds session variable 'columnstore_max_pm_join_result_count` to control the limit.
2023-08-04 16:55:45 +03:00
Denis Khalikov
a079834288 MCOL-5522 Temporary disable join match limit on BPP
Current join pipeline is not designed to send a temporal result and
continue execution. RowGroup for large side is the same where we store
the matched rows, we cannot continue to iterate over it, requires a
proper refactoring.
2023-07-09 17:56:44 +03:00
Roman Nozdrin
62dc392476
MCOL-5499 Enable ControlFlow for same node communication processing path to avoid DEC queue overloading (#2848) 2023-06-07 15:41:59 +03:00
Roman Nozdrin
4fe9cd64a3
Revert "No boost condition (#2822)" (#2828)
This reverts commit f916e64927cd81569327014f20c4cc0b8aca40ff.
2023-04-22 15:49:50 +03:00
Leonid Fedorov
f916e64927
No boost condition (#2822)
This patch replaces boost primitives with stdlib counterparts.
2023-04-22 00:42:45 +03:00
Leonid Fedorov
c2d0fa24da replace boost::shared_array<T> to std::shared_ptr<T[]> 2023-04-14 10:33:27 +00:00
Leonid Fedorov
6c32c658d5
MCOL-5385: Delete RowGroup::setData and make Pointer ctor explicit (#2808)
* Delete RowGroup::setData and make Pointer ctor explicit

* some push_backs replaced with emplace_backs

* Fixes of review notes
2023-04-13 03:55:30 +03:00
Leonid Fedorov
2e1394149b
MCOL-5464: Fixes of bugs from ASAN warnings, part one (#2792)
* Fixes of bugs from ASAN warnings, part one

* MQC as static library, with nifty counter for global map and mutex

* Switch clang to 16

* link messageqcpp to execplan
2023-04-04 02:33:23 +03:00
Sergey Zefirov
b53c231ca6 MCOL-271 empty strings should not be NULLs (#2794)
This patch improves handling of NULLs in textual fields in ColumnStore.
Previously empty strings were considered NULLs and it could be a problem
if data scheme allows for empty strings. It was also one of major
reasons of behavior difference between ColumnStore and other engines in
MariaDB family.

Also, this patch fixes some other bugs and incorrect behavior, for
example, incorrect comparison for "column <= ''" which evaluates to
constant True for all purposes before this patch.
2023-03-30 21:18:29 +03:00
Gagan Goel
2280b1dd25 MCOL-5021 Add support for the AUX column in ExeMgr and PrimProc.
In the joblist code, in addition to sending the lbid of the SCAN
column, we also send the corresponding lbid of the AUX column to PrimProc.

In the primitives processor code in PrimProc, we load the AUX column
block (8192 rows since the AUX column is implemented as a 1-byte
UNSIGNED TINYINT) into memory and then pass it down to the low-level
scanning (vectorized scanning as applicable) routine to build a non-Empty
mask for the block being processed to filter out DELETED rows based on
comparison of the AUX block row to the empty magic value for the AUX column.
2022-08-05 14:40:49 -04:00
Roman Nozdrin
a9d8924683 MCOL-5166 This patch adds support for in-memory communication b/w EM to PP via a shared queue in DEC class
JobList low-level code relateod to primitive jobs now uses shared pointers instead of ByteStream refs talking to DEC
b/c same-node EM-PP communication now goes over a queue in DEC instead of a network hop.
PP now has a separate thread that processes the primitive job messages from that DEC queue.
2022-08-04 18:51:31 +03:00
Roman Nozdrin
6cff14997d Revert "This reverts MCOL-5044 AKA FairThreadPool that breaks regr test002"
This reverts commit 61359119ad3e7002c1855fd1bb433f0d2795d1e4.
2022-07-09 12:38:51 +00:00
Roman Nozdrin
1624c347f6 MCOL-5152 This patch enables PP to put ByteStreams into DEC input queue directly for a local PP-EM connection 2022-07-04 09:06:40 +00:00
david.hall
6d47529499 Merge branch 'develop' into MCOL-4841 2022-06-14 14:41:41 -05:00
Roman Nozdrin
61359119ad This reverts MCOL-5044 AKA FairThreadPool that breaks regr test002
This reverts commit e40c16bd561086e1fc085f0365948b6d44cd8aab, reversing
changes made to 18e6b1d77bdd47877fbb798140b739d19f063cb2.
2022-06-10 14:17:59 +00:00
David.Hall
272246e9fa
Merge branch 'develop' into MCOL-4841 2022-06-09 16:58:33 -05:00
david.hall
3b6449842f Merge branch 'develop' into MCOL-4841
# Conflicts:
#	exemgr/main.cpp
#	oam/etc/Columnstore.xml.singleserver
#	primitives/primproc/primproc.cpp
2022-06-09 10:07:26 -05:00
Roman Nozdrin
fd8ba33f21 MCOL-5044 This patch replaces PriorityThreadPool with FairThreadPool that uses a simple
operations + morsel size weight model to equally allocate CPU b/w parallel query morsels.
This patch delivers better parallel query timings distribution(timings graph resembles normal
distribution with a bigger left side thus more queries runs faster comparing with PrioThreadPool-based
single-node installation).
See changes in batchprimitiveprocessor-jl.h and comments in fair_threadpool.h for
important implementation details
2022-06-03 10:08:12 +00:00
Serguey Zefirov
53b9a2a0f9 MCOL-4580 extent elimination for dictionary-based text/varchar types
The idea is relatively simple - encode prefixes of collated strings as
integers and use them to compute extents' ranges. Then we can eliminate
extents with strings.

The actual patch does have all the code there but miss one important
step: we do not keep collation index, we keep charset index. Because of
this, some of the tests in the bugfix suite fail and thus main
functionality is turned off.

The reason of this patch to be put into PR at all is that it contains
changes that made CHAR/VARCHAR columns unsigned. This change is needed in
vectorization work.
2022-03-02 23:53:39 +03:00
Leonid Fedorov
3919c541ac
New warnfixes (#2254)
* Fix clang warnings

* Remove vim tab guides

* initialize variables

* 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length

* Fix ISO C++17 does not allow 'register' storage class specifier for outdated bison

* chars are unsigned on ARM, having  if (ival < 0) always false

* chars are unsigned by default on ARM and comparison with -1 if always true
2022-02-17 13:08:58 +03:00
David Hall
27dea733c5 MCOL4841 dev port run large join without OOM 2022-02-09 17:33:55 -06:00
Leonid Fedorov
04752ec546 clang format apply 2022-01-21 16:43:49 +00:00
Roman Nozdrin
3de038c1da MCOL-4876 This patch enables continues buffer to be used by ColumnCommand and aligns BPP::blockData
that in most cases was unaligned
2021-10-06 09:23:40 +00:00
Roman Nozdrin
fb5ba84212 MCOL-4802 Removed ByteStream methods for bool manipulations and add some logging into I_S.columnstore_files 2021-07-07 07:16:30 +00:00
Roman Nozdrin
bed0b7c6bc MCOL-4173 This patch adds support for wide-DECIMAL INNER, OUTER, SEMI, functional JOINs
based on top of TypelessData
2021-06-24 08:07:23 +00:00
Alexander Barkov
b3d6f62964 MCOL-4753 Performance problem in Typeless join 2021-06-10 09:26:26 +00:00
Roman Nozdrin
757f8d00a5 A plugable PoorManProfiler singleton 2021-04-14 10:54:46 +00:00
Roman Nozdrin
895cbbe2d1 This patch revives PP poorman's profiling using StopWatch class 2021-04-08 12:10:06 +00:00
Gagan Goel
a91fb15b07 Add PrimProc support for selective block loading for 16-byte columns. 2020-12-11 14:23:45 -05:00
Alexander Barkov
c6158eee31 Part#1 MCOL-4064 Make JOIN collation aware
Making field1=field2 collation aware for long CHAR/VARCHAR.
2020-12-04 08:41:26 +04:00
Roman Nozdrin
1588ebe439 MCOL-641 Clean up primitives code
Add int128_t support into ByteStream

Fixed UTs broken after collation patch
2020-11-18 13:52:19 +00:00
Gagan Goel
d3bc68b02f MCOL-641 Refactor initial extent elimination support.
This commit also adds support in TupleHashJoinStep::forwardCPData,
although we currently do not support wide decimals as join keys.

Row estimation to determine large-side of the join is also updated.
2020-11-18 13:52:19 +00:00
Gagan Goel
62d0c82d75 MCOL-641 1. Templatized convertValueNum() function.
2. Allocate int128_t buffers in batchprimitiveprocessor if
a query involves wide decimal columns.
2020-11-18 13:47:44 +00:00
Gagan Goel
55afcd8890 MCOL-641 Basic extent elimination support for Decimal38. 2020-11-18 13:47:01 +00:00
drrtuy
0ff0472842 MCOL-641 sum() now works with DECIMAL(38) columns.
TupleAggregateStep class method and buildAggregateColumn() now properly set result data type.

doSum() now handles DECIMAL(38) in approprate manner.

Low-level null related methods for new binary-based datatypes now handles magic values for
binary-based DT.
2020-11-18 13:47:01 +00:00