1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-08-05 16:15:50 +03:00
Commit Graph

47 Commits

Author SHA1 Message Date
Sergey Zefirov
920607520c feat(runtime)!: MCOL-678 A "GROUP BY ... WITH ROLLUP" support
Adds a special column which helps to differentiate data and rollups of
various depts and a simple logic to row aggregation to add processing of
subtotals.
2023-09-26 17:01:53 +03:00
Denis Khalikov
896e8dd769 MCOL-5522 Properly process pm join result count. (#2909)
This patch:
1. Properly processes situation when pm join result count is exceeded.
2. Adds session variable 'columnstore_max_pm_join_result_count` to control the limit.
2023-08-04 16:55:45 +03:00
Denis Khalikov
024e6bd358 MCOL-5512 Fix for post join filter.
This patch fixes certain situations where post join filter is not applying.
2023-06-09 11:15:05 +03:00
Leonid Fedorov
8f93fc3623 MCOL-5493: First portion of UBSan fixes (#2842)
Multiple UB fixes
2023-06-02 17:02:09 +03:00
Roman Nozdrin
4fe9cd64a3 Revert "No boost condition (#2822)" (#2828)
This reverts commit f916e64927.
2023-04-22 15:49:50 +03:00
Leonid Fedorov
f916e64927 No boost condition (#2822)
This patch replaces boost primitives with stdlib counterparts.
2023-04-22 00:42:45 +03:00
Leonid Fedorov
c2d0fa24da replace boost::shared_array<T> to std::shared_ptr<T[]> 2023-04-14 10:33:27 +00:00
Leonid Fedorov
6c32c658d5 MCOL-5385: Delete RowGroup::setData and make Pointer ctor explicit (#2808)
* Delete RowGroup::setData and make Pointer ctor explicit

* some push_backs replaced with emplace_backs

* Fixes of review notes
2023-04-13 03:55:30 +03:00
Leonid Fedorov
56f2346083 Remove windows ifdefs 2023-03-02 15:59:42 +00:00
Denis Khalikov
e09d24cb8d [MCOL-5265] Change boost:shared_ptr to std::shared_ptr.
This is attempt to make some part of the code more stable.
For some reason we can get a spurious nullptr for boost::shared_ptr
which cause an assert and abort.
2022-11-14 18:53:53 +03:00
Gagan Goel
cbfdae3481 MCOL-5021 Code changes based on review feedback. 2022-08-05 14:40:50 -04:00
Gagan Goel
9b6d3c3870 MCOL-5021 Add support for AUX column in the client code calling
CalpontSystemCatalog::columnRIDs().
2022-08-05 14:40:49 -04:00
Gagan Goel
262cd5c501 MCOL-5021 Remove hard-coded values for data type, column width
and compression type for the AUX column, and replace them with
constants defined in the execplan namespace.
2022-08-05 14:40:49 -04:00
Gagan Goel
2280b1dd25 MCOL-5021 Add support for the AUX column in ExeMgr and PrimProc.
In the joblist code, in addition to sending the lbid of the SCAN
column, we also send the corresponding lbid of the AUX column to PrimProc.

In the primitives processor code in PrimProc, we load the AUX column
block (8192 rows since the AUX column is implemented as a 1-byte
UNSIGNED TINYINT) into memory and then pass it down to the low-level
scanning (vectorized scanning as applicable) routine to build a non-Empty
mask for the block being processed to filter out DELETED rows based on
comparison of the AUX block row to the empty magic value for the AUX column.
2022-08-05 14:40:49 -04:00
Roman Nozdrin
a9d8924683 MCOL-5166 This patch adds support for in-memory communication b/w EM to PP via a shared queue in DEC class
JobList low-level code relateod to primitive jobs now uses shared pointers instead of ByteStream refs talking to DEC
b/c same-node EM-PP communication now goes over a queue in DEC instead of a network hop.
PP now has a separate thread that processes the primitive job messages from that DEC queue.
2022-08-04 18:51:31 +03:00
Roman Nozdrin
1624c347f6 MCOL-5152 This patch enables PP to put ByteStreams into DEC input queue directly for a local PP-EM connection 2022-07-04 09:06:40 +00:00
david.hall
3b6449842f Merge branch 'develop' into MCOL-4841
# Conflicts:
#	exemgr/main.cpp
#	oam/etc/Columnstore.xml.singleserver
#	primitives/primproc/primproc.cpp
2022-06-09 10:07:26 -05:00
Serguey Zefirov
53b9a2a0f9 MCOL-4580 extent elimination for dictionary-based text/varchar types
The idea is relatively simple - encode prefixes of collated strings as
integers and use them to compute extents' ranges. Then we can eliminate
extents with strings.

The actual patch does have all the code there but miss one important
step: we do not keep collation index, we keep charset index. Because of
this, some of the tests in the bugfix suite fail and thus main
functionality is turned off.

The reason of this patch to be put into PR at all is that it contains
changes that made CHAR/VARCHAR columns unsigned. This change is needed in
vectorization work.
2022-03-02 23:53:39 +03:00
Leonid Fedorov
3919c541ac New warnfixes (#2254)
* Fix clang warnings

* Remove vim tab guides

* initialize variables

* 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length

* Fix ISO C++17 does not allow 'register' storage class specifier for outdated bison

* chars are unsigned on ARM, having  if (ival < 0) always false

* chars are unsigned by default on ARM and comparison with -1 if always true
2022-02-17 13:08:58 +03:00
David Hall
27dea733c5 MCOL4841 dev port run large join without OOM 2022-02-09 17:33:55 -06:00
Leonid Fedorov
04752ec546 clang format apply 2022-01-21 16:43:49 +00:00
Denis Khalikov
f8bd566b0f [MCOL-4849] Fix build warnings. 2021-11-17 17:33:22 +03:00
Denis Khalikov
b382f681a1 [MCOL-4849] Parallelize the processing of the bytestream vector.
This patch changes the logic of the `receiveMultiPrimitiveMessages`
function in the following way:

1. We have only one aggregation thread which reads the data from Queue (which is populated
by messages from BPPs).
2. Processing of the received `bytestream vector` could be in parallel depends on the
type of `TupleBPS` operation (join, fe2, ...) and actual thread pool workload.

The motivation is to eliminate some amount of context switches.
2021-11-04 13:28:22 +03:00
Gagan Goel
b3a560300c Revert "Merge pull request #2022 from mariadb-corporation/bar-develop-MCOL-4791"
This reverts commit 4016e25e5b, reversing
changes made to 85435f6b1e.
2021-07-13 11:06:56 +00:00
Gagan Goel
8520f87237 MCOL-641 Cleanup. 2021-07-06 09:01:49 +00:00
Alexander Barkov
e8126bede5 MCOL-4791 Fix ColumnCommand fudged data type format to clearly identify CHAR vs VARCHAR 2021-07-02 12:42:03 +04:00
Alexey Antipovsky
0dedb7e628 Fix compilation warnings 2021-06-09 16:51:00 +03:00
Denis Khalikov
606194e6e4 MCOL-4685: Eliminate some irrelevant settings (uncompressed data and extents per file).
This patch:
1. Removes the option to declare uncompressed columns (set columnstore_compression_type = 0).
2. Ignores [COMMENT '[compression=0] option at table or column level (no error messages, just disregard).
3. Removes the option to set more than 2 extents per file (ExtentsPreSegmentFile).
4. Updates rebuildEM tool to support up to 10 dictionary extent per dictionary segment file.
5. Adds check for `DBRootStorageType` for rebuildEM tool.
6. Renamed rebuildEM to mcsRebuildEM.
2021-06-03 14:44:33 +03:00
Alexander Barkov
a433c65575 A cleanup for MCOL-4064 Make JOIN collation aware
After creating and populating tables with CHAR(5) case insensitive columns,
in a set of consequent joins like:

select * from t1, t2 where t1.c1=t2.c1;
select * from t1, t2 where t1.c1=t2.c2;
select * from t1, t2 where t1.c2=t2.c1;
select * from t1, t2 where t1.c2=t2.c2;

only the first join worked reliably case insensitively.

Removing the remaining pieces of the code that used order_swap() to compare
short CHAR columns, and using Charset::strnncollsp() instead.
This fixes the issue.
2020-12-10 19:19:36 +04:00
Roman Nozdrin
aa44bca473 A pack of fixes for compilation errors and warnings for all platforms
Add libdatatypes.so into debian packaging
2020-11-19 10:21:45 +00:00
Alexander Barkov
d5c6645ba1 Adding mcs_basic_types.h
For now it consists of only:

using int128_t = __int128;
using uint128_t = unsigned __int128;

All new privitive data types should go into this file in the future.
2020-11-18 13:53:15 +00:00
Alexander Barkov
129d5b5a0f MCOL-4174 Review/refactor frontend/connector code 2020-11-18 13:53:15 +00:00
Gagan Goel
62c1c1e0e2 Remove hi_val/lo_val data members from EMCasualPartition_struct
and use the union members instead.
2020-11-18 13:52:19 +00:00
Gagan Goel
d3bc68b02f MCOL-641 Refactor initial extent elimination support.
This commit also adds support in TupleHashJoinStep::forwardCPData,
although we currently do not support wide decimals as join keys.

Row estimation to determine large-side of the join is also updated.
2020-11-18 13:52:19 +00:00
Gagan Goel
55afcd8890 MCOL-641 Basic extent elimination support for Decimal38. 2020-11-18 13:47:01 +00:00
Roman Nozdrin
cd48df99e5 MCOL-4368 Unified exceptions handling code in dbcon/joblist 2020-10-21 18:17:32 +00:00
David Hall
78ac310e42 MCOL-3536 Collation 2020-06-01 15:08:15 -05:00
Sergei Golubchik
586391e1ca compilation failure
error: reference to 'mutex' is ambiguous
note: candidates are: 'class boost::mutex'
note:                 'class std::mutex'
2019-12-19 18:13:39 +01:00
Roman Nozdrin
7b5e5f0eb6 MCOL-894 Upmerged the fist part of the patch into develop.
MCOL-894 Add default values in Compare and CSEP ctors to activate UTF-8 sorting
    properly.

MCOL-894 Unit tests to build a framework for a new parallel sorting.

MCOL-894 Finished with parallel workers invocation.
     The implementation lacks final aggregation step.

MCOL-894 TupleAnnexStep's init and destructor are now parallel execution aware.

    Implemented final merging step for parallel execution finalizeParallelOrderBy().

    Templated unit test to use it with arbitrary number of rows, threads.

Reuse LimitedOrderBy in the final step

MCOL-894 Cleaned up finalizeParallelOrderBy.

MCOL-894 Add and propagate thread variable that controls a number of threads.

    Optimized comparators used for sorting and add corresponding UTs.

    Refactored TupleAnnexStep::finalizeParallelOrderByDistinct.

    Parallel sorting methods now preallocates memory in batches.

MCOL-894 Fixed comparator for StringCompare.
2019-11-05 15:23:43 +03:00
Roman Nozdrin
a0b3424603 MCOL-2244 Columnstore execution threads now have names describe
the threads operation. This should simplify CPU bottlenecks troubleshooting.
2019-03-15 14:34:01 +03:00
Andrew Hutchings
01446d1e22 Reformat all code to coding standard 2017-10-26 17:18:17 +01:00
David Hall
55d006de1a MCOL-513 use thread pool for jobsteps 2017-02-03 15:25:21 -06:00
Andrew Hutchings
ffcfc41563 MCOL-507 Further ExeMgr performance improvements
This does the following:

* Switch resource manager to a singleton which reduces the amount of
times the XML data is scanned and objects allocated.
* Make the I_S tables use the FE implementation of the system catalog
* Make the I_S.columnstore_columns table use the RID list cache
* Make the extentmap pre-allocate a vector instead of many small allocs
2017-01-16 12:33:27 +00:00
David Hall
482047679a MCOL-259 add some retry logic to the OAMCache system. Add that degraded is still valid for a PM. 2016-08-23 16:51:16 -05:00
David Hall
b9bbb67549 MCOL-259. Reload the Columnstore.xml if ERR_DATA_OFFLINE would be thrown. If still broke, throw anyway. 2016-08-11 15:35:19 -05:00
David Hall
b57af447a4 MCOL-5 Correct the issue of double-unlocking the mutex. It was supposed to be a lock, not unlock 2016-08-08 16:36:53 -05:00
david hill
f6afc42dd0 the begginning 2016-01-06 14:08:59 -06:00