1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-07-29 08:21:15 +03:00
Commit Graph

500 Commits

Author SHA1 Message Date
fcd46ab00a fix(DEC): MCOL-5637 Initialize a new bytestream before write to PS (#3118) 2024-02-09 22:27:14 +03:00
74c1a38f2c fix(disk-based-join): MCOL-5626 Fix for race in DJS with outer join. (#3064) 2023-12-15 11:20:27 +03:00
9119f6f7b8 fix(aggregate): MCOL-5467 Add support for duplicate expressions in group by. (#3045)
This patch adds support for duplicate expressions (builtin_functions) with
one argument in select statement and group by statement.
2023-12-05 15:04:53 +03:00
8632c85ecf feat(primproc,aggregegation)!: Changes for ROLLUP with single-phase aggregation (#3025)
The fix is simple: enable subtotals in single-phase aggregation and
disable parallel processing when there are subtotals and aggregation is
single-phase.
2023-11-28 17:33:02 +03:00
69b8e1c779 feat(extent-elimination)!: re-enable extent-elimination for dictionary columns scanning
This is "productization" of an old code that would enable extent
elimination for dictionary columns.

This concrete patch enables it, fixes perfomance degradation (main
problem with old code) and also fixes incorrect behavior of cpimport.
2023-11-17 17:14:35 +03:00
320df831c6 MCOL-5572 Force the charset on the autoincrement column of (#2976)
calpontsys.syscolumn syscat table to be latin1.

This change is done in one of the ctors of pColStep which is
initiated while building the job list from the execution plan.
2023-09-28 22:03:39 +03:00
920607520c feat(runtime)!: MCOL-678 A "GROUP BY ... WITH ROLLUP" support
Adds a special column which helps to differentiate data and rollups of
various depts and a simple logic to row aggregation to add processing of
subtotals.
2023-09-26 17:01:53 +03:00
8171e9da07 Fix rocky-8 vanilla compiler build (#2959)
Co-authored-by: Leonid Fedorov <leonid.fedorov@mariad.com>
2023-09-20 04:04:08 +03:00
add3a57e8d MCOL-5539 Put table on small side if it was involved in prev.join. (#2945) 2023-09-05 12:19:43 +03:00
896e8dd769 MCOL-5522 Properly process pm join result count. (#2909)
This patch:
1. Properly processes situation when pm join result count is exceeded.
2. Adds session variable 'columnstore_max_pm_join_result_count` to control the limit.
2023-08-04 16:55:45 +03:00
2a66ae2ed1 MCOL-5514 Parallel disk join step. 2023-07-11 14:05:14 +03:00
a8be4a3787 compiler warnings
like

dbcon/joblist/batchprimitiveprocessor-jl.cpp:893:54: error: pointer used after ‘void operator delete [](void*, std::size_t)’ [-Werror=use-after-free]
  893 |           joinResults.reset(new vector<uint32_t>[8192]);
      |                                                      ^
2023-07-04 12:58:18 -04:00
2aba28d855 Merge pull request #2851 from denis0x0D/MCOL-5477
MCOL-5477 Disk join step improvement.
2023-06-26 11:02:20 +03:00
1f190a6e75 MCOL-5477 Disk join step improvement.
This patch:
1. Handles corner case when the bucket exceeded the memory limit, but we cannot redistribute the data in this bucket into new buckets based on a hash algorithm, because the rows have the same values.
2. Adds force option for disk join step.
3. Add a option to contol the depth of the partition tree.
2023-06-23 18:40:15 +03:00
024e6bd358 MCOL-5512 Fix for post join filter.
This patch fixes certain situations where post join filter is not applying.
2023-06-09 11:15:05 +03:00
62dc392476 MCOL-5499 Enable ControlFlow for same node communication processing path to avoid DEC queue overloading (#2848) 2023-06-07 15:41:59 +03:00
8f93fc3623 MCOL-5493: First portion of UBSan fixes (#2842)
Multiple UB fixes
2023-06-02 17:02:09 +03:00
87eb875379 MCOL-5491 Enable StringStore for long strings in JSON_ARRAYAGG processing.
This patch is the JSON_ARRAYAGG clone of the changes done in MCOL-5429
where we enabled usage of StringStore for long strings in
GROUP_CONCAT() processing to reduce memory footprint of PrimProc and
thus avoiding a potential OS triggered OOM crash.
2023-05-12 19:45:02 +00:00
0be1c3dc8f MCOL-5429 Fix high memory consumption in GROUP_CONCAT() processing.
1. Input and output RowGroup's used in GROUP_CONCAT classes
are currently allocating a raw memory buffer of size equal
to the actual width of the string datatype. As an example,
for the following query:
  SELECT col1, GROUP_CONCAT(col2) FROM t GROUP BY col1;
If col2 is a TEXT field with default width, the input
RowGroup containing the target rows to be concatenated will
assign 64kb of memory for every input row in the RowGroup.
This is wasteful as actual field values in real workloads
would be much smaller. We fix this by enabling the
RowGroup to use the StringStore when the RowGroup contains
long strings.

2. RowAggregation::initialize() allocates a memory buffer
for a NULL row. The size of this buffer is equal to the
row size for the output RowGroup. For the above scenario,
using the default group_concat_max_len (which is a server
variable that sets the maximum length of the GROUP_CONCAT string)
value of 1mb, the buffer size would be
(1mb + 64kb + some additional metadata). If the user sets
group_concat_max_len to a higher value, say 3gb, this buffer
size would be ~3gb. Now if the runtime initiates several
instances of RowAggregation, total memory consumption by
PrimProc could exceed the hardware memory limits causing the
OS OOM to kill the process. We fix this problem by again
enabling the StringStore for the NULL row allocation.

3. In the plugin code in buildAggregateColumn(), there is
an integer overflow when the server group_concat_max_len
variable (which is an uint32_t) is set to a value > INT32_MAX
(such as 3gb) and is assigned to
CalpontSystemCatalog::ColType::colWidth (which is an int32_t).
As a short term fix, we saturate the assigned value to colWidth
to INT32_MAX. Proper fix would be to upgrade
CalpontSystemCatalog::ColType::colWidth to an uint32_t.
2023-05-01 13:06:23 -04:00
4fe9cd64a3 Revert "No boost condition (#2822)" (#2828)
This reverts commit f916e64927.
2023-04-22 15:49:50 +03:00
f916e64927 No boost condition (#2822)
This patch replaces boost primitives with stdlib counterparts.
2023-04-22 00:42:45 +03:00
c2d0fa24da replace boost::shared_array<T> to std::shared_ptr<T[]> 2023-04-14 10:33:27 +00:00
a508b86091 remove boost/shared_array include 2023-04-14 09:42:50 +00:00
6c32c658d5 MCOL-5385: Delete RowGroup::setData and make Pointer ctor explicit (#2808)
* Delete RowGroup::setData and make Pointer ctor explicit

* some push_backs replaced with emplace_backs

* Fixes of review notes
2023-04-13 03:55:30 +03:00
2e1394149b MCOL-5464: Fixes of bugs from ASAN warnings, part one (#2792)
* Fixes of bugs from ASAN warnings, part one

* MQC as static library, with nifty counter for global map and mutex

* Switch clang to 16

* link messageqcpp to execplan
2023-04-04 02:33:23 +03:00
b53c231ca6 MCOL-271 empty strings should not be NULLs (#2794)
This patch improves handling of NULLs in textual fields in ColumnStore.
Previously empty strings were considered NULLs and it could be a problem
if data scheme allows for empty strings. It was also one of major
reasons of behavior difference between ColumnStore and other engines in
MariaDB family.

Also, this patch fixes some other bugs and incorrect behavior, for
example, incorrect comparison for "column <= ''" which evaluates to
constant True for all purposes before this patch.
2023-03-30 21:18:29 +03:00
786b9da5b0 MCOL-5438 COUNT() in math causes SEGV 2023-03-09 20:35:38 +00:00
56f2346083 Remove windows ifdefs 2023-03-02 15:59:42 +00:00
2f1f9c0ef0 MDEV-25080 Some fixes:
1. In TupleUnion::writeNull(), add the missing switch case for
   wide decimal with 16bytes column width.
2. MCOL-5432 Disable complete/partial pushdown of UNION operation
   if the query involves an ORDER BY or a LIMIT clause, until
   MCOL-5222 is fixed. Also add MTR test cases for this.
2023-02-27 06:38:31 -05:00
86dcf92d56 MCOL-5215 Fix overflow of UNION operation involving DECIMAL datatypes.
When a UNION operation involving DECIMAL datatypes with scale and digits
before the decimal exceeds the currently supported maximum precision
of 38, we throw an error to the user:
"MCS-2060: Union operation exceeds maximum DECIMAL precision of 38".

This is until MCOL-5417 is implemented where ColumnStore will have
full parity with MariaDB server in terms of maximum supported DECIMAL
precision and scale of 65 and 38 digits respectively.
2023-02-27 06:38:31 -05:00
d87206c3e4 Fix segfault in getLocalNetIfacesSins (#2713) 2023-01-26 16:21:21 +03:00
c7c182ebd2 Merge pull request #2684 from drrtuy/MCOL-5385
MCOL-5385 This patch reduces RAM consumption and adds GROUP_CONCAT RA…
2023-01-18 11:58:47 +03:00
d42485656c Fix clang 16 warnings for comfort build 2023-01-12 22:11:28 +03:00
d0eea0ffe8 MCOL-5385 This patch reduces RAM consumption and adds GROUP_CONCAT RAM accounting feature 2023-01-11 09:52:10 +00:00
15f65eff15 Merge pull request #2655 from denis0x0D/MCOL-5263_2
MCOL-5263 Add support to ROLLBACK when PP were restarted.
2022-12-13 21:24:01 +03:00
d61780cab1 MCOL-5263 Add support to ROLLBACK when PP were restarted.
DMLProc starts ROLLBACK when SELECT part of UPDATE fails b/c EM facility in PP were restarted.
Unfortunately this ROLLBACK stuck if EM/PP are not yet available.
DMLProc must have a t/o with re-try doing ROLLBACK.
2022-12-13 16:18:53 +03:00
635a9fdb56 Merge pull request #2658 from dhall-MariaDB/patch_out_of_band
patch_out_of_band
2022-12-12 17:14:41 -06:00
10e2834033 patch_out_of_band
Some changes made to 10.6-enterprise make a build using the out-of-band method of compiling columnstore not work. Out-of band means the source for the engine is not in the storage subdir of server, but rather in a stand alone directory. This is used by developers for easier develop work. In the case of out-of-band, INSTALL_LAYOUT is false in CMakeLists.txt
2022-12-12 14:17:09 -06:00
369aea884e MCOL-5311 Add timezone to jobList in subquerytransformer
TimeZone was uninitialized in this scenario and led to undefined behavior.
2022-12-07 09:52:00 -06:00
a1d89d8f31 Merge pull request #2630 from dhall-MariaDB/sergchanges
Sergchanges
2022-12-02 19:08:08 +03:00
7e3ad24437 Serge changes -- Add static joblist lib 2022-11-30 12:46:26 -06:00
bfbe5bf315 MCOL-5264 This patch replaces boost mutex locks with std analogs
boost::uniqie_lock dtor calls a fancy unlock logic that throws twice.
First if the mutex is 0 and second lock doesn't own the mutex.
The first condition failure causes unhandled exception for one of the clients
in DEC::writeToClient(). I was unable to find out why Linux can have a 0
mutex and replaced boost::mutex with std::mutex b/c stdlibc++ should
be more stable comparing with boost.
2022-11-24 17:19:24 +00:00
b936ed8b2e Fix some GCC-12 Build errors 2022-11-22 03:28:17 +03:00
84bb4e56b8 Merge pull request #2624 from mariadb-corporation/columnstore-22.08.4-1
Columnstore 22.08.4 1
2022-11-18 19:01:33 +03:00
246a4db8de fix C API includes
ColumnStore used to include server's mysql.h
but link all tools with libmariadb.so

There's no guarantee that this would work, even with workarounds
it had in dbcon/mysql/sm.cpp

Fix:
* tools (linked with libmariadb.so) *must* include libmariadb's mysql.h
* as a hack prevent service_thd_timezone.h from being loaded into tools,
  as it conflicts with libmariadb's mysql.h
* server plugin *must* include server's mysql.h
* also don't link every tool with libmariadb.so, link the helper library
  (liblibmysqlclient.so) that actually needs it, tools use this
  helper library, not libmariadb.so directly
2022-11-17 12:02:07 -06:00
21c3bbce16 do *not* link ha_columnstore.so with libmariadb.so
this means some libraries have to be compiled twice -
for tools with libmariadb.so and for plugin, without.
2022-11-17 11:53:42 -06:00
9b84bf57c9 From serg: add dependency for generated header files errorids.h messageids.h 2022-11-17 11:46:10 -06:00
4b15a7e8a9 Merge pull request #2628 from denis0x0D/MCOL-5265
[MCOL-5265] Change boost:shared_ptr to std::shared_ptr.
2022-11-15 16:45:19 +03:00
e09d24cb8d [MCOL-5265] Change boost:shared_ptr to std::shared_ptr.
This is attempt to make some part of the code more stable.
For some reason we can get a spurious nullptr for boost::shared_ptr
which cause an assert and abort.
2022-11-14 18:53:53 +03:00
61d5f80aa0 MCOL-5279 This approach executes same node DEC::writeToClient the last taking multiple ExeMgrs into account (#2623)
Co-authored-by: Roman Nozdrin <rnozdrin@mariadb.com>
2022-11-11 13:17:45 -06:00