1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-04-18 21:44:02 +03:00

311 Commits

Author SHA1 Message Date
Aleksei Antipovskii
5556d818f8 chore(codestyle): mark virtual methods as override 2025-02-21 20:02:38 +04:00
drrtuy
6445f4dff3
feat(joblist,runtime): this is the first part of the execution model that produces a workload that can be predicted for a given query.
* feat(joblist,runtime): this is the first part of the execution model that produces a workload that can be predicted for a given query.
  - forces to UM join converter to use a value from a configuration
  - replaces a constant used to control a number of outstanding requests with a value depends on column width
  - modifies related Columnstore.xml values
2024-12-03 22:17:49 +00:00
drrtuy
8ae5a3da40
Fix/mcol 5787 rgdata buffer max size dev (#3325)
* fix(rowgroup): RGData now uses uint64_t counter for the fixed sizes columns data buf.
	The buffer can utilize > 4GB RAM that is necessary for PM side join.
	RGData ctor uses uint32_t allocating data buffer.
 	This fact causes implicit heap overflow.

* feat(bytestream,serdes): BS buffer size type is uint64_t
	This necessary to handle 64bit RGData, that comes as
	a separate patch. The pair of patches would allow to
	have PM joins when SmallSide size > 4GB.

* feat(bytestream,serdes): Distribute BS buf size data type change to avoid implicit data type narrowing

* feat(rowgroup): this returns bits lost during cherry-pick. The bits lost caused the first RGData::serialize to crash a process
2024-11-09 19:44:02 +00:00
Alexey Antipovsky
842a3c8a40
fix(PrimProc): MCOL-5651 Add a workaround to avoid choosing an incorr… (#3320)
* fix(PrimProc): MCOL-5651 Add a workaround to avoid choosing an incorrect TupleHashJoinStep as a joiner
2024-11-08 17:44:20 +00:00
Alexander Presnyakov
5f9ccfa8a1 fix(mcol-4499): Correct handling of LIKE/NOT LIKE NULL 2024-07-30 23:13:38 +04:00
Leonid Fedorov
83c2408f8d
fix(join, threadpool): MCOL-5565: MCOL-5636: MCOL-5645: port from develop-23.02 to [develop] (#3128)
* fix(threadpool): MCOL-5565 queries stuck in FairThreadScheduler. (#3100)

Meta Primitive Jobs, .e.g ADD_JOINER, LAST_JOINER stuck
	in Fair scheduler without out-of-band scheduler. Add OOB
	scheduler back to remedy the issue.

* fix(messageqcpp): MCOL-5636 same node communication crashes transmiting PP errors to EM b/c error messaging leveraged socket that was a nullptr. (#3106)

* fix(threadpool): MCOL-5645 errenous threadpool Job ctor implictly sets socket shared_ptr to nullptr causing sigabrt when threadpool returns an error (#3125)

---------

Co-authored-by: drrtuy <roman.nozdrin@mariadb.com>
2024-02-13 19:01:16 +03:00
Sergey Zefirov
69b8e1c779
feat(extent-elimination)!: re-enable extent-elimination for dictionary columns scanning
This is "productization" of an old code that would enable extent
elimination for dictionary columns.

This concrete patch enables it, fixes perfomance degradation (main
problem with old code) and also fixes incorrect behavior of cpimport.
2023-11-17 17:14:35 +03:00
Roman Nozdrin
eb744eafed chore(datatypes): this refactors the placement of the main SQL data types enum to enable templates that are parametrized with this enum(see mcs_datatype_basic.h changes for more details). 2023-10-24 18:44:35 +03:00
Sergey Zefirov
920607520c
feat(runtime)!: MCOL-678 A "GROUP BY ... WITH ROLLUP" support
Adds a special column which helps to differentiate data and rollups of
various depts and a simple logic to row aggregation to add processing of
subtotals.
2023-09-26 17:01:53 +03:00
drrtuy
765dd46b61
fix(pp-threadpool): the workaround for a stuck tests001 in CI (#2931)
CI ocassionaly stuck running test001 b/c PP threadpool endlessly reschedules
    meta jobs, e.g. BATCH_PRIMITIVE_CREATE, which ByteStreams were somehow damaged or read out.

Co-authored-by: Leonid Fedorov <leonid.fedorov@mariadb.com>
2023-08-18 00:02:31 +03:00
Denis Khalikov
896e8dd769
MCOL-5522 Properly process pm join result count. (#2909)
This patch:
1. Properly processes situation when pm join result count is exceeded.
2. Adds session variable 'columnstore_max_pm_join_result_count` to control the limit.
2023-08-04 16:55:45 +03:00
drrtuy
d86204b9c8
Merge pull request #2899 from mariadb-corporation/simpler_singletons
MCOL-5532: Simpler singletons
2023-07-17 11:28:15 +01:00
Leonid Fedorov
aead012f9f Simpler OAM 2023-07-12 18:15:51 +03:00
Denis Khalikov
a079834288 MCOL-5522 Temporary disable join match limit on BPP
Current join pipeline is not designed to send a temporal result and
continue execution. RowGroup for large side is the same where we store
the matched rows, we cannot continue to iterate over it, requires a
proper refactoring.
2023-07-09 17:56:44 +03:00
Gagan Goel
ab7dfaa25b Fix resource leak in DDLProc/DMLProc/PrimProc/WriteengineServer processes.
As part of the charset support, a call to MY_INIT() was added at the
initialization of the above processes. This call initializes the MySQL
thread environment required by the charset library. However, the
accompanying my_end() call required to terminate this thread environment
was not added at the termination of these process, hence leaking
resources. As a fix, we move the MY_INIT() calls to the Child()
functions of these services and also add the missing my_end() call.
2023-06-23 20:18:32 +00:00
Roman Nozdrin
62dc392476
MCOL-5499 Enable ControlFlow for same node communication processing path to avoid DEC queue overloading (#2848) 2023-06-07 15:41:59 +03:00
Leonid Fedorov
8f93fc3623
MCOL-5493: First portion of UBSan fixes (#2842)
Multiple UB fixes
2023-06-02 17:02:09 +03:00
Roman Nozdrin
4fe9cd64a3
Revert "No boost condition (#2822)" (#2828)
This reverts commit f916e64927cd81569327014f20c4cc0b8aca40ff.
2023-04-22 15:49:50 +03:00
Leonid Fedorov
f916e64927
No boost condition (#2822)
This patch replaces boost primitives with stdlib counterparts.
2023-04-22 00:42:45 +03:00
Leonid Fedorov
c2d0fa24da replace boost::shared_array<T> to std::shared_ptr<T[]> 2023-04-14 10:33:27 +00:00
Leonid Fedorov
a508b86091 remove boost/shared_array include 2023-04-14 09:42:50 +00:00
Leonid Fedorov
6c32c658d5
MCOL-5385: Delete RowGroup::setData and make Pointer ctor explicit (#2808)
* Delete RowGroup::setData and make Pointer ctor explicit

* some push_backs replaced with emplace_backs

* Fixes of review notes
2023-04-13 03:55:30 +03:00
Leonid Fedorov
2e1394149b
MCOL-5464: Fixes of bugs from ASAN warnings, part one (#2792)
* Fixes of bugs from ASAN warnings, part one

* MQC as static library, with nifty counter for global map and mutex

* Switch clang to 16

* link messageqcpp to execplan
2023-04-04 02:33:23 +03:00
Sergey Zefirov
b53c231ca6 MCOL-271 empty strings should not be NULLs (#2794)
This patch improves handling of NULLs in textual fields in ColumnStore.
Previously empty strings were considered NULLs and it could be a problem
if data scheme allows for empty strings. It was also one of major
reasons of behavior difference between ColumnStore and other engines in
MariaDB family.

Also, this patch fixes some other bugs and incorrect behavior, for
example, incorrect comparison for "column <= ''" which evaluates to
constant True for all purposes before this patch.
2023-03-30 21:18:29 +03:00
Serguey Zefirov
2558c1a911 Define commented out 2023-03-17 17:10:40 +03:00
Sergey Zefirov
a2d7a21196
Fix "column1 LIKE column2" for columns from same table (#2776)
This WHERE part will be performed with FilterCommand, which did not had
LIKE operator implemented until now.
2023-03-17 12:10:53 +02:00
Leonid Fedorov
56f2346083 Remove windows ifdefs 2023-03-02 15:59:42 +00:00
Leonid Fedorov
d42485656c Fix clang 16 warnings for comfort build 2023-01-12 22:11:28 +03:00
David.Hall
53af74b027
MCOL-1170 Fix ANALYZE to not error (#2682)
Analyze needs to be completed differently than a normal query. In server, when an ANALYZE is seen, it calls init_scan() immediatly followed by end_scan(). This leaves the sqlfrontendsession (ExeMgr) in a state where it expects to return rows. This patch fixes end_scan to clean this up via reads and writes to get everything back in synch.

ANALYZE should display the number of rows to be displayed if the query were run normally. We have that information available, but no way to return it. A modification to server side to ask for that in the handler is required.

This patch also includes a beautification of sqlfrontsessionthread.cpp since it looked bad. The important change is at line 774
if (!swallowRows)
which short circuits the actual return of data
2023-01-09 13:59:26 -06:00
Roman Nozdrin
4313288a85
Merge 22.08.7 (#2678)
* fix C API includes

ColumnStore used to include server's mysql.h
but link all tools with libmariadb.so

There's no guarantee that this would work, even with workarounds
it had in dbcon/mysql/sm.cpp

Fix:
* tools (linked with libmariadb.so) *must* include libmariadb's mysql.h
* as a hack prevent service_thd_timezone.h from being loaded into tools,
  as it conflicts with libmariadb's mysql.h
* server plugin *must* include server's mysql.h
* also don't link every tool with libmariadb.so, link the helper library
  (liblibmysqlclient.so) that actually needs it, tools use this
  helper library, not libmariadb.so directly

* do *not* link ha_columnstore.so with libmariadb.so

this means some libraries have to be compiled twice -
for tools with libmariadb.so and for plugin, without.

* use system boost, if possible

boost 1.71.0 is what ubuntu focal has, so let's start with that version.
boost 1.77.0 is the first that supports c++20

* add dependency for generated header files errorids.h messageids.h

see 3edd51610

* bump the version

* MCOL-5322 This patch replaces boost::mutex with std::mutex b/c IMHO std::unique_lock::lock is
less troublesome comparing with the boost alternative

* MCOL-5310 This patch replaces move-assignment with copy-assignment to avoid memory corruption (#2661)

* Bump VERSION to 22.08.7-1

* MCOL-5306 Re-read the config (Columnstore.xml) file if it was updated.

The existing implementation of Config::makeConfig() factory method
was returning a possibly stale config to the caller, without checking
if the config file was updated since the last read. This bug triggered
a scenario as described in MCOL-5306 where after a failover in an MCS
cluster, the controllernode coordinates changed in the config file
after failover and the existing mariadbd process was still using the
old controllernode coordinates. This lead to failed network connection
between mariadbd and the new controllernode.

The change in this fix, however, is more generic and not just limited
to this above scenario.

* MCOL-5264 This patch replaces boost mutex locks with std analogs
boost::uniqie_lock dtor calls a fancy unlock logic that throws twice.
First if the mutex is 0 and second lock doesn't own the mutex.
The first condition failure causes unhandled exception for one of the clients
in DEC::writeToClient(). I was unable to find out why Linux can have a 0
mutex and replaced boost::mutex with std::mutex b/c stdlibc++ should
be more stable comparing with boost.

* MCOL-5311 Add timezone to jobList in subquerytransformer
TimeZone was uninitialized in this scenario and led to undefined behavior.

* patch_out_of_band
Some changes made to 10.6-enterprise make a build using the out-of-band method of compiling columnstore not work. Out-of band means the source for the engine is not in the storage subdir of server, but rather in a stand alone directory. This is used by developers for easier develop work. In the case of out-of-band, INSTALL_LAYOUT is false in CMakeLists.txt

* MCOL-5346 This patch forces TreeNode::getIntValue to use conversion for dict-based CHAR/VARCHAR and TEXT columns (#2657)

Co-authored-by: Roman Nozdrin <rnozdrin@mariadb.com>

* MCOL-5263 Add support to ROLLBACK when PP were restarted.

DMLProc starts ROLLBACK when SELECT part of UPDATE fails b/c EM facility in PP were restarted.
Unfortunately this ROLLBACK stuck if EM/PP are not yet available.
DMLProc must have a t/o with re-try doing ROLLBACK.

* MCOL-3561 This patch updates Connector code after MDEV-29988

* This commit applies the code style format

Co-authored-by: Sergei Golubchik <serg@mariadb.com>
Co-authored-by: Roman Nozdrin <rnozdrin@mariadb.com>
Co-authored-by: David.Hall <david.hall@mariadb.com>
Co-authored-by: Gagan Goel <gagan.nith@gmail.com>
Co-authored-by: Denis Khalikov <dennis.khalikov@gmail.com>
2022-12-28 21:15:39 +03:00
Roman Nozdrin
a1d89d8f31
Merge pull request #2630 from dhall-MariaDB/sergchanges
Sergchanges
2022-12-02 19:08:08 +03:00
Andrey Piskunov
96c40d5081
Add core dumps and stack trace (#2604)
* Add core dumps and stack trace
2022-11-22 03:29:34 +03:00
Sergei Golubchik
246a4db8de fix C API includes
ColumnStore used to include server's mysql.h
but link all tools with libmariadb.so

There's no guarantee that this would work, even with workarounds
it had in dbcon/mysql/sm.cpp

Fix:
* tools (linked with libmariadb.so) *must* include libmariadb's mysql.h
* as a hack prevent service_thd_timezone.h from being loaded into tools,
  as it conflicts with libmariadb's mysql.h
* server plugin *must* include server's mysql.h
* also don't link every tool with libmariadb.so, link the helper library
  (liblibmysqlclient.so) that actually needs it, tools use this
  helper library, not libmariadb.so directly
2022-11-17 12:02:07 -06:00
Leonid Fedorov
37fd915a08 Serg`s patch for develop-6 revised for develop https://github.com/mariadb-corporation/mariadb-columnstore-engine/pull/2614 2022-11-09 22:41:38 +00:00
Roman Nozdrin
d22627af7d
Merge pull request #2566 from denis0x0D/MCOL-5191_1
MCOL-5191 Add MCV statistics.
2022-10-30 15:49:46 +03:00
Roman Nozdrin
878a8ab857 Compilation error fixes for the recent updates in container images 2022-10-14 17:53:48 +03:00
Denis Khalikov
e299a8409d MCOL-5191 Add MCV statistics.
This patch adds:
1. Initial version of random sampling.
2. Initial version of MCV statistics.
2022-10-09 22:26:40 +03:00
NTH19
7d76dc4534 AUX column scan(MCOL-5021) effectively disables vectorized scanning on
ARM platforms. This patch resolves this issue and unifies AUX column
processing at x86 and ARM using tempate class SimdProcessor.
The patch also replaces uint16_t mask previously used in column.cpp and
SimProcessor code with a native masks that platform uses, e.g. __m128i
or __m128 on x86 and variety of masks on ARM.
To unify the processing I introduced a new filtering Compare Operator - COMPARE_NULLEQ.
with a 'c1 IS NULL semantics'.
2022-10-07 10:32:54 +00:00
Roman Nozdrin
eb57103c06 This patch fixes nullptr dereferencing in the same node queue thread 2022-08-31 18:55:48 +00:00
NTH19
df7c967d54 when eq Filtercount <6 ,the speed of for loop is faster than hashmap
add threshold for eqFilter
2022-08-23 18:45:36 +08:00
Gagan Goel
6a6fee5969 MCOL-5021 Followup.
Allow the compiler to inline the call to nextColValue() in column.cpp.
2022-08-18 19:35:35 +00:00
Roman Nozdrin
56bbef62e6
Merge pull request #2406 from tntnatbry/MCOL-5021-dev
MCOL-5021 AUX column implementation to improve DELETE performance.
2022-08-15 19:03:42 +03:00
Andrey Piskunov
c906172bf5
MCOL-5180 Check CPU vector instructions set in installer and PrimProc on start (#2499)
* Add check for simd acrh support

* Updates

* More polite and detailed error messages

* Updates

* Always true to conditional

Co-authored-by: Leonid Fedorov <leonid.fedorov@mariadb.com>
2022-08-11 15:28:22 +03:00
Gagan Goel
cbfdae3481 MCOL-5021 Code changes based on review feedback. 2022-08-05 14:40:50 -04:00
Gagan Goel
11b7ee2f11 MCOL-5021 Disallow the following ALTER TABLE ADD COLUMN statement:
ALTER TABLE calpontsys.systable ADD COLUMN (auxcolumnoid INT NOT NULL DEFAULT 0);
2022-08-05 14:40:50 -04:00
Gagan Goel
1355237ca3 MCOL-5021 Some minor fixes. 2022-08-05 14:40:50 -04:00
Gagan Goel
262cd5c501 MCOL-5021 Remove hard-coded values for data type, column width
and compression type for the AUX column, and replace them with
constants defined in the execplan namespace.
2022-08-05 14:40:49 -04:00
Gagan Goel
c8b6b154bf MCOL-5021 Add an option in Columnstore.xml, fastdelete (disabled
by default), which when enabled, indiscriminately invalidates all
column extents and performs the actual DELETE only on the AUX
column. The trade-off with this approach would now be that the
first SELECT for certain query patterns (those containing a WHERE
predicate) after the DELETE operation will slow down as the
invalidated column extent would need to be scanned again to set
the min/max values.
2022-08-05 14:40:49 -04:00
Gagan Goel
2280b1dd25 MCOL-5021 Add support for the AUX column in ExeMgr and PrimProc.
In the joblist code, in addition to sending the lbid of the SCAN
column, we also send the corresponding lbid of the AUX column to PrimProc.

In the primitives processor code in PrimProc, we load the AUX column
block (8192 rows since the AUX column is implemented as a 1-byte
UNSIGNED TINYINT) into memory and then pass it down to the low-level
scanning (vectorized scanning as applicable) routine to build a non-Empty
mask for the block being processed to filter out DELETED rows based on
comparison of the AUX block row to the empty magic value for the AUX column.
2022-08-05 14:40:49 -04:00
Roman Nozdrin
a9d8924683 MCOL-5166 This patch adds support for in-memory communication b/w EM to PP via a shared queue in DEC class
JobList low-level code relateod to primitive jobs now uses shared pointers instead of ByteStream refs talking to DEC
b/c same-node EM-PP communication now goes over a queue in DEC instead of a network hop.
PP now has a separate thread that processes the primitive job messages from that DEC queue.
2022-08-04 18:51:31 +03:00