1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-04-18 21:44:02 +03:00

1187 Commits

Author SHA1 Message Date
Denis Khalikov
6a6db672db
feat(FDB): [FDB BlobHandler] Add writeOrUpdateBlob, update read and remove. (#3373)
This patch:
1) Adds a `writeOrUpdateBlob` function.
2) Updates `read` and `remove` to take in account the size of the
`keys` and `values` for one FDB transaction.
2025-03-14 11:39:06 +00:00
Sergey Zefirov
e99db9c212
fix(plugin): MCOL-4942 No-table-SELECT now can return empty set (#3415)
The query like "SELECT 1 WHERE 1=0" was returning a row despite
unsatisfiable condition in WHERE. Now it returns an empty set.
2025-03-05 07:36:05 +00:00
Serguey Zefirov
4b7016e67b fix(MCOL-5842): Fix JSON_OBJECT's handling of empty strings
JSON_OBJECT() (and probably some other JSON functions) now properly
handle empty strings in their arguments - JSON_OBJECT used to return
NULL, now it returns empty string.
2025-03-04 12:56:33 +04:00
Aleksei Antipovskii
5556d818f8 chore(codestyle): mark virtual methods as override 2025-02-21 20:02:38 +04:00
Daniel Black
7dcc6a251a
MCOL-5881 set/getThreadName use FreeBSD API (#3383)
Taken from FreeBSD ports, this uses the FreeBSD
APIs rather than the Linux specific prctl to change
and retreive the thread names.

Co-authored-by: Bernard Spil <brnrd@FreeBSD.org>
2025-01-15 22:02:20 +00:00
Sergey Zefirov
3bc8bd8cc6
fix(group by, having): MCOL-5776: GROUP BY/HAVING closer to server's (#3257)
This patch introduces an internal aggregate operator SELECT_SOME that
is automatically added to columns that are not in GROUP BY. It
"computes" some plausible value of the column (actually, last one
passed).

Along the way it fixes incorrect handling of HAVING being transferred
into WHERE, window function handling and a bit of other inconsistencies.
2024-12-20 19:12:32 +00:00
Denis Khalikov
87e2bb4cef
feat(fdb): MCOL-5802 Add support for blob insertion into FDB. (#3351) 2024-12-10 21:07:46 +00:00
drrtuy
6445f4dff3
feat(joblist,runtime): this is the first part of the execution model that produces a workload that can be predicted for a given query.
* feat(joblist,runtime): this is the first part of the execution model that produces a workload that can be predicted for a given query.
  - forces to UM join converter to use a value from a configuration
  - replaces a constant used to control a number of outstanding requests with a value depends on column width
  - modifies related Columnstore.xml values
2024-12-03 22:17:49 +00:00
Serguey Zefirov
a2eafa492a Shorter code 2024-12-02 20:18:13 +03:00
Serguey Zefirov
4878caee4e Shorter code 2024-12-02 20:18:13 +03:00
Serguey Zefirov
0bc384d5f0 fix(ubsan): MCOL-5844 - iron out UBSAN reports
The most important fix here is the fix of possible buffer overrun in
DATEFORMAT() function. A "%W" format, repeated enough times, would
overflow the 256-bytes buffer for result. Now we use ostringstream to
construct result and we are safe.

Changes in date/time projection functions made me fix difference between
us and server behavior. The new, better behavior is reflected in changes
in tests' results.

Also, there was incorrect logic in TRUNCATE() and ROUND() functions in
computing the decimal "shift."
2024-12-02 20:18:13 +03:00
Denis Khalikov
a6eb5ca689
MCOL-5719: Move ownership mechanism to KV storage. (#3266)
* MCOL-5719 Move ownership mechanism to FDB
2024-11-09 19:47:04 +00:00
drrtuy
8ae5a3da40
Fix/mcol 5787 rgdata buffer max size dev (#3325)
* fix(rowgroup): RGData now uses uint64_t counter for the fixed sizes columns data buf.
	The buffer can utilize > 4GB RAM that is necessary for PM side join.
	RGData ctor uses uint32_t allocating data buffer.
 	This fact causes implicit heap overflow.

* feat(bytestream,serdes): BS buffer size type is uint64_t
	This necessary to handle 64bit RGData, that comes as
	a separate patch. The pair of patches would allow to
	have PM joins when SmallSide size > 4GB.

* feat(bytestream,serdes): Distribute BS buf size data type change to avoid implicit data type narrowing

* feat(rowgroup): this returns bits lost during cherry-pick. The bits lost caused the first RGData::serialize to crash a process
2024-11-09 19:44:02 +00:00
Alexey Antipovsky
842a3c8a40
fix(PrimProc): MCOL-5651 Add a workaround to avoid choosing an incorr… (#3320)
* fix(PrimProc): MCOL-5651 Add a workaround to avoid choosing an incorrect TupleHashJoinStep as a joiner
2024-11-08 17:44:20 +00:00
drrtuy
47d01b2d2f fix(join, UM, perf): UM join is multi-threaded now (#3286)
* chore: UM join is multi-threaded now

* fix(UMjoin):  replace TR1 maps with stdlib versions
2024-09-04 18:56:24 +04:00
Denis Khalikov
bb861f8fab FDB
This patch moves FDB to utils dir and adds test on `remove keys range`.
2024-08-28 15:02:08 +04:00
Roman Nozdrin
e2941628d1 fix(): fix the naming 2024-08-21 20:45:16 +04:00
Leonid Fedorov
25c20bae9b MCOL-4696: get rid of boost::iequals 2024-08-21 20:45:16 +04:00
Alexey Antipovsky
c22409760f
feat(SM): MCOL-5785 Add timeout options for S3Storage (#3265)
* Update libmarias3

fix build with the recent libmarias3

* feat(SM): MCOL-5785 Add timeout options for S3Storage

    In some unfortunate situations StorageManager may get stuck on
    network operations. This commit adds the ability to set network
    timeouts which will help to ensure that the system is more
    responsive.

* feat(SM): MCOL-5785 Add smps & smkill tools

    * `smps` shows all active S3 network operations
    * `smkill` terminates S3 network operations

    NB! At the moment smkill is able to terminate operations
    that are stuck on retries, but not hang inside the libcurl
    call. In other words if you want to terminate all operations
    you should configure `connect_timeout` & `timeout`
---------

Co-authored-by: Leonid Fedorov <leonid.fedorov@mariadb.com>
2024-08-21 18:38:49 +03:00
Serguey Zefirov
6e995e2e80 fix: MCOL-5755: incorrect handling of BLOB (and TEXT) in GROUP BY
BLOB fields did not work as grouping keys at all, they were assigned
value NULL for any value, be it NULL or not. The fix is in the
rowaggregation.cpp in the initMapping(), a switch/case branch was added
to handle BLOB field copying there.

Also, TEXT columns did not distinguish between NULL and empty string in
the grouping algorithm, now they do. The fix is in the equals()
function, now we specifically check for isNull() equality between
values.
2024-07-11 11:03:05 +03:00
Alexander Presnyakov
57e2375dbc fix(funcexp): MCOL-4671 Fix behaviour of LEFT/RIGHT functions when negative trim length value is passedB 2024-07-04 12:51:01 +04:00
Leonid Fedorov
a1e64d4cb0 bug(priproc) make last_day type a bit more accurate
This fixes discrepance with the server, which assigns DATE type to
last_day()'s result.

Now we also assigns DATE result type and, also, use proper
dataconvert::Day data structure to return date.

Tests agree with InnoDB.

Also, this patch includes test for MCOL-5669, to show we fixed it.
2024-07-01 16:25:44 +03:00
Denis Khalikov
2444f96b11
Merge pull request #3202 from denis0x0D/MCOL-5708
MCOL-5708 Calculate precision and scale for constant decimal.
2024-06-24 11:16:58 +03:00
Sergey Zefirov
1122b64cb1
MCOL-4234: improve GROUP BY and ORDER BY interaction (#3194)
This patch fixes the problem in MCOL-4234 and also generally improves
behavior of GROUP BY.

It does so by introducing a "dummy" aggregate and by wrapping columns
into it. This allows for columns that are not in GROUP BY to be used
more freely, for example, in SELECT * FROM tbl GROUP BY col - all
columns that are not "col" will be wrapped into an aggregate and query
will proceed to execution.

The dummy aggregate itself does nothing more than remember last value
passed into it.

There also an additional error message that tries to explain what types
of expressions can be wrapped into an aggregate.
2024-06-17 20:00:54 +03:00
Denis Khalikov
b1045d27b6
fix(funcexp): MCOL-5237 Proper handle DATETIME column for "ifnull" function. (#3196) 2024-06-17 12:09:14 +01:00
drrtuy
113d9873a3
Containers memory limits for CI (#3108)
Limit test containers by memory, fix cgroup path inside the containers by introducing new ugly setting name 

---------

Co-authored-by: Roman Nozdrin <rnozdrin@mariadb.com>
Co-authored-by: Leonid Fedorov <leonid.fedorov@mariadb.com>
2024-06-16 19:16:23 +04:00
Denis Khalikov
ccb7ba5914 MCOL-5708 Calculate precision and scale for constant decimal.
This patch calculates precision and scale for constant decimal
value for SUM aggregation function.
2024-06-11 15:48:46 +00:00
Leonid Fedorov
a8d3fff79e chore(build) Rocky8 gcc vanilla build fix 2024-04-16 17:08:06 +03:00
Denis Khalikov
1c88a7fcd8 MCOL-5597 Rollback changes introduced for DJS.
This patch changes:
1. The number of buckets created on each split.
2. The heuristic which calculates the bucket size.
2024-04-15 19:37:29 +03:00
Serguey Zefirov
3b7e69135d Fixes MCOL-5700, Oracle mode test results
This changeset contains fixes in Oracle mode tests and for the
implementation of the CONCAT_ORACLE. Also, we harmonise our
translation process with the recent changes in the server.

Due to changed behavior of the server, some CREATE VIEW/EXPLAIN
statements' results begun to output unexpected results and need to be
fixed.

Also, concatenation operation's name also changed. This lead to
disabled func_concat_oracle test to be enabled to test it and it
turned out that our implementation of this function was broken
and need to be fixed too.
2024-04-15 19:35:21 +03:00
Leonid Fedorov
af5ae35413
Revert "Fixes MCOL-5700, Oracle mode test results" 2024-03-27 18:52:30 +04:00
mariadb-KirillPerov
56b35d5cf6
Merge pull request #3156 from mariadb-corporation/sz-fix-oracle-mode
Fixes MCOL-5700, Oracle mode test results
2024-03-27 14:45:52 +06:00
Serguey Zefirov
34acd3559b Fixes MCOL-5700, Oracle mode test results
This changeset contains fixes in Oracle mode tests and for the
implementation of the CONCAT_ORACLE. Also, we harmonise our
translation process with the recent changes in the server.

Due to changed behavior of the server, some CREATE VIEW/EXPLAIN
statements' results begun to output unexpected results and need to be
fixed.

Also, concatenation operation's name also changed. This lead to
disabled func_concat_oracle test to be enabled to test it and it
turned out that our implementation of this function was broken
and need to be fixed too.
2024-03-27 10:00:39 +03:00
drrtuy
444cf4c65e fix(aggregation, disk-based) MCOL-5691 distinct aggregate disk based (#3145)
* fix(aggregation, disk-based): MCOL-5689 this fixes disk-based distinct aggregation functions
Previously disk-based distinct aggregation functions produced incorrect results b/c there was no finalization applied for previous generations stored on disk.

*  fix(aggregation, disk-based): Fix disk-based COUNT(DISTINCT ...) queries. (Case 2). (Distinct & Multi-Distinct, Single- & Multi-Threaded).

* fix(aggregation, disk-based): Fix disk-based DISTINCT & GROUP BY queries. (Case 1). (Distinct & Multi-Distinct, Single- & Multi-Threaded).

---------

Co-authored-by: Theresa Hradilak <theresa.hradilak@gmail.com>
Co-authored-by: Roman Nozdrin <rnozdrin@mariadb.com>
2024-03-26 00:39:53 +03:00
Leonid Fedorov
5f40fb32d0
MCOL-5328: use PCRE2 and JPCRE wrapper (#3137)
PCRE2 for regexp functions in columnstore
2024-03-14 19:39:29 +04:00
Leonid Fedorov
83c2408f8d
fix(join, threadpool): MCOL-5565: MCOL-5636: MCOL-5645: port from develop-23.02 to [develop] (#3128)
* fix(threadpool): MCOL-5565 queries stuck in FairThreadScheduler. (#3100)

Meta Primitive Jobs, .e.g ADD_JOINER, LAST_JOINER stuck
	in Fair scheduler without out-of-band scheduler. Add OOB
	scheduler back to remedy the issue.

* fix(messageqcpp): MCOL-5636 same node communication crashes transmiting PP errors to EM b/c error messaging leveraged socket that was a nullptr. (#3106)

* fix(threadpool): MCOL-5645 errenous threadpool Job ctor implictly sets socket shared_ptr to nullptr causing sigabrt when threadpool returns an error (#3125)

---------

Co-authored-by: drrtuy <roman.nozdrin@mariadb.com>
2024-02-13 19:01:16 +03:00
Sergey Zefirov
ebcf43a517
fix(join, disk-based): MCOL-5597: large side read errors (#3117)
The large side read errors mentioned there can be due to failure to
close file stream properly. Some of the data may still reside in the
file stream buffers, closing must flush it. The flush is an I/O
operation and can fail, leading to partial write and subsequent partial
read.

This patch tries to provide better diagnostics.
2024-02-09 22:25:43 +03:00
Leonid Fedorov
0d1c72a563 compilation fix for gcc12 on known gcc bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105329 2024-01-04 11:43:03 +03:00
Leonid Fedorov
4d7a6a0be5 perf(primproc) MCOL-5601: Initilize two fields once in ctor instead of calling makeConfig
std::string fTmpDir = config::Config::makeConfig()->getTempFileDir(config::Config::TempDirPurpose::Aggregates);
std::string fCompStr = config::Config::makeConfig()->getConfig("RowAggregation", "Compression");
2023-12-19 15:25:19 +03:00
Leonid Fedorov
9d5ad925eb fix(linkage) link libm to libmarias3 2023-12-18 14:10:14 +03:00
Serguey Zefirov
1f958c9ed2 MCOL-5625: Fixes json_query implementation
Also extends func_json_value.test.
2023-12-12 15:45:03 +03:00
Denis Khalikov
865cca11c9 MCOL-5505 Add TypeHandler functions. 2023-11-30 01:47:13 +04:00
HanpyBin
fe597ec78c MCOL-5505 add parquet support for cpimport and add mcs_parquet_ddl and mcs_parquet_gen tools 2023-11-30 01:47:13 +04:00
Serguey Zefirov
792aea2a7c Fixes MCOL-5599 where LIKE operator never finishes
This is a fix of logging subsystem, nothing else.

The old code expanded an argument into string and advanced too little
and, if expansion contained argument's index, it expanded it again. And
again.
2023-11-29 19:17:16 +04:00
Sergey Zefirov
8632c85ecf
feat(primproc,aggregegation)!: Changes for ROLLUP with single-phase aggregation (#3025)
The fix is simple: enable subtotals in single-phase aggregation and
disable parallel processing when there are subtotals and aggregation is
single-phase.
2023-11-28 17:33:02 +03:00
Denis Khalikov
76e4e13b80
fix(rowgroup,stringstore): MCOL-5597 Set length for nullptr string to 0. (#3027) 2023-11-28 17:18:52 +03:00
Sergey Zefirov
5c9770d1e6
fix(funcexp): MCOL-5607: JSON function use crashes query execution (#3028)
JSON functions were implemented violating an assumption of their
pureness, as they should not have any state. This concrete patch
fixes implementation of JSON_VALUE function.
2023-11-21 23:46:03 +03:00
Sergey Zefirov
69b8e1c779
feat(extent-elimination)!: re-enable extent-elimination for dictionary columns scanning
This is "productization" of an old code that would enable extent
elimination for dictionary columns.

This concrete patch enables it, fixes perfomance degradation (main
problem with old code) and also fixes incorrect behavior of cpimport.
2023-11-17 17:14:35 +03:00
drrtuy
67c842e792
Merge pull request #3017 from drrtuy/fix/MCOL-5472-urandom-mutex
fix(rowstorage): MCOL-5472 SplitMix64 PRNG implementation to replace stdlib MT PRNG that uses /dev/urandom guarded by spinlock
2023-11-02 16:10:57 +02:00
Roman Nozdrin
dfc9e89496 fix(rowstorage): SplitMix64 PRNG implementation to replace stdlib MT PRNG that uses /dev/urandom guarded by spinlock 2023-11-01 18:19:45 +00:00