1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-08-07 03:22:57 +03:00
Commit Graph

167 Commits

Author SHA1 Message Date
Serguey Zefirov
6e995e2e80 fix: MCOL-5755: incorrect handling of BLOB (and TEXT) in GROUP BY
BLOB fields did not work as grouping keys at all, they were assigned
value NULL for any value, be it NULL or not. The fix is in the
rowaggregation.cpp in the initMapping(), a switch/case branch was added
to handle BLOB field copying there.

Also, TEXT columns did not distinguish between NULL and empty string in
the grouping algorithm, now they do. The fix is in the equals()
function, now we specifically check for isNull() equality between
values.
2024-07-11 11:03:05 +03:00
drrtuy
113d9873a3 Containers memory limits for CI (#3108)
Limit test containers by memory, fix cgroup path inside the containers by introducing new ugly setting name 

---------

Co-authored-by: Roman Nozdrin <rnozdrin@mariadb.com>
Co-authored-by: Leonid Fedorov <leonid.fedorov@mariadb.com>
2024-06-16 19:16:23 +04:00
Leonid Fedorov
5f40fb32d0 MCOL-5328: use PCRE2 and JPCRE wrapper (#3137)
PCRE2 for regexp functions in columnstore
2024-03-14 19:39:29 +04:00
Sergey Zefirov
8632c85ecf feat(primproc,aggregegation)!: Changes for ROLLUP with single-phase aggregation (#3025)
The fix is simple: enable subtotals in single-phase aggregation and
disable parallel processing when there are subtotals and aggregation is
single-phase.
2023-11-28 17:33:02 +03:00
Denis Khalikov
76e4e13b80 fix(rowgroup,stringstore): MCOL-5597 Set length for nullptr string to 0. (#3027) 2023-11-28 17:18:52 +03:00
Sergey Zefirov
69b8e1c779 feat(extent-elimination)!: re-enable extent-elimination for dictionary columns scanning
This is "productization" of an old code that would enable extent
elimination for dictionary columns.

This concrete patch enables it, fixes perfomance degradation (main
problem with old code) and also fixes incorrect behavior of cpimport.
2023-11-17 17:14:35 +03:00
mariadb-AlexeyVorovich
fd94ab5042 chore(logging): move cgroup /cgroup version log from constructor to getTotalMemory to avoid duplicate log as constructor is called per query 2023-09-25 22:17:09 +03:00
Gagan Goel
7f9c624626 MCOL-5573 Fix cpimport truncation of TEXT columns.
1. Restore the utf8_truncate_point() function in utils/common/utils_utf8.h
that I removed as part of the patch for MCOL-4931.

2. As per the definition of TEXT columns, the default column width represents
the maximum number of bytes that can be stored in the TEXT column. So the
effective maximum length is less if the value contains multi-byte characters.
However, if the user explicitly specifies the length of the TEXT column in a
table DDL, such as TEXT(65535), then the DDL logic ensures that enough number
of bytes are allocated (upto a system maximum) to allow upto that many number
of characters (multi-byte characters if the charset for the column is multi-byte,
such as utf8mb3).
2023-09-20 12:23:22 -04:00
Gagan Goel
931f2b36a1 MCOL-4931 Make cpimport charset-aware. (#2938)
1. Extend the following CalpontSystemCatalog member functions to
   set CalpontSystemCatalog::ColType::charsetNumber, after the
   system catalog update to add charset number to calpontsys.syscolumn
   in MCOL-5005:
     CalpontSystemCatalog::lookupOID
     CalpontSystemCatalog::colType
     CalpontSystemCatalog::columnRIDs
     CalpontSystemCatalog::getSchemaInfo

2. Update cpimport to use the CHARSET_INFO object associated with the
   charset number retrieved from the system catalog, for a
   dictionary/non-dictionary CHAR/VARCHAR/TEXT column, to truncate
   long strings that exceed the target column character length.

3. Add MTR test cases.
2023-09-05 17:17:20 +03:00
mariadb-AlexeyVorovich
5b4f06bf0d Logging of memory (#2930)
* -logging of memory WIP

* -better log for cgroup case

* -fix log

* -display in GIB

* add log for freememory for non CGROUP
(to be discussed)

* test repeated log entries

* -added counter for every 1000 call. effectivly 15m

* Name logginng period and inrease it, clear config files from PR, add .gitignore

---------

Co-authored-by: pgmabv99 <alexey.vorovich@gmail.com>
Co-authored-by: Leonid Fedorov <leonid.fedorov@mariadb.com>
2023-09-05 15:46:29 +03:00
Theresa Hradilak
48562e41f9 feat(datatypes): MCOL-4632 and MCOL-4648, fix cast leads to NULL.
Remove redundant cast.

As C-style casts with a type name in parantheses are interpreted as static_casts this literally just changes the interpretation around (and forces an implicit cast to match the return value of the function).

Switch UBIGINTNULL and UBIGINTEMPTYROW constants for consistency.

Make consistent with relation between BIGINTNULL and BIGINTEMPTYROW & make adapted cast behaviour due to NULL markers more intuitive. (After this change we can simply block the highest possible uint64_t value and if a cast results in it, print the next lower value (2^64 - 2). Previously, (2^64 - 1) was able to be printed, but (2^64 - 2) as being blocked by the UBIGINTNULL constant was not, making finding the appropiate replacement value to give out more confusing.

Introduce MAX_MCS_UBIGINT and MIN_MCS_BIGINT and adapt casts.

Adapt casting to BIGINT to remove NULL marker error.

Add bugfix regression test for MCOL 4632

Add regression test for mcol_4648

Revert "Switch UBIGINTNULL and UBIGINTEMPTYROW constants for consistency."

This reverts commit 83eac11b18937ecb0b4c754dd48e4cb47310f620.
Due to backwards compatability issues.

Refactor casting to MCS[U]Int to datatype functions.

Update regression tests to include other affected datatypes.

Apply formatting.

Refactor according to PR review

Remove redundant new constant, switch to using already existing constant.

Adapt nullstring casting to EMPTYROW markers for backwards compatability.

Adapt tests for backward compatability behaviour allowing text datatypes to be casted to EMPTYROW constant.

Adapt mcol641-functions test according to bug fix.

Update tests according to new expected behaviour.

Adapt tests to new understanding of issue.

Update comments/documentation for MCOL_4632 test.

Adapt to new cast limit logic.

Make bracketing consistent.

Adapt previous regression test to new expected behaviour.
2023-08-11 13:00:30 +00:00
drrtuy
6d44d2e850 MCOL-5500 Remove another noisy printout. (#2886)
Co-authored-by: Roman Nozdrin <rnozdrin@mariadb.com>
2023-06-29 15:46:00 +03:00
Roman Nozdrin
79b636d853 MCOL-5500 Remove noisy printout from CGroupConfigurator method 2023-06-19 11:24:41 +00:00
Roman Nozdrin
375d162376 MCOL-5500 This patch adds cgroup v2 support with some sanity checks for (#2849)
values reported by cgroups v1
2023-06-09 17:37:21 +03:00
Leonid Fedorov
c2d0fa24da replace boost::shared_array<T> to std::shared_ptr<T[]> 2023-04-14 10:33:27 +00:00
Leonid Fedorov
a508b86091 remove boost/shared_array include 2023-04-14 09:42:50 +00:00
Roman Nozdrin
c38d98a510 Merge pull request #2762 from mariadb-corporation/MCOL-5191_Dist
MCOL-5191 Refacator statistics.
2023-04-06 21:09:42 +01:00
Leonid Fedorov
2e1394149b MCOL-5464: Fixes of bugs from ASAN warnings, part one (#2792)
* Fixes of bugs from ASAN warnings, part one

* MQC as static library, with nifty counter for global map and mutex

* Switch clang to 16

* link messageqcpp to execplan
2023-04-04 02:33:23 +03:00
Sergey Zefirov
b53c231ca6 MCOL-271 empty strings should not be NULLs (#2794)
This patch improves handling of NULLs in textual fields in ColumnStore.
Previously empty strings were considered NULLs and it could be a problem
if data scheme allows for empty strings. It was also one of major
reasons of behavior difference between ColumnStore and other engines in
MariaDB family.

Also, this patch fixes some other bugs and incorrect behavior, for
example, incorrect comparison for "column <= ''" which evaluates to
constant True for all purposes before this patch.
2023-03-30 21:18:29 +03:00
Denis Khalikov
bca300cd11 MCOL-5191 Refacator statistics.
Move uniform distribution to Statitistics constructor, remove rowcount.
2023-03-06 15:06:40 +03:00
Leonid Fedorov
56f2346083 Remove windows ifdefs 2023-03-02 15:59:42 +00:00
Leonid Fedorov
f7118b53a8 Turn on ASAN for unitests (#2719)
Fix asan error on compression tests
Fix warn of nonreturn function
2023-02-02 15:08:01 +02:00
Leonid Fedorov
d42485656c Fix clang 16 warnings for comfort build 2023-01-12 22:11:28 +03:00
Roman Nozdrin
d22627af7d Merge pull request #2566 from denis0x0D/MCOL-5191_1
MCOL-5191 Add MCV statistics.
2022-10-30 15:49:46 +03:00
Roman Nozdrin
a0086bc561 Adding NULL flag into ConstString class 2022-10-21 18:13:18 +00:00
Denis Khalikov
e299a8409d MCOL-5191 Add MCV statistics.
This patch adds:
1. Initial version of random sampling.
2. Initial version of MCV statistics.
2022-10-09 22:26:40 +03:00
NTH19
7d76dc4534 AUX column scan(MCOL-5021) effectively disables vectorized scanning on
ARM platforms. This patch resolves this issue and unifies AUX column
processing at x86 and ARM using tempate class SimdProcessor.
The patch also replaces uint16_t mask previously used in column.cpp and
SimProcessor code with a native masks that platform uses, e.g. __m128i
or __m128 on x86 and variety of masks on ARM.
To unify the processing I introduced a new filtering Compare Operator - COMPARE_NULLEQ.
with a 'c1 IS NULL semantics'.
2022-10-07 10:32:54 +00:00
Roman Nozdrin
72e264e8ef MCOL-5199 This patch solves the overal performance degradation introduced with a new way of char columns hashing
in aggregation code
The patch disables padding that forces hasher to calculate over the whole 2k buffer. This patch also moves hashing code
into the common place where it belongs.
2022-08-24 19:07:06 +00:00
Sergei Golubchik
a7a9ccf889 Serg dev (#2504)
* more build dependencies

* fix for cmake < 3.11

It cannot do ADD_LIBRARY(... ALIAS ...) on IMPORTED targets

* another fix for cmake 3.10.2

It doesn't know about CMAKE_CXX_STANDARD=20,
let's add the correct flag manually

* gcc 8 on aarch64

utils/common/simd_arm.h:241:16: error: need ‘typename’ before ‘simd::TypeToVecWrapperType<T>::WrapperType’ because ‘simd::TypeToVecWrapperType<T>’ is a dependent scope
2022-08-15 13:35:30 +03:00
David.Hall
2020f35e88 Mcol 5092 MODA uses wrong column width for some types (#2450)
* MCOL-5092 Ensure column width is correct for datatype
                       Change MODA return type to STRING
                       Modify MODA to handle every numeric type
* MCOL-5162 MODA to support char and varchar with collation support

Fixes to the aggregate bit functions
When we fixed the storage sign issue for MCOL-5092, it uncovered a problem in the bit aggregates (bit_and, bit_or and bit_xor). These aggregates should always return UBIGINT, but they relied on the type of the argument column, which gave bad results.
2022-08-11 15:16:11 -05:00
Roman Nozdrin
dd96e686c0 MCOL-5153 This patch replaces MDB collation aware hash function with the (#2488)
exact functionality that does not use MDB hash function.
This patch also takes a bit from Robin Hood hash map implementation forgotten
that reduces hash function collision rate.
2022-08-07 02:36:03 +03:00
Andrey Piskunov
c3a5731890 Rename cmpGt2 2022-08-04 16:16:38 +03:00
Andrey Piskunov
24b2c1c283 Vectorizing min/max for KIND_TEXT 2022-08-04 16:16:38 +03:00
NTH19
c4798ce585 fix 2022-08-04 16:16:38 +03:00
NTH19
19ca844cd1 support_max_min 2022-08-04 16:16:38 +03:00
Andrey Piskunov
589b786fda Don't ignore null or empty in calculation 2022-08-04 16:16:38 +03:00
Andrey Piskunov
20f48fd730 Vectorized update min max 2022-08-04 16:16:38 +03:00
Andrey Piskunov
b8200acd3b Don't ignore null or empty in calculation 2022-08-04 16:16:38 +03:00
Andrey Piskunov
1681edaca0 Tests for simd min/max 2022-08-04 16:16:38 +03:00
Andrey Piskunov
9930d0dedd Vectorized update min max 2022-08-04 16:16:38 +03:00
Roman Nozdrin
df431ebad9 MCOL-5093 This patch raises the hardcoded service start TO up to 2 hours (#2469) 2022-07-22 12:25:24 -05:00
mariadb-RomanNavrotskiy
194f0e9d64 ci: new builds grid, parallel steps 2022-07-08 22:30:02 +02:00
NTH19
d451b5c7c5 fix 2022-06-24 18:06:04 +08:00
NTH19
4c0b8fd829 simd of arm neon
unit testing

pass unit test for simdprocessor

add test cases

implement specific _mm_movemask for different types

float movemask change

rename
2022-06-24 11:24:59 +08:00
david.hall
6d47529499 Merge branch 'develop' into MCOL-4841 2022-06-14 14:41:41 -05:00
David.Hall
272246e9fa Merge branch 'develop' into MCOL-4841 2022-06-09 16:58:33 -05:00
david.hall
3b6449842f Merge branch 'develop' into MCOL-4841
# Conflicts:
#	exemgr/main.cpp
#	oam/etc/Columnstore.xml.singleserver
#	primitives/primproc/primproc.cpp
2022-06-09 10:07:26 -05:00
Roman Nozdrin
7c9da5709d MCOL-5105 This patch raises pipe read operation timeout to 20 minutes
to enable DMLProc to survive rollbacks on startup.
The patch also fixes linter warnings in service.h and pipe.h.
2022-06-09 14:34:50 +00:00
Roman Nozdrin
4e50fca460 Merge pull request #2401 from denis0x0D/statistic_man
StatisticsManager initialize all plugins.
2022-06-03 15:41:28 +05:30
Denis Khalikov
6c0ebd568b StatisticsManager initialize all plugins.
This patch adds support for initializing all plugins in the system.
2022-05-31 12:42:00 +03:00