1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-07-01 06:21:41 +03:00
Commit Graph

165 Commits

Author SHA1 Message Date
38c9b51a13 move convert to datatypes::Charset class 2024-03-13 13:24:26 +00:00
8632c85ecf feat(primproc,aggregegation)!: Changes for ROLLUP with single-phase aggregation (#3025)
The fix is simple: enable subtotals in single-phase aggregation and
disable parallel processing when there are subtotals and aggregation is
single-phase.
2023-11-28 17:33:02 +03:00
76e4e13b80 fix(rowgroup,stringstore): MCOL-5597 Set length for nullptr string to 0. (#3027) 2023-11-28 17:18:52 +03:00
69b8e1c779 feat(extent-elimination)!: re-enable extent-elimination for dictionary columns scanning
This is "productization" of an old code that would enable extent
elimination for dictionary columns.

This concrete patch enables it, fixes perfomance degradation (main
problem with old code) and also fixes incorrect behavior of cpimport.
2023-11-17 17:14:35 +03:00
fd94ab5042 chore(logging): move cgroup /cgroup version log from constructor to getTotalMemory to avoid duplicate log as constructor is called per query 2023-09-25 22:17:09 +03:00
7f9c624626 MCOL-5573 Fix cpimport truncation of TEXT columns.
1. Restore the utf8_truncate_point() function in utils/common/utils_utf8.h
that I removed as part of the patch for MCOL-4931.

2. As per the definition of TEXT columns, the default column width represents
the maximum number of bytes that can be stored in the TEXT column. So the
effective maximum length is less if the value contains multi-byte characters.
However, if the user explicitly specifies the length of the TEXT column in a
table DDL, such as TEXT(65535), then the DDL logic ensures that enough number
of bytes are allocated (upto a system maximum) to allow upto that many number
of characters (multi-byte characters if the charset for the column is multi-byte,
such as utf8mb3).
2023-09-20 12:23:22 -04:00
931f2b36a1 MCOL-4931 Make cpimport charset-aware. (#2938)
1. Extend the following CalpontSystemCatalog member functions to
   set CalpontSystemCatalog::ColType::charsetNumber, after the
   system catalog update to add charset number to calpontsys.syscolumn
   in MCOL-5005:
     CalpontSystemCatalog::lookupOID
     CalpontSystemCatalog::colType
     CalpontSystemCatalog::columnRIDs
     CalpontSystemCatalog::getSchemaInfo

2. Update cpimport to use the CHARSET_INFO object associated with the
   charset number retrieved from the system catalog, for a
   dictionary/non-dictionary CHAR/VARCHAR/TEXT column, to truncate
   long strings that exceed the target column character length.

3. Add MTR test cases.
2023-09-05 17:17:20 +03:00
5b4f06bf0d Logging of memory (#2930)
* -logging of memory WIP

* -better log for cgroup case

* -fix log

* -display in GIB

* add log for freememory for non CGROUP
(to be discussed)

* test repeated log entries

* -added counter for every 1000 call. effectivly 15m

* Name logginng period and inrease it, clear config files from PR, add .gitignore

---------

Co-authored-by: pgmabv99 <alexey.vorovich@gmail.com>
Co-authored-by: Leonid Fedorov <leonid.fedorov@mariadb.com>
2023-09-05 15:46:29 +03:00
48562e41f9 feat(datatypes): MCOL-4632 and MCOL-4648, fix cast leads to NULL.
Remove redundant cast.

As C-style casts with a type name in parantheses are interpreted as static_casts this literally just changes the interpretation around (and forces an implicit cast to match the return value of the function).

Switch UBIGINTNULL and UBIGINTEMPTYROW constants for consistency.

Make consistent with relation between BIGINTNULL and BIGINTEMPTYROW & make adapted cast behaviour due to NULL markers more intuitive. (After this change we can simply block the highest possible uint64_t value and if a cast results in it, print the next lower value (2^64 - 2). Previously, (2^64 - 1) was able to be printed, but (2^64 - 2) as being blocked by the UBIGINTNULL constant was not, making finding the appropiate replacement value to give out more confusing.

Introduce MAX_MCS_UBIGINT and MIN_MCS_BIGINT and adapt casts.

Adapt casting to BIGINT to remove NULL marker error.

Add bugfix regression test for MCOL 4632

Add regression test for mcol_4648

Revert "Switch UBIGINTNULL and UBIGINTEMPTYROW constants for consistency."

This reverts commit 83eac11b18937ecb0b4c754dd48e4cb47310f620.
Due to backwards compatability issues.

Refactor casting to MCS[U]Int to datatype functions.

Update regression tests to include other affected datatypes.

Apply formatting.

Refactor according to PR review

Remove redundant new constant, switch to using already existing constant.

Adapt nullstring casting to EMPTYROW markers for backwards compatability.

Adapt tests for backward compatability behaviour allowing text datatypes to be casted to EMPTYROW constant.

Adapt mcol641-functions test according to bug fix.

Update tests according to new expected behaviour.

Adapt tests to new understanding of issue.

Update comments/documentation for MCOL_4632 test.

Adapt to new cast limit logic.

Make bracketing consistent.

Adapt previous regression test to new expected behaviour.
2023-08-11 13:00:30 +00:00
6d44d2e850 MCOL-5500 Remove another noisy printout. (#2886)
Co-authored-by: Roman Nozdrin <rnozdrin@mariadb.com>
2023-06-29 15:46:00 +03:00
79b636d853 MCOL-5500 Remove noisy printout from CGroupConfigurator method 2023-06-19 11:24:41 +00:00
375d162376 MCOL-5500 This patch adds cgroup v2 support with some sanity checks for (#2849)
values reported by cgroups v1
2023-06-09 17:37:21 +03:00
c2d0fa24da replace boost::shared_array<T> to std::shared_ptr<T[]> 2023-04-14 10:33:27 +00:00
a508b86091 remove boost/shared_array include 2023-04-14 09:42:50 +00:00
c38d98a510 Merge pull request #2762 from mariadb-corporation/MCOL-5191_Dist
MCOL-5191 Refacator statistics.
2023-04-06 21:09:42 +01:00
2e1394149b MCOL-5464: Fixes of bugs from ASAN warnings, part one (#2792)
* Fixes of bugs from ASAN warnings, part one

* MQC as static library, with nifty counter for global map and mutex

* Switch clang to 16

* link messageqcpp to execplan
2023-04-04 02:33:23 +03:00
b53c231ca6 MCOL-271 empty strings should not be NULLs (#2794)
This patch improves handling of NULLs in textual fields in ColumnStore.
Previously empty strings were considered NULLs and it could be a problem
if data scheme allows for empty strings. It was also one of major
reasons of behavior difference between ColumnStore and other engines in
MariaDB family.

Also, this patch fixes some other bugs and incorrect behavior, for
example, incorrect comparison for "column <= ''" which evaluates to
constant True for all purposes before this patch.
2023-03-30 21:18:29 +03:00
bca300cd11 MCOL-5191 Refacator statistics.
Move uniform distribution to Statitistics constructor, remove rowcount.
2023-03-06 15:06:40 +03:00
56f2346083 Remove windows ifdefs 2023-03-02 15:59:42 +00:00
f7118b53a8 Turn on ASAN for unitests (#2719)
Fix asan error on compression tests
Fix warn of nonreturn function
2023-02-02 15:08:01 +02:00
d42485656c Fix clang 16 warnings for comfort build 2023-01-12 22:11:28 +03:00
d22627af7d Merge pull request #2566 from denis0x0D/MCOL-5191_1
MCOL-5191 Add MCV statistics.
2022-10-30 15:49:46 +03:00
a0086bc561 Adding NULL flag into ConstString class 2022-10-21 18:13:18 +00:00
e299a8409d MCOL-5191 Add MCV statistics.
This patch adds:
1. Initial version of random sampling.
2. Initial version of MCV statistics.
2022-10-09 22:26:40 +03:00
7d76dc4534 AUX column scan(MCOL-5021) effectively disables vectorized scanning on
ARM platforms. This patch resolves this issue and unifies AUX column
processing at x86 and ARM using tempate class SimdProcessor.
The patch also replaces uint16_t mask previously used in column.cpp and
SimProcessor code with a native masks that platform uses, e.g. __m128i
or __m128 on x86 and variety of masks on ARM.
To unify the processing I introduced a new filtering Compare Operator - COMPARE_NULLEQ.
with a 'c1 IS NULL semantics'.
2022-10-07 10:32:54 +00:00
72e264e8ef MCOL-5199 This patch solves the overal performance degradation introduced with a new way of char columns hashing
in aggregation code
The patch disables padding that forces hasher to calculate over the whole 2k buffer. This patch also moves hashing code
into the common place where it belongs.
2022-08-24 19:07:06 +00:00
a7a9ccf889 Serg dev (#2504)
* more build dependencies

* fix for cmake < 3.11

It cannot do ADD_LIBRARY(... ALIAS ...) on IMPORTED targets

* another fix for cmake 3.10.2

It doesn't know about CMAKE_CXX_STANDARD=20,
let's add the correct flag manually

* gcc 8 on aarch64

utils/common/simd_arm.h:241:16: error: need ‘typename’ before ‘simd::TypeToVecWrapperType<T>::WrapperType’ because ‘simd::TypeToVecWrapperType<T>’ is a dependent scope
2022-08-15 13:35:30 +03:00
2020f35e88 Mcol 5092 MODA uses wrong column width for some types (#2450)
* MCOL-5092 Ensure column width is correct for datatype
                       Change MODA return type to STRING
                       Modify MODA to handle every numeric type
* MCOL-5162 MODA to support char and varchar with collation support

Fixes to the aggregate bit functions
When we fixed the storage sign issue for MCOL-5092, it uncovered a problem in the bit aggregates (bit_and, bit_or and bit_xor). These aggregates should always return UBIGINT, but they relied on the type of the argument column, which gave bad results.
2022-08-11 15:16:11 -05:00
dd96e686c0 MCOL-5153 This patch replaces MDB collation aware hash function with the (#2488)
exact functionality that does not use MDB hash function.
This patch also takes a bit from Robin Hood hash map implementation forgotten
that reduces hash function collision rate.
2022-08-07 02:36:03 +03:00
c3a5731890 Rename cmpGt2 2022-08-04 16:16:38 +03:00
24b2c1c283 Vectorizing min/max for KIND_TEXT 2022-08-04 16:16:38 +03:00
c4798ce585 fix 2022-08-04 16:16:38 +03:00
19ca844cd1 support_max_min 2022-08-04 16:16:38 +03:00
589b786fda Don't ignore null or empty in calculation 2022-08-04 16:16:38 +03:00
20f48fd730 Vectorized update min max 2022-08-04 16:16:38 +03:00
b8200acd3b Don't ignore null or empty in calculation 2022-08-04 16:16:38 +03:00
1681edaca0 Tests for simd min/max 2022-08-04 16:16:38 +03:00
9930d0dedd Vectorized update min max 2022-08-04 16:16:38 +03:00
df431ebad9 MCOL-5093 This patch raises the hardcoded service start TO up to 2 hours (#2469) 2022-07-22 12:25:24 -05:00
194f0e9d64 ci: new builds grid, parallel steps 2022-07-08 22:30:02 +02:00
d451b5c7c5 fix 2022-06-24 18:06:04 +08:00
4c0b8fd829 simd of arm neon
unit testing

pass unit test for simdprocessor

add test cases

implement specific _mm_movemask for different types

float movemask change

rename
2022-06-24 11:24:59 +08:00
6d47529499 Merge branch 'develop' into MCOL-4841 2022-06-14 14:41:41 -05:00
272246e9fa Merge branch 'develop' into MCOL-4841 2022-06-09 16:58:33 -05:00
3b6449842f Merge branch 'develop' into MCOL-4841
# Conflicts:
#	exemgr/main.cpp
#	oam/etc/Columnstore.xml.singleserver
#	primitives/primproc/primproc.cpp
2022-06-09 10:07:26 -05:00
7c9da5709d MCOL-5105 This patch raises pipe read operation timeout to 20 minutes
to enable DMLProc to survive rollbacks on startup.
The patch also fixes linter warnings in service.h and pipe.h.
2022-06-09 14:34:50 +00:00
4e50fca460 Merge pull request #2401 from denis0x0D/statistic_man
StatisticsManager initialize all plugins.
2022-06-03 15:41:28 +05:30
6c0ebd568b StatisticsManager initialize all plugins.
This patch adds support for initializing all plugins in the system.
2022-05-31 12:42:00 +03:00
c25ae4f378 Use external boost 1.78 2022-05-02 18:23:37 +00:00
5820a21e19 Merge pull request #2331 from drrtuy/MCOL-5001-pp-em-combo-merge-1
Mcol 5001 pp em combo merge 1
2022-04-13 15:16:18 +03:00