1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-04-26 11:48:52 +03:00

94 Commits

Author SHA1 Message Date
Denis Khalikov
865cca11c9 MCOL-5505 Add TypeHandler functions. 2023-11-30 01:47:13 +04:00
HanpyBin
fe597ec78c MCOL-5505 add parquet support for cpimport and add mcs_parquet_ddl and mcs_parquet_gen tools 2023-11-30 01:47:13 +04:00
Roman Nozdrin
eb744eafed chore(datatypes): this refactors the placement of the main SQL data types enum to enable templates that are parametrized with this enum(see mcs_datatype_basic.h changes for more details). 2023-10-24 18:44:35 +03:00
Mu He
7e2f83e39d
Merge branch 'mariadb-corporation:develop' into develop 2023-04-05 18:22:52 +02:00
Leonid Fedorov
2e1394149b
MCOL-5464: Fixes of bugs from ASAN warnings, part one (#2792)
* Fixes of bugs from ASAN warnings, part one

* MQC as static library, with nifty counter for global map and mutex

* Switch clang to 16

* link messageqcpp to execplan
2023-04-04 02:33:23 +03:00
MuHe03
d906974abc MCOL-4991 Solving TRUNCATE/ROUND/CEILING functions on TIME/DATETIME/TIMESTAMP
Add getDecimalVal in func_round and func_truncate for getting value while filtering

MCOL-4991 Solving TRUNCATE/ROUND/CEILING functions on TIME/DATETIME/TIMESTAMP

Update func_cast.cpp
2023-03-31 18:39:16 +02:00
Sergey Zefirov
b53c231ca6 MCOL-271 empty strings should not be NULLs (#2794)
This patch improves handling of NULLs in textual fields in ColumnStore.
Previously empty strings were considered NULLs and it could be a problem
if data scheme allows for empty strings. It was also one of major
reasons of behavior difference between ColumnStore and other engines in
MariaDB family.

Also, this patch fixes some other bugs and incorrect behavior, for
example, incorrect comparison for "column <= ''" which evaluates to
constant True for all purposes before this patch.
2023-03-30 21:18:29 +03:00
Roman Nozdrin
786b9da5b0 MCOL-5438 COUNT() in math causes SEGV 2023-03-09 20:35:38 +00:00
Leonid Fedorov
56f2346083 Remove windows ifdefs 2023-03-02 15:59:42 +00:00
Gagan Goel
8bf545bc2e MDEV-25080 Fix a corner case in DataConvert::joinColTypeForUnion(). 2023-02-27 09:01:24 -05:00
Gagan Goel
4e2123ca80 MDEV-25080 Fix some corner cases in DataConvert::joinColTypeForUnion(). 2023-02-27 06:38:31 -05:00
Gagan Goel
86dcf92d56 MCOL-5215 Fix overflow of UNION operation involving DECIMAL datatypes.
When a UNION operation involving DECIMAL datatypes with scale and digits
before the decimal exceeds the currently supported maximum precision
of 38, we throw an error to the user:
"MCS-2060: Union operation exceeds maximum DECIMAL precision of 38".

This is until MCOL-5417 is implemented where ColumnStore will have
full parity with MariaDB server in terms of maximum supported DECIMAL
precision and scale of 65 and 38 digits respectively.
2023-02-27 06:38:31 -05:00
Leonid Fedorov
d42485656c Fix clang 16 warnings for comfort build 2023-01-12 22:11:28 +03:00
Jigao Luo
6c4af1461f
[MCOL-5205] Fix bug from union type in UNION processing.
This patch fixs the reported JIRA issue MCOL 5205, which consists of a wrong union type from two input Int types. The bug results in wrong unioned answers in CS. The fix includes more INT case discussions. Additionaly, this patch provides detailed unit tests for correctness in UNION processing with Int.

Signed-off-by: Jigao Luo <luojigao@outlook.com>
2022-09-09 22:54:35 +02:00
Leonid Fedorov
3919c541ac
New warnfixes (#2254)
* Fix clang warnings

* Remove vim tab guides

* initialize variables

* 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length

* Fix ISO C++17 does not allow 'register' storage class specifier for outdated bison

* chars are unsigned on ARM, having  if (ival < 0) always false

* chars are unsigned by default on ARM and comparison with -1 if always true
2022-02-17 13:08:58 +03:00
Gagan Goel
973e5024d8 MCOL-4957 Fix performance slowdown for processing TIMESTAMP columns.
Part 1:
 As part of MCOL-3776 to address synchronization issue while accessing
 the fTimeZone member of the Func class, mutex locks were added to the
 accessor and mutator methods. However, this slows down processing
 of TIMESTAMP columns in PrimProc significantly as all threads across
 all concurrently running queries would serialize on the mutex. This
 is because PrimProc only has a single global object for the functor
 class (class derived from Func in utils/funcexp/functor.h) for a given
 function name. To fix this problem:

   (1) We remove the fTimeZone as a member of the Func derived classes
   (hence removing the mutexes) and instead use the fOperationType
   member of the FunctionColumn class to propagate the timezone values
   down to the individual functor processing functions such as
   FunctionColumn::getStrVal(), FunctionColumn::getIntVal(), etc.

   (2) To achieve (1), a timezone member is added to the
   execplan::CalpontSystemCatalog::ColType class.

Part 2:
 Several functors in the Funcexp code call dataconvert::gmtSecToMySQLTime()
 and dataconvert::mySQLTimeToGmtSec() functions for conversion between seconds
 since unix epoch and broken-down representation. These functions in turn call
 the C library function localtime_r() which currently has a known bug of holding
 a global lock via a call to __tz_convert. This significantly reduces performance
 in multi-threaded applications where multiple threads concurrently call
 localtime_r(). More details on the bug:
   https://sourceware.org/bugzilla/show_bug.cgi?id=16145

 This bug in localtime_r() caused processing of the Functors in PrimProc to
 slowdown significantly since a query execution causes Functors code to be
 processed in a multi-threaded manner.

 As a fix, we remove the calls to localtime_r() from gmtSecToMySQLTime()
 and mySQLTimeToGmtSec() by performing the timezone-to-offset conversion
 (done in dataconvert::timeZoneToOffset()) during the execution plan
 creation in the plugin. Note that localtime_r() is only called when the
 time_zone system variable is set to "SYSTEM".

 This fix also required changing the timezone type from a std::string to
 a long across the system.
2022-02-14 14:12:27 -05:00
Leonid Fedorov
04752ec546 clang format apply 2022-01-21 16:43:49 +00:00
Leonid Fedorov
5c5f103f98
MCOL-4839: Fix clang build (#2100)
* Fix clang build

* Extern C returned to plugin_instance

Co-authored-by: Leonid Fedorov <l.fedorov@mail.corp.ru>
2021-08-23 10:45:10 -05:00
Gagan Goel
7c8b502dc2 Fix regression in a query involving an aggregate function on a
non-wide decimal column in the HAVING clause.

In buildAggregateColumn(), if an aggregate function (such as avg)
is applied on a non-wide decimal column, we were setting the precision
of the resulting column as -1. This later down in the execution got
converted to 255 as in some cases, precision is stored as uint8_t.
The predicate operations on a DECIMAL column has logic that uses
the wide Decimal::s128value field if precision > 18. This logic incorrectly
used the Decimal::s128value instead of the correct value stored in the
narrow Decimal::value field, since precision of the Decimal column
was 255. The fix is to set the aggregate column precision to
datatypes::INT64MAXPRECISION (18) in buildAggregateColumn() when the
aggregate is applied on a non-wide decimal column.

This commit also partially fixes -Wstrict-aliasing GCC warnings.
2021-06-22 11:11:34 +00:00
Alexander Barkov
67449418ed MCOL-4700 Wrong result of a UNION for INT and INT UNSIGNED 2021-06-11 19:31:51 +04:00
Gagan Goel
4e9307fa6d MCOL-4612 A subquery with a union for DECIMAL and BIGINT returns zeros.
In this patch, we set the unioned type to a wide decimal, if any of the
numeric columns involved in the union operation have a precision > 18
(which is also possible for BIGINT/UBIGINT types) and <= 38.
2021-04-30 12:33:33 +00:00
Gagan Goel
f6b55c1e18 MCOL-4177 Add support for bulk insertion for wide decimals.
1. This patch adds support for wide decimals with/without scale
     to cpimport. In addition, INSERT ... SELECT and LDI are also
     now supported.
  2. Logic to compute the number of bytes to convert a binary
     representation in the buffer to a narrow decimal is also
     simplified.
2020-12-15 22:14:54 +00:00
Alexander Barkov
52c5af054a Part#2 MCOL-495 Make string comparison not case sensitive
Fixing field='str' for short (non-Dict) CHAR and VARCHAR data types.
2020-12-04 08:40:29 +04:00
Roman Nozdrin
58495d0d2f MCOL-4387 Convert dataconvert::decimalToString() into VDecimal and TSInt128 methods 2020-11-18 13:53:16 +00:00
Gagan Goel
6fd7916c56 MCOL-4188 Fix regression in a union subquery involving a numeric field.
Since we now perform type promotion to wide decimals for aggregations
involving numeric fields, we need to check for wide decimal in
in and out ROWs and call the appropriate setter and getter functions.
2020-11-18 13:53:16 +00:00
Alexander Barkov
3d7f5c6fd1 MCOL-4377 Split DataConvert::convertColumnData() 2020-11-18 13:53:16 +00:00
Alexander Barkov
129d5b5a0f MCOL-4174 Review/refactor frontend/connector code 2020-11-18 13:53:15 +00:00
Gagan Goel
cfe35b5c7f MCOL-641 Add support for functions (Part 1). 2020-11-18 13:51:25 +00:00
Gagan Goel
74b64eb4f1 MCOL-641 1. Add support for int128_t in ParsedColumnFilter.
2. Set Decimal precision in SimpleColumn::evaluate().
3. Add support for int128_t in ConstantColumn.
4. Set IDB_Decimal::s128Value in buildDecimalColumn().
5. Use width 16 as first if predicate for branching based on decimal width.
2020-11-18 13:47:45 +00:00
Roman Nozdrin
b09f3088ca MCOL-641 Initial version of Math operations for wide decimal. 2020-11-18 13:47:44 +00:00
Gagan Goel
9b714274db MCOL-641 1. Minor refactoring of decimalToString for int128_t.
2. Update unit tests for decimalToString.
3. Allow support for wide decimal in TupleConstantStep::fillInConstants().
2020-11-18 13:47:44 +00:00
Roman Nozdrin
238386bf63 MCOL-641 Replaced IDB_Decima.__v union with int128_t attribute.
Moved all tests into ./test

Introduced ./datatypes directory
2020-11-18 13:47:44 +00:00
Gagan Goel
824615a55b MCOL-641 Refactor empty value implementation in writeengine. 2020-11-18 13:47:44 +00:00
Roman Nozdrin
97ee1609b2 MCOL-641 Replaced NULL binary constants.
DataConvert::decimalToString, toString, writeIntPart, writeFractionalPart are not templates anymore.
2020-11-18 13:47:44 +00:00
Roman Nozdrin
61647c1f5b MCOL-641 DataConvert::decimalToString() refactoring. 2020-11-18 13:47:02 +00:00
Gagan Goel
8f80c1dee6 MCOL-641 1. Implement int128 version of strtoll.
2. Templatize number_int_value.
3. Add test cases for strtoll128 and number_int_value for Decimal38.
2020-11-18 13:47:02 +00:00
drrtuy
b29d0c9daa MCOL-641 Changed the hint to search for GTest headers.
This commit introduces DataConvert UTs.

DataConvert::decimalToString now can negative values.

Next version for Row::toString(), applyMapping UT checks.

Row:equals() is now wide-DECIMAL aware.
2020-11-18 13:47:02 +00:00
Roman Nozdrin
c23ead2703 MCOL-641 This commit changes NULL and EMPTY values.
It also contains the refactored DataConvert::decimalToString().

Row::toString UT is finished.
2020-11-18 13:47:02 +00:00
Gagan Goel
b07db9a8f4 MCOL-641 Basic support for updates. 2020-11-18 13:47:01 +00:00
Gagan Goel
55afcd8890 MCOL-641 Basic extent elimination support for Decimal38. 2020-11-18 13:47:01 +00:00
drrtuy
84f9821720 MCOL-641 Switched to DataConvert static methods in joblist code.
Replaced BINARYEMPTYROW and BINARYNULL values. We need to have
separate magic values for numeric and non-numeric binary types
b/c numeric cant tolerate losing 0 used for magics previously.

atoi128() now parses minus sign and produces negative values.

RowAggregation::isNull() now uses Row::isNull() for DECIMAL.
2020-11-18 13:47:01 +00:00
drrtuy
98213c0094 MCOL-641 Addition now works for DECIMAL columns with precision > 18. 2020-11-18 13:47:01 +00:00
drrtuy
54c152d6c8 MCOL-641 This commit introduces templates for DataConvert and RowGroup methods. 2020-11-18 13:47:01 +00:00
drrtuy
0c67b6ab50 MCOL-641 atoi128 now correctly processes decimal point and - signs.
There are multiple overloaded version of the low level DML write methods to
push down CSC column type. WE needs the type to convert values correctly.

Replaced WE_INT128 with CSC data type that is more informative.

Removed commented and obsolete code.

Replaced switch-case blocks with oneliners.
2020-11-18 13:47:01 +00:00
Gagan Goel
77e1d6abe3 Basic SELECT support for Decimal38 2020-11-18 13:47:00 +00:00
Roman Nozdrin
63dcaa387f MCOL-641 Simple INSERT with one record works with this commit. 2020-11-18 13:47:00 +00:00
Roman Nozdrin
c9f42fb5cc MCOL-641 PoC version for DECIMAL(38) using BINARY as a basis. 2020-11-18 13:47:00 +00:00
Gagan Goel
32f6167067 MCOL-641 Work of Ivan Zuniga on basic read and write support for Binary16 2020-11-18 13:47:00 +00:00
Roman Nozdrin
7acfddddb7 Refactored MDB relation names decoding in DDL code.
SH now takes all or nothing thus we need to change if conditions that rules our GBH.

Small warning fixes for GCC8.2

Disabled GBH.
2019-12-13 11:38:19 -06:00
jmrojas2332
c29c41e235 MCOL 3474 Fix Timediff results after accounting for new test cases 2019-12-04 22:48:07 +00:00