1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-04-18 21:44:02 +03:00

22 Commits

Author SHA1 Message Date
Sergey Zefirov
b53c231ca6 MCOL-271 empty strings should not be NULLs (#2794)
This patch improves handling of NULLs in textual fields in ColumnStore.
Previously empty strings were considered NULLs and it could be a problem
if data scheme allows for empty strings. It was also one of major
reasons of behavior difference between ColumnStore and other engines in
MariaDB family.

Also, this patch fixes some other bugs and incorrect behavior, for
example, incorrect comparison for "column <= ''" which evaluates to
constant True for all purposes before this patch.
2023-03-30 21:18:29 +03:00
Leonid Fedorov
3919c541ac
New warnfixes (#2254)
* Fix clang warnings

* Remove vim tab guides

* initialize variables

* 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length

* Fix ISO C++17 does not allow 'register' storage class specifier for outdated bison

* chars are unsigned on ARM, having  if (ival < 0) always false

* chars are unsigned by default on ARM and comparison with -1 if always true
2022-02-17 13:08:58 +03:00
Gagan Goel
973e5024d8 MCOL-4957 Fix performance slowdown for processing TIMESTAMP columns.
Part 1:
 As part of MCOL-3776 to address synchronization issue while accessing
 the fTimeZone member of the Func class, mutex locks were added to the
 accessor and mutator methods. However, this slows down processing
 of TIMESTAMP columns in PrimProc significantly as all threads across
 all concurrently running queries would serialize on the mutex. This
 is because PrimProc only has a single global object for the functor
 class (class derived from Func in utils/funcexp/functor.h) for a given
 function name. To fix this problem:

   (1) We remove the fTimeZone as a member of the Func derived classes
   (hence removing the mutexes) and instead use the fOperationType
   member of the FunctionColumn class to propagate the timezone values
   down to the individual functor processing functions such as
   FunctionColumn::getStrVal(), FunctionColumn::getIntVal(), etc.

   (2) To achieve (1), a timezone member is added to the
   execplan::CalpontSystemCatalog::ColType class.

Part 2:
 Several functors in the Funcexp code call dataconvert::gmtSecToMySQLTime()
 and dataconvert::mySQLTimeToGmtSec() functions for conversion between seconds
 since unix epoch and broken-down representation. These functions in turn call
 the C library function localtime_r() which currently has a known bug of holding
 a global lock via a call to __tz_convert. This significantly reduces performance
 in multi-threaded applications where multiple threads concurrently call
 localtime_r(). More details on the bug:
   https://sourceware.org/bugzilla/show_bug.cgi?id=16145

 This bug in localtime_r() caused processing of the Functors in PrimProc to
 slowdown significantly since a query execution causes Functors code to be
 processed in a multi-threaded manner.

 As a fix, we remove the calls to localtime_r() from gmtSecToMySQLTime()
 and mySQLTimeToGmtSec() by performing the timezone-to-offset conversion
 (done in dataconvert::timeZoneToOffset()) during the execution plan
 creation in the plugin. Note that localtime_r() is only called when the
 time_zone system variable is set to "SYSTEM".

 This fix also required changing the timezone type from a std::string to
 a long across the system.
2022-02-14 14:12:27 -05:00
Leonid Fedorov
04752ec546 clang format apply 2022-01-21 16:43:49 +00:00
Alexander Barkov
a6a85d157d MCOL-4666 Empty set when using BIT OR and BIT AND functions in WHERE 2021-04-07 14:37:39 +04:00
Alexander Barkov
1acc631a04 MCOL-4600 CAST(decimal AS SIGNED/UNSIGNED) returns a wrong result
The "SIGNED" part of the problem was previously fixed by MCOL-4640.
Fixing the "UNSIGNED" part.

- Adding TDecimal64::toUInt64Round() and Decimal::decimal64ToUInt64Round()
- Renaming Decimal::narrowRound() to decimal64ToSInt64Round(),
   for a more self-descriptive name, and for symmetry with decimal64ToUInt64Round()
- Reusing TDecimal64::toSInt64Round() inside decimal64ToSInt64Round().
  This change was forgotten in MCOL-4640 :(
- Removing the old code in Func_cast_unsigned::getUintVal with pow().
  It caused precision loss, hence the bug. Adding a call for
  Decimal::decimal64ToUInt64Round() instead.
- Adding tests for both SIGNED and UNSIGNED casts.

Additional change:
- Moving the wide-decimal-to-uint64_t rounding code from Func_cast_unsigned::getUintVal()
  to TDecimal128::toUInt64Round() (with refactoring). Adding TDecimal::toUInt64Round()
  for symmetry with TDecimal::toSInt64Round(). It will be easier to reuse the code
  with way.
2021-03-30 12:46:07 +04:00
Alexander Barkov
69da915160 MCOL-4531 New string-to-decimal conversion implementation
This change fixes:

MCOL-4462 CAST(varchar_expr AS DECIMAL(M,N)) returns a wrong result
MCOL-4500 Bit functions processing throws internally trying to cast char into decimal representation
MCOL-4532 CAST(AS DECIMAL) returns a garbage for large values

Also, this change makes string-to-decimal conversion 5-10 times faster,
depending on exact data.
Performance implemenent is achieved by the fact that (unlike in the old
implementation), the new version does not do any "string" object copying.
2021-02-09 13:02:27 +04:00
Alexander Barkov
b8cfda3bda A cleanup for MCOL-4464 Bitwise operations not like in MariaDB
CI with RelWithDebInfo builds revealed a problem in the main
patch for MCOL-4464, which did not show up with Debug builds.

Methods like:
- getDoubleVal()
- getDateIntVal()
- getDatetimeIntVal()
- getTimestampIntVal()
- getTimeIntVal()
- getUintVal()
- getIntVal()
- getStrVal()
require the caller to initialize the isNull argument to false.
This fact was not taken into account in MCOL-4464.

Adding proper initializations.
2021-01-13 16:51:38 +04:00
Alexander Barkov
4abbe90302 MCOL-4464 Bitwise operations not like in MariaDB 2021-01-11 14:14:34 +04:00
Roman Nozdrin
069b74563f Move Functor::timeZone() call that utilizes a lock into a block where
timezone is really used
2020-12-07 14:35:30 +00:00
Gagan Goel
6aea838360 MCOL-641 Add support for functions (Part 2). 2020-11-18 13:51:55 +00:00
Gagan Goel
cfe35b5c7f MCOL-641 Add support for functions (Part 1). 2020-11-18 13:51:25 +00:00
Patrick LeBlanc
d6ef3cad3d Merge pull request #1049 from pleblanc1976/mcol-3776
Mcol 3776 - shared but unsync'd timezone var
2020-02-28 13:58:51 -05:00
Gagan Goel
2e16fe674c 1. Implement bit_count() distributed function.
2. Handle hexadecimal string literals in buildReturnedColumn().
2019-11-22 18:02:47 -05:00
Andrew Hutchings
70b3aa3159 Merge branch 'develop-1.2' into develop-merge-up-20190924-2 2019-09-24 14:17:57 +01:00
David Hall
81e745256b MCOL-174 Replace custom helpers::power() with standard pow(). helpers::power() breaks with #dec > 9 2019-07-31 14:49:31 -05:00
David Hall
765d1d38d4 MCOL-174 Handle quoted numerics 2019-07-31 13:58:50 -05:00
Gagan Goel
e89d1ac3cf MCOL-265 Add support for TIMESTAMP data type 2019-04-23 00:00:09 -04:00
Andrew Hutchings
3c1ebd8b94 MCOL-392 Add initial TIME datatype support 2018-04-30 09:42:41 +01:00
Andrew Hutchings
01446d1e22 Reformat all code to coding standard 2017-10-26 17:18:17 +01:00
Andrew Hutchings
28fe2c0b70 MCOL-664 Add function support for TEXT
For the initial BLOB/TEXT pull request we put them in the same bucket as
VARBINARY, forcing many functions to be disabled. This patch enables the
same TEXT function support as VARCHAR.
2017-04-15 07:22:05 +02:00
david hill
f6afc42dd0 the begginning 2016-01-06 14:08:59 -06:00