1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-08-07 03:22:57 +03:00
Commit Graph

332 Commits

Author SHA1 Message Date
Leonid Fedorov
8f93fc3623 MCOL-5493: First portion of UBSan fixes (#2842)
Multiple UB fixes
2023-06-02 17:02:09 +03:00
Roman Nozdrin
4fe9cd64a3 Revert "No boost condition (#2822)" (#2828)
This reverts commit f916e64927.
2023-04-22 15:49:50 +03:00
Leonid Fedorov
f916e64927 No boost condition (#2822)
This patch replaces boost primitives with stdlib counterparts.
2023-04-22 00:42:45 +03:00
Leonid Fedorov
c2d0fa24da replace boost::shared_array<T> to std::shared_ptr<T[]> 2023-04-14 10:33:27 +00:00
Leonid Fedorov
2e1394149b MCOL-5464: Fixes of bugs from ASAN warnings, part one (#2792)
* Fixes of bugs from ASAN warnings, part one

* MQC as static library, with nifty counter for global map and mutex

* Switch clang to 16

* link messageqcpp to execplan
2023-04-04 02:33:23 +03:00
Sergey Zefirov
b53c231ca6 MCOL-271 empty strings should not be NULLs (#2794)
This patch improves handling of NULLs in textual fields in ColumnStore.
Previously empty strings were considered NULLs and it could be a problem
if data scheme allows for empty strings. It was also one of major
reasons of behavior difference between ColumnStore and other engines in
MariaDB family.

Also, this patch fixes some other bugs and incorrect behavior, for
example, incorrect comparison for "column <= ''" which evaluates to
constant True for all purposes before this patch.
2023-03-30 21:18:29 +03:00
Andrey Piskunov
256691652d MCOL-4530: toCppCode() method for ParseTree and TreeNode (#2777)
* toCppCode for ParseTree and TreeNode

* generated tree is compiling

* Put tree constructors into tests

* Minor fixes

* Fixed parse + some constructors

* Fixed includes, removed debug and old data

* Hopefully fix clang errors

* Forgot an override

* More overrides
2023-03-22 23:25:06 +03:00
Otto Kekäläinen
70124ecc01 Fix trivial spelling errors
- occured -> occurred
- reponse -> response
- seperated -> separated

All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services, Inc.
2023-03-11 11:59:47 -08:00
Roman Nozdrin
786b9da5b0 MCOL-5438 COUNT() in math causes SEGV 2023-03-09 20:35:38 +00:00
Leonid Fedorov
56f2346083 Remove windows ifdefs 2023-03-02 15:59:42 +00:00
Roman Nozdrin
4d4e4ad30d Merge pull request #2741 from mariadb-corporation/MDEV-25080-CS-dev
MDEV-25080 Allow pushdown of queries involving UNIONs in outer select to ColumnStore
2023-02-28 11:23:50 +00:00
Andrey Piskunov
b6808c97f1 MCOL-4530: common conjuction top rewrite (#2673)
Added logical transformation of the execplan::ParseTrees with the taking out the common factor in expression of the form "(A and B) or (A and C)" for the purposes of passing a TPCH 19 query.

Co-authored-by: Leonid Fedorov <leonid.fedorov@mariadb.com>
2023-02-27 19:23:19 +03:00
Gagan Goel
86dcf92d56 MCOL-5215 Fix overflow of UNION operation involving DECIMAL datatypes.
When a UNION operation involving DECIMAL datatypes with scale and digits
before the decimal exceeds the currently supported maximum precision
of 38, we throw an error to the user:
"MCS-2060: Union operation exceeds maximum DECIMAL precision of 38".

This is until MCOL-5417 is implemented where ColumnStore will have
full parity with MariaDB server in terms of maximum supported DECIMAL
precision and scale of 65 and 38 digits respectively.
2023-02-27 06:38:31 -05:00
Roman Nozdrin
ff534dba7f MCOL-5384 This commit replaces shared pointer to CSC with CSC ctor that is cleaned up leaving a scope
CSC default ctor was private b/c it must not allow to use CSC outside thread cache.
  However there are some places in the plugin code that need a standalone syscat that
  is cleaned up leaving the scope. The decision is to make the restriction mentioned
  organizational rather than syntactical.
2023-02-08 14:03:41 +00:00
Roman Nozdrin
d43e418a8c MCOL-5346 This patch forces TreeNode::getIntValue to use conversion for dict-based CHAR/VARCHAR and TEXT columns (#2657)
Co-authored-by: Roman Nozdrin <rnozdrin@mariadb.com>
2022-12-13 18:13:22 +03:00
Leonid Fedorov
b936ed8b2e Fix some GCC-12 Build errors 2022-11-22 03:28:17 +03:00
mariadb-AndreyPiskunov
b57d2c30fe Minor fixes 2022-10-31 14:56:32 +02:00
mariadb-AndreyPiskunov
315e4be2d8 First working attempt for json_arrayagg 2022-10-31 14:56:32 +02:00
mariadb-AndreyPiskunov
1714b75434 Non working attempt to do MCOL-5227 2022-10-31 14:56:32 +02:00
Ziy1-Tan
cdd41f05f3 MCOL-785 Implement DISTRIBUTED JSON functions
The following functions are created:
Create function JSON_VALID and test cases
Create function JSON_DEPTH and test cases
Create function JSON_LENGTH and test cases
Create function JSON_EQUALS and test cases
Create function JSON_NORMALIZE and test cases
Create function JSON_TYPE and test cases
Create function JSON_OBJECT and test cases
Create function JSON_ARRAY and test cases
Create function JSON_KEYS and test cases
Create function JSON_EXISTS and test cases
Create function JSON_QUOTE/JSON_UNQUOTE and test cases
Create function JSON_COMPACT/DETAILED/LOOSE and test cases
Create function JSON_MERGE and test cases
Create function JSON_MERGE_PATCH and test cases
Create function JSON_VALUE and test cases
Create function JSON_QUERY and test cases
Create function JSON_CONTAINS and test cases
Create function JSON_ARRAY_APPEND and test cases
Create function JSON_ARRAY_INSERT and test cases
Create function JSON_INSERT/REPLACE/SET and test cases
Create function JSON_REMOVE and test cases
Create function JSON_CONTAINS_PATH and test cases
Create function JSON_OVERLAPS and test cases
Create function JSON_EXTRACT and test cases
Create function JSON_SEARCH and test cases

Note:
Some functions output differs from MDB because session variables that affects functions output,e.g JSON_QUOTE/JSON_UNQUOTE
This depends on MCOL-5212
2022-08-30 22:22:23 +08:00
Leonid Fedorov
d2432f9bf6 get rid of pointers for 128 fields 2022-08-26 15:12:22 +00:00
Gagan Goel
6a6fee5969 MCOL-5021 Followup.
Allow the compiler to inline the call to nextColValue() in column.cpp.
2022-08-18 19:35:35 +00:00
Gagan Goel
cbfdae3481 MCOL-5021 Code changes based on review feedback. 2022-08-05 14:40:50 -04:00
Gagan Goel
1355237ca3 MCOL-5021 Some minor fixes. 2022-08-05 14:40:50 -04:00
Gagan Goel
94e9f55940 MCOL-5021 Add a new member function to the DBRM class, DBRM::addToLBIDList().
This function iterates over lbidList (populated by an earlier call to
DBRM::getUncommittedExtentLBIDs()) to find those LBIDs which belong to
the AUX column. It then finds the corresponding LBIDs for all other columns
which belong to the same table as the AUX LBID and appends them to lbidList.
The updated lbidList is used by invalidateUncommittedExtentLBIDs() to update
the casual partitioning information.

DBRM::addToLBIDList() only comes into play in case of a transaction ROLLBACK.
2022-08-05 14:40:50 -04:00
Gagan Goel
439db48c5a MCOL-5021 Add support for the AUX column in TRUNCATE table processing. 2022-08-05 14:40:49 -04:00
Gagan Goel
ea1861fdb5 MCOL-5021 Add a new function to CalpontSystemCatalog class,
isAUXColumnOID(), to check if a given OID is an auxilliary
column OID.
2022-08-05 14:40:49 -04:00
Gagan Goel
262cd5c501 MCOL-5021 Remove hard-coded values for data type, column width
and compression type for the AUX column, and replace them with
constants defined in the execplan namespace.
2022-08-05 14:40:49 -04:00
Gagan Goel
86df9a972c MCOL-5021 Add prototype support for the AUX column in CREATE/DROP
DDL commands, single and multi-value INSERTs, cpimport, and
DELETE.
2022-08-05 14:40:49 -04:00
Leonid Fedorov
39c43a0f70 <unnamed>.execplan::CalpontSystemCatalog::TableName::create_date' may be used uninitialized 2022-07-11 22:27:25 +02:00
Denis Khalikov
467fe0b401 [MCOL-5109] Make a singleton from ServicePrimProc.
This patch makes a singleton from ServicePrimProc.
2022-06-07 13:27:45 +03:00
Leonid Fedorov
65252df4f6 C++20 fixes 2022-03-28 12:32:29 +00:00
Serguey Zefirov
53b9a2a0f9 MCOL-4580 extent elimination for dictionary-based text/varchar types
The idea is relatively simple - encode prefixes of collated strings as
integers and use them to compute extents' ranges. Then we can eliminate
extents with strings.

The actual patch does have all the code there but miss one important
step: we do not keep collation index, we keep charset index. Because of
this, some of the tests in the bugfix suite fail and thus main
functionality is turned off.

The reason of this patch to be put into PR at all is that it contains
changes that made CHAR/VARCHAR columns unsigned. This change is needed in
vectorization work.
2022-03-02 23:53:39 +03:00
Leonid Fedorov
3919c541ac New warnfixes (#2254)
* Fix clang warnings

* Remove vim tab guides

* initialize variables

* 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length

* Fix ISO C++17 does not allow 'register' storage class specifier for outdated bison

* chars are unsigned on ARM, having  if (ival < 0) always false

* chars are unsigned by default on ARM and comparison with -1 if always true
2022-02-17 13:08:58 +03:00
Roman Nozdrin
15a87ee510 Merge pull request #2257 from tntnatbry/MCOL-4957
MCOL-4957 Fix performance slowdown for processing TIMESTAMP columns.
2022-02-16 19:46:57 +02:00
Leonid Fedorov
9b686c04e1 wrong return type 2022-02-15 14:23:08 +00:00
Gagan Goel
973e5024d8 MCOL-4957 Fix performance slowdown for processing TIMESTAMP columns.
Part 1:
 As part of MCOL-3776 to address synchronization issue while accessing
 the fTimeZone member of the Func class, mutex locks were added to the
 accessor and mutator methods. However, this slows down processing
 of TIMESTAMP columns in PrimProc significantly as all threads across
 all concurrently running queries would serialize on the mutex. This
 is because PrimProc only has a single global object for the functor
 class (class derived from Func in utils/funcexp/functor.h) for a given
 function name. To fix this problem:

   (1) We remove the fTimeZone as a member of the Func derived classes
   (hence removing the mutexes) and instead use the fOperationType
   member of the FunctionColumn class to propagate the timezone values
   down to the individual functor processing functions such as
   FunctionColumn::getStrVal(), FunctionColumn::getIntVal(), etc.

   (2) To achieve (1), a timezone member is added to the
   execplan::CalpontSystemCatalog::ColType class.

Part 2:
 Several functors in the Funcexp code call dataconvert::gmtSecToMySQLTime()
 and dataconvert::mySQLTimeToGmtSec() functions for conversion between seconds
 since unix epoch and broken-down representation. These functions in turn call
 the C library function localtime_r() which currently has a known bug of holding
 a global lock via a call to __tz_convert. This significantly reduces performance
 in multi-threaded applications where multiple threads concurrently call
 localtime_r(). More details on the bug:
   https://sourceware.org/bugzilla/show_bug.cgi?id=16145

 This bug in localtime_r() caused processing of the Functors in PrimProc to
 slowdown significantly since a query execution causes Functors code to be
 processed in a multi-threaded manner.

 As a fix, we remove the calls to localtime_r() from gmtSecToMySQLTime()
 and mySQLTimeToGmtSec() by performing the timezone-to-offset conversion
 (done in dataconvert::timeZoneToOffset()) during the execution plan
 creation in the plugin. Note that localtime_r() is only called when the
 time_zone system variable is set to "SYSTEM".

 This fix also required changing the timezone type from a std::string to
 a long across the system.
2022-02-14 14:12:27 -05:00
Leonid Fedorov
04752ec546 clang format apply 2022-01-21 16:43:49 +00:00
Leonid Fedorov
01f3ceb437 replace header guards with #pragma once 2022-01-21 15:24:58 +00:00
Roman Nozdrin
05897948e4 MCOL-4899 MCS now applies a correct collation running IN for character data types 2022-01-05 12:00:01 +00:00
Alexander Barkov
fa9f18553a MCOL-4728 Query with unusual use of aggregate functions on ColumnStore table crashes MariaDB Server
After an AggreateColumn corresponding to SUM(1+1) is created,
it is pushed to the list:

    gwi.count_asterisk_list.push_back(ac)

Later, in getSelectPlan(), the expression SUM(1+1) was erroneously
treated as a constant:

  if (!hasNonSupportItem && !nonConstFunc(ifp) && !(parseInfo & AF_BIT) && tmpVec.size() == 0)
  {
     srcp.reset(buildReturnedColumn(item, gwi, gwi.fatalParseError));

This code freed the original AggregateColumn and replaced to a ConstantColumn.

But gwi.count_asterisk_list still pointer to the freed AggregateColumn().

The expression SUM(1+1) was treated as a constant because tmpVec
was empty due to a bug in this code:

                    // special handling for count(*). This should not be treated as constant.
                    if (isp->argument_count() == 1 &&
                            ( sfitempp[0]->type() == Item::CONST_ITEM &&
                                (sfitempp[0]->cmp_type() == INT_RESULT ||
                                 sfitempp[0]->cmp_type() == STRING_RESULT ||
                                 sfitempp[0]->cmp_type() == REAL_RESULT ||
                                 sfitempp[0]->cmp_type() == DECIMAL_RESULT)
                            )
                        )
                    {
                        field_vec.push_back((Item_field*)item); //dummy

Notice, it handles only aggregate functions with explicit literals
passed as an argument, while it does not handle constant expressions
such as 1+1.

Fix:

- Adding new classes ConstantColumnNull, ConstantColumnString,
  ConstantColumnNum, ConstantColumnUInt, ConstantColumnSInt,
  ConstantColumnReal, ValStrStdString, to reuse the code easier.

- Moving a part of the code from the case branch handling CONST_ITEM
  in buildReturnedColumn() into a new function
  newConstantColumnNotNullUsingValNativeNoTz(). This
  makes the code easier to read and to reuse in the future.

- Adding a new function newConstantColumnMaybeNullFromValStrNoTz().
  Removing dulplicate code from !!!four!!! places, using the new
  function instead.

- Adding a function isSupportedAggregateWithOneConstArg() to
  properly catch all constant expressions. Using the new function parse_item()
  in the code commented as "special handling for count(*)".
  Now it pushes all constant expressions to field_vec, not only
  explicit literals.

- Moving a part of the code from buildAggregateColumn()
  to a helper function processAggregateColumnConstArg().
  Using processAggregateColumnConstArg() in the CONST_ITEM
  and NULL_ITEM branches.

- Adding a new branch in buildReturnedColumn() handling FUNC_ITEM.
  If a function has constant arguments, a ConstantColumn() is
  immediately created, without going to
  buildArithmeticColumn()/buildFunctionColumn().

- Reusing isSupportedAggregateWithOneConstArg()
  and processAggregateColumnConstArg() in buildAggregateColumn().
  A new branch catches aggregate function has only one constant argument
  and immediately creates a single ConstantColumn without
  traversing to the argument sub-components.
2021-09-21 14:00:56 +04:00
Leonid Fedorov
5c5f103f98 MCOL-4839: Fix clang build (#2100)
* Fix clang build

* Extern C returned to plugin_instance

Co-authored-by: Leonid Fedorov <l.fedorov@mail.corp.ru>
2021-08-23 10:45:10 -05:00
David Hall
a202bda485 MCOL-4719 iterate into subquery looking for windowfunctions
When an outer query filter accesses an subquery column that contains an aggregate or a window function, certain optimizations can't be performed. We had been looking at the surface of the returned column. We now iterate into any functions or operations looking for aggregates and window functions.
2021-07-22 13:56:21 -05:00
Gagan Goel
b3a560300c Revert "Merge pull request #2022 from mariadb-corporation/bar-develop-MCOL-4791"
This reverts commit 4016e25e5b, reversing
changes made to 85435f6b1e.
2021-07-13 11:06:56 +00:00
Leonid Fedorov
f81f743282 Replace underlying type for avg and sum for int types from long double to wide decimal 2021-07-08 17:04:43 +00:00
Roman Nozdrin
7b4f759592 Merge pull request #2032 from drrtuy/MCOL-4802
MCOL-4802 Removed ByteStream methods for bool and add some logging in…
2021-07-07 13:03:54 +03:00
Roman Nozdrin
fb5ba84212 MCOL-4802 Removed ByteStream methods for bool manipulations and add some logging into I_S.columnstore_files 2021-07-07 07:16:30 +00:00
Alexander Barkov
9794f24369 MCOL-4801 Replace Row methods getStringLength() and getStringPointer() to getConstString() 2021-07-06 21:15:32 +04:00
Gagan Goel
8520f87237 MCOL-641 Cleanup. 2021-07-06 09:01:49 +00:00
Roman Nozdrin
6dc356ed60 Merge pull request #1989 from denis0x0D/MCOL-4713
MCOL-4713 Analyze table implementation.
2021-07-02 16:17:07 +03:00