1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-06-12 05:01:56 +03:00

242 Commits

Author SHA1 Message Date
c7e67aedd9 Renamed variables + removed server tests 2022-06-03 15:30:25 +03:00
66c69c7609 Welford's algorithm STD and VAR on window functions 2022-06-03 15:29:30 +03:00
c5fa27475d Welford algorithm for STD and VAR
Naive algorithm for calculating STD and VAR is subject to catastrophic
cancellation. A well-known Welford's algorithms is used instead.
2022-06-03 15:29:30 +03:00
f28e00c206 No repeating code in client_udfs + better test 2022-03-28 21:48:47 +03:00
876a66cbc3 Added crude tests for cal/mcs client UDFs 2022-03-28 21:48:47 +03:00
53b9a2a0f9 MCOL-4580 extent elimination for dictionary-based text/varchar types
The idea is relatively simple - encode prefixes of collated strings as
integers and use them to compute extents' ranges. Then we can eliminate
extents with strings.

The actual patch does have all the code there but miss one important
step: we do not keep collation index, we keep charset index. Because of
this, some of the tests in the bugfix suite fail and thus main
functionality is turned off.

The reason of this patch to be put into PR at all is that it contains
changes that made CHAR/VARCHAR columns unsigned. This change is needed in
vectorization work.
2022-03-02 23:53:39 +03:00
4b412d4e09 MCOL-4940: test case for ROUND fix (#2271) 2022-02-21 16:09:31 -06:00
973e5024d8 MCOL-4957 Fix performance slowdown for processing TIMESTAMP columns.
Part 1:
 As part of MCOL-3776 to address synchronization issue while accessing
 the fTimeZone member of the Func class, mutex locks were added to the
 accessor and mutator methods. However, this slows down processing
 of TIMESTAMP columns in PrimProc significantly as all threads across
 all concurrently running queries would serialize on the mutex. This
 is because PrimProc only has a single global object for the functor
 class (class derived from Func in utils/funcexp/functor.h) for a given
 function name. To fix this problem:

   (1) We remove the fTimeZone as a member of the Func derived classes
   (hence removing the mutexes) and instead use the fOperationType
   member of the FunctionColumn class to propagate the timezone values
   down to the individual functor processing functions such as
   FunctionColumn::getStrVal(), FunctionColumn::getIntVal(), etc.

   (2) To achieve (1), a timezone member is added to the
   execplan::CalpontSystemCatalog::ColType class.

Part 2:
 Several functors in the Funcexp code call dataconvert::gmtSecToMySQLTime()
 and dataconvert::mySQLTimeToGmtSec() functions for conversion between seconds
 since unix epoch and broken-down representation. These functions in turn call
 the C library function localtime_r() which currently has a known bug of holding
 a global lock via a call to __tz_convert. This significantly reduces performance
 in multi-threaded applications where multiple threads concurrently call
 localtime_r(). More details on the bug:
   https://sourceware.org/bugzilla/show_bug.cgi?id=16145

 This bug in localtime_r() caused processing of the Functors in PrimProc to
 slowdown significantly since a query execution causes Functors code to be
 processed in a multi-threaded manner.

 As a fix, we remove the calls to localtime_r() from gmtSecToMySQLTime()
 and mySQLTimeToGmtSec() by performing the timezone-to-offset conversion
 (done in dataconvert::timeZoneToOffset()) during the execution plan
 creation in the plugin. Note that localtime_r() is only called when the
 time_zone system variable is set to "SYSTEM".

 This fix also required changing the timezone type from a std::string to
 a long across the system.
2022-02-14 14:12:27 -05:00
7cfcdf365d Merge pull request #2211 from drrtuy/MCOL-4899-dev
MCOL-4899 MCS now applies a correct collation running IN for characte…
2022-01-06 21:08:11 +03:00
05897948e4 MCOL-4899 MCS now applies a correct collation running IN for character data types 2022-01-05 12:00:01 +00:00
695b437730 The goal is to migrate the last offending regr test001 test case into MTR to make test001 green 2021-12-30 19:12:58 +00:00
8cab54fe31 MCOL-4868 Move test cases for MCOL-4264 to MTR. 2021-12-20 18:34:08 +00:00
a31066ff0e MCOL-4871 This patch adds relevant tests 2021-12-17 17:41:07 +00:00
7f456e58cc MCOL-4868 UPDATE on a ColumnStore table containing an IN-subquery
on a non-ColumnStore table does not work.

As part of MCOL-4617, we moved the in-to-exists predicate creation
and injection from the server into the engine. However, when query
with an IN Subquery contains a non-ColumnStore table, the server
still performs the in-to-exists predicate transformation for the
foreign engine table. This caused ColumnStore's execution plan to
contain incorrect WHERE predicates. As a fix, we call
mutate_optimizer_flags() for the WRITE lock, in addition to the READ
table lock. And in mutate_optimizer_flags(), we change the optimizer
flag from OPTIMIZER_SWITCH_IN_TO_EXISTS to OPTIMIZER_SWITCH_MATERIALIZATION.
2021-12-16 23:11:26 +00:00
340a90fc8d MCOL-4874 Crossengine JOIN involving a ColumnStore table and a
wide decimal column in a non-ColumnStore table throws an exception.

ROW::getSignedNullValue() method does not support wide decimal fields
yet. To fix this exception, we remove the call to this method from
CrossEngineStep::setField().
2021-12-08 22:26:52 +00:00
fa9f18553a MCOL-4728 Query with unusual use of aggregate functions on ColumnStore table crashes MariaDB Server
After an AggreateColumn corresponding to SUM(1+1) is created,
it is pushed to the list:

    gwi.count_asterisk_list.push_back(ac)

Later, in getSelectPlan(), the expression SUM(1+1) was erroneously
treated as a constant:

  if (!hasNonSupportItem && !nonConstFunc(ifp) && !(parseInfo & AF_BIT) && tmpVec.size() == 0)
  {
     srcp.reset(buildReturnedColumn(item, gwi, gwi.fatalParseError));

This code freed the original AggregateColumn and replaced to a ConstantColumn.

But gwi.count_asterisk_list still pointer to the freed AggregateColumn().

The expression SUM(1+1) was treated as a constant because tmpVec
was empty due to a bug in this code:

                    // special handling for count(*). This should not be treated as constant.
                    if (isp->argument_count() == 1 &&
                            ( sfitempp[0]->type() == Item::CONST_ITEM &&
                                (sfitempp[0]->cmp_type() == INT_RESULT ||
                                 sfitempp[0]->cmp_type() == STRING_RESULT ||
                                 sfitempp[0]->cmp_type() == REAL_RESULT ||
                                 sfitempp[0]->cmp_type() == DECIMAL_RESULT)
                            )
                        )
                    {
                        field_vec.push_back((Item_field*)item); //dummy

Notice, it handles only aggregate functions with explicit literals
passed as an argument, while it does not handle constant expressions
such as 1+1.

Fix:

- Adding new classes ConstantColumnNull, ConstantColumnString,
  ConstantColumnNum, ConstantColumnUInt, ConstantColumnSInt,
  ConstantColumnReal, ValStrStdString, to reuse the code easier.

- Moving a part of the code from the case branch handling CONST_ITEM
  in buildReturnedColumn() into a new function
  newConstantColumnNotNullUsingValNativeNoTz(). This
  makes the code easier to read and to reuse in the future.

- Adding a new function newConstantColumnMaybeNullFromValStrNoTz().
  Removing dulplicate code from !!!four!!! places, using the new
  function instead.

- Adding a function isSupportedAggregateWithOneConstArg() to
  properly catch all constant expressions. Using the new function parse_item()
  in the code commented as "special handling for count(*)".
  Now it pushes all constant expressions to field_vec, not only
  explicit literals.

- Moving a part of the code from buildAggregateColumn()
  to a helper function processAggregateColumnConstArg().
  Using processAggregateColumnConstArg() in the CONST_ITEM
  and NULL_ITEM branches.

- Adding a new branch in buildReturnedColumn() handling FUNC_ITEM.
  If a function has constant arguments, a ConstantColumn() is
  immediately created, without going to
  buildArithmeticColumn()/buildFunctionColumn().

- Reusing isSupportedAggregateWithOneConstArg()
  and processAggregateColumnConstArg() in buildAggregateColumn().
  A new branch catches aggregate function has only one constant argument
  and immediately creates a single ConstantColumn without
  traversing to the argument sub-components.
2021-09-21 14:00:56 +04:00
83c0c84aea This patch adds MTR's --sorted_result option to make the tests outputs deterministic (#2107) 2021-09-01 23:43:29 +03:00
d1ef83c0f4 MCOL-3741 dev Change mtr IDB-xxx error codes to MCS-xxxx 2021-08-10 12:01:36 -05:00
c16b0f6ad7 MCOL-4823 WHERE char_col<varchar_col returns a wrong result of a large table (#2060)
SCommand StrFilterCmd::duplicate() missed these two lines:

    filterCmd->leftColType = leftColType;
    filterCmd->rightColType = rightColType;

which exist in the parent's FilterCommand::duplicate().

Rewriting the code to avoid duplication by using more inherited
methods/constructors. This reduces the probability of similar bugs
in the future.
2021-08-03 11:53:05 +03:00
a202bda485 MCOL-4719 iterate into subquery looking for windowfunctions
When an outer query filter accesses an subquery column that contains an aggregate or a window function, certain optimizations can't be performed. We had been looking at the surface of the returned column. We now iterate into any functions or operations looking for aggregates and window functions.
2021-07-22 13:56:21 -05:00
6e45c125c7 Merge pull request #2047 from dhall-MariaDB/develop
Disable the non-deterministic mcs211 test
2021-07-12 17:53:33 -05:00
10cc1159a2 Merge pull request #2040 from tntnatbry/MCOL-641-move-mtr-tests
MCOL-641 Move MTR tests from future/ to basic/
2021-07-12 15:47:45 -05:00
2b37c2c7bc Merge pull request #2046 from denis0x0D/MCOL-4786_fix_regression
[MCOL-4786] Fix filter comparison.
2021-07-12 13:51:42 -05:00
497d12e08f Remove the non-deterministic mcs211 test 2021-07-12 12:26:41 -05:00
dc51dbf6cf [MCOL-4786] Fix filter comparison.
Compare ParseTree by dereferencing pointers.
2021-07-12 19:18:02 +03:00
78dbf3a7e1 MCOL-641 Move MTR tests from future/ to basic/ 2021-07-12 13:01:45 +00:00
3d557a2f1e Merge pull request #2044 from dhall-MariaDB/MCOL-3738
MCOL-3738 COUNT(DISTINCT) with multiple parms
2021-07-12 07:34:56 -04:00
69eec2cc0f MCOL-2044 Test sources made more resilient 2021-07-09 18:21:08 +03:00
76607be63a MCOL-3738 COUNT(DISTINCT) with multiple parms
Fixed regression
Added a few more mtr tests
2021-07-09 09:07:03 -05:00
4d265472ee Merge pull request #2036 from mariadb-SergeyZefirov/MCOL-4766-UPDATE-INSERT-in-a-transaction-does-not-revert-back-extent-ranges-on-a-rollback
MCOL-4766 ROLLBACK kept ranges changed inside rolled back transaction
2021-07-09 16:33:06 +03:00
adace6e0c7 MCOL-4786 Fix wrong comparison for the filters.
Fix wrong comparison for the filters while creating case function.
2021-07-09 12:18:26 +03:00
9e0851e4cf MCOL-4766 ROLLBACK kept ranges changed inside rolled back transaction
Now ROLLBACK drops ranges to INVALID state which makes engine to rescan
blocks and discover correct ranges.
2021-07-07 18:16:56 +03:00
74bdf522d1 Merge pull request #1999 from mariadb-SergeyZefirov/MCOL-4741-in-like-equal-returns-different-result
MCOL-4741 Fix extentmap handling of string prefixes encoded as 64-bit integers
2021-07-05 04:33:32 -04:00
237cad347f MCOL-4758 Limit LONGTEXT and LONGBLOB to 16MB (#1995)
MCOL-4758 Limit LONGTEXT and LONGBLOB to 16MB

Also add the original test case from MCOL-3879.
2021-07-05 02:09:41 -04:00
a576981db0 MCOL-1205 Remove old tests.
This patch removes some tests which check `circular join` error.
2021-07-03 15:29:35 +03:00
0553093986 Merge pull request #2023 from dhall-MariaDB/MCOL-4789
MCOL-4789 mcs51_cpimport_select_from is not deterministic
2021-07-02 13:31:54 -05:00
c58136a32d MCOL-4741 in/like/equal(=) operations differ in results
This is due to signedness in the string range comparison in extentmap
and unsignedness everywhere else.
2021-07-02 19:22:46 +03:00
1d5f309b8f MCOL-1205 Support queries with circular joins
This patch adds support for queries with circular joins.
Currently support added for inner joins only.
2021-07-02 18:37:07 +03:00
20d90c6293 MCOL-4789 mcs51_cpimport_select_from is not deterministic
Testing script was left in.
2021-07-02 09:21:31 -05:00
6dc356ed60 Merge pull request #1989 from denis0x0D/MCOL-4713
MCOL-4713 Analyze table implementation.
2021-07-02 16:17:07 +03:00
c20015a7b2 MCOL-4713 Analyze table implementation. 2021-07-02 12:37:12 +03:00
6ae64bc750 MCOL-4789 Make mcs51_cpimport_select_from deterministic 2021-07-01 16:29:37 -05:00
325bb6c9e0 Merge pull request #1986 from tntnatbry/MCOL-1482
MCOL-1482 An UPDATE operation on a non-ColumnStore table involving a cross-engine join
2021-07-01 14:25:32 +03:00
132146b9c8 Mcol 3738 Allow COUNT(DISTINCT to have multiple parms) (#2002)
* MCOL-3738 allow COUNT(DISTINCT) multiple parameters
Changes in the way tupleaggregatestep sets up the aggregate arrays.

* MCOL-3738 mtr test
2021-06-28 20:14:44 +03:00
49255f5cbd MCOL-1482 An UPDATE operation on a non-ColumnStore table involving a
cross-engine join with a ColumnStore table errors out.

ColumnStore cannot directly update a foreign table. We detect whether
a multi-table UPDATE operation is performed on a foreign table, if so,
do not create the select_handler and let the server execute the UPDATE
operation instead.
2021-06-25 15:27:54 +00:00
2de4888899 Merge pull request #1990 from drrtuy/MCOL-4173_9
MCOL-4173 This patch adds support for wide-DECIMAL INNER, OUTER, SEMI…
2021-06-24 16:15:07 +03:00
6620d873fd Merge pull request #1927 from denis0x0D/MCOL-4407
MCOL-4407 and condtion does not work when HWM > columnstore_string_san_threshold - 1
2021-06-24 15:58:35 +03:00
bed0b7c6bc MCOL-4173 This patch adds support for wide-DECIMAL INNER, OUTER, SEMI, functional JOINs
based on top of TypelessData
2021-06-24 08:07:23 +00:00
8dd2f2937c MCOL-4407 and condtion does not work when HWM > columnstore_string_scan_threshold - 1 2021-06-21 14:04:26 +03:00
96f2a55eea Merge pull request #1970 from tntnatbry/MCOL-4525
MCOL-4525 Implement columnstore_select_handler=AUTO.
2021-06-14 10:43:34 +03:00