1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-08-07 03:22:57 +03:00
Commit Graph

33 Commits

Author SHA1 Message Date
Serguey Zefirov
bd1622f331 feat(MCOL-5886): support InnoDB's table partitions in cross-engine joins
The purpose of this changeset is to obtain list of partitions from
SELECT_LEX structure and pass it down to joblist and then to
CrossEngineStep to pass to InnoDB.
2025-04-23 08:24:10 +03:00
Aleksei Antipovskii
4bea7e59a0 feat(PrimProc): MCOL-5852 disk-based GROUP_CONCAT & JSON_ARRAYAGG
* move GROUP_CONCAT/JSON_ARRAYAGG storage to the RowGroup from
  the RowAggregation*
* internal data structures (de)serialization
* get rid of a specialized classes for processing JSON_ARRAYAGG
* move the memory accounting to disk-based aggregation classes
* allow aggregation generations to be used for queries with
  GROUP_CONCAT/JSON_ARRAYAGG
* Remove the thread id from the error message as it interferes with the mtr
2025-04-11 15:21:07 +02:00
Aleksei Antipovskii
0ab03c7258 chore(codestyle): mark virtual methods as override 2025-02-21 20:01:34 +04:00
Serguey Zefirov
39a976c39a fix(ubsan): MCOL-5844 - iron out UBSAN reports
The most important fix here is the fix of possible buffer overrun in
DATEFORMAT() function. A "%W" format, repeated enough times, would
overflow the 256-bytes buffer for result. Now we use ostringstream to
construct result and we are safe.

Changes in date/time projection functions made me fix difference between
us and server behavior. The new, better behavior is reflected in changes
in tests' results.

Also, there was incorrect logic in TRUNCATE() and ROUND() functions in
computing the decimal "shift."
2024-12-10 20:30:58 +04:00
Denis Khalikov
58e18eeb56 fix(aggregation): MCOL-5467 Add support for duplicate expressions in group by. (#3052)
This patch adds support for duplicate expressions (builtin_functions) with
one argument in select statement and group by statement.
2023-12-05 18:29:44 +03:00
Sergey Zefirov
920607520c feat(runtime)!: MCOL-678 A "GROUP BY ... WITH ROLLUP" support
Adds a special column which helps to differentiate data and rollups of
various depts and a simple logic to row aggregation to add processing of
subtotals.
2023-09-26 17:01:53 +03:00
Denis Khalikov
896e8dd769 MCOL-5522 Properly process pm join result count. (#2909)
This patch:
1. Properly processes situation when pm join result count is exceeded.
2. Adds session variable 'columnstore_max_pm_join_result_count` to control the limit.
2023-08-04 16:55:45 +03:00
Denis Khalikov
1f190a6e75 MCOL-5477 Disk join step improvement.
This patch:
1. Handles corner case when the bucket exceeded the memory limit, but we cannot redistribute the data in this bucket into new buckets based on a hash algorithm, because the rows have the same values.
2. Adds force option for disk join step.
3. Add a option to contol the depth of the partition tree.
2023-06-23 18:40:15 +03:00
david.hall
369aea884e MCOL-5311 Add timezone to jobList in subquerytransformer
TimeZone was uninitialized in this scenario and led to undefined behavior.
2022-12-07 09:52:00 -06:00
mariadb-AndreyPiskunov
1714b75434 Non working attempt to do MCOL-5227 2022-10-31 14:56:32 +02:00
Denis Khalikov
59166608b1 MCOL-4715 Mixed inner and outer joins with "null" filter for the table which is not involved into the outer join produces wrong results. 2022-08-16 17:13:03 +03:00
Denis Khalikov
e519cd7486 [MCOL-5061] Fix wrong join id assignment for the views. (#2474)
This patch fixes a wrong `join id` assignment for `TupleHashJoinStep` in a view.
After MCOL-334 CS assigns a '-1' as `join id` for `TupleHashJoinStep` in a view, and
in this case we cannot apply a filter for specific `Join step`, which is associated with `join id`
for 2 reasons:
1. Filters for all `TupleHashJoinSteps` associated with the same `join id`, which is '-1'.
2. When CS creates a `joinIdIndexMap` it eliminates all `join ids` which a less or equal 0.

This patch also fixes some tests for the view, which were generated wrong results.
2022-07-25 20:02:02 +03:00
Denis Khalikov
636e60b5f9 [MCOL-4699] Add support for circular outer joins. 2022-07-19 21:47:36 +03:00
Denis Khalikov
467fe0b401 [MCOL-5109] Make a singleton from ServicePrimProc.
This patch makes a singleton from ServicePrimProc.
2022-06-07 13:27:45 +03:00
David.Hall
bbb168a846 Mcol 4560 (#2337)
* MCOL-4560 remove unused xml entries and code that references it.
There is reader code and variables for some of these settings, but nobody uses them.
2022-04-18 18:00:17 -04:00
Gagan Goel
973e5024d8 MCOL-4957 Fix performance slowdown for processing TIMESTAMP columns.
Part 1:
 As part of MCOL-3776 to address synchronization issue while accessing
 the fTimeZone member of the Func class, mutex locks were added to the
 accessor and mutator methods. However, this slows down processing
 of TIMESTAMP columns in PrimProc significantly as all threads across
 all concurrently running queries would serialize on the mutex. This
 is because PrimProc only has a single global object for the functor
 class (class derived from Func in utils/funcexp/functor.h) for a given
 function name. To fix this problem:

   (1) We remove the fTimeZone as a member of the Func derived classes
   (hence removing the mutexes) and instead use the fOperationType
   member of the FunctionColumn class to propagate the timezone values
   down to the individual functor processing functions such as
   FunctionColumn::getStrVal(), FunctionColumn::getIntVal(), etc.

   (2) To achieve (1), a timezone member is added to the
   execplan::CalpontSystemCatalog::ColType class.

Part 2:
 Several functors in the Funcexp code call dataconvert::gmtSecToMySQLTime()
 and dataconvert::mySQLTimeToGmtSec() functions for conversion between seconds
 since unix epoch and broken-down representation. These functions in turn call
 the C library function localtime_r() which currently has a known bug of holding
 a global lock via a call to __tz_convert. This significantly reduces performance
 in multi-threaded applications where multiple threads concurrently call
 localtime_r(). More details on the bug:
   https://sourceware.org/bugzilla/show_bug.cgi?id=16145

 This bug in localtime_r() caused processing of the Functors in PrimProc to
 slowdown significantly since a query execution causes Functors code to be
 processed in a multi-threaded manner.

 As a fix, we remove the calls to localtime_r() from gmtSecToMySQLTime()
 and mySQLTimeToGmtSec() by performing the timezone-to-offset conversion
 (done in dataconvert::timeZoneToOffset()) during the execution plan
 creation in the plugin. Note that localtime_r() is only called when the
 time_zone system variable is set to "SYSTEM".

 This fix also required changing the timezone type from a std::string to
 a long across the system.
2022-02-14 14:12:27 -05:00
Leonid Fedorov
04752ec546 clang format apply 2022-01-21 16:43:49 +00:00
Leonid Fedorov
01f3ceb437 replace header guards with #pragma once 2022-01-21 15:24:58 +00:00
Gagan Goel
b48cf64b78 MCOL-4043 Fix memory leaks - 1 (second attempt)
simpleScalarFilterToParseTree() performs a dynamic allocation
of a ParseTree object, but this memory is never freed later.
We now keep track of this allocation and perform the delete
in ~CSEP/CSEP::unserialize() after the query finishes.
2020-06-19 15:39:49 -04:00
Gagan Goel
d333ac9e53 Revert "MCOL-4043 Fix memory leaks - 1" 2020-06-15 11:26:35 -04:00
Gagan Goel
881091c535 MCOL-4043 Fix memory leaks - 1
simpleScalarFilterToParseTree() performs a dynamic allocation
of a ParseTree object, but this memory is never freed later.
We now keep track of this allocation and perform the delete
in the JobList dtor after the query finishes.
2020-06-11 19:01:40 -04:00
David Hall
06e50e0926 MCOL-3536 collation 2020-05-26 12:42:11 -05:00
Roman Nozdrin
7b5e5f0eb6 MCOL-894 Upmerged the fist part of the patch into develop.
MCOL-894 Add default values in Compare and CSEP ctors to activate UTF-8 sorting
    properly.

MCOL-894 Unit tests to build a framework for a new parallel sorting.

MCOL-894 Finished with parallel workers invocation.
     The implementation lacks final aggregation step.

MCOL-894 TupleAnnexStep's init and destructor are now parallel execution aware.

    Implemented final merging step for parallel execution finalizeParallelOrderBy().

    Templated unit test to use it with arbitrary number of rows, threads.

Reuse LimitedOrderBy in the final step

MCOL-894 Cleaned up finalizeParallelOrderBy.

MCOL-894 Add and propagate thread variable that controls a number of threads.

    Optimized comparators used for sorting and add corresponding UTs.

    Refactored TupleAnnexStep::finalizeParallelOrderByDistinct.

    Parallel sorting methods now preallocates memory in batches.

MCOL-894 Fixed comparator for StringCompare.
2019-11-05 15:23:43 +03:00
Roman Nozdrin
6cdca1330b Merge pull request #808 from mariadb-corporation/develop-merge-up-20190729
Merge develop-1.2 into develop
2019-08-13 11:55:22 +03:00
Patrick LeBlanc
a09a9d5d0f Mass substitution 'Corporaton' -> 'Corporation' 2019-08-07 14:43:25 -05:00
Andrew Hutchings
811909aa72 Merge branch 'develop-1.2' into develop-merge-up-20190729 2019-07-29 12:19:26 +01:00
David Hall
4cf2c37c18 MCOL-3343 Handle windowfunction <arithmetic op> simple column 2019-06-20 09:24:58 -05:00
Gagan Goel
e89d1ac3cf MCOL-265 Add support for TIMESTAMP data type 2019-04-23 00:00:09 -04:00
David Hall
3f2c753947 MCOL-1822-c final checkin 2019-03-05 09:33:39 -06:00
David Hall
a2aa4b8479 MCOL-1822 Intermediate checkin. DISTINCT not working. 2019-02-25 14:54:46 -06:00
Andrew Hutchings
01446d1e22 Reformat all code to coding standard 2017-10-26 17:18:17 +01:00
Andrew Hutchings
ffcfc41563 MCOL-507 Further ExeMgr performance improvements
This does the following:

* Switch resource manager to a singleton which reduces the amount of
times the XML data is scanned and objects allocated.
* Make the I_S tables use the FE implementation of the system catalog
* Make the I_S.columnstore_columns table use the RID list cache
* Make the extentmap pre-allocate a vector instead of many small allocs
2017-01-16 12:33:27 +00:00
david hill
f6afc42dd0 the begginning 2016-01-06 14:08:59 -06:00