1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-06-09 06:41:19 +03:00

25 Commits

Author SHA1 Message Date
Aleksei Antipovskii
0ab03c7258 chore(codestyle): mark virtual methods as override 2025-02-21 20:01:34 +04:00
Sergey Zefirov
920607520c
feat(runtime)!: MCOL-678 A "GROUP BY ... WITH ROLLUP" support
Adds a special column which helps to differentiate data and rollups of
various depts and a simple logic to row aggregation to add processing of
subtotals.
2023-09-26 17:01:53 +03:00
Denis Khalikov
896e8dd769
MCOL-5522 Properly process pm join result count. (#2909)
This patch:
1. Properly processes situation when pm join result count is exceeded.
2. Adds session variable 'columnstore_max_pm_join_result_count` to control the limit.
2023-08-04 16:55:45 +03:00
Leonid Fedorov
65cde8c894
feature: pron (#2908)
* feature: Special dictionary, we can pass with session veriable to modify codepaths and behaviour for testing and debugging
2023-07-21 14:02:03 +03:00
Denis Khalikov
1f190a6e75 MCOL-5477 Disk join step improvement.
This patch:
1. Handles corner case when the bucket exceeded the memory limit, but we cannot redistribute the data in this bucket into new buckets based on a hash algorithm, because the rows have the same values.
2. Adds force option for disk join step.
3. Add a option to contol the depth of the partition tree.
2023-06-23 18:40:15 +03:00
Leonid Fedorov
8f93fc3623
MCOL-5493: First portion of UBSan fixes (#2842)
Multiple UB fixes
2023-06-02 17:02:09 +03:00
Leonid Fedorov
3919c541ac
New warnfixes (#2254)
* Fix clang warnings

* Remove vim tab guides

* initialize variables

* 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length

* Fix ISO C++17 does not allow 'register' storage class specifier for outdated bison

* chars are unsigned on ARM, having  if (ival < 0) always false

* chars are unsigned by default on ARM and comparison with -1 if always true
2022-02-17 13:08:58 +03:00
Gagan Goel
973e5024d8 MCOL-4957 Fix performance slowdown for processing TIMESTAMP columns.
Part 1:
 As part of MCOL-3776 to address synchronization issue while accessing
 the fTimeZone member of the Func class, mutex locks were added to the
 accessor and mutator methods. However, this slows down processing
 of TIMESTAMP columns in PrimProc significantly as all threads across
 all concurrently running queries would serialize on the mutex. This
 is because PrimProc only has a single global object for the functor
 class (class derived from Func in utils/funcexp/functor.h) for a given
 function name. To fix this problem:

   (1) We remove the fTimeZone as a member of the Func derived classes
   (hence removing the mutexes) and instead use the fOperationType
   member of the FunctionColumn class to propagate the timezone values
   down to the individual functor processing functions such as
   FunctionColumn::getStrVal(), FunctionColumn::getIntVal(), etc.

   (2) To achieve (1), a timezone member is added to the
   execplan::CalpontSystemCatalog::ColType class.

Part 2:
 Several functors in the Funcexp code call dataconvert::gmtSecToMySQLTime()
 and dataconvert::mySQLTimeToGmtSec() functions for conversion between seconds
 since unix epoch and broken-down representation. These functions in turn call
 the C library function localtime_r() which currently has a known bug of holding
 a global lock via a call to __tz_convert. This significantly reduces performance
 in multi-threaded applications where multiple threads concurrently call
 localtime_r(). More details on the bug:
   https://sourceware.org/bugzilla/show_bug.cgi?id=16145

 This bug in localtime_r() caused processing of the Functors in PrimProc to
 slowdown significantly since a query execution causes Functors code to be
 processed in a multi-threaded manner.

 As a fix, we remove the calls to localtime_r() from gmtSecToMySQLTime()
 and mySQLTimeToGmtSec() by performing the timezone-to-offset conversion
 (done in dataconvert::timeZoneToOffset()) during the execution plan
 creation in the plugin. Note that localtime_r() is only called when the
 time_zone system variable is set to "SYSTEM".

 This fix also required changing the timezone type from a std::string to
 a long across the system.
2022-02-14 14:12:27 -05:00
Leonid Fedorov
04752ec546 clang format apply 2022-01-21 16:43:49 +00:00
Leonid Fedorov
01f3ceb437 replace header guards with #pragma once 2022-01-21 15:24:58 +00:00
Denis Khalikov
c20015a7b2 MCOL-4713 Analyze table implementation. 2021-07-02 12:37:12 +03:00
Gagan Goel
8a03e6c7d1 MCOL-4543 Subquery optimization.
For a query of the form:

SELECT COUNT(c2) FROM (SELECT * FROM t1) q;

where t1 contains 10 columns c1, c2, ... , c10.

We currently create an intermediate RowGroup in ExeMgr with
a row of the form (1, c2_value1, 1, 1, 1, 1, 1, 1, 1, 1), i.e.
for all the columns of the subquery which are not referenced in
the outer query, we substitute a constant value, which is wasteful.

With this optimization, we are trimming the RowGroup to a row
of the form (1, c2_value1). This can have non-trivial query
execution time improvements if the subquery contains large number
of columns (such as a "select *" on a very wide table) and the outer
query is only referencing a subset of these columns with lower
index values from the subquery (as an example, c1 or c2 above).
That is, the current limitation of this optimization is we are not
removing those non-referenced subquery columns (c1 in the query above)
which are to the left of a referenced column.
2021-03-29 11:56:04 +00:00
David Hall
35c4b66a67 MCOL-4144 Enable lower_case_table_names
Create tables and schemas with lower case name only if the flag is set.
During operations, convert to lowercase in plugin. Byt the time a query gets to ExeMgr, DDLProc etc., everything must be lower case if the flag is set, and undisturbed if not.
2020-09-24 15:21:13 -05:00
Patrick LeBlanc
cc7251d9db
Merge pull request #1306 from benthompson15/MCOL-4030
MCOL-4030
2020-06-26 09:49:54 -05:00
benthompson15
eac7dab096 MCOL-4030: first commit of warning removals unneed const and missing virtual dtors. 2020-06-23 13:51:36 -05:00
Gagan Goel
b48cf64b78 MCOL-4043 Fix memory leaks - 1 (second attempt)
simpleScalarFilterToParseTree() performs a dynamic allocation
of a ParseTree object, but this memory is never freed later.
We now keep track of this allocation and perform the delete
in ~CSEP/CSEP::unserialize() after the query finishes.
2020-06-19 15:39:49 -04:00
Andrew Hutchings
49994f7bc3 Fix warnings found in DEBUG combined build
Fixes:
* Irrelevant where conditions
* Irrelevant const
* A potential infinite loop in treenode
* Bad implicit case fallthroughs
* Explicit markings for required case fallthroughs
* Unused variables
* Unused function

Also disabled some warnings for now which we should fix later.
2019-12-10 16:33:08 +00:00
Roman Nozdrin
7b5e5f0eb6 MCOL-894 Upmerged the fist part of the patch into develop.
MCOL-894 Add default values in Compare and CSEP ctors to activate UTF-8 sorting
    properly.

MCOL-894 Unit tests to build a framework for a new parallel sorting.

MCOL-894 Finished with parallel workers invocation.
     The implementation lacks final aggregation step.

MCOL-894 TupleAnnexStep's init and destructor are now parallel execution aware.

    Implemented final merging step for parallel execution finalizeParallelOrderBy().

    Templated unit test to use it with arbitrary number of rows, threads.

Reuse LimitedOrderBy in the final step

MCOL-894 Cleaned up finalizeParallelOrderBy.

MCOL-894 Add and propagate thread variable that controls a number of threads.

    Optimized comparators used for sorting and add corresponding UTs.

    Refactored TupleAnnexStep::finalizeParallelOrderByDistinct.

    Parallel sorting methods now preallocates memory in batches.

MCOL-894 Fixed comparator for StringCompare.
2019-11-05 15:23:43 +03:00
Andrew Hutchings
811909aa72 Merge branch 'develop-1.2' into develop-merge-up-20190729 2019-07-29 12:19:26 +01:00
Andrew Hutchings
cddb776bd4 Merge branch 'develop-1.1' into develop-1.2-merge-up-20190619 2019-06-19 18:34:43 +01:00
Andrew Hutchings
7d22a5945c MCOL-1989 Fix view in view subquery outer join
A view calling a view as part of a subquery outer join was not getting
the view name for the derived table columns. Which caused ColumnStore to
think it was joining outside of the view and triggered a missing column
error.

This fix adds the view name from the subquery if one cannot be obtained
from the field object.
2019-06-14 15:52:30 +01:00
Gagan Goel
e89d1ac3cf MCOL-265 Add support for TIMESTAMP data type 2019-04-23 00:00:09 -04:00
Roman Nozdrin
07561c43d7 MCOL-1052 LIMIT processing refactoring in getGroupPlan(). 2018-08-30 17:03:14 +03:00
Andrew Hutchings
01446d1e22 Reformat all code to coding standard 2017-10-26 17:18:17 +01:00
david hill
f6afc42dd0 the begginning 2016-01-06 14:08:59 -06:00