1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-04-20 09:07:44 +03:00

43 Commits

Author SHA1 Message Date
Aleksei Antipovskii
5556d818f8 chore(codestyle): mark virtual methods as override 2025-02-21 20:02:38 +04:00
drrtuy
8ae5a3da40
Fix/mcol 5787 rgdata buffer max size dev (#3325)
* fix(rowgroup): RGData now uses uint64_t counter for the fixed sizes columns data buf.
	The buffer can utilize > 4GB RAM that is necessary for PM side join.
	RGData ctor uses uint32_t allocating data buffer.
 	This fact causes implicit heap overflow.

* feat(bytestream,serdes): BS buffer size type is uint64_t
	This necessary to handle 64bit RGData, that comes as
	a separate patch. The pair of patches would allow to
	have PM joins when SmallSide size > 4GB.

* feat(bytestream,serdes): Distribute BS buf size data type change to avoid implicit data type narrowing

* feat(rowgroup): this returns bits lost during cherry-pick. The bits lost caused the first RGData::serialize to crash a process
2024-11-09 19:44:02 +00:00
Leonid Fedorov
a508b86091 remove boost/shared_array include 2023-04-14 09:42:50 +00:00
Leonid Fedorov
8ff4192f52 get rid of getbinaryfield in ranking window functions 2022-08-24 14:50:03 +00:00
david.hall
3b6449842f Merge branch 'develop' into MCOL-4841
# Conflicts:
#	exemgr/main.cpp
#	oam/etc/Columnstore.xml.singleserver
#	primitives/primproc/primproc.cpp
2022-06-09 10:07:26 -05:00
Leonid Fedorov
3919c541ac
New warnfixes (#2254)
* Fix clang warnings

* Remove vim tab guides

* initialize variables

* 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length

* Fix ISO C++17 does not allow 'register' storage class specifier for outdated bison

* chars are unsigned on ARM, having  if (ival < 0) always false

* chars are unsigned by default on ARM and comparison with -1 if always true
2022-02-17 13:08:58 +03:00
David Hall
27dea733c5 MCOL4841 dev port run large join without OOM 2022-02-09 17:33:55 -06:00
Leonid Fedorov
04752ec546 clang format apply 2022-01-21 16:43:49 +00:00
Alexander Barkov
9794f24369 MCOL-4801 Replace Row methods getStringLength() and getStringPointer() to getConstString() 2021-07-06 21:15:32 +04:00
Alexander Barkov
9608533d92 MCOL-4734 Compilation failure: MariaDB-10.6 + ColumnStore-develop
mcsconfig.h and my_config.h have the following
pre-processor definitions:

1. Conflicting definitions coming from the standard cmake definitions:
- PACKAGE
- PACKAGE_BUGREPORT
- PACKAGE_NAME
- PACKAGE_STRING
- PACKAGE_TARNAME
- PACKAGE_VERSION
- VERSION

2. Conflicting definitions of other kinds:
- HAVE_STRTOLL - this is a dirt in MariaDB headers.
  Should be fixed in the server code. my_config.h erroneously
  performs "#define HAVE_STRTOLL" instead of "#define HAVE_STRTOLL 1".
  in some cases. The former is not CMake compatible style. The latter is.

3. Non-conflicting definitions:
  Otherwise, mcsconfig.h and my_config.h should be mutually compatible,
  because both are generated by cmake on the same host machine. So
  they should have exactly equal definitions like "HAVE_XXX", "SIZEOF_XXX", etc.

Observations:
- It's OK to include both mcsconfig.h and my_config.h providing that we
  suppress duplicate definition of the above conflicting types #1 and #2.
- There is no a need to suppress duplicate definitions mentioned in #3,
  as they are compatible!
- my_sys.h and m_ctype.h must always follow a CMake configuation header,
  either my_config.h or mcsconfig.h (or both).
  They must never be included without any preceeding configuration header.

This change make sure that we resolve conflicts by:
- either disallowing inclusion of mcsconfig.h and my_config.h
  at the same time
- or by hiding conflicting definitions #1 and #2
  (with their later restoring).
- also, by making sure that my_sys.h and m_ctype.h always follow
  a CMake configuration file.

Details:
- idb_mysql.h can now only be included only after my_config.h
  An attempt to use idb_mysql.h with mcsconfig.h instead of
  my_config.h is caught by the "#error" preprocessor directive.

- mariadb_my_sys.h can now be only included after mcsconfig.h.
  An attempt to use mariadb_my_sys.h without mcscofig.h
  (e.g. with my_config.h) is also caught by "#error".

- collation.h now can now be included in two ways.
  It now has the following effective structure:

    #if defined(PREFER_MY_CONFIG_H) && defined(MY_CONFIG_H)
    //  Remember current conflicting definitions on the preprocessor stack
    //  Undefine current conflicting definitions
    #endif
    #include "mcsconfig.h"
    #include "m_ctype.h"
    #if defined(PREFER_MY_CONFIG_H) && defined(MY_CONFIG_H)
    #    Restore conflicting definitions from the preprocessor stack
    #endif

  and can be included as follows:

  a. using only mcsconfig.h as a configuration header:

    // my_config.h must not be included so far
    #include "collation.h"

  b. using my_config.h as the first included configuration file:

    #define PREFER_MY_CONFIG_H // Force conflict resolution
    #include "my_config.h"     // can be included directly or indirectly
    ...
    #include "collation.h"

Other changes:

- Adding helper header files
     utils/common/mcsconfig_conflicting_defs_remember.h
     utils/common/mcsconfig_conflicting_defs_restore.h
     utils/common/mcsconfig_conflicting_defs_undef.h
  to perform conflict resolution easier.

- Removing `#include "collation.h"` from a number of files,
  as it's automatically included from rowgroup.h.

- Removing redundant `#include "utils_utf8.h"`.
  This change is not directly related to the problem being fixed,
  but it's nice to remove redundant directives for both collation.h
  and utils_utf8.h from all the files that do not really need them.
  (this change could probably have gone as a separate commit)

- Changing my_init() to MY_INIT(argv[0]) in the MCS services sources.
  After the fix of the complitation failure it appeared that ColumnStore
  services compiled with the debug build crash due to recent changes in
  safemalloc. The crash happened in strcmp() with `my_progname` as an argument
  (where my_progname is a mysys global variable). This problem should
  probably be fixed on the server side as well to avoid passing NULL.
  But, the majority of MariaDB executable programs also use MY_INIT(argv[0])
  rather than my_init(). So let's make MCS do like the other programs do.
2021-05-25 12:34:36 +04:00
Alexey Antipovsky
5080e1ae53 MCOL-4031 More accurate memory usage counting while sorting 2021-01-29 18:31:20 +03:00
Alexander Barkov
09f937693f MCOL-4454 "ORDER BY BINARY a" is not like in InnoDB 2021-01-12 12:33:44 +04:00
Gagan Goel
785bc43435 MCOL-4188 Regression fixes for MCOL-641.
1. Make PredicateOperator::setOpType() function wide decimal aware.
  2. Added support for wide decimal in jlf_subquery.cpp::getColumnValue()
     used in scalar subqueries.
  3. Fixed the column index used for fetching wide decimal values from a Row
     when a wide decimal field is used in the order by clause.
2020-11-27 13:27:48 -05:00
Roman Nozdrin
3eb26c0d4a MCOL-4313 Introduced TSInt128 that is a storage class for int128
Removed uint128 from joblist/lbidlist.*

Another toString() method for wide-decimal that is EMPTY/NULL aware

Unified decimal processing in WF functions

Fixed a potential issue in EqualCompData::operator() for
    wide-decimal processing

Fixed some signedness warnings
2020-11-18 13:53:15 +00:00
Alexander Barkov
d5c6645ba1 Adding mcs_basic_types.h
For now it consists of only:

using int128_t = __int128;
using uint128_t = unsigned __int128;

All new privitive data types should go into this file in the future.
2020-11-18 13:53:15 +00:00
Roman Nozdrin
1c3a34a3d0 Dataconvert::decimalToString badly fails w/o 20th member of mcs_pow_10 so I returned it
WF::percentile runtime threw an exception b/c of wrong DT deduced from its argument

Replaced literals with constants

Tought WF_sum_avg::checkSumLimit to use refs instead of values
2020-11-18 13:52:20 +00:00
David Hall
638202417f MCOL-4171 2020-11-18 13:52:19 +00:00
Roman Nozdrin
17bad9eb0b MCOL-641 Initial support for ORDER BY on wide DECIMALs. 2020-11-18 13:51:26 +00:00
David Hall
d0818f2b4e MCOl-3536 Collation phase 2 2020-06-18 13:54:16 -05:00
Gagan Goel
6c05717b4c MCOL-4043 Fix memory leaks - 2
Perform deletes of Compare objects in the OrderByData dtor,
which were allocated by CompareRule::compileRules().
2020-06-11 19:21:11 -04:00
David Hall
f9078efbc6 MCOL-3536 Collation 2020-06-08 17:57:37 -05:00
David Hall
2e66b1f1e8 MCOL-3536 Collation 2020-05-28 14:19:17 -05:00
David Hall
1f3d1e6fd6 MCOL-3536 collation 2020-05-14 16:02:49 -05:00
Gagan Goel
ba4053bf2b Use signed int64_t for comparing two TIME values in the order by clause. 2019-12-06 14:31:48 +00:00
Roman Nozdrin
0696696cf6 MCOL-894 Upmerged post review changes.
Raised the default for orderby threads from 4 to 16.
    Removed if LIMIT check block in makeVtableModeSteps().
    Removed a duplicate of TimeCompare class and methods.

MCOL-3536 Upmerged the change.

MCOL-894 Post review changes.
    Uncomment if LIMIT check block in makeVtableModeSteps().

    TupleAnnexStep dtor now uses vector::size() as a boundary.

    Removed useless try-catch blocks.

    executeParallelOrderBy() now calculates rowSize only once.

    Removed forward declaration in the unexisting namespace.

    Trim UTs a bit.
2019-11-05 17:23:49 +03:00
Roman Nozdrin
7b5e5f0eb6 MCOL-894 Upmerged the fist part of the patch into develop.
MCOL-894 Add default values in Compare and CSEP ctors to activate UTF-8 sorting
    properly.

MCOL-894 Unit tests to build a framework for a new parallel sorting.

MCOL-894 Finished with parallel workers invocation.
     The implementation lacks final aggregation step.

MCOL-894 TupleAnnexStep's init and destructor are now parallel execution aware.

    Implemented final merging step for parallel execution finalizeParallelOrderBy().

    Templated unit test to use it with arbitrary number of rows, threads.

Reuse LimitedOrderBy in the final step

MCOL-894 Cleaned up finalizeParallelOrderBy.

MCOL-894 Add and propagate thread variable that controls a number of threads.

    Optimized comparators used for sorting and add corresponding UTs.

    Refactored TupleAnnexStep::finalizeParallelOrderByDistinct.

    Parallel sorting methods now preallocates memory in batches.

MCOL-894 Fixed comparator for StringCompare.
2019-11-05 15:23:43 +03:00
David Hall
0d2b0e0070 MCOL-3536 check for Japanese and use byte compare if found 2019-11-05 15:22:15 +03:00
David Hall
3b01d5ea22 MCOL-3536 use Locale from columnstore.xml when doing ORDER BY 2019-11-05 15:22:15 +03:00
Gagan Goel
6207d6137d Handle comparison of two negative TIME values properly in the ORDER BY clause. 2019-10-08 10:23:56 -04:00
Gagan Goel
c3a5b57d85 Use signed integer comparison for TIME data type in the ORDER BY clause. 2019-10-04 17:46:58 -04:00
Gagan Goel
6ccdb4ac10 1. Fix for group_concat(distinct ... ) to use all unique values in a row.
2. Explicitly set array bounds for Row::hash() and Row::equals().
2019-08-15 01:15:05 -04:00
Gagan Goel
e89d1ac3cf MCOL-265 Add support for TIMESTAMP data type 2019-04-23 00:00:09 -04:00
Andrew Hutchings
6ba299cccd
Merge branch 'develop-1.2' into MCOL-1822-c 2019-03-06 06:47:46 -05:00
David Hall
3f2c753947 MCOL-1822-c final checkin 2019-03-05 09:33:39 -06:00
Roman Nozdrin
947eaaef26 MCOL-901 group_concat() uses only a subset of unique values
b/c an incorrect number of key columns was used whilst
        initializing Eq() struct. The regression caused by
        changes made for MCOL-1829 so I revised the changes
        made. Now LimitedOrderBy::getKeyLength() returns an
        actual key columns count.
2019-03-02 02:01:41 +03:00
David Hall
c5b9ae11e5 MCOL-1822 add LONG DOUBLE support 2019-01-29 09:55:43 -06:00
Andrew Hutchings
95c4f17b0a Merge branch 'develop-1.1' into 1.2-merge-up-20190122 2019-01-22 08:26:26 +00:00
Roman Nozdrin
7b4cec5757 MCOL-1829 Subquery with limited order by could potentially return onordered set.
There were two code mistakes: Eq::operator() always returned true for
    any pair and Hasher::operator() always returned 0 as a key.
2019-01-14 16:55:41 +03:00
Roman Nozdrin
35a17a87c4 MCOL-1829 Subquery with limited order by could potentially return onordered set.
There were two code mistakes: Eq::operator() always returned true for
	any pair and Hasher::operator() always returned 0 as a key.
2018-12-26 10:39:32 +03:00
Andrew Hutchings
3c1ebd8b94 MCOL-392 Add initial TIME datatype support 2018-04-30 09:42:41 +01:00
Andrew Hutchings
01446d1e22 Reformat all code to coding standard 2017-10-26 17:18:17 +01:00
Andrew Hutchings
f369ee6212 MCOL-707 Fix memory accounting for ORDER BY
Missed off window function order by memory accounting in my first commit
for MCOL-707. Fixed in the same way
2017-05-10 22:59:52 +01:00
david hill
f6afc42dd0 the begginning 2016-01-06 14:08:59 -06:00