1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-08-01 06:46:55 +03:00
Commit Graph

1150 Commits

Author SHA1 Message Date
7bda598fbf MCOL-4810 Redundant copying and wasting memory in PrimProc
This patch eliminates a copying `long string`s into the bytestream.
2021-08-26 12:16:23 +03:00
5c5f103f98 MCOL-4839: Fix clang build (#2100)
* Fix clang build

* Extern C returned to plugin_instance

Co-authored-by: Leonid Fedorov <l.fedorov@mail.corp.ru>
2021-08-23 10:45:10 -05:00
923bbf4033 MCOL-1356: Add convert_tz (#2099) 2021-08-19 17:47:10 -05:00
517e793843 One more bool* to bool cast bug (#2097) 2021-08-18 15:56:57 -05:00
98473a45cc Merge pull request #2079 from dhall-MariaDB/MCOL-3741
Mcol 3741 Change IDB-xxxx error codes to MCS-xxxx
2021-08-18 14:01:04 -04:00
dbb1269d69 GetInterrupted returned bool instead of bool * (#2085) 2021-08-18 11:50:45 -05:00
a2f441cd9c moda returned local object pointer (#2089) 2021-08-18 11:41:07 -05:00
3136e9dbab We forgot to initilize longdoublenull value (#2091) 2021-08-18 11:34:35 -05:00
150770b919 Merge pull request #2059 from mariadb-corporation/unittests-ctest
Add ctest for google unittests
2021-08-11 11:03:05 +03:00
ecde2719b1 MCOL-3741 Change IDB-xxxx error codes to MCS-xxxx 2021-08-09 11:33:09 -05:00
4d4dd22105 MCOL-4771 develop fix crash from rand() 2021-08-06 16:09:28 -05:00
73e710ed52 Add ctest for google unittests 2021-08-02 19:41:04 +03:00
4cdef40a55 Merge pull request #2052 from drrtuy/MCOL-4815
MCOL-4815 ColumnCommand was replaced with a set of derived classes sp…
2021-07-21 17:36:35 +03:00
a292585b8c MCOL-4815 ColumnCommand was replaced with a set of derived classes specified by
column width

RTSCommand was modified to use a fabric that produces CC class based on column width

NB this patch doesn't affect PseudoCC that also leverages ColumnCommand
2021-07-21 12:54:14 +00:00
fa8dc815a7 MCOL-4814 Add a cmake build option to enable LZ4 compression.
This patch adds an option for cmake flags to enable lz4 compression.
2021-07-16 17:57:11 +03:00
3d557a2f1e Merge pull request #2044 from dhall-MariaDB/MCOL-3738
MCOL-3738 COUNT(DISTINCT) with multiple parms
2021-07-12 07:34:56 -04:00
51a8ffcb6a Fix sumavgoverflow.sql test 2021-07-09 22:41:28 +00:00
76607be63a MCOL-3738 COUNT(DISTINCT) with multiple parms
Fixed regression
Added a few more mtr tests
2021-07-09 09:07:03 -05:00
f81f743282 Replace underlying type for avg and sum for int types from long double to wide decimal 2021-07-08 17:04:43 +00:00
1113470551 MCOL-4738 AVG gives wrong results with strict_aliasing
A f fix that works with strict_aliasing
2021-07-07 13:08:32 -05:00
866dc25729 Merge pull request #1842 from denis0x0D/MCOL-987_LZ
MCOL-987 LZ4 compression support.
2021-07-07 13:13:18 +03:00
7b4f759592 Merge pull request #2032 from drrtuy/MCOL-4802
MCOL-4802 Removed ByteStream methods for bool and add some logging in…
2021-07-07 13:03:54 +03:00
8988253ff4 Merge pull request #2031 from mariadb-corporation/bar-develop-MCOL-4801
MCOL-4801 Replace Row methods getStringLength() and getStringPointer(…
2021-07-07 13:53:19 +04:00
fb5ba84212 MCOL-4802 Removed ByteStream methods for bool manipulations and add some logging into I_S.columnstore_files 2021-07-07 07:16:30 +00:00
8332ab8974 MCOL-4738 AVG() returns a wrong result
On AMD64 machines, the fpu is 80 bits. The unused bits must be masked for memcmp to work properly. For other archetectures, we don't want to mask those bits.
2021-07-06 19:50:00 -05:00
9794f24369 MCOL-4801 Replace Row methods getStringLength() and getStringPointer() to getConstString() 2021-07-06 21:15:32 +04:00
cc1c3629c5 MCOL-987 Add LZ4 compression.
* Adds CompressInterfaceLZ4 which uses LZ4 API for compress/uncompress.
* Adds CMake machinery to search LZ4 on running host.
* All methods which use static data and do not modify any internal data - become `static`,
  so we can use them without creation of the specific object. This is possible, because
  the header specification has not been modified. We still use 2 sections in header, first
  one with file meta data, the second one with pointers for compressed chunks.
* Methods `compress`, `uncompress`, `maxCompressedSize`, `getUncompressedSize` - become
  pure virtual, so we can override them for the other compression algos.
* Adds method `getChunkMagicNumber`, so we can verify chunk magic number
  for each compression algo.
* Renames "s/IDBCompressInterface/CompressInterface/g" according to requirement.
2021-07-06 18:04:37 +03:00
8520f87237 MCOL-641 Cleanup. 2021-07-06 09:01:49 +00:00
1d5f309b8f MCOL-1205 Support queries with circular joins
This patch adds support for queries with circular joins.
Currently support added for inner joins only.
2021-07-02 18:37:07 +03:00
c20015a7b2 MCOL-4713 Analyze table implementation. 2021-07-02 12:37:12 +03:00
60495564b8 [MCOL-4709] Fix another UB in disk aggregation 2021-06-29 17:47:07 +03:00
8a0b68f25e [MCOL-4709] Fix UB in disk aggregation 2021-06-28 20:07:23 +03:00
d8cbc000e2 Merge pull request #2004 from drrtuy/MCOL-4759
MCOL-4759 Upmerge for MCOL-4564 code that implements hash merging fam…
2021-06-28 14:05:16 +03:00
8c360a1a27 MCOL-4759 Upmerge for MCOL-4564 code that implements hash merging family to reduce
performance penalty using MDB hashing functions
2021-06-24 14:48:01 +00:00
2de4888899 Merge pull request #1990 from drrtuy/MCOL-4173_9
MCOL-4173 This patch adds support for wide-DECIMAL INNER, OUTER, SEMI…
2021-06-24 16:15:07 +03:00
bed0b7c6bc MCOL-4173 This patch adds support for wide-DECIMAL INNER, OUTER, SEMI, functional JOINs
based on top of TypelessData
2021-06-24 08:07:23 +00:00
7c8b502dc2 Fix regression in a query involving an aggregate function on a
non-wide decimal column in the HAVING clause.

In buildAggregateColumn(), if an aggregate function (such as avg)
is applied on a non-wide decimal column, we were setting the precision
of the resulting column as -1. This later down in the execution got
converted to 255 as in some cases, precision is stored as uint8_t.
The predicate operations on a DECIMAL column has logic that uses
the wide Decimal::s128value field if precision > 18. This logic incorrectly
used the Decimal::s128value instead of the correct value stored in the
narrow Decimal::value field, since precision of the Decimal column
was 255. The fix is to set the aggregate column precision to
datatypes::INT64MAXPRECISION (18) in buildAggregateColumn() when the
aggregate is applied on a non-wide decimal column.

This commit also partially fixes -Wstrict-aliasing GCC warnings.
2021-06-22 11:11:34 +00:00
5155a08a67 Merge pull request #1987 from mariadb-corporation/bar-develop-MCOL-4700
MCOL-4700 Wrong result of a UNION for INT and INT UNSIGNED
2021-06-14 02:36:10 -04:00
67449418ed MCOL-4700 Wrong result of a UNION for INT and INT UNSIGNED 2021-06-11 19:31:51 +04:00
b3d6f62964 MCOL-4753 Performance problem in Typeless join 2021-06-10 09:26:26 +00:00
0dedb7e628 Fix compilation warnings 2021-06-09 16:51:00 +03:00
7a152c6a19 Merge pull request #1944 from mariadb-AlexeyAntipovsky/MCOL-563-dev
[MCOL-4709] Disk-based aggregation
2021-06-08 20:42:58 +03:00
475104e4d3 [MCOL-4709] Disk-based aggregation
* Introduce multigeneration aggregation

* Do not save unused part of RGDatas to disk
* Add IO error explanation (strerror)

* Reduce memory usage while aggregating
* introduce in-memory generations to better memory utilization

* Try to limit the qty of buckets at a low limit

* Refactor disk aggregation a bit
* pass calculated hash into RowAggregation
* try to keep some RGData with free space in memory

* do not dump more than half of rowgroups to disk if generations are
  allowed, instead start a new generation
* for each thread shift the first processed bucket at each iteration,
  so the generations start more evenly

* Unify temp data location

* Explicitly create temp subdirectories
  whether disk aggregation/join are enabled or not
2021-06-06 16:09:15 +03:00
606194e6e4 MCOL-4685: Eliminate some irrelevant settings (uncompressed data and extents per file).
This patch:
1. Removes the option to declare uncompressed columns (set columnstore_compression_type = 0).
2. Ignores [COMMENT '[compression=0] option at table or column level (no error messages, just disregard).
3. Removes the option to set more than 2 extents per file (ExtentsPreSegmentFile).
4. Updates rebuildEM tool to support up to 10 dictionary extent per dictionary segment file.
5. Adds check for `DBRootStorageType` for rebuildEM tool.
6. Renamed rebuildEM to mcsRebuildEM.
2021-06-03 14:44:33 +03:00
90397dfed0 MCOL-4675 DMLProc now automatically and gracefully shutdowns when a cluster state is set to
SS_SHUTDOWN_PENDING | SS_ROLLBACK
2021-05-27 11:07:32 +00:00
42e710f817 Merge pull request #1942 from mariadb-corporation/bar-develop-compile-10.6
Fixing 10.6 + develop compilation failure
2021-05-25 14:31:37 +03:00
9608533d92 MCOL-4734 Compilation failure: MariaDB-10.6 + ColumnStore-develop
mcsconfig.h and my_config.h have the following
pre-processor definitions:

1. Conflicting definitions coming from the standard cmake definitions:
- PACKAGE
- PACKAGE_BUGREPORT
- PACKAGE_NAME
- PACKAGE_STRING
- PACKAGE_TARNAME
- PACKAGE_VERSION
- VERSION

2. Conflicting definitions of other kinds:
- HAVE_STRTOLL - this is a dirt in MariaDB headers.
  Should be fixed in the server code. my_config.h erroneously
  performs "#define HAVE_STRTOLL" instead of "#define HAVE_STRTOLL 1".
  in some cases. The former is not CMake compatible style. The latter is.

3. Non-conflicting definitions:
  Otherwise, mcsconfig.h and my_config.h should be mutually compatible,
  because both are generated by cmake on the same host machine. So
  they should have exactly equal definitions like "HAVE_XXX", "SIZEOF_XXX", etc.

Observations:
- It's OK to include both mcsconfig.h and my_config.h providing that we
  suppress duplicate definition of the above conflicting types #1 and #2.
- There is no a need to suppress duplicate definitions mentioned in #3,
  as they are compatible!
- my_sys.h and m_ctype.h must always follow a CMake configuation header,
  either my_config.h or mcsconfig.h (or both).
  They must never be included without any preceeding configuration header.

This change make sure that we resolve conflicts by:
- either disallowing inclusion of mcsconfig.h and my_config.h
  at the same time
- or by hiding conflicting definitions #1 and #2
  (with their later restoring).
- also, by making sure that my_sys.h and m_ctype.h always follow
  a CMake configuration file.

Details:
- idb_mysql.h can now only be included only after my_config.h
  An attempt to use idb_mysql.h with mcsconfig.h instead of
  my_config.h is caught by the "#error" preprocessor directive.

- mariadb_my_sys.h can now be only included after mcsconfig.h.
  An attempt to use mariadb_my_sys.h without mcscofig.h
  (e.g. with my_config.h) is also caught by "#error".

- collation.h now can now be included in two ways.
  It now has the following effective structure:

    #if defined(PREFER_MY_CONFIG_H) && defined(MY_CONFIG_H)
    //  Remember current conflicting definitions on the preprocessor stack
    //  Undefine current conflicting definitions
    #endif
    #include "mcsconfig.h"
    #include "m_ctype.h"
    #if defined(PREFER_MY_CONFIG_H) && defined(MY_CONFIG_H)
    #    Restore conflicting definitions from the preprocessor stack
    #endif

  and can be included as follows:

  a. using only mcsconfig.h as a configuration header:

    // my_config.h must not be included so far
    #include "collation.h"

  b. using my_config.h as the first included configuration file:

    #define PREFER_MY_CONFIG_H // Force conflict resolution
    #include "my_config.h"     // can be included directly or indirectly
    ...
    #include "collation.h"

Other changes:

- Adding helper header files
     utils/common/mcsconfig_conflicting_defs_remember.h
     utils/common/mcsconfig_conflicting_defs_restore.h
     utils/common/mcsconfig_conflicting_defs_undef.h
  to perform conflict resolution easier.

- Removing `#include "collation.h"` from a number of files,
  as it's automatically included from rowgroup.h.

- Removing redundant `#include "utils_utf8.h"`.
  This change is not directly related to the problem being fixed,
  but it's nice to remove redundant directives for both collation.h
  and utils_utf8.h from all the files that do not really need them.
  (this change could probably have gone as a separate commit)

- Changing my_init() to MY_INIT(argv[0]) in the MCS services sources.
  After the fix of the complitation failure it appeared that ColumnStore
  services compiled with the debug build crash due to recent changes in
  safemalloc. The crash happened in strcmp() with `my_progname` as an argument
  (where my_progname is a mysys global variable). This problem should
  probably be fixed on the server side as well to avoid passing NULL.
  But, the majority of MariaDB executable programs also use MY_INIT(argv[0])
  rather than my_init(). So let's make MCS do like the other programs do.
2021-05-25 12:34:36 +04:00
284fc51bb7 MCOL-4726 Wrong result of WHERE char1_col='A' 2021-05-21 14:40:16 +04:00
bd4cbb542d MCOL-4721 CHAR(1) is not collation-aware for GROUP/DISTINCT 2021-05-18 16:14:53 +04:00
78cca01dfa Merge pull request #1899 from tntnatbry/MCOL-4612
MCOL-4612 A subquery with a union for DECIMAL and BIGINT returns zeros.
2021-05-03 02:52:23 -04:00