1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-08-07 03:22:57 +03:00
Commit Graph

73 Commits

Author SHA1 Message Date
Leonid Fedorov
8a535e872c Revert "Deep build refactoring phase 2 (#3564)" (#3678)
This reverts commit 449029a827.
2025-08-01 02:56:31 +04:00
Leonid Fedorov
449029a827 Deep build refactoring phase 2 (#3564)
* configcpp refactored

* chore(build): massive removals, auto add files to debian install file

* chore(build): configure before autobake

* chore(build): use custom cmake commands for components, mariadb-plugin-columnstore.install generated

* chore(build): install deps as separate step for build-packages

* more deps

* chore(codemanagement, build): build refactoring stage2

* chore(safety): Locked Map for MessageqCpp with a simpler way

 Please enter the commit message for your changes. Lines starting

* chore(codemanagement, ci): better coredumps handling, deps fixed

* Delete build/bootstrap_mcs.py

* Update charset.cpp (add license)
2025-07-17 16:14:10 +04:00
drrtuy
79008f4f69 feat(CSEP): CSEP printer with indentations to simplify reading + rewriter skeleton + some test binary to describe minimalistic CSEP localy 2025-06-26 18:35:33 +01:00
Leonid Fedorov
a0bee173f6 chore(build): fixes to satisfy clang19 warnings 2025-05-15 19:05:38 +04:00
Leonid Fedorov
8859e3f4df chore(build): satisfy gcc9 for execplan partionions unequivalence 2025-04-25 17:36:43 +04:00
Serguey Zefirov
bd1622f331 feat(MCOL-5886): support InnoDB's table partitions in cross-engine joins
The purpose of this changeset is to obtain list of partitions from
SELECT_LEX structure and pass it down to joblist and then to
CrossEngineStep to pass to InnoDB.
2025-04-23 08:24:10 +03:00
Aleksei Antipovskii
0ab03c7258 chore(codestyle): mark virtual methods as override 2025-02-21 20:01:34 +04:00
Leonid Fedorov
539db054b3 MCOL-5779: use encoding to check alter table alter column statement correctly 2024-08-30 17:54:56 +04:00
Leonid Fedorov
1c9cd9db9f Fix garbage charset using ColType(int32_t colWidth_, int32_t scale_, int32_t precision_, (#2949)
const ConstraintType& constraintType_, const DictOID& ddn_, int32_t colPosition_,
            int32_t compressionType_, OID columnOID_, const ColDataType& colDataType_);
2023-09-06 20:01:31 +03:00
Gagan Goel
d50a0fa2e6 MCOL-5005 Add charset number to system catalog - Part 2.
1. Extend the calpontsys.syscolumn system catalog table
  with a new column, 'charsetnum'.

  'charsetnum' field is set to the 'number' member of the
  'charset_info_st' struct defined in the server in m_ctype.h.

  For CHAR/VARCHAR/TEXT column types, 'charset_info_st' is
  initialized to the charset/collation of the column, which
  is set at the column-level or at the table-level in the DDL.

  For BLOB/VARBINARY binary column types, 'charset_info_st' is
  initialized to my_charset_bin (charsetnum=63).

  For all other column types, charsetnum is set to 0.

  2. Add support for the newly added 'charsetnum' column in the
  automatic system catalog upgrade logic in dbbuilder.

  For existing table definitions, charsetnum for the column is
  defaulted to 0.

  3. Add MTR test case that creates a few table definitions with
  a range of charset/collation combinations and queries the
  calpontsys.syscolumn system catalog table with the charsetnum
  field for the columns in the table DDLs.
2023-08-15 17:21:47 +00:00
Roman Nozdrin
4fe9cd64a3 Revert "No boost condition (#2822)" (#2828)
This reverts commit f916e64927.
2023-04-22 15:49:50 +03:00
Leonid Fedorov
f916e64927 No boost condition (#2822)
This patch replaces boost primitives with stdlib counterparts.
2023-04-22 00:42:45 +03:00
Sergey Zefirov
b53c231ca6 MCOL-271 empty strings should not be NULLs (#2794)
This patch improves handling of NULLs in textual fields in ColumnStore.
Previously empty strings were considered NULLs and it could be a problem
if data scheme allows for empty strings. It was also one of major
reasons of behavior difference between ColumnStore and other engines in
MariaDB family.

Also, this patch fixes some other bugs and incorrect behavior, for
example, incorrect comparison for "column <= ''" which evaluates to
constant True for all purposes before this patch.
2023-03-30 21:18:29 +03:00
Gagan Goel
86dcf92d56 MCOL-5215 Fix overflow of UNION operation involving DECIMAL datatypes.
When a UNION operation involving DECIMAL datatypes with scale and digits
before the decimal exceeds the currently supported maximum precision
of 38, we throw an error to the user:
"MCS-2060: Union operation exceeds maximum DECIMAL precision of 38".

This is until MCOL-5417 is implemented where ColumnStore will have
full parity with MariaDB server in terms of maximum supported DECIMAL
precision and scale of 65 and 38 digits respectively.
2023-02-27 06:38:31 -05:00
Roman Nozdrin
ff534dba7f MCOL-5384 This commit replaces shared pointer to CSC with CSC ctor that is cleaned up leaving a scope
CSC default ctor was private b/c it must not allow to use CSC outside thread cache.
  However there are some places in the plugin code that need a standalone syscat that
  is cleaned up leaving the scope. The decision is to make the restriction mentioned
  organizational rather than syntactical.
2023-02-08 14:03:41 +00:00
Gagan Goel
6a6fee5969 MCOL-5021 Followup.
Allow the compiler to inline the call to nextColValue() in column.cpp.
2022-08-18 19:35:35 +00:00
Gagan Goel
1355237ca3 MCOL-5021 Some minor fixes. 2022-08-05 14:40:50 -04:00
Gagan Goel
94e9f55940 MCOL-5021 Add a new member function to the DBRM class, DBRM::addToLBIDList().
This function iterates over lbidList (populated by an earlier call to
DBRM::getUncommittedExtentLBIDs()) to find those LBIDs which belong to
the AUX column. It then finds the corresponding LBIDs for all other columns
which belong to the same table as the AUX LBID and appends them to lbidList.
The updated lbidList is used by invalidateUncommittedExtentLBIDs() to update
the casual partitioning information.

DBRM::addToLBIDList() only comes into play in case of a transaction ROLLBACK.
2022-08-05 14:40:50 -04:00
Gagan Goel
ea1861fdb5 MCOL-5021 Add a new function to CalpontSystemCatalog class,
isAUXColumnOID(), to check if a given OID is an auxilliary
column OID.
2022-08-05 14:40:49 -04:00
Gagan Goel
262cd5c501 MCOL-5021 Remove hard-coded values for data type, column width
and compression type for the AUX column, and replace them with
constants defined in the execplan namespace.
2022-08-05 14:40:49 -04:00
Gagan Goel
86df9a972c MCOL-5021 Add prototype support for the AUX column in CREATE/DROP
DDL commands, single and multi-value INSERTs, cpimport, and
DELETE.
2022-08-05 14:40:49 -04:00
Leonid Fedorov
39c43a0f70 <unnamed>.execplan::CalpontSystemCatalog::TableName::create_date' may be used uninitialized 2022-07-11 22:27:25 +02:00
Serguey Zefirov
53b9a2a0f9 MCOL-4580 extent elimination for dictionary-based text/varchar types
The idea is relatively simple - encode prefixes of collated strings as
integers and use them to compute extents' ranges. Then we can eliminate
extents with strings.

The actual patch does have all the code there but miss one important
step: we do not keep collation index, we keep charset index. Because of
this, some of the tests in the bugfix suite fail and thus main
functionality is turned off.

The reason of this patch to be put into PR at all is that it contains
changes that made CHAR/VARCHAR columns unsigned. This change is needed in
vectorization work.
2022-03-02 23:53:39 +03:00
Leonid Fedorov
3919c541ac New warnfixes (#2254)
* Fix clang warnings

* Remove vim tab guides

* initialize variables

* 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length

* Fix ISO C++17 does not allow 'register' storage class specifier for outdated bison

* chars are unsigned on ARM, having  if (ival < 0) always false

* chars are unsigned by default on ARM and comparison with -1 if always true
2022-02-17 13:08:58 +03:00
Gagan Goel
973e5024d8 MCOL-4957 Fix performance slowdown for processing TIMESTAMP columns.
Part 1:
 As part of MCOL-3776 to address synchronization issue while accessing
 the fTimeZone member of the Func class, mutex locks were added to the
 accessor and mutator methods. However, this slows down processing
 of TIMESTAMP columns in PrimProc significantly as all threads across
 all concurrently running queries would serialize on the mutex. This
 is because PrimProc only has a single global object for the functor
 class (class derived from Func in utils/funcexp/functor.h) for a given
 function name. To fix this problem:

   (1) We remove the fTimeZone as a member of the Func derived classes
   (hence removing the mutexes) and instead use the fOperationType
   member of the FunctionColumn class to propagate the timezone values
   down to the individual functor processing functions such as
   FunctionColumn::getStrVal(), FunctionColumn::getIntVal(), etc.

   (2) To achieve (1), a timezone member is added to the
   execplan::CalpontSystemCatalog::ColType class.

Part 2:
 Several functors in the Funcexp code call dataconvert::gmtSecToMySQLTime()
 and dataconvert::mySQLTimeToGmtSec() functions for conversion between seconds
 since unix epoch and broken-down representation. These functions in turn call
 the C library function localtime_r() which currently has a known bug of holding
 a global lock via a call to __tz_convert. This significantly reduces performance
 in multi-threaded applications where multiple threads concurrently call
 localtime_r(). More details on the bug:
   https://sourceware.org/bugzilla/show_bug.cgi?id=16145

 This bug in localtime_r() caused processing of the Functors in PrimProc to
 slowdown significantly since a query execution causes Functors code to be
 processed in a multi-threaded manner.

 As a fix, we remove the calls to localtime_r() from gmtSecToMySQLTime()
 and mySQLTimeToGmtSec() by performing the timezone-to-offset conversion
 (done in dataconvert::timeZoneToOffset()) during the execution plan
 creation in the plugin. Note that localtime_r() is only called when the
 time_zone system variable is set to "SYSTEM".

 This fix also required changing the timezone type from a std::string to
 a long across the system.
2022-02-14 14:12:27 -05:00
Leonid Fedorov
04752ec546 clang format apply 2022-01-21 16:43:49 +00:00
Leonid Fedorov
01f3ceb437 replace header guards with #pragma once 2022-01-21 15:24:58 +00:00
Gagan Goel
b3a560300c Revert "Merge pull request #2022 from mariadb-corporation/bar-develop-MCOL-4791"
This reverts commit 4016e25e5b, reversing
changes made to 85435f6b1e.
2021-07-13 11:06:56 +00:00
Roman Nozdrin
fb5ba84212 MCOL-4802 Removed ByteStream methods for bool manipulations and add some logging into I_S.columnstore_files 2021-07-07 07:16:30 +00:00
Alexander Barkov
e8126bede5 MCOL-4791 Fix ColumnCommand fudged data type format to clearly identify CHAR vs VARCHAR 2021-07-02 12:42:03 +04:00
Alexander Barkov
9608533d92 MCOL-4734 Compilation failure: MariaDB-10.6 + ColumnStore-develop
mcsconfig.h and my_config.h have the following
pre-processor definitions:

1. Conflicting definitions coming from the standard cmake definitions:
- PACKAGE
- PACKAGE_BUGREPORT
- PACKAGE_NAME
- PACKAGE_STRING
- PACKAGE_TARNAME
- PACKAGE_VERSION
- VERSION

2. Conflicting definitions of other kinds:
- HAVE_STRTOLL - this is a dirt in MariaDB headers.
  Should be fixed in the server code. my_config.h erroneously
  performs "#define HAVE_STRTOLL" instead of "#define HAVE_STRTOLL 1".
  in some cases. The former is not CMake compatible style. The latter is.

3. Non-conflicting definitions:
  Otherwise, mcsconfig.h and my_config.h should be mutually compatible,
  because both are generated by cmake on the same host machine. So
  they should have exactly equal definitions like "HAVE_XXX", "SIZEOF_XXX", etc.

Observations:
- It's OK to include both mcsconfig.h and my_config.h providing that we
  suppress duplicate definition of the above conflicting types #1 and #2.
- There is no a need to suppress duplicate definitions mentioned in #3,
  as they are compatible!
- my_sys.h and m_ctype.h must always follow a CMake configuation header,
  either my_config.h or mcsconfig.h (or both).
  They must never be included without any preceeding configuration header.

This change make sure that we resolve conflicts by:
- either disallowing inclusion of mcsconfig.h and my_config.h
  at the same time
- or by hiding conflicting definitions #1 and #2
  (with their later restoring).
- also, by making sure that my_sys.h and m_ctype.h always follow
  a CMake configuration file.

Details:
- idb_mysql.h can now only be included only after my_config.h
  An attempt to use idb_mysql.h with mcsconfig.h instead of
  my_config.h is caught by the "#error" preprocessor directive.

- mariadb_my_sys.h can now be only included after mcsconfig.h.
  An attempt to use mariadb_my_sys.h without mcscofig.h
  (e.g. with my_config.h) is also caught by "#error".

- collation.h now can now be included in two ways.
  It now has the following effective structure:

    #if defined(PREFER_MY_CONFIG_H) && defined(MY_CONFIG_H)
    //  Remember current conflicting definitions on the preprocessor stack
    //  Undefine current conflicting definitions
    #endif
    #include "mcsconfig.h"
    #include "m_ctype.h"
    #if defined(PREFER_MY_CONFIG_H) && defined(MY_CONFIG_H)
    #    Restore conflicting definitions from the preprocessor stack
    #endif

  and can be included as follows:

  a. using only mcsconfig.h as a configuration header:

    // my_config.h must not be included so far
    #include "collation.h"

  b. using my_config.h as the first included configuration file:

    #define PREFER_MY_CONFIG_H // Force conflict resolution
    #include "my_config.h"     // can be included directly or indirectly
    ...
    #include "collation.h"

Other changes:

- Adding helper header files
     utils/common/mcsconfig_conflicting_defs_remember.h
     utils/common/mcsconfig_conflicting_defs_restore.h
     utils/common/mcsconfig_conflicting_defs_undef.h
  to perform conflict resolution easier.

- Removing `#include "collation.h"` from a number of files,
  as it's automatically included from rowgroup.h.

- Removing redundant `#include "utils_utf8.h"`.
  This change is not directly related to the problem being fixed,
  but it's nice to remove redundant directives for both collation.h
  and utils_utf8.h from all the files that do not really need them.
  (this change could probably have gone as a separate commit)

- Changing my_init() to MY_INIT(argv[0]) in the MCS services sources.
  After the fix of the complitation failure it appeared that ColumnStore
  services compiled with the debug build crash due to recent changes in
  safemalloc. The crash happened in strcmp() with `my_progname` as an argument
  (where my_progname is a mysys global variable). This problem should
  probably be fixed on the server side as well to avoid passing NULL.
  But, the majority of MariaDB executable programs also use MY_INIT(argv[0])
  rather than my_init(). So let's make MCS do like the other programs do.
2021-05-25 12:34:36 +04:00
Alexander Barkov
2ea73846b9 MCOL-4422 Remove mariadb.h and my_sys.h dependency from collation.h 2020-11-30 14:26:35 +04:00
Gagan Goel
995cadef2d MCOL-641 Fix alter table add wide decimal column.
This patch also removes CalpontSystemCatalog::BINARY and
ddlpackage::DDL_BINARY that were added during the initial
stages of the work on MCOL-641.
2020-11-20 19:49:54 -05:00
Alexander Barkov
3d7f5c6fd1 MCOL-4377 Split DataConvert::convertColumnData() 2020-11-18 13:53:16 +00:00
Alexander Barkov
d5c6645ba1 Adding mcs_basic_types.h
For now it consists of only:

using int128_t = __int128;
using uint128_t = unsigned __int128;

All new privitive data types should go into this file in the future.
2020-11-18 13:53:15 +00:00
Alexander Barkov
129d5b5a0f MCOL-4174 Review/refactor frontend/connector code 2020-11-18 13:53:15 +00:00
Roman Nozdrin
1588ebe439 MCOL-641 Clean up primitives code
Add int128_t support into ByteStream

Fixed UTs broken after collation patch
2020-11-18 13:52:19 +00:00
Gagan Goel
62d0c82d75 MCOL-641 1. Templatized convertValueNum() function.
2. Allocate int128_t buffers in batchprimitiveprocessor if
a query involves wide decimal columns.
2020-11-18 13:47:44 +00:00
drrtuy
b29d0c9daa MCOL-641 Changed the hint to search for GTest headers.
This commit introduces DataConvert UTs.

DataConvert::decimalToString now can negative values.

Next version for Row::toString(), applyMapping UT checks.

Row:equals() is now wide-DECIMAL aware.
2020-11-18 13:47:02 +00:00
Gagan Goel
55afcd8890 MCOL-641 Basic extent elimination support for Decimal38. 2020-11-18 13:47:01 +00:00
Gagan Goel
32f6167067 MCOL-641 Work of Ivan Zuniga on basic read and write support for Binary16 2020-11-18 13:47:00 +00:00
Alexey Antipovsky
c896a50a5d Add copy assign operator (-Wdeprecated-copy) 2020-11-17 15:03:10 +03:00
Alexey Antipovsky
ac459cd560 Add missing break (-Wimplicit-fallthrough) 2020-11-17 15:03:10 +03:00
David Hall
35c4b66a67 MCOL-4144 Enable lower_case_table_names
Create tables and schemas with lower case name only if the flag is set.
During operations, convert to lowercase in plugin. Byt the time a query gets to ExeMgr, DDLProc etc., everything must be lower case if the flag is set, and undisturbed if not.
2020-09-24 15:21:13 -05:00
benthompson15
eac7dab096 MCOL-4030: first commit of warning removals unneed const and missing virtual dtors. 2020-06-23 13:51:36 -05:00
David Hall
06e50e0926 MCOL-3536 collation 2020-05-26 12:42:11 -05:00
David Hall
1f3d1e6fd6 MCOL-3536 collation 2020-05-14 16:02:49 -05:00
Andrew Hutchings
dba7220ad3 Fix a few cppcheck issues
Found the following:

* Potential stack explosions with alloca() usage on potentially large
strings
* Memory leaks in WriteEngineServer
* Stack usage out of scope in dataconvert
* A typo in an 'if' statement in dataconvert
2019-11-21 13:52:53 +00:00
Roman Nozdrin
b1bc995420 Merge branch 'develop' into remove-infinidb 2019-08-13 12:32:01 +03:00
Andrew Hutchings
9d83b49fca MCOL-104 First pass of InfiniDB rename in code 2019-08-12 09:41:28 +01:00