1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-07-30 19:23:07 +03:00
Commit Graph

67 Commits

Author SHA1 Message Date
a46232f830 MCOL-5779: use encoding to check alter table alter column statement correctly 2024-08-28 16:27:02 +04:00
eb744eafed chore(datatypes): this refactors the placement of the main SQL data types enum to enable templates that are parametrized with this enum(see mcs_datatype_basic.h changes for more details). 2023-10-24 18:44:35 +03:00
1c9cd9db9f Fix garbage charset using ColType(int32_t colWidth_, int32_t scale_, int32_t precision_, (#2949)
const ConstraintType& constraintType_, const DictOID& ddn_, int32_t colPosition_,
            int32_t compressionType_, OID columnOID_, const ColDataType& colDataType_);
2023-09-06 20:01:31 +03:00
d50a0fa2e6 MCOL-5005 Add charset number to system catalog - Part 2.
1. Extend the calpontsys.syscolumn system catalog table
  with a new column, 'charsetnum'.

  'charsetnum' field is set to the 'number' member of the
  'charset_info_st' struct defined in the server in m_ctype.h.

  For CHAR/VARCHAR/TEXT column types, 'charset_info_st' is
  initialized to the charset/collation of the column, which
  is set at the column-level or at the table-level in the DDL.

  For BLOB/VARBINARY binary column types, 'charset_info_st' is
  initialized to my_charset_bin (charsetnum=63).

  For all other column types, charsetnum is set to 0.

  2. Add support for the newly added 'charsetnum' column in the
  automatic system catalog upgrade logic in dbbuilder.

  For existing table definitions, charsetnum for the column is
  defaulted to 0.

  3. Add MTR test case that creates a few table definitions with
  a range of charset/collation combinations and queries the
  calpontsys.syscolumn system catalog table with the charsetnum
  field for the columns in the table DDLs.
2023-08-15 17:21:47 +00:00
4fe9cd64a3 Revert "No boost condition (#2822)" (#2828)
This reverts commit f916e64927.
2023-04-22 15:49:50 +03:00
f916e64927 No boost condition (#2822)
This patch replaces boost primitives with stdlib counterparts.
2023-04-22 00:42:45 +03:00
b53c231ca6 MCOL-271 empty strings should not be NULLs (#2794)
This patch improves handling of NULLs in textual fields in ColumnStore.
Previously empty strings were considered NULLs and it could be a problem
if data scheme allows for empty strings. It was also one of major
reasons of behavior difference between ColumnStore and other engines in
MariaDB family.

Also, this patch fixes some other bugs and incorrect behavior, for
example, incorrect comparison for "column <= ''" which evaluates to
constant True for all purposes before this patch.
2023-03-30 21:18:29 +03:00
86dcf92d56 MCOL-5215 Fix overflow of UNION operation involving DECIMAL datatypes.
When a UNION operation involving DECIMAL datatypes with scale and digits
before the decimal exceeds the currently supported maximum precision
of 38, we throw an error to the user:
"MCS-2060: Union operation exceeds maximum DECIMAL precision of 38".

This is until MCOL-5417 is implemented where ColumnStore will have
full parity with MariaDB server in terms of maximum supported DECIMAL
precision and scale of 65 and 38 digits respectively.
2023-02-27 06:38:31 -05:00
ff534dba7f MCOL-5384 This commit replaces shared pointer to CSC with CSC ctor that is cleaned up leaving a scope
CSC default ctor was private b/c it must not allow to use CSC outside thread cache.
  However there are some places in the plugin code that need a standalone syscat that
  is cleaned up leaving the scope. The decision is to make the restriction mentioned
  organizational rather than syntactical.
2023-02-08 14:03:41 +00:00
6a6fee5969 MCOL-5021 Followup.
Allow the compiler to inline the call to nextColValue() in column.cpp.
2022-08-18 19:35:35 +00:00
1355237ca3 MCOL-5021 Some minor fixes. 2022-08-05 14:40:50 -04:00
94e9f55940 MCOL-5021 Add a new member function to the DBRM class, DBRM::addToLBIDList().
This function iterates over lbidList (populated by an earlier call to
DBRM::getUncommittedExtentLBIDs()) to find those LBIDs which belong to
the AUX column. It then finds the corresponding LBIDs for all other columns
which belong to the same table as the AUX LBID and appends them to lbidList.
The updated lbidList is used by invalidateUncommittedExtentLBIDs() to update
the casual partitioning information.

DBRM::addToLBIDList() only comes into play in case of a transaction ROLLBACK.
2022-08-05 14:40:50 -04:00
ea1861fdb5 MCOL-5021 Add a new function to CalpontSystemCatalog class,
isAUXColumnOID(), to check if a given OID is an auxilliary
column OID.
2022-08-05 14:40:49 -04:00
262cd5c501 MCOL-5021 Remove hard-coded values for data type, column width
and compression type for the AUX column, and replace them with
constants defined in the execplan namespace.
2022-08-05 14:40:49 -04:00
86df9a972c MCOL-5021 Add prototype support for the AUX column in CREATE/DROP
DDL commands, single and multi-value INSERTs, cpimport, and
DELETE.
2022-08-05 14:40:49 -04:00
39c43a0f70 <unnamed>.execplan::CalpontSystemCatalog::TableName::create_date' may be used uninitialized 2022-07-11 22:27:25 +02:00
53b9a2a0f9 MCOL-4580 extent elimination for dictionary-based text/varchar types
The idea is relatively simple - encode prefixes of collated strings as
integers and use them to compute extents' ranges. Then we can eliminate
extents with strings.

The actual patch does have all the code there but miss one important
step: we do not keep collation index, we keep charset index. Because of
this, some of the tests in the bugfix suite fail and thus main
functionality is turned off.

The reason of this patch to be put into PR at all is that it contains
changes that made CHAR/VARCHAR columns unsigned. This change is needed in
vectorization work.
2022-03-02 23:53:39 +03:00
3919c541ac New warnfixes (#2254)
* Fix clang warnings

* Remove vim tab guides

* initialize variables

* 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length

* Fix ISO C++17 does not allow 'register' storage class specifier for outdated bison

* chars are unsigned on ARM, having  if (ival < 0) always false

* chars are unsigned by default on ARM and comparison with -1 if always true
2022-02-17 13:08:58 +03:00
973e5024d8 MCOL-4957 Fix performance slowdown for processing TIMESTAMP columns.
Part 1:
 As part of MCOL-3776 to address synchronization issue while accessing
 the fTimeZone member of the Func class, mutex locks were added to the
 accessor and mutator methods. However, this slows down processing
 of TIMESTAMP columns in PrimProc significantly as all threads across
 all concurrently running queries would serialize on the mutex. This
 is because PrimProc only has a single global object for the functor
 class (class derived from Func in utils/funcexp/functor.h) for a given
 function name. To fix this problem:

   (1) We remove the fTimeZone as a member of the Func derived classes
   (hence removing the mutexes) and instead use the fOperationType
   member of the FunctionColumn class to propagate the timezone values
   down to the individual functor processing functions such as
   FunctionColumn::getStrVal(), FunctionColumn::getIntVal(), etc.

   (2) To achieve (1), a timezone member is added to the
   execplan::CalpontSystemCatalog::ColType class.

Part 2:
 Several functors in the Funcexp code call dataconvert::gmtSecToMySQLTime()
 and dataconvert::mySQLTimeToGmtSec() functions for conversion between seconds
 since unix epoch and broken-down representation. These functions in turn call
 the C library function localtime_r() which currently has a known bug of holding
 a global lock via a call to __tz_convert. This significantly reduces performance
 in multi-threaded applications where multiple threads concurrently call
 localtime_r(). More details on the bug:
   https://sourceware.org/bugzilla/show_bug.cgi?id=16145

 This bug in localtime_r() caused processing of the Functors in PrimProc to
 slowdown significantly since a query execution causes Functors code to be
 processed in a multi-threaded manner.

 As a fix, we remove the calls to localtime_r() from gmtSecToMySQLTime()
 and mySQLTimeToGmtSec() by performing the timezone-to-offset conversion
 (done in dataconvert::timeZoneToOffset()) during the execution plan
 creation in the plugin. Note that localtime_r() is only called when the
 time_zone system variable is set to "SYSTEM".

 This fix also required changing the timezone type from a std::string to
 a long across the system.
2022-02-14 14:12:27 -05:00
04752ec546 clang format apply 2022-01-21 16:43:49 +00:00
01f3ceb437 replace header guards with #pragma once 2022-01-21 15:24:58 +00:00
b3a560300c Revert "Merge pull request #2022 from mariadb-corporation/bar-develop-MCOL-4791"
This reverts commit 4016e25e5b, reversing
changes made to 85435f6b1e.
2021-07-13 11:06:56 +00:00
fb5ba84212 MCOL-4802 Removed ByteStream methods for bool manipulations and add some logging into I_S.columnstore_files 2021-07-07 07:16:30 +00:00
e8126bede5 MCOL-4791 Fix ColumnCommand fudged data type format to clearly identify CHAR vs VARCHAR 2021-07-02 12:42:03 +04:00
9608533d92 MCOL-4734 Compilation failure: MariaDB-10.6 + ColumnStore-develop
mcsconfig.h and my_config.h have the following
pre-processor definitions:

1. Conflicting definitions coming from the standard cmake definitions:
- PACKAGE
- PACKAGE_BUGREPORT
- PACKAGE_NAME
- PACKAGE_STRING
- PACKAGE_TARNAME
- PACKAGE_VERSION
- VERSION

2. Conflicting definitions of other kinds:
- HAVE_STRTOLL - this is a dirt in MariaDB headers.
  Should be fixed in the server code. my_config.h erroneously
  performs "#define HAVE_STRTOLL" instead of "#define HAVE_STRTOLL 1".
  in some cases. The former is not CMake compatible style. The latter is.

3. Non-conflicting definitions:
  Otherwise, mcsconfig.h and my_config.h should be mutually compatible,
  because both are generated by cmake on the same host machine. So
  they should have exactly equal definitions like "HAVE_XXX", "SIZEOF_XXX", etc.

Observations:
- It's OK to include both mcsconfig.h and my_config.h providing that we
  suppress duplicate definition of the above conflicting types #1 and #2.
- There is no a need to suppress duplicate definitions mentioned in #3,
  as they are compatible!
- my_sys.h and m_ctype.h must always follow a CMake configuation header,
  either my_config.h or mcsconfig.h (or both).
  They must never be included without any preceeding configuration header.

This change make sure that we resolve conflicts by:
- either disallowing inclusion of mcsconfig.h and my_config.h
  at the same time
- or by hiding conflicting definitions #1 and #2
  (with their later restoring).
- also, by making sure that my_sys.h and m_ctype.h always follow
  a CMake configuration file.

Details:
- idb_mysql.h can now only be included only after my_config.h
  An attempt to use idb_mysql.h with mcsconfig.h instead of
  my_config.h is caught by the "#error" preprocessor directive.

- mariadb_my_sys.h can now be only included after mcsconfig.h.
  An attempt to use mariadb_my_sys.h without mcscofig.h
  (e.g. with my_config.h) is also caught by "#error".

- collation.h now can now be included in two ways.
  It now has the following effective structure:

    #if defined(PREFER_MY_CONFIG_H) && defined(MY_CONFIG_H)
    //  Remember current conflicting definitions on the preprocessor stack
    //  Undefine current conflicting definitions
    #endif
    #include "mcsconfig.h"
    #include "m_ctype.h"
    #if defined(PREFER_MY_CONFIG_H) && defined(MY_CONFIG_H)
    #    Restore conflicting definitions from the preprocessor stack
    #endif

  and can be included as follows:

  a. using only mcsconfig.h as a configuration header:

    // my_config.h must not be included so far
    #include "collation.h"

  b. using my_config.h as the first included configuration file:

    #define PREFER_MY_CONFIG_H // Force conflict resolution
    #include "my_config.h"     // can be included directly or indirectly
    ...
    #include "collation.h"

Other changes:

- Adding helper header files
     utils/common/mcsconfig_conflicting_defs_remember.h
     utils/common/mcsconfig_conflicting_defs_restore.h
     utils/common/mcsconfig_conflicting_defs_undef.h
  to perform conflict resolution easier.

- Removing `#include "collation.h"` from a number of files,
  as it's automatically included from rowgroup.h.

- Removing redundant `#include "utils_utf8.h"`.
  This change is not directly related to the problem being fixed,
  but it's nice to remove redundant directives for both collation.h
  and utils_utf8.h from all the files that do not really need them.
  (this change could probably have gone as a separate commit)

- Changing my_init() to MY_INIT(argv[0]) in the MCS services sources.
  After the fix of the complitation failure it appeared that ColumnStore
  services compiled with the debug build crash due to recent changes in
  safemalloc. The crash happened in strcmp() with `my_progname` as an argument
  (where my_progname is a mysys global variable). This problem should
  probably be fixed on the server side as well to avoid passing NULL.
  But, the majority of MariaDB executable programs also use MY_INIT(argv[0])
  rather than my_init(). So let's make MCS do like the other programs do.
2021-05-25 12:34:36 +04:00
2ea73846b9 MCOL-4422 Remove mariadb.h and my_sys.h dependency from collation.h 2020-11-30 14:26:35 +04:00
995cadef2d MCOL-641 Fix alter table add wide decimal column.
This patch also removes CalpontSystemCatalog::BINARY and
ddlpackage::DDL_BINARY that were added during the initial
stages of the work on MCOL-641.
2020-11-20 19:49:54 -05:00
3d7f5c6fd1 MCOL-4377 Split DataConvert::convertColumnData() 2020-11-18 13:53:16 +00:00
d5c6645ba1 Adding mcs_basic_types.h
For now it consists of only:

using int128_t = __int128;
using uint128_t = unsigned __int128;

All new privitive data types should go into this file in the future.
2020-11-18 13:53:15 +00:00
129d5b5a0f MCOL-4174 Review/refactor frontend/connector code 2020-11-18 13:53:15 +00:00
1588ebe439 MCOL-641 Clean up primitives code
Add int128_t support into ByteStream

Fixed UTs broken after collation patch
2020-11-18 13:52:19 +00:00
62d0c82d75 MCOL-641 1. Templatized convertValueNum() function.
2. Allocate int128_t buffers in batchprimitiveprocessor if
a query involves wide decimal columns.
2020-11-18 13:47:44 +00:00
b29d0c9daa MCOL-641 Changed the hint to search for GTest headers.
This commit introduces DataConvert UTs.

DataConvert::decimalToString now can negative values.

Next version for Row::toString(), applyMapping UT checks.

Row:equals() is now wide-DECIMAL aware.
2020-11-18 13:47:02 +00:00
55afcd8890 MCOL-641 Basic extent elimination support for Decimal38. 2020-11-18 13:47:01 +00:00
32f6167067 MCOL-641 Work of Ivan Zuniga on basic read and write support for Binary16 2020-11-18 13:47:00 +00:00
c896a50a5d Add copy assign operator (-Wdeprecated-copy) 2020-11-17 15:03:10 +03:00
ac459cd560 Add missing break (-Wimplicit-fallthrough) 2020-11-17 15:03:10 +03:00
35c4b66a67 MCOL-4144 Enable lower_case_table_names
Create tables and schemas with lower case name only if the flag is set.
During operations, convert to lowercase in plugin. Byt the time a query gets to ExeMgr, DDLProc etc., everything must be lower case if the flag is set, and undisturbed if not.
2020-09-24 15:21:13 -05:00
eac7dab096 MCOL-4030: first commit of warning removals unneed const and missing virtual dtors. 2020-06-23 13:51:36 -05:00
06e50e0926 MCOL-3536 collation 2020-05-26 12:42:11 -05:00
1f3d1e6fd6 MCOL-3536 collation 2020-05-14 16:02:49 -05:00
dba7220ad3 Fix a few cppcheck issues
Found the following:

* Potential stack explosions with alloca() usage on potentially large
strings
* Memory leaks in WriteEngineServer
* Stack usage out of scope in dataconvert
* A typo in an 'if' statement in dataconvert
2019-11-21 13:52:53 +00:00
b1bc995420 Merge branch 'develop' into remove-infinidb 2019-08-13 12:32:01 +03:00
9d83b49fca MCOL-104 First pass of InfiniDB rename in code 2019-08-12 09:41:28 +01:00
a09a9d5d0f Mass substitution 'Corporaton' -> 'Corporation' 2019-08-07 14:43:25 -05:00
e89d1ac3cf MCOL-265 Add support for TIMESTAMP data type 2019-04-23 00:00:09 -04:00
064d2ee9e4 Merge branch 'develop-1.2' into develop-merge-up-20190328 2019-03-28 15:09:21 +00:00
3f2c753947 MCOL-1822-c final checkin 2019-03-05 09:33:39 -06:00
c5b9ae11e5 MCOL-1822 add LONG DOUBLE support 2019-01-29 09:55:43 -06:00
d1ada75395 MCOL-270 Add support for MEDIUMINT data type 2018-12-30 19:13:16 -05:00