mariadb-columnstore-engine

mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-07-30 19:23:07 +03:00

Author	SHA1	Message	Date
Leonid Fedorov	a46232f830	MCOL-5779: use encoding to check alter table alter column statement correctly	2024-08-28 16:27:02 +04:00
Roman Nozdrin	eb744eafed	chore(datatypes): this refactors the placement of the main SQL data types enum to enable templates that are parametrized with this enum(see mcs_datatype_basic.h changes for more details).	2023-10-24 18:44:35 +03:00
Leonid Fedorov	1c9cd9db9f	Fix garbage charset using ColType(int32_t colWidth_, int32_t scale_, int32_t precision_, (#2949 ) const ConstraintType& constraintType_, const DictOID& ddn_, int32_t colPosition_, int32_t compressionType_, OID columnOID_, const ColDataType& colDataType_);	2023-09-06 20:01:31 +03:00
Gagan Goel	d50a0fa2e6	MCOL-5005 Add charset number to system catalog - Part 2. 1. Extend the calpontsys.syscolumn system catalog table with a new column, 'charsetnum'. 'charsetnum' field is set to the 'number' member of the 'charset_info_st' struct defined in the server in m_ctype.h. For CHAR/VARCHAR/TEXT column types, 'charset_info_st' is initialized to the charset/collation of the column, which is set at the column-level or at the table-level in the DDL. For BLOB/VARBINARY binary column types, 'charset_info_st' is initialized to my_charset_bin (charsetnum=63). For all other column types, charsetnum is set to 0. 2. Add support for the newly added 'charsetnum' column in the automatic system catalog upgrade logic in dbbuilder. For existing table definitions, charsetnum for the column is defaulted to 0. 3. Add MTR test case that creates a few table definitions with a range of charset/collation combinations and queries the calpontsys.syscolumn system catalog table with the charsetnum field for the columns in the table DDLs.	2023-08-15 17:21:47 +00:00
Roman Nozdrin	4fe9cd64a3	Revert "No boost condition (#2822 )" (#2828 ) This reverts commit `f916e64927`.	2023-04-22 15:49:50 +03:00
Leonid Fedorov	f916e64927	No boost condition (#2822 ) This patch replaces boost primitives with stdlib counterparts.	2023-04-22 00:42:45 +03:00
Sergey Zefirov	b53c231ca6	MCOL-271 empty strings should not be NULLs (#2794 ) This patch improves handling of NULLs in textual fields in ColumnStore. Previously empty strings were considered NULLs and it could be a problem if data scheme allows for empty strings. It was also one of major reasons of behavior difference between ColumnStore and other engines in MariaDB family. Also, this patch fixes some other bugs and incorrect behavior, for example, incorrect comparison for "column <= ''" which evaluates to constant True for all purposes before this patch.	2023-03-30 21:18:29 +03:00
Gagan Goel	86dcf92d56	MCOL-5215 Fix overflow of UNION operation involving DECIMAL datatypes. When a UNION operation involving DECIMAL datatypes with scale and digits before the decimal exceeds the currently supported maximum precision of 38, we throw an error to the user: "MCS-2060: Union operation exceeds maximum DECIMAL precision of 38". This is until MCOL-5417 is implemented where ColumnStore will have full parity with MariaDB server in terms of maximum supported DECIMAL precision and scale of 65 and 38 digits respectively.	2023-02-27 06:38:31 -05:00
Roman Nozdrin	ff534dba7f	MCOL-5384 This commit replaces shared pointer to CSC with CSC ctor that is cleaned up leaving a scope CSC default ctor was private b/c it must not allow to use CSC outside thread cache. However there are some places in the plugin code that need a standalone syscat that is cleaned up leaving the scope. The decision is to make the restriction mentioned organizational rather than syntactical.	2023-02-08 14:03:41 +00:00
Gagan Goel	6a6fee5969	MCOL-5021 Followup. Allow the compiler to inline the call to nextColValue() in column.cpp.	2022-08-18 19:35:35 +00:00
Gagan Goel	1355237ca3	MCOL-5021 Some minor fixes.	2022-08-05 14:40:50 -04:00
Gagan Goel	94e9f55940	MCOL-5021 Add a new member function to the DBRM class, DBRM::addToLBIDList(). This function iterates over lbidList (populated by an earlier call to DBRM::getUncommittedExtentLBIDs()) to find those LBIDs which belong to the AUX column. It then finds the corresponding LBIDs for all other columns which belong to the same table as the AUX LBID and appends them to lbidList. The updated lbidList is used by invalidateUncommittedExtentLBIDs() to update the casual partitioning information. DBRM::addToLBIDList() only comes into play in case of a transaction ROLLBACK.	2022-08-05 14:40:50 -04:00
Gagan Goel	ea1861fdb5	MCOL-5021 Add a new function to CalpontSystemCatalog class, isAUXColumnOID(), to check if a given OID is an auxilliary column OID.	2022-08-05 14:40:49 -04:00
Gagan Goel	262cd5c501	MCOL-5021 Remove hard-coded values for data type, column width and compression type for the AUX column, and replace them with constants defined in the execplan namespace.	2022-08-05 14:40:49 -04:00
Gagan Goel	86df9a972c	MCOL-5021 Add prototype support for the AUX column in CREATE/DROP DDL commands, single and multi-value INSERTs, cpimport, and DELETE.	2022-08-05 14:40:49 -04:00
Leonid Fedorov	39c43a0f70	<unnamed>.execplan::CalpontSystemCatalog::TableName::create_date' may be used uninitialized	2022-07-11 22:27:25 +02:00
Serguey Zefirov	53b9a2a0f9	MCOL-4580 extent elimination for dictionary-based text/varchar types The idea is relatively simple - encode prefixes of collated strings as integers and use them to compute extents' ranges. Then we can eliminate extents with strings. The actual patch does have all the code there but miss one important step: we do not keep collation index, we keep charset index. Because of this, some of the tests in the bugfix suite fail and thus main functionality is turned off. The reason of this patch to be put into PR at all is that it contains changes that made CHAR/VARCHAR columns unsigned. This change is needed in vectorization work.	2022-03-02 23:53:39 +03:00
Leonid Fedorov	3919c541ac	New warnfixes (#2254 ) * Fix clang warnings * Remove vim tab guides * initialize variables * 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length * Fix ISO C++17 does not allow 'register' storage class specifier for outdated bison * chars are unsigned on ARM, having if (ival < 0) always false * chars are unsigned by default on ARM and comparison with -1 if always true	2022-02-17 13:08:58 +03:00
Gagan Goel	973e5024d8	MCOL-4957 Fix performance slowdown for processing TIMESTAMP columns. Part 1: As part of MCOL-3776 to address synchronization issue while accessing the fTimeZone member of the Func class, mutex locks were added to the accessor and mutator methods. However, this slows down processing of TIMESTAMP columns in PrimProc significantly as all threads across all concurrently running queries would serialize on the mutex. This is because PrimProc only has a single global object for the functor class (class derived from Func in utils/funcexp/functor.h) for a given function name. To fix this problem: (1) We remove the fTimeZone as a member of the Func derived classes (hence removing the mutexes) and instead use the fOperationType member of the FunctionColumn class to propagate the timezone values down to the individual functor processing functions such as FunctionColumn::getStrVal(), FunctionColumn::getIntVal(), etc. (2) To achieve (1), a timezone member is added to the execplan::CalpontSystemCatalog::ColType class. Part 2: Several functors in the Funcexp code call dataconvert::gmtSecToMySQLTime() and dataconvert::mySQLTimeToGmtSec() functions for conversion between seconds since unix epoch and broken-down representation. These functions in turn call the C library function localtime_r() which currently has a known bug of holding a global lock via a call to __tz_convert. This significantly reduces performance in multi-threaded applications where multiple threads concurrently call localtime_r(). More details on the bug: https://sourceware.org/bugzilla/show_bug.cgi?id=16145 This bug in localtime_r() caused processing of the Functors in PrimProc to slowdown significantly since a query execution causes Functors code to be processed in a multi-threaded manner. As a fix, we remove the calls to localtime_r() from gmtSecToMySQLTime() and mySQLTimeToGmtSec() by performing the timezone-to-offset conversion (done in dataconvert::timeZoneToOffset()) during the execution plan creation in the plugin. Note that localtime_r() is only called when the time_zone system variable is set to "SYSTEM". This fix also required changing the timezone type from a std::string to a long across the system.	2022-02-14 14:12:27 -05:00
Leonid Fedorov	04752ec546	clang format apply	2022-01-21 16:43:49 +00:00
Leonid Fedorov	01f3ceb437	replace header guards with #pragma once	2022-01-21 15:24:58 +00:00
Gagan Goel	b3a560300c	Revert "Merge pull request #2022 from mariadb-corporation/bar-develop-MCOL-4791" This reverts commit `4016e25e5b`, reversing changes made to `85435f6b1e`.	2021-07-13 11:06:56 +00:00
Roman Nozdrin	fb5ba84212	MCOL-4802 Removed ByteStream methods for bool manipulations and add some logging into I_S.columnstore_files	2021-07-07 07:16:30 +00:00
Alexander Barkov	e8126bede5	MCOL-4791 Fix ColumnCommand fudged data type format to clearly identify CHAR vs VARCHAR	2021-07-02 12:42:03 +04:00
Alexander Barkov	9608533d92	MCOL-4734 Compilation failure: MariaDB-10.6 + ColumnStore-develop mcsconfig.h and my_config.h have the following pre-processor definitions: 1. Conflicting definitions coming from the standard cmake definitions: - PACKAGE - PACKAGE_BUGREPORT - PACKAGE_NAME - PACKAGE_STRING - PACKAGE_TARNAME - PACKAGE_VERSION - VERSION 2. Conflicting definitions of other kinds: - HAVE_STRTOLL - this is a dirt in MariaDB headers. Should be fixed in the server code. my_config.h erroneously performs "#define HAVE_STRTOLL" instead of "#define HAVE_STRTOLL 1". in some cases. The former is not CMake compatible style. The latter is. 3. Non-conflicting definitions: Otherwise, mcsconfig.h and my_config.h should be mutually compatible, because both are generated by cmake on the same host machine. So they should have exactly equal definitions like "HAVE_XXX", "SIZEOF_XXX", etc. Observations: - It's OK to include both mcsconfig.h and my_config.h providing that we suppress duplicate definition of the above conflicting types #1 and #2. - There is no a need to suppress duplicate definitions mentioned in #3, as they are compatible! - my_sys.h and m_ctype.h must always follow a CMake configuation header, either my_config.h or mcsconfig.h (or both). They must never be included without any preceeding configuration header. This change make sure that we resolve conflicts by: - either disallowing inclusion of mcsconfig.h and my_config.h at the same time - or by hiding conflicting definitions #1 and #2 (with their later restoring). - also, by making sure that my_sys.h and m_ctype.h always follow a CMake configuration file. Details: - idb_mysql.h can now only be included only after my_config.h An attempt to use idb_mysql.h with mcsconfig.h instead of my_config.h is caught by the "#error" preprocessor directive. - mariadb_my_sys.h can now be only included after mcsconfig.h. An attempt to use mariadb_my_sys.h without mcscofig.h (e.g. with my_config.h) is also caught by "#error". - collation.h now can now be included in two ways. It now has the following effective structure: #if defined(PREFER_MY_CONFIG_H) && defined(MY_CONFIG_H) // Remember current conflicting definitions on the preprocessor stack // Undefine current conflicting definitions #endif #include "mcsconfig.h" #include "m_ctype.h" #if defined(PREFER_MY_CONFIG_H) && defined(MY_CONFIG_H) # Restore conflicting definitions from the preprocessor stack #endif and can be included as follows: a. using only mcsconfig.h as a configuration header: // my_config.h must not be included so far #include "collation.h" b. using my_config.h as the first included configuration file: #define PREFER_MY_CONFIG_H // Force conflict resolution #include "my_config.h" // can be included directly or indirectly ... #include "collation.h" Other changes: - Adding helper header files utils/common/mcsconfig_conflicting_defs_remember.h utils/common/mcsconfig_conflicting_defs_restore.h utils/common/mcsconfig_conflicting_defs_undef.h to perform conflict resolution easier. - Removing `#include "collation.h"` from a number of files, as it's automatically included from rowgroup.h. - Removing redundant `#include "utils_utf8.h"`. This change is not directly related to the problem being fixed, but it's nice to remove redundant directives for both collation.h and utils_utf8.h from all the files that do not really need them. (this change could probably have gone as a separate commit) - Changing my_init() to MY_INIT(argv[0]) in the MCS services sources. After the fix of the complitation failure it appeared that ColumnStore services compiled with the debug build crash due to recent changes in safemalloc. The crash happened in strcmp() with `my_progname` as an argument (where my_progname is a mysys global variable). This problem should probably be fixed on the server side as well to avoid passing NULL. But, the majority of MariaDB executable programs also use MY_INIT(argv[0]) rather than my_init(). So let's make MCS do like the other programs do.	2021-05-25 12:34:36 +04:00
Alexander Barkov	2ea73846b9	MCOL-4422 Remove mariadb.h and my_sys.h dependency from collation.h	2020-11-30 14:26:35 +04:00
Gagan Goel	995cadef2d	MCOL-641 Fix alter table add wide decimal column. This patch also removes CalpontSystemCatalog::BINARY and ddlpackage::DDL_BINARY that were added during the initial stages of the work on MCOL-641.	2020-11-20 19:49:54 -05:00
Alexander Barkov	3d7f5c6fd1	MCOL-4377 Split DataConvert::convertColumnData()	2020-11-18 13:53:16 +00:00
Alexander Barkov	d5c6645ba1	Adding mcs_basic_types.h For now it consists of only: using int128_t = __int128; using uint128_t = unsigned __int128; All new privitive data types should go into this file in the future.	2020-11-18 13:53:15 +00:00
Alexander Barkov	129d5b5a0f	MCOL-4174 Review/refactor frontend/connector code	2020-11-18 13:53:15 +00:00
Roman Nozdrin	1588ebe439	MCOL-641 Clean up primitives code Add int128_t support into ByteStream Fixed UTs broken after collation patch	2020-11-18 13:52:19 +00:00
Gagan Goel	62d0c82d75	MCOL-641 1. Templatized convertValueNum() function. 2. Allocate int128_t buffers in batchprimitiveprocessor if a query involves wide decimal columns.	2020-11-18 13:47:44 +00:00
drrtuy	b29d0c9daa	MCOL-641 Changed the hint to search for GTest headers. This commit introduces DataConvert UTs. DataConvert::decimalToString now can negative values. Next version for Row::toString(), applyMapping UT checks. Row:equals() is now wide-DECIMAL aware.	2020-11-18 13:47:02 +00:00
Gagan Goel	55afcd8890	MCOL-641 Basic extent elimination support for Decimal38.	2020-11-18 13:47:01 +00:00
Gagan Goel	32f6167067	MCOL-641 Work of Ivan Zuniga on basic read and write support for Binary16	2020-11-18 13:47:00 +00:00
Alexey Antipovsky	c896a50a5d	Add copy assign operator (-Wdeprecated-copy)	2020-11-17 15:03:10 +03:00
Alexey Antipovsky	ac459cd560	Add missing break (-Wimplicit-fallthrough)	2020-11-17 15:03:10 +03:00
David Hall	35c4b66a67	MCOL-4144 Enable lower_case_table_names Create tables and schemas with lower case name only if the flag is set. During operations, convert to lowercase in plugin. Byt the time a query gets to ExeMgr, DDLProc etc., everything must be lower case if the flag is set, and undisturbed if not.	2020-09-24 15:21:13 -05:00
benthompson15	eac7dab096	MCOL-4030: first commit of warning removals unneed const and missing virtual dtors.	2020-06-23 13:51:36 -05:00
David Hall	06e50e0926	MCOL-3536 collation	2020-05-26 12:42:11 -05:00
David Hall	1f3d1e6fd6	MCOL-3536 collation	2020-05-14 16:02:49 -05:00
Andrew Hutchings	dba7220ad3	Fix a few cppcheck issues Found the following: * Potential stack explosions with alloca() usage on potentially large strings * Memory leaks in WriteEngineServer * Stack usage out of scope in dataconvert * A typo in an 'if' statement in dataconvert	2019-11-21 13:52:53 +00:00
Roman Nozdrin	b1bc995420	Merge branch 'develop' into remove-infinidb	2019-08-13 12:32:01 +03:00
Andrew Hutchings	9d83b49fca	MCOL-104 First pass of InfiniDB rename in code	2019-08-12 09:41:28 +01:00
Patrick LeBlanc	a09a9d5d0f	Mass substitution 'Corporaton' -> 'Corporation'	2019-08-07 14:43:25 -05:00
Gagan Goel	e89d1ac3cf	MCOL-265 Add support for TIMESTAMP data type	2019-04-23 00:00:09 -04:00
Andrew Hutchings	064d2ee9e4	Merge branch 'develop-1.2' into develop-merge-up-20190328	2019-03-28 15:09:21 +00:00
David Hall	3f2c753947	MCOL-1822-c final checkin	2019-03-05 09:33:39 -06:00
David Hall	c5b9ae11e5	MCOL-1822 add LONG DOUBLE support	2019-01-29 09:55:43 -06:00
Gagan Goel	d1ada75395	MCOL-270 Add support for MEDIUMINT data type	2018-12-30 19:13:16 -05:00

1 2

67 Commits