mariadb-columnstore-engine

mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-10-30 07:25:34 +03:00

Author	SHA1	Message	Date
Serguey Zefirov	bb631dcffb	feat(JSON,data_type): MCOL-6197 - support for JSON type This patch does exactly this, it implements support for JSON in DDL. Right now we use server's check for JSON validity on INSERT. We do not implement JSON validity check during updates, it is postponed for later work.	2025-10-17 11:48:46 +03:00
Leonid Fedorov	f002c5abc1	MCOL-5843: fix(extents): print corrent N/A for temporal extents	2025-10-02 22:15:01 +04:00
Leonid Fedorov	878efe55ba	fix(engine): MCOL-5778: if function fixed with handling temporal times and null values	2025-09-12 15:02:51 +04:00
Leonid Fedorov	a7e5115ad6	chore(build): no more need for this oververbose messages on build stage	2025-08-04 19:58:49 +04:00
Leonid Fedorov	8a535e872c	Revert "Deep build refactoring phase 2 (#3564 )" (#3678 ) This reverts commit `449029a827`.	2025-08-01 02:56:31 +04:00
Leonid Fedorov	82421c208f	chore(ci): collect asan ubsan and libc++ build with mtr and regression status ignored (#3672 ) * MSan added with fixes for libc++ * libc++ sepatare build * add libc++ to ci * libstdc++ in CI * libcpp and msan to external projects * std::sqrt * awful_hack(ci): install whole llvm instead of libc++ in terrible way for test containers * Adding ddeb packages for teststages and repos * libc++ more for test container * save some money on debug * colored coredumps * revert ci * chore(ci): collect asan ubsan and libc++ build with mtr and regression status ignored	2025-07-31 00:32:32 +04:00
Leonid Fedorov	449029a827	Deep build refactoring phase 2 (#3564 ) * configcpp refactored * chore(build): massive removals, auto add files to debian install file * chore(build): configure before autobake * chore(build): use custom cmake commands for components, mariadb-plugin-columnstore.install generated * chore(build): install deps as separate step for build-packages * more deps * chore(codemanagement, build): build refactoring stage2 * chore(safety): Locked Map for MessageqCpp with a simpler way Please enter the commit message for your changes. Lines starting * chore(codemanagement, ci): better coredumps handling, deps fixed * Delete build/bootstrap_mcs.py * Update charset.cpp (add license)	2025-07-17 16:14:10 +04:00
Leonid Fedorov	aa7e0fb9b4	Deep build refactoring phase 1 (#3562 ) * configcpp refactored * logging and datatypes refactored * more dataconvert * chore(build): massive removals, auto add files to debian install file * chore(codemanagement): nodeps headers, potentioal library * chore(build): configure before autobake * chore(build): use custom cmake commands for components, mariadb-plugin-columnstore.install generated * chore(build): install deps as separate step for build-packages * more deps * check debian/mariadb-plugin-columnstore.install automatically * chore(build): add option for multibracnh compilation * Fix warning	2025-05-30 14:05:21 +04:00
Leonid Fedorov	6db2dc668f	stubs and cmake formatting	2025-05-20 18:22:59 +04:00
Leonid Fedorov	2036e521c7	named linkage	2025-05-20 18:22:59 +04:00
Leonid Fedorov	16303bef2b	chore(build): clang-20 warnings fixed	2025-05-15 19:05:38 +04:00
Leonid Fedorov	a0bee173f6	chore(build): fixes to satisfy clang19 warnings	2025-05-15 19:05:38 +04:00
Aleksei Antipovskii	0ab03c7258	chore(codestyle): mark virtual methods as override	2025-02-21 20:01:34 +04:00
Denis Khalikov	9f4231f87f	MCOL-5708 Calculate precision and scale for constant decimal. (#3227 ) This patch calculates precision and scale for constant decimal value for SUM aggregation function.	2024-06-28 00:31:03 +04:00
Theresa Hradilak	48562e41f9	feat(datatypes): MCOL-4632 and MCOL-4648, fix cast leads to NULL. Remove redundant cast. As C-style casts with a type name in parantheses are interpreted as static_casts this literally just changes the interpretation around (and forces an implicit cast to match the return value of the function). Switch UBIGINTNULL and UBIGINTEMPTYROW constants for consistency. Make consistent with relation between BIGINTNULL and BIGINTEMPTYROW & make adapted cast behaviour due to NULL markers more intuitive. (After this change we can simply block the highest possible uint64_t value and if a cast results in it, print the next lower value (2^64 - 2). Previously, (2^64 - 1) was able to be printed, but (2^64 - 2) as being blocked by the UBIGINTNULL constant was not, making finding the appropiate replacement value to give out more confusing. Introduce MAX_MCS_UBIGINT and MIN_MCS_BIGINT and adapt casts. Adapt casting to BIGINT to remove NULL marker error. Add bugfix regression test for MCOL 4632 Add regression test for mcol_4648 Revert "Switch UBIGINTNULL and UBIGINTEMPTYROW constants for consistency." This reverts commit 83eac11b18937ecb0b4c754dd48e4cb47310f620. Due to backwards compatability issues. Refactor casting to MCS[U]Int to datatype functions. Update regression tests to include other affected datatypes. Apply formatting. Refactor according to PR review Remove redundant new constant, switch to using already existing constant. Adapt nullstring casting to EMPTYROW markers for backwards compatability. Adapt tests for backward compatability behaviour allowing text datatypes to be casted to EMPTYROW constant. Adapt mcol641-functions test according to bug fix. Update tests according to new expected behaviour. Adapt tests to new understanding of issue. Update comments/documentation for MCOL_4632 test. Adapt to new cast limit logic. Make bracketing consistent. Adapt previous regression test to new expected behaviour.	2023-08-11 13:00:30 +00:00
Gagan Goel	c598a9bbed	MCOL-5480 LOAD DATA INFILE incorrectly loads values for MEDIUMINT datatype. Internal memory representation of MEDIUMINT datatype uses 24 bits. This is true for both MariaDB server as well as ColumnStore. MCS plugin code uses TypeHandlerSInt24 and TypeHandlerUInt24 classes to respectively convert the binary representation of the signed and unsigned MEDIUMINT values passed by the server to the plugin. The plugin then outputs the text representation of these values into an open file descriptor which is piped to cpimport for the final load into the MCS db files. The TypeHandlerXInt24 classes were earlier incorrectly using WriteBatchField::ColWriteBatchXInt32() functions which operate on a 4 byte buffer. This resulted in incorrect parsing of MEDIUMINT values. As a fix, we implement WriteBatchField::ColWriteBatchXInt24() functions which correctly handle the 24 bit input buffer used for MEDIUMINT datatype.	2023-05-23 16:00:05 -04:00
Leonid Fedorov	2e1394149b	MCOL-5464: Fixes of bugs from ASAN warnings, part one (#2792 ) * Fixes of bugs from ASAN warnings, part one * MQC as static library, with nifty counter for global map and mutex * Switch clang to 16 * link messageqcpp to execplan	2023-04-04 02:33:23 +03:00
Sergey Zefirov	b53c231ca6	MCOL-271 empty strings should not be NULLs (#2794 ) This patch improves handling of NULLs in textual fields in ColumnStore. Previously empty strings were considered NULLs and it could be a problem if data scheme allows for empty strings. It was also one of major reasons of behavior difference between ColumnStore and other engines in MariaDB family. Also, this patch fixes some other bugs and incorrect behavior, for example, incorrect comparison for "column <= ''" which evaluates to constant True for all purposes before this patch.	2023-03-30 21:18:29 +03:00
Roman Nozdrin	786b9da5b0	MCOL-5438 COUNT() in math causes SEGV	2023-03-09 20:35:38 +00:00
Leonid Fedorov	56f2346083	Remove windows ifdefs	2023-03-02 15:59:42 +00:00
david.hall	9b84bf57c9	From serg: add dependency for generated header files errorids.h messageids.h	2022-11-17 11:46:10 -06:00
Roman Nozdrin	a0086bc561	Adding NULL flag into ConstString class	2022-10-21 18:13:18 +00:00
Jigao Luo	6c4af1461f	[MCOL-5205] Fix bug from union type in UNION processing. This patch fixs the reported JIRA issue MCOL 5205, which consists of a wrong union type from two input Int types. The bug results in wrong unioned answers in CS. The fix includes more INT case discussions. Additionaly, this patch provides detailed unit tests for correctness in UNION processing with Int. Signed-off-by: Jigao Luo <luojigao@outlook.com>	2022-09-09 22:54:35 +02:00
Leonid Fedorov	d2432f9bf6	get rid of pointers for 128 fields	2022-08-26 15:12:22 +00:00
mariadb-AndreyPiskunov	0863ecd279	Replace getBinaryField	2022-08-25 18:21:43 +03:00
Sergei Golubchik	dee50318ad	Serg dev (#2502 ) * build boost during build phase, not during configure * add dependency for generated header files errorids.h messageids.h see `7e868bc588` * set explicit dependencies on external_boost for #include's * clang-14 compatibility fix	2022-08-13 07:18:30 +03:00
David.Hall	2020f35e88	Mcol 5092 MODA uses wrong column width for some types (#2450 ) * MCOL-5092 Ensure column width is correct for datatype Change MODA return type to STRING Modify MODA to handle every numeric type * MCOL-5162 MODA to support char and varchar with collation support Fixes to the aggregate bit functions When we fixed the storage sign issue for MCOL-5092, it uncovered a problem in the bit aggregates (bit_and, bit_or and bit_xor). These aggregates should always return UBIGINT, but they relied on the type of the argument column, which gave bad results.	2022-08-11 15:16:11 -05:00
Leonid Fedorov	fbd043b036	Fixing alightment for clang tests of rowgroup	2022-03-23 14:29:19 +00:00
Serguey Zefirov	53b9a2a0f9	MCOL-4580 extent elimination for dictionary-based text/varchar types The idea is relatively simple - encode prefixes of collated strings as integers and use them to compute extents' ranges. Then we can eliminate extents with strings. The actual patch does have all the code there but miss one important step: we do not keep collation index, we keep charset index. Because of this, some of the tests in the bugfix suite fail and thus main functionality is turned off. The reason of this patch to be put into PR at all is that it contains changes that made CHAR/VARCHAR columns unsigned. This change is needed in vectorization work.	2022-03-02 23:53:39 +03:00
Leonid Fedorov	3919c541ac	New warnfixes (#2254 ) * Fix clang warnings * Remove vim tab guides * initialize variables * 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length * Fix ISO C++17 does not allow 'register' storage class specifier for outdated bison * chars are unsigned on ARM, having if (ival < 0) always false * chars are unsigned by default on ARM and comparison with -1 if always true	2022-02-17 13:08:58 +03:00
Gagan Goel	973e5024d8	MCOL-4957 Fix performance slowdown for processing TIMESTAMP columns. Part 1: As part of MCOL-3776 to address synchronization issue while accessing the fTimeZone member of the Func class, mutex locks were added to the accessor and mutator methods. However, this slows down processing of TIMESTAMP columns in PrimProc significantly as all threads across all concurrently running queries would serialize on the mutex. This is because PrimProc only has a single global object for the functor class (class derived from Func in utils/funcexp/functor.h) for a given function name. To fix this problem: (1) We remove the fTimeZone as a member of the Func derived classes (hence removing the mutexes) and instead use the fOperationType member of the FunctionColumn class to propagate the timezone values down to the individual functor processing functions such as FunctionColumn::getStrVal(), FunctionColumn::getIntVal(), etc. (2) To achieve (1), a timezone member is added to the execplan::CalpontSystemCatalog::ColType class. Part 2: Several functors in the Funcexp code call dataconvert::gmtSecToMySQLTime() and dataconvert::mySQLTimeToGmtSec() functions for conversion between seconds since unix epoch and broken-down representation. These functions in turn call the C library function localtime_r() which currently has a known bug of holding a global lock via a call to __tz_convert. This significantly reduces performance in multi-threaded applications where multiple threads concurrently call localtime_r(). More details on the bug: https://sourceware.org/bugzilla/show_bug.cgi?id=16145 This bug in localtime_r() caused processing of the Functors in PrimProc to slowdown significantly since a query execution causes Functors code to be processed in a multi-threaded manner. As a fix, we remove the calls to localtime_r() from gmtSecToMySQLTime() and mySQLTimeToGmtSec() by performing the timezone-to-offset conversion (done in dataconvert::timeZoneToOffset()) during the execution plan creation in the plugin. Note that localtime_r() is only called when the time_zone system variable is set to "SYSTEM". This fix also required changing the timezone type from a std::string to a long across the system.	2022-02-14 14:12:27 -05:00
Leonid Fedorov	04752ec546	clang format apply	2022-01-21 16:43:49 +00:00
Leonid Fedorov	01f3ceb437	replace header guards with #pragma once	2022-01-21 15:24:58 +00:00
Roman Nozdrin	67c85dae15	MCOL-4809 The patch replaces legacy scanning/filtering code with a number of templates that simplifies control flow removing needless expressions	2021-09-06 17:04:52 +00:00
Leonid Fedorov	5c5f103f98	MCOL-4839: Fix clang build (#2100 ) * Fix clang build * Extern C returned to plugin_instance Co-authored-by: Leonid Fedorov <l.fedorov@mail.corp.ru>	2021-08-23 10:45:10 -05:00
Roman Nozdrin	a292585b8c	MCOL-4815 ColumnCommand was replaced with a set of derived classes specified by column width RTSCommand was modified to use a fabric that produces CC class based on column width NB this patch doesn't affect PseudoCC that also leverages ColumnCommand	2021-07-21 12:54:14 +00:00
Leonid Fedorov	f81f743282	Replace underlying type for avg and sum for int types from long double to wide decimal	2021-07-08 17:04:43 +00:00
Alexander Barkov	9794f24369	MCOL-4801 Replace Row methods getStringLength() and getStringPointer() to getConstString()	2021-07-06 21:15:32 +04:00
Gagan Goel	8520f87237	MCOL-641 Cleanup.	2021-07-06 09:01:49 +00:00
Roman Nozdrin	2de4888899	Merge pull request #1990 from drrtuy/MCOL-4173_9 MCOL-4173 This patch adds support for wide-DECIMAL INNER, OUTER, SEMI…	2021-06-24 16:15:07 +03:00
Roman Nozdrin	bed0b7c6bc	MCOL-4173 This patch adds support for wide-DECIMAL INNER, OUTER, SEMI, functional JOINs based on top of TypelessData	2021-06-24 08:07:23 +00:00
Gagan Goel	7c8b502dc2	Fix regression in a query involving an aggregate function on a non-wide decimal column in the HAVING clause. In buildAggregateColumn(), if an aggregate function (such as avg) is applied on a non-wide decimal column, we were setting the precision of the resulting column as -1. This later down in the execution got converted to 255 as in some cases, precision is stored as uint8_t. The predicate operations on a DECIMAL column has logic that uses the wide Decimal::s128value field if precision > 18. This logic incorrectly used the Decimal::s128value instead of the correct value stored in the narrow Decimal::value field, since precision of the Decimal column was 255. The fix is to set the aggregate column precision to datatypes::INT64MAXPRECISION (18) in buildAggregateColumn() when the aggregate is applied on a non-wide decimal column. This commit also partially fixes -Wstrict-aliasing GCC warnings.	2021-06-22 11:11:34 +00:00
Roman Nozdrin	42e710f817	Merge pull request #1942 from mariadb-corporation/bar-develop-compile-10.6 Fixing 10.6 + develop compilation failure	2021-05-25 14:31:37 +03:00
Alexander Barkov	9608533d92	MCOL-4734 Compilation failure: MariaDB-10.6 + ColumnStore-develop mcsconfig.h and my_config.h have the following pre-processor definitions: 1. Conflicting definitions coming from the standard cmake definitions: - PACKAGE - PACKAGE_BUGREPORT - PACKAGE_NAME - PACKAGE_STRING - PACKAGE_TARNAME - PACKAGE_VERSION - VERSION 2. Conflicting definitions of other kinds: - HAVE_STRTOLL - this is a dirt in MariaDB headers. Should be fixed in the server code. my_config.h erroneously performs "#define HAVE_STRTOLL" instead of "#define HAVE_STRTOLL 1". in some cases. The former is not CMake compatible style. The latter is. 3. Non-conflicting definitions: Otherwise, mcsconfig.h and my_config.h should be mutually compatible, because both are generated by cmake on the same host machine. So they should have exactly equal definitions like "HAVE_XXX", "SIZEOF_XXX", etc. Observations: - It's OK to include both mcsconfig.h and my_config.h providing that we suppress duplicate definition of the above conflicting types #1 and #2. - There is no a need to suppress duplicate definitions mentioned in #3, as they are compatible! - my_sys.h and m_ctype.h must always follow a CMake configuation header, either my_config.h or mcsconfig.h (or both). They must never be included without any preceeding configuration header. This change make sure that we resolve conflicts by: - either disallowing inclusion of mcsconfig.h and my_config.h at the same time - or by hiding conflicting definitions #1 and #2 (with their later restoring). - also, by making sure that my_sys.h and m_ctype.h always follow a CMake configuration file. Details: - idb_mysql.h can now only be included only after my_config.h An attempt to use idb_mysql.h with mcsconfig.h instead of my_config.h is caught by the "#error" preprocessor directive. - mariadb_my_sys.h can now be only included after mcsconfig.h. An attempt to use mariadb_my_sys.h without mcscofig.h (e.g. with my_config.h) is also caught by "#error". - collation.h now can now be included in two ways. It now has the following effective structure: #if defined(PREFER_MY_CONFIG_H) && defined(MY_CONFIG_H) // Remember current conflicting definitions on the preprocessor stack // Undefine current conflicting definitions #endif #include "mcsconfig.h" #include "m_ctype.h" #if defined(PREFER_MY_CONFIG_H) && defined(MY_CONFIG_H) # Restore conflicting definitions from the preprocessor stack #endif and can be included as follows: a. using only mcsconfig.h as a configuration header: // my_config.h must not be included so far #include "collation.h" b. using my_config.h as the first included configuration file: #define PREFER_MY_CONFIG_H // Force conflict resolution #include "my_config.h" // can be included directly or indirectly ... #include "collation.h" Other changes: - Adding helper header files utils/common/mcsconfig_conflicting_defs_remember.h utils/common/mcsconfig_conflicting_defs_restore.h utils/common/mcsconfig_conflicting_defs_undef.h to perform conflict resolution easier. - Removing `#include "collation.h"` from a number of files, as it's automatically included from rowgroup.h. - Removing redundant `#include "utils_utf8.h"`. This change is not directly related to the problem being fixed, but it's nice to remove redundant directives for both collation.h and utils_utf8.h from all the files that do not really need them. (this change could probably have gone as a separate commit) - Changing my_init() to MY_INIT(argv[0]) in the MCS services sources. After the fix of the complitation failure it appeared that ColumnStore services compiled with the debug build crash due to recent changes in safemalloc. The crash happened in strcmp() with `my_progname` as an argument (where my_progname is a mysys global variable). This problem should probably be fixed on the server side as well to avoid passing NULL. But, the majority of MariaDB executable programs also use MY_INIT(argv[0]) rather than my_init(). So let's make MCS do like the other programs do.	2021-05-25 12:34:36 +04:00
Alexander Barkov	284fc51bb7	MCOL-4726 Wrong result of WHERE char1_col='A'	2021-05-21 14:40:16 +04:00
Alexander Barkov	bd4cbb542d	MCOL-4721 CHAR(1) is not collation-aware for GROUP/DISTINCT	2021-05-18 16:14:53 +04:00
Alexander Barkov	362bfcd15e	MCOL-4361 Replace pow(10.0, (double)scale) expressions with a static dictionary lookup.	2021-04-09 12:41:04 +04:00
Roman Nozdrin	c1138c4793	Merge pull request #1771 from mariadb-SergeyZefirov/MCOL-2044-update-ranges-during-DML Mcol 2044 update ranges during dml	2021-04-06 12:46:02 +03:00
Gagan Goel	f33b860580	Merge pull request #1841 from mariadb-corporation/bar-develop-MCOL-4614 A joint patch for MCOL-4614 and MCOL-4615 (decimal to string conversion)	2021-04-06 02:13:38 -04:00
Alexander Barkov	69911c2710	A joint patch for MCOL-4614, MCOL-4615, MCOL-4660 (decimal to string conversion) This patch fixes: - MCOL-4614 calShowPartitions() precision loss for huge narrow decimal - MCOL-4615 GROUP_CONCAT() precision loss for huge narrow decimal - MCOL-4660 Narow decimal to string conversion is inconsistent about zero integral Changes: - Implementing Row::getDecimalField() - Removing double arithmetic from the code printing DECIMAL values in TypeHandlerXDecimal::format64() and GroupConcator::outputRow(). Using Decimal::toString() instead. - Rewriting Decimal::toStringTSInt64(). The old implementation was wrong, too complex and slow (used unnecessary memmove, memcpy). An additional cleanup: - Removing the ENGINE=COLUMNSTORE clause from tests for MCOL-4532 and MCOL-4640 type_decimal.test is combinations-aware. It's run two times with default_storage_engine=MyISAM and default_storage_engine=COLUMNSTORE. So the CREATE TABLE statements should not specify the engine explicitly. - Adding --disable_warnings in the old fixed test. We needed to suppress warnings when the MyISAM combination is being run. Previously the table was erroneously created with ENGINE=COLUMNSTORE even with the MyISAM combination run. So warning were not generated.	2021-04-05 16:36:19 +04:00

1 2 3

112 Commits