mariadb-columnstore-engine

mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-07-30 19:23:07 +03:00

Author	SHA1	Message	Date
Leonid Fedorov	0d1c72a563	compilation fix for gcc12 on known gcc bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105329	2024-01-04 11:43:03 +03:00
Serguey Zefirov	1f958c9ed2	MCOL-5625: Fixes json_query implementation Also extends func_json_value.test.	2023-12-12 15:45:03 +03:00
Sergey Zefirov	5c9770d1e6	fix(funcexp): MCOL-5607: JSON function use crashes query execution (#3028 ) JSON functions were implemented violating an assumption of their pureness, as they should not have any state. This concrete patch fixes implementation of JSON_VALUE function.	2023-11-21 23:46:03 +03:00
Theresa Hradilak	48562e41f9	feat(datatypes): MCOL-4632 and MCOL-4648, fix cast leads to NULL. Remove redundant cast. As C-style casts with a type name in parantheses are interpreted as static_casts this literally just changes the interpretation around (and forces an implicit cast to match the return value of the function). Switch UBIGINTNULL and UBIGINTEMPTYROW constants for consistency. Make consistent with relation between BIGINTNULL and BIGINTEMPTYROW & make adapted cast behaviour due to NULL markers more intuitive. (After this change we can simply block the highest possible uint64_t value and if a cast results in it, print the next lower value (2^64 - 2). Previously, (2^64 - 1) was able to be printed, but (2^64 - 2) as being blocked by the UBIGINTNULL constant was not, making finding the appropiate replacement value to give out more confusing. Introduce MAX_MCS_UBIGINT and MIN_MCS_BIGINT and adapt casts. Adapt casting to BIGINT to remove NULL marker error. Add bugfix regression test for MCOL 4632 Add regression test for mcol_4648 Revert "Switch UBIGINTNULL and UBIGINTEMPTYROW constants for consistency." This reverts commit 83eac11b18937ecb0b4c754dd48e4cb47310f620. Due to backwards compatability issues. Refactor casting to MCS[U]Int to datatype functions. Update regression tests to include other affected datatypes. Apply formatting. Refactor according to PR review Remove redundant new constant, switch to using already existing constant. Adapt nullstring casting to EMPTYROW markers for backwards compatability. Adapt tests for backward compatability behaviour allowing text datatypes to be casted to EMPTYROW constant. Adapt mcol641-functions test according to bug fix. Update tests according to new expected behaviour. Adapt tests to new understanding of issue. Update comments/documentation for MCOL_4632 test. Adapt to new cast limit logic. Make bracketing consistent. Adapt previous regression test to new expected behaviour.	2023-08-11 13:00:30 +00:00
Leonid Fedorov	65cde8c894	feature: pron (#2908 ) * feature: Special dictionary, we can pass with session veriable to modify codepaths and behaviour for testing and debugging	2023-07-21 14:02:03 +03:00
Sergei Golubchik	ebfb9face2	compiler failures with gcc 12.x a workaround for something that looks like a bug in a compiler. Fixes errors like In file included from /usr/include/c++/12/string:40, from /mnt/server/storage/columnstore/columnstore/utils/funcexp/func_math.cpp:26: In static member function ‘static constexpr std::char_traits<char>::char_type* std::char_traits<char>::copy(char_type, const char_type, std::size_t)’, inlined from ‘static constexpr void std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::_S_copy(_CharT, const _CharT, size_type) [with _CharT = char; _Traits = std::char_traits<char>; _Alloc = std::allocator<char>]’ at /usr/include/c++/12/bits/basic_string.h:423:21, inlined from ‘constexpr std::__cxx11::basic_string<_CharT, _Traits, _Allocator>& std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::_M_replace(size_type, size_type, const _CharT, size_type) [with _CharT = char; _Traits = std::char_traits<char>; _Alloc = std::allocator<char>]’ at /usr/include/c++/12/bits/basic_string.tcc:532:22, inlined from ‘constexpr std::__cxx11::basic_string<_CharT, _Traits, _Alloc>& std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::replace(size_type, size_type, const _CharT, size_type) [with _CharT = char; _Traits = std::char_traits<char>; _Alloc = std::allocator<char>]’ at /usr/include/c++/12/bits/basic_string.h:2171:19, inlined from ‘constexpr std::__cxx11::basic_string<_CharT, _Traits, _Alloc>& std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::insert(size_type, const _CharT) [with _CharT = char; _Traits = std::char_traits<char>; _Alloc = std::allocator<char>]’ at /usr/include/c++/12/bits/basic_string.h:1928:22, inlined from ‘virtual std::string funcexp::Func_format::getStrVal(rowgroup::Row&, funcexp::FunctionParm&, bool&, execplan::CalpontSystemCatalog::ColType&)’ at /mnt/server/storage/columnstore/columnstore/utils/funcexp/func_math.cpp:2008:17: /usr/include/c++/12/bits/char_traits.h:431:56: error: ‘void __builtin_memcpy(void, const void, long unsigned int)’ accessing 9223372036854775810 or more bytes at offsets 3 and [2, 2147483645] may overlap up to 9223372036854775813 bytes at offset -3 [-Werror=restrict] 431 \| return static_cast<char_type*>(__builtin_memcpy(__s1, __s2, __n ); $ gcc --version gcc (Ubuntu 12.2.0-3ubuntu1) 12.2.0	2023-07-04 12:58:18 -04:00
Leonid Fedorov	77eedd1756	MCOL-5503: Fix broken quarter (#2855 ) * Fix broken quarter function	2023-06-02 18:06:58 +03:00
Leonid Fedorov	8f93fc3623	MCOL-5493: First portion of UBSan fixes (#2842 ) Multiple UB fixes	2023-06-02 17:02:09 +03:00
Sergey Zefirov	0a2e9760ee	Fix for JSON_VALUE function to remove OOB stack access (#2852 ) MCOL-271 introduced a bug in JSON_VALUE that was discovered during implementation of ASAN builds. The changes here restore normal functionality. In short, changes in MCOL-271 introduced a local variable instead of reference to a string in ConstantColumn's fResult.strVal. The handling of ConstantColumn is different because ConstantColumn's value is used to initialize JSON path once. JSON path value holds pointer to data it does not own and if there are two or more rows the data can be corrupted and/or be out of stack bounds. The changes here introduce reference to a NullString that is held in the ConstantColumn's fResult.strVal and uses appropriate functions to obtain data from the NullString. CC's fResult is held by CC and strVal is also neither changing nor moving during operation, which allow JSON path to hold correct pointers during multi-row operation.	2023-05-31 15:30:40 +03:00
Leonid Fedorov	f18c556311	Fix gcc-13 warning and add support for building Fedora (#2845 )	2023-05-26 16:30:53 +03:00
Roman Nozdrin	e6e74c0be7	MCOL-5437 Fixes to follow the charset_info api change introduced by MDEV-30661	2023-05-08 18:57:36 +00:00
Roman Nozdrin	4fe9cd64a3	Revert "No boost condition (#2822 )" (#2828 ) This reverts commit `f916e64927`.	2023-04-22 15:49:50 +03:00
Leonid Fedorov	f916e64927	No boost condition (#2822 ) This patch replaces boost primitives with stdlib counterparts.	2023-04-22 00:42:45 +03:00
Mu He	7e2f83e39d	Merge branch 'mariadb-corporation:develop' into develop	2023-04-05 18:22:52 +02:00
Leonid Fedorov	2e1394149b	MCOL-5464: Fixes of bugs from ASAN warnings, part one (#2792 ) * Fixes of bugs from ASAN warnings, part one * MQC as static library, with nifty counter for global map and mutex * Switch clang to 16 * link messageqcpp to execplan	2023-04-04 02:33:23 +03:00
MuHe03	d906974abc	MCOL-4991 Solving TRUNCATE/ROUND/CEILING functions on TIME/DATETIME/TIMESTAMP Add getDecimalVal in func_round and func_truncate for getting value while filtering MCOL-4991 Solving TRUNCATE/ROUND/CEILING functions on TIME/DATETIME/TIMESTAMP Update func_cast.cpp	2023-03-31 18:39:16 +02:00
Sergey Zefirov	b53c231ca6	MCOL-271 empty strings should not be NULLs (#2794 ) This patch improves handling of NULLs in textual fields in ColumnStore. Previously empty strings were considered NULLs and it could be a problem if data scheme allows for empty strings. It was also one of major reasons of behavior difference between ColumnStore and other engines in MariaDB family. Also, this patch fixes some other bugs and incorrect behavior, for example, incorrect comparison for "column <= ''" which evaluates to constant True for all purposes before this patch.	2023-03-30 21:18:29 +03:00
Roman Nozdrin	786b9da5b0	MCOL-5438 COUNT() in math causes SEGV	2023-03-09 20:35:38 +00:00
Leonid Fedorov	56f2346083	Remove windows ifdefs	2023-03-02 15:59:42 +00:00
David.Hall	ef0a21267e	MCOL-5248 Change func_truncate() to use double for string, rather than attempt to translate to decimal. Currently, the treenode.h conversion functions don't support string to decimal conversion. (#2598 ) This new functionality brings us into alignment with MDB 10.6	2022-11-30 19:58:25 +03:00
mariadb-AndreyPiskunov	c3426dbd69	Explicit cast of INT_MIN/MAX to double	2022-09-30 18:26:37 +03:00
mariadb-AndreyPiskunov	07a7130e2a	Explicit cast to long	2022-09-30 18:26:37 +03:00
mariadb-AndreyPiskunov	5b166a9577	Fix truncate and round for char types	2022-09-30 18:26:37 +03:00
Roman Nozdrin	09223cc2ce	Merge pull request #2425 from Ziy1-Tan/MCOL-785-ziyi MCOL-785 Implement DISTRIBUTED JSON functions	2022-08-31 22:56:43 +03:00
Leonid Fedorov	05d3ac82d9	Querytester (#2539 ) * Build querytester adhoc on Drone * Negative to unsigned cast is 0 on ARM	2022-08-30 17:25:26 +03:00
Ziy1-Tan	cdd41f05f3	MCOL-785 Implement DISTRIBUTED JSON functions The following functions are created: Create function JSON_VALID and test cases Create function JSON_DEPTH and test cases Create function JSON_LENGTH and test cases Create function JSON_EQUALS and test cases Create function JSON_NORMALIZE and test cases Create function JSON_TYPE and test cases Create function JSON_OBJECT and test cases Create function JSON_ARRAY and test cases Create function JSON_KEYS and test cases Create function JSON_EXISTS and test cases Create function JSON_QUOTE/JSON_UNQUOTE and test cases Create function JSON_COMPACT/DETAILED/LOOSE and test cases Create function JSON_MERGE and test cases Create function JSON_MERGE_PATCH and test cases Create function JSON_VALUE and test cases Create function JSON_QUERY and test cases Create function JSON_CONTAINS and test cases Create function JSON_ARRAY_APPEND and test cases Create function JSON_ARRAY_INSERT and test cases Create function JSON_INSERT/REPLACE/SET and test cases Create function JSON_REMOVE and test cases Create function JSON_CONTAINS_PATH and test cases Create function JSON_OVERLAPS and test cases Create function JSON_EXTRACT and test cases Create function JSON_SEARCH and test cases Note: Some functions output differs from MDB because session variables that affects functions output,e.g JSON_QUOTE/JSON_UNQUOTE This depends on MCOL-5212	2022-08-30 22:22:23 +08:00
Sergey Zefirov	50d95bf60a	MCOL-5196 REPLACE function may trigger invalid capacity assertion (#2522 ) When length of string to replace minus length of string to replace to is bigger than input string and processing mode allows for binary (memcmp or std::string::find()) comparison, REPLACE may trigger invalid capacity assertion and query processing will stop. The fix is to properly count the number of occurences of the string to replace, basically.	2022-08-22 21:34:38 +03:00
Leonid Fedorov	65252df4f6	C++20 fixes	2022-03-28 12:32:29 +00:00
Serguey Zefirov	53b9a2a0f9	MCOL-4580 extent elimination for dictionary-based text/varchar types The idea is relatively simple - encode prefixes of collated strings as integers and use them to compute extents' ranges. Then we can eliminate extents with strings. The actual patch does have all the code there but miss one important step: we do not keep collation index, we keep charset index. Because of this, some of the tests in the bugfix suite fail and thus main functionality is turned off. The reason of this patch to be put into PR at all is that it contains changes that made CHAR/VARCHAR columns unsigned. This change is needed in vectorization work.	2022-03-02 23:53:39 +03:00
Leonid Fedorov	5a577553f7	Hex double convertion	2022-02-24 10:07:22 +03:00
Leonid Fedorov	3919c541ac	New warnfixes (#2254 ) * Fix clang warnings * Remove vim tab guides * initialize variables * 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length * Fix ISO C++17 does not allow 'register' storage class specifier for outdated bison * chars are unsigned on ARM, having if (ival < 0) always false * chars are unsigned by default on ARM and comparison with -1 if always true	2022-02-17 13:08:58 +03:00
Gagan Goel	973e5024d8	MCOL-4957 Fix performance slowdown for processing TIMESTAMP columns. Part 1: As part of MCOL-3776 to address synchronization issue while accessing the fTimeZone member of the Func class, mutex locks were added to the accessor and mutator methods. However, this slows down processing of TIMESTAMP columns in PrimProc significantly as all threads across all concurrently running queries would serialize on the mutex. This is because PrimProc only has a single global object for the functor class (class derived from Func in utils/funcexp/functor.h) for a given function name. To fix this problem: (1) We remove the fTimeZone as a member of the Func derived classes (hence removing the mutexes) and instead use the fOperationType member of the FunctionColumn class to propagate the timezone values down to the individual functor processing functions such as FunctionColumn::getStrVal(), FunctionColumn::getIntVal(), etc. (2) To achieve (1), a timezone member is added to the execplan::CalpontSystemCatalog::ColType class. Part 2: Several functors in the Funcexp code call dataconvert::gmtSecToMySQLTime() and dataconvert::mySQLTimeToGmtSec() functions for conversion between seconds since unix epoch and broken-down representation. These functions in turn call the C library function localtime_r() which currently has a known bug of holding a global lock via a call to __tz_convert. This significantly reduces performance in multi-threaded applications where multiple threads concurrently call localtime_r(). More details on the bug: https://sourceware.org/bugzilla/show_bug.cgi?id=16145 This bug in localtime_r() caused processing of the Functors in PrimProc to slowdown significantly since a query execution causes Functors code to be processed in a multi-threaded manner. As a fix, we remove the calls to localtime_r() from gmtSecToMySQLTime() and mySQLTimeToGmtSec() by performing the timezone-to-offset conversion (done in dataconvert::timeZoneToOffset()) during the execution plan creation in the plugin. Note that localtime_r() is only called when the time_zone system variable is set to "SYSTEM". This fix also required changing the timezone type from a std::string to a long across the system.	2022-02-14 14:12:27 -05:00
benthompson15	36775168d3	MCOL-4940: getLongDoubleVal was not handling all colDataType correctly (#2229 )	2022-01-31 13:46:13 -06:00
Leonid Fedorov	04752ec546	clang format apply	2022-01-21 16:43:49 +00:00
Leonid Fedorov	01f3ceb437	replace header guards with #pragma once	2022-01-21 15:24:58 +00:00
Marko Mäkelä	15da99477e	MDEV-27519 CRC32() upon Columnstore table returns a wrong value Func_crc32::getIntVal(): Support the 2-ary CRC32() variant (MDEV-27208). Also, do not assume that the string contains no NUL bytes.	2022-01-21 09:35:19 +00:00
Roman Nozdrin	0324cedc88	This patch puts octet2hex symbol into a correct namespace	2021-10-25 18:08:18 +00:00
Leonid Fedorov	5c5f103f98	MCOL-4839: Fix clang build (#2100 ) * Fix clang build * Extern C returned to plugin_instance Co-authored-by: Leonid Fedorov <l.fedorov@mail.corp.ru>	2021-08-23 10:45:10 -05:00
benthompson15	923bbf4033	MCOL-1356: Add convert_tz (#2099 )	2021-08-19 17:47:10 -05:00
Leonid Fedorov	3136e9dbab	We forgot to initilize longdoublenull value (#2091 )	2021-08-18 11:34:35 -05:00
David Hall	4d4dd22105	MCOL-4771 develop fix crash from rand()	2021-08-06 16:09:28 -05:00
Gagan Goel	8520f87237	MCOL-641 Cleanup.	2021-07-06 09:01:49 +00:00
Alexander Barkov	9608533d92	MCOL-4734 Compilation failure: MariaDB-10.6 + ColumnStore-develop mcsconfig.h and my_config.h have the following pre-processor definitions: 1. Conflicting definitions coming from the standard cmake definitions: - PACKAGE - PACKAGE_BUGREPORT - PACKAGE_NAME - PACKAGE_STRING - PACKAGE_TARNAME - PACKAGE_VERSION - VERSION 2. Conflicting definitions of other kinds: - HAVE_STRTOLL - this is a dirt in MariaDB headers. Should be fixed in the server code. my_config.h erroneously performs "#define HAVE_STRTOLL" instead of "#define HAVE_STRTOLL 1". in some cases. The former is not CMake compatible style. The latter is. 3. Non-conflicting definitions: Otherwise, mcsconfig.h and my_config.h should be mutually compatible, because both are generated by cmake on the same host machine. So they should have exactly equal definitions like "HAVE_XXX", "SIZEOF_XXX", etc. Observations: - It's OK to include both mcsconfig.h and my_config.h providing that we suppress duplicate definition of the above conflicting types #1 and #2. - There is no a need to suppress duplicate definitions mentioned in #3, as they are compatible! - my_sys.h and m_ctype.h must always follow a CMake configuation header, either my_config.h or mcsconfig.h (or both). They must never be included without any preceeding configuration header. This change make sure that we resolve conflicts by: - either disallowing inclusion of mcsconfig.h and my_config.h at the same time - or by hiding conflicting definitions #1 and #2 (with their later restoring). - also, by making sure that my_sys.h and m_ctype.h always follow a CMake configuration file. Details: - idb_mysql.h can now only be included only after my_config.h An attempt to use idb_mysql.h with mcsconfig.h instead of my_config.h is caught by the "#error" preprocessor directive. - mariadb_my_sys.h can now be only included after mcsconfig.h. An attempt to use mariadb_my_sys.h without mcscofig.h (e.g. with my_config.h) is also caught by "#error". - collation.h now can now be included in two ways. It now has the following effective structure: #if defined(PREFER_MY_CONFIG_H) && defined(MY_CONFIG_H) // Remember current conflicting definitions on the preprocessor stack // Undefine current conflicting definitions #endif #include "mcsconfig.h" #include "m_ctype.h" #if defined(PREFER_MY_CONFIG_H) && defined(MY_CONFIG_H) # Restore conflicting definitions from the preprocessor stack #endif and can be included as follows: a. using only mcsconfig.h as a configuration header: // my_config.h must not be included so far #include "collation.h" b. using my_config.h as the first included configuration file: #define PREFER_MY_CONFIG_H // Force conflict resolution #include "my_config.h" // can be included directly or indirectly ... #include "collation.h" Other changes: - Adding helper header files utils/common/mcsconfig_conflicting_defs_remember.h utils/common/mcsconfig_conflicting_defs_restore.h utils/common/mcsconfig_conflicting_defs_undef.h to perform conflict resolution easier. - Removing `#include "collation.h"` from a number of files, as it's automatically included from rowgroup.h. - Removing redundant `#include "utils_utf8.h"`. This change is not directly related to the problem being fixed, but it's nice to remove redundant directives for both collation.h and utils_utf8.h from all the files that do not really need them. (this change could probably have gone as a separate commit) - Changing my_init() to MY_INIT(argv[0]) in the MCS services sources. After the fix of the complitation failure it appeared that ColumnStore services compiled with the debug build crash due to recent changes in safemalloc. The crash happened in strcmp() with `my_progname` as an argument (where my_progname is a mysys global variable). This problem should probably be fixed on the server side as well to avoid passing NULL. But, the majority of MariaDB executable programs also use MY_INIT(argv[0]) rather than my_init(). So let's make MCS do like the other programs do.	2021-05-25 12:34:36 +04:00
benthompson15	870d672efb	MCOL-4044: Add oracle mode functions.	2021-04-21 16:07:42 -05:00
Alexander Barkov	362bfcd15e	MCOL-4361 Replace pow(10.0, (double)scale) expressions with a static dictionary lookup.	2021-04-09 12:41:04 +04:00
Alexander Barkov	912cbe641e	Removing func_bitand.cpp It was a dead code. It was not even a part of the soures in CMakeList.txt. The bit AND operator implemententation resides in func_bitwise.cpp together with all other bit operators and functions.	2021-04-08 11:34:22 +04:00
Gagan Goel	47b1ea1cf9	Merge pull request #1849 from mariadb-corporation/bar-develop-MCOL-4666 MCOL-4666 Empty set when using BIT OR and BIT AND functions in WHERE	2021-04-08 03:04:54 -04:00
Alexander Barkov	a6a85d157d	MCOL-4666 Empty set when using BIT OR and BIT AND functions in WHERE	2021-04-07 14:37:39 +04:00
Alexander Barkov	9f41f574da	MCOL-4668 PERIOD_DIFF(dec_or_double1,dec_or_double2) is not as in InnoDB	2021-04-06 08:15:18 +04:00
Alexander Barkov	a86f432f35	Fixing DOUBLE-to-[U]INT conversion (MCOL-4649, MCOL-4631, MCOL-4647) Bugs fixed: - MCOL-4649 CAST(double AS UNSIGNED) returns 0 - MCOL-4631 CAST(double AS SIGNED) returns 0 or NULL - MCOL-4647 SEC_TO_TIME(double_or_float) returns a wrong result Problems: - The code in Func_cast_unsigned::getUintVal() and Func_cast_signed::getIntVal() did not properly check the double value to fit inside a uint64_t/int64_t range. So the corner cases: - numeric_limits<uint64_t>::max()-2 for uint64_t - numeric_limits<int64_t>::max() for int64_t produced unexpected results. The problem was in tests like this: if (value > (double) numeric_limits<int64_t>::max()) A correct test would be: if (value >= (double) numeric_limits<int64_t>::max()) - The code in Func_sec_to_time::getStrVal() searched for the decimal dot character, assuming that the next character after the dot was the leftmost fractional digit. This assumption was wrong because huge double numbers use scientific notation. So for example in "2.5e-40" the digit "5" following the dot is NOT the leftmost fractional digit. Also, the code in Func_sec_to_time::getStrVal() was slow because of using non necessary to-string and from-string data conversion. Also, the code in Func_sec_to_time::getStrVal() evaluated the argument two times: using getStrVal() then using getIntVal(). Solution: - Adding new classes TDouble and TLongDouble. - Adding a few function templates to reuse the code easier. - Moving the conversion code inside TDouble and TLongDouble methods toMCSSInt64Round() and toMCSUInt64Round(). - Reusing new classes and their methods in func_cast.cc and func_sec_to_time.cc.	2021-04-05 11:30:52 +04:00

1 2 3 4 5 ...

305 Commits