mariadb-columnstore-engine

mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-08-07 03:22:57 +03:00

Author	SHA1	Message	Date
Denis Khalikov	5f07828619	MCOL-5522 Properly process pm join result count. This patch: 1. Properly processes situation when pm join result count is exceeded. 2. Adds session variable 'columnstore_max_pm_join_result_count` to control the limit.	2024-01-10 18:16:39 +04:00
Sergei Golubchik	5258ab03cf	compiler failures with gcc 12.x a workaround for something that looks like a bug in a compiler. Fixes errors like In file included from /usr/include/c++/12/string:40, from /mnt/server/storage/columnstore/columnstore/utils/funcexp/func_math.cpp:26: In static member function ‘static constexpr std::char_traits<char>::char_type* std::char_traits<char>::copy(char_type, const char_type, std::size_t)’, inlined from ‘static constexpr void std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::_S_copy(_CharT, const _CharT, size_type) [with _CharT = char; _Traits = std::char_traits<char>; _Alloc = std::allocator<char>]’ at /usr/include/c++/12/bits/basic_string.h:423:21, inlined from ‘constexpr std::__cxx11::basic_string<_CharT, _Traits, _Allocator>& std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::_M_replace(size_type, size_type, const _CharT, size_type) [with _CharT = char; _Traits = std::char_traits<char>; _Alloc = std::allocator<char>]’ at /usr/include/c++/12/bits/basic_string.tcc:532:22, inlined from ‘constexpr std::__cxx11::basic_string<_CharT, _Traits, _Alloc>& std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::replace(size_type, size_type, const _CharT, size_type) [with _CharT = char; _Traits = std::char_traits<char>; _Alloc = std::allocator<char>]’ at /usr/include/c++/12/bits/basic_string.h:2171:19, inlined from ‘constexpr std::__cxx11::basic_string<_CharT, _Traits, _Alloc>& std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::insert(size_type, const _CharT) [with _CharT = char; _Traits = std::char_traits<char>; _Alloc = std::allocator<char>]’ at /usr/include/c++/12/bits/basic_string.h:1928:22, inlined from ‘virtual std::string funcexp::Func_format::getStrVal(rowgroup::Row&, funcexp::FunctionParm&, bool&, execplan::CalpontSystemCatalog::ColType&)’ at /mnt/server/storage/columnstore/columnstore/utils/funcexp/func_math.cpp:2008:17: /usr/include/c++/12/bits/char_traits.h:431:56: error: ‘void __builtin_memcpy(void, const void, long unsigned int)’ accessing 9223372036854775810 or more bytes at offsets 3 and [2, 2147483645] may overlap up to 9223372036854775813 bytes at offset -3 [-Werror=restrict] 431 \| return static_cast<char_type*>(__builtin_memcpy(__s1, __s2, __n ); $ gcc --version gcc (Ubuntu 12.2.0-3ubuntu1) 12.2.0	2023-06-29 18:39:35 -04:00
Leonid Fedorov	3bd0ef4e43	Replace std::set contains method with count to support Rocky/RHEL/Alma 8 where the std::set in the stock STL does not have contains method	2023-06-29 18:39:35 -04:00
Leonid Fedorov	030144127e	Remove boost shared array [develop 23.02] (#2812 ) * remove boost/shared_array include * replace boost::shared_array<T> to std::shared_ptr<T[]>	2023-04-17 20:56:09 +03:00
Roman Nozdrin	7f3d540841	MCOL-5438 COUNT() in math causes SEGV (#2769 ) Co-authored-by: Roman Nozdrin <rnozdrin@mariadb.com>	2023-03-10 19:32:17 +03:00
Leonid Fedorov	56f2346083	Remove windows ifdefs	2023-03-02 15:59:42 +00:00
Roman Nozdrin	4d4e4ad30d	Merge pull request #2741 from mariadb-corporation/MDEV-25080-CS-dev MDEV-25080 Allow pushdown of queries involving UNIONs in outer select to ColumnStore	2023-02-28 11:23:50 +00:00
Andrey Piskunov	b6808c97f1	MCOL-4530: common conjuction top rewrite (#2673 ) Added logical transformation of the execplan::ParseTrees with the taking out the common factor in expression of the form "(A and B) or (A and C)" for the purposes of passing a TPCH 19 query. Co-authored-by: Leonid Fedorov <leonid.fedorov@mariadb.com>	2023-02-27 19:23:19 +03:00
Gagan Goel	86dcf92d56	MCOL-5215 Fix overflow of UNION operation involving DECIMAL datatypes. When a UNION operation involving DECIMAL datatypes with scale and digits before the decimal exceeds the currently supported maximum precision of 38, we throw an error to the user: "MCS-2060: Union operation exceeds maximum DECIMAL precision of 38". This is until MCOL-5417 is implemented where ColumnStore will have full parity with MariaDB server in terms of maximum supported DECIMAL precision and scale of 65 and 38 digits respectively.	2023-02-27 06:38:31 -05:00
Roman Nozdrin	ff534dba7f	MCOL-5384 This commit replaces shared pointer to CSC with CSC ctor that is cleaned up leaving a scope CSC default ctor was private b/c it must not allow to use CSC outside thread cache. However there are some places in the plugin code that need a standalone syscat that is cleaned up leaving the scope. The decision is to make the restriction mentioned organizational rather than syntactical.	2023-02-08 14:03:41 +00:00
Roman Nozdrin	d43e418a8c	MCOL-5346 This patch forces TreeNode::getIntValue to use conversion for dict-based CHAR/VARCHAR and TEXT columns (#2657 ) Co-authored-by: Roman Nozdrin <rnozdrin@mariadb.com>	2022-12-13 18:13:22 +03:00
Leonid Fedorov	b936ed8b2e	Fix some GCC-12 Build errors	2022-11-22 03:28:17 +03:00
mariadb-AndreyPiskunov	b57d2c30fe	Minor fixes	2022-10-31 14:56:32 +02:00
mariadb-AndreyPiskunov	315e4be2d8	First working attempt for json_arrayagg	2022-10-31 14:56:32 +02:00
mariadb-AndreyPiskunov	1714b75434	Non working attempt to do MCOL-5227	2022-10-31 14:56:32 +02:00
Ziy1-Tan	cdd41f05f3	MCOL-785 Implement DISTRIBUTED JSON functions The following functions are created: Create function JSON_VALID and test cases Create function JSON_DEPTH and test cases Create function JSON_LENGTH and test cases Create function JSON_EQUALS and test cases Create function JSON_NORMALIZE and test cases Create function JSON_TYPE and test cases Create function JSON_OBJECT and test cases Create function JSON_ARRAY and test cases Create function JSON_KEYS and test cases Create function JSON_EXISTS and test cases Create function JSON_QUOTE/JSON_UNQUOTE and test cases Create function JSON_COMPACT/DETAILED/LOOSE and test cases Create function JSON_MERGE and test cases Create function JSON_MERGE_PATCH and test cases Create function JSON_VALUE and test cases Create function JSON_QUERY and test cases Create function JSON_CONTAINS and test cases Create function JSON_ARRAY_APPEND and test cases Create function JSON_ARRAY_INSERT and test cases Create function JSON_INSERT/REPLACE/SET and test cases Create function JSON_REMOVE and test cases Create function JSON_CONTAINS_PATH and test cases Create function JSON_OVERLAPS and test cases Create function JSON_EXTRACT and test cases Create function JSON_SEARCH and test cases Note: Some functions output differs from MDB because session variables that affects functions output,e.g JSON_QUOTE/JSON_UNQUOTE This depends on MCOL-5212	2022-08-30 22:22:23 +08:00
Leonid Fedorov	d2432f9bf6	get rid of pointers for 128 fields	2022-08-26 15:12:22 +00:00
Gagan Goel	6a6fee5969	MCOL-5021 Followup. Allow the compiler to inline the call to nextColValue() in column.cpp.	2022-08-18 19:35:35 +00:00
Gagan Goel	cbfdae3481	MCOL-5021 Code changes based on review feedback.	2022-08-05 14:40:50 -04:00
Gagan Goel	1355237ca3	MCOL-5021 Some minor fixes.	2022-08-05 14:40:50 -04:00
Gagan Goel	94e9f55940	MCOL-5021 Add a new member function to the DBRM class, DBRM::addToLBIDList(). This function iterates over lbidList (populated by an earlier call to DBRM::getUncommittedExtentLBIDs()) to find those LBIDs which belong to the AUX column. It then finds the corresponding LBIDs for all other columns which belong to the same table as the AUX LBID and appends them to lbidList. The updated lbidList is used by invalidateUncommittedExtentLBIDs() to update the casual partitioning information. DBRM::addToLBIDList() only comes into play in case of a transaction ROLLBACK.	2022-08-05 14:40:50 -04:00
Gagan Goel	439db48c5a	MCOL-5021 Add support for the AUX column in TRUNCATE table processing.	2022-08-05 14:40:49 -04:00
Gagan Goel	ea1861fdb5	MCOL-5021 Add a new function to CalpontSystemCatalog class, isAUXColumnOID(), to check if a given OID is an auxilliary column OID.	2022-08-05 14:40:49 -04:00
Gagan Goel	262cd5c501	MCOL-5021 Remove hard-coded values for data type, column width and compression type for the AUX column, and replace them with constants defined in the execplan namespace.	2022-08-05 14:40:49 -04:00
Gagan Goel	86df9a972c	MCOL-5021 Add prototype support for the AUX column in CREATE/DROP DDL commands, single and multi-value INSERTs, cpimport, and DELETE.	2022-08-05 14:40:49 -04:00
Leonid Fedorov	39c43a0f70	<unnamed>.execplan::CalpontSystemCatalog::TableName::create_date' may be used uninitialized	2022-07-11 22:27:25 +02:00
Denis Khalikov	467fe0b401	[MCOL-5109] Make a singleton from ServicePrimProc. This patch makes a singleton from ServicePrimProc.	2022-06-07 13:27:45 +03:00
Leonid Fedorov	65252df4f6	C++20 fixes	2022-03-28 12:32:29 +00:00
Serguey Zefirov	53b9a2a0f9	MCOL-4580 extent elimination for dictionary-based text/varchar types The idea is relatively simple - encode prefixes of collated strings as integers and use them to compute extents' ranges. Then we can eliminate extents with strings. The actual patch does have all the code there but miss one important step: we do not keep collation index, we keep charset index. Because of this, some of the tests in the bugfix suite fail and thus main functionality is turned off. The reason of this patch to be put into PR at all is that it contains changes that made CHAR/VARCHAR columns unsigned. This change is needed in vectorization work.	2022-03-02 23:53:39 +03:00
Leonid Fedorov	3919c541ac	New warnfixes (#2254 ) * Fix clang warnings * Remove vim tab guides * initialize variables * 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length * Fix ISO C++17 does not allow 'register' storage class specifier for outdated bison * chars are unsigned on ARM, having if (ival < 0) always false * chars are unsigned by default on ARM and comparison with -1 if always true	2022-02-17 13:08:58 +03:00
Roman Nozdrin	15a87ee510	Merge pull request #2257 from tntnatbry/MCOL-4957 MCOL-4957 Fix performance slowdown for processing TIMESTAMP columns.	2022-02-16 19:46:57 +02:00
Leonid Fedorov	9b686c04e1	wrong return type	2022-02-15 14:23:08 +00:00
Gagan Goel	973e5024d8	MCOL-4957 Fix performance slowdown for processing TIMESTAMP columns. Part 1: As part of MCOL-3776 to address synchronization issue while accessing the fTimeZone member of the Func class, mutex locks were added to the accessor and mutator methods. However, this slows down processing of TIMESTAMP columns in PrimProc significantly as all threads across all concurrently running queries would serialize on the mutex. This is because PrimProc only has a single global object for the functor class (class derived from Func in utils/funcexp/functor.h) for a given function name. To fix this problem: (1) We remove the fTimeZone as a member of the Func derived classes (hence removing the mutexes) and instead use the fOperationType member of the FunctionColumn class to propagate the timezone values down to the individual functor processing functions such as FunctionColumn::getStrVal(), FunctionColumn::getIntVal(), etc. (2) To achieve (1), a timezone member is added to the execplan::CalpontSystemCatalog::ColType class. Part 2: Several functors in the Funcexp code call dataconvert::gmtSecToMySQLTime() and dataconvert::mySQLTimeToGmtSec() functions for conversion between seconds since unix epoch and broken-down representation. These functions in turn call the C library function localtime_r() which currently has a known bug of holding a global lock via a call to __tz_convert. This significantly reduces performance in multi-threaded applications where multiple threads concurrently call localtime_r(). More details on the bug: https://sourceware.org/bugzilla/show_bug.cgi?id=16145 This bug in localtime_r() caused processing of the Functors in PrimProc to slowdown significantly since a query execution causes Functors code to be processed in a multi-threaded manner. As a fix, we remove the calls to localtime_r() from gmtSecToMySQLTime() and mySQLTimeToGmtSec() by performing the timezone-to-offset conversion (done in dataconvert::timeZoneToOffset()) during the execution plan creation in the plugin. Note that localtime_r() is only called when the time_zone system variable is set to "SYSTEM". This fix also required changing the timezone type from a std::string to a long across the system.	2022-02-14 14:12:27 -05:00
Leonid Fedorov	04752ec546	clang format apply	2022-01-21 16:43:49 +00:00
Leonid Fedorov	01f3ceb437	replace header guards with #pragma once	2022-01-21 15:24:58 +00:00
Roman Nozdrin	05897948e4	MCOL-4899 MCS now applies a correct collation running IN for character data types	2022-01-05 12:00:01 +00:00
Alexander Barkov	fa9f18553a	MCOL-4728 Query with unusual use of aggregate functions on ColumnStore table crashes MariaDB Server After an AggreateColumn corresponding to SUM(1+1) is created, it is pushed to the list: gwi.count_asterisk_list.push_back(ac) Later, in getSelectPlan(), the expression SUM(1+1) was erroneously treated as a constant: if (!hasNonSupportItem && !nonConstFunc(ifp) && !(parseInfo & AF_BIT) && tmpVec.size() == 0) { srcp.reset(buildReturnedColumn(item, gwi, gwi.fatalParseError)); This code freed the original AggregateColumn and replaced to a ConstantColumn. But gwi.count_asterisk_list still pointer to the freed AggregateColumn(). The expression SUM(1+1) was treated as a constant because tmpVec was empty due to a bug in this code: // special handling for count(). This should not be treated as constant. if (isp->argument_count() == 1 && ( sfitempp[0]->type() == Item::CONST_ITEM && (sfitempp[0]->cmp_type() == INT_RESULT \|\| sfitempp[0]->cmp_type() == STRING_RESULT \|\| sfitempp[0]->cmp_type() == REAL_RESULT \|\| sfitempp[0]->cmp_type() == DECIMAL_RESULT) ) ) { field_vec.push_back((Item_field)item); //dummy Notice, it handles only aggregate functions with explicit literals passed as an argument, while it does not handle constant expressions such as 1+1. Fix: - Adding new classes ConstantColumnNull, ConstantColumnString, ConstantColumnNum, ConstantColumnUInt, ConstantColumnSInt, ConstantColumnReal, ValStrStdString, to reuse the code easier. - Moving a part of the code from the case branch handling CONST_ITEM in buildReturnedColumn() into a new function newConstantColumnNotNullUsingValNativeNoTz(). This makes the code easier to read and to reuse in the future. - Adding a new function newConstantColumnMaybeNullFromValStrNoTz(). Removing dulplicate code from !!!four!!! places, using the new function instead. - Adding a function isSupportedAggregateWithOneConstArg() to properly catch all constant expressions. Using the new function parse_item() in the code commented as "special handling for count(*)". Now it pushes all constant expressions to field_vec, not only explicit literals. - Moving a part of the code from buildAggregateColumn() to a helper function processAggregateColumnConstArg(). Using processAggregateColumnConstArg() in the CONST_ITEM and NULL_ITEM branches. - Adding a new branch in buildReturnedColumn() handling FUNC_ITEM. If a function has constant arguments, a ConstantColumn() is immediately created, without going to buildArithmeticColumn()/buildFunctionColumn(). - Reusing isSupportedAggregateWithOneConstArg() and processAggregateColumnConstArg() in buildAggregateColumn(). A new branch catches aggregate function has only one constant argument and immediately creates a single ConstantColumn without traversing to the argument sub-components.	2021-09-21 14:00:56 +04:00
Leonid Fedorov	5c5f103f98	MCOL-4839: Fix clang build (#2100 ) * Fix clang build * Extern C returned to plugin_instance Co-authored-by: Leonid Fedorov <l.fedorov@mail.corp.ru>	2021-08-23 10:45:10 -05:00
David Hall	a202bda485	MCOL-4719 iterate into subquery looking for windowfunctions When an outer query filter accesses an subquery column that contains an aggregate or a window function, certain optimizations can't be performed. We had been looking at the surface of the returned column. We now iterate into any functions or operations looking for aggregates and window functions.	2021-07-22 13:56:21 -05:00
Gagan Goel	b3a560300c	Revert "Merge pull request #2022 from mariadb-corporation/bar-develop-MCOL-4791" This reverts commit `4016e25e5b`, reversing changes made to `85435f6b1e`.	2021-07-13 11:06:56 +00:00
Leonid Fedorov	f81f743282	Replace underlying type for avg and sum for int types from long double to wide decimal	2021-07-08 17:04:43 +00:00
Roman Nozdrin	7b4f759592	Merge pull request #2032 from drrtuy/MCOL-4802 MCOL-4802 Removed ByteStream methods for bool and add some logging in…	2021-07-07 13:03:54 +03:00
Roman Nozdrin	fb5ba84212	MCOL-4802 Removed ByteStream methods for bool manipulations and add some logging into I_S.columnstore_files	2021-07-07 07:16:30 +00:00
Alexander Barkov	9794f24369	MCOL-4801 Replace Row methods getStringLength() and getStringPointer() to getConstString()	2021-07-06 21:15:32 +04:00
Gagan Goel	8520f87237	MCOL-641 Cleanup.	2021-07-06 09:01:49 +00:00
Roman Nozdrin	6dc356ed60	Merge pull request #1989 from denis0x0D/MCOL-4713 MCOL-4713 Analyze table implementation.	2021-07-02 16:17:07 +03:00
Denis Khalikov	c20015a7b2	MCOL-4713 Analyze table implementation.	2021-07-02 12:37:12 +03:00
Alexander Barkov	e8126bede5	MCOL-4791 Fix ColumnCommand fudged data type format to clearly identify CHAR vs VARCHAR	2021-07-02 12:42:03 +04:00
Alexander Barkov	d61690748e	MCOL-4743 Regression: TIME_TO_SEC(const_expr) erroneosly returns 0	2021-06-03 11:16:53 +04:00
Alexander Barkov	9608533d92	MCOL-4734 Compilation failure: MariaDB-10.6 + ColumnStore-develop mcsconfig.h and my_config.h have the following pre-processor definitions: 1. Conflicting definitions coming from the standard cmake definitions: - PACKAGE - PACKAGE_BUGREPORT - PACKAGE_NAME - PACKAGE_STRING - PACKAGE_TARNAME - PACKAGE_VERSION - VERSION 2. Conflicting definitions of other kinds: - HAVE_STRTOLL - this is a dirt in MariaDB headers. Should be fixed in the server code. my_config.h erroneously performs "#define HAVE_STRTOLL" instead of "#define HAVE_STRTOLL 1". in some cases. The former is not CMake compatible style. The latter is. 3. Non-conflicting definitions: Otherwise, mcsconfig.h and my_config.h should be mutually compatible, because both are generated by cmake on the same host machine. So they should have exactly equal definitions like "HAVE_XXX", "SIZEOF_XXX", etc. Observations: - It's OK to include both mcsconfig.h and my_config.h providing that we suppress duplicate definition of the above conflicting types #1 and #2. - There is no a need to suppress duplicate definitions mentioned in #3, as they are compatible! - my_sys.h and m_ctype.h must always follow a CMake configuation header, either my_config.h or mcsconfig.h (or both). They must never be included without any preceeding configuration header. This change make sure that we resolve conflicts by: - either disallowing inclusion of mcsconfig.h and my_config.h at the same time - or by hiding conflicting definitions #1 and #2 (with their later restoring). - also, by making sure that my_sys.h and m_ctype.h always follow a CMake configuration file. Details: - idb_mysql.h can now only be included only after my_config.h An attempt to use idb_mysql.h with mcsconfig.h instead of my_config.h is caught by the "#error" preprocessor directive. - mariadb_my_sys.h can now be only included after mcsconfig.h. An attempt to use mariadb_my_sys.h without mcscofig.h (e.g. with my_config.h) is also caught by "#error". - collation.h now can now be included in two ways. It now has the following effective structure: #if defined(PREFER_MY_CONFIG_H) && defined(MY_CONFIG_H) // Remember current conflicting definitions on the preprocessor stack // Undefine current conflicting definitions #endif #include "mcsconfig.h" #include "m_ctype.h" #if defined(PREFER_MY_CONFIG_H) && defined(MY_CONFIG_H) # Restore conflicting definitions from the preprocessor stack #endif and can be included as follows: a. using only mcsconfig.h as a configuration header: // my_config.h must not be included so far #include "collation.h" b. using my_config.h as the first included configuration file: #define PREFER_MY_CONFIG_H // Force conflict resolution #include "my_config.h" // can be included directly or indirectly ... #include "collation.h" Other changes: - Adding helper header files utils/common/mcsconfig_conflicting_defs_remember.h utils/common/mcsconfig_conflicting_defs_restore.h utils/common/mcsconfig_conflicting_defs_undef.h to perform conflict resolution easier. - Removing `#include "collation.h"` from a number of files, as it's automatically included from rowgroup.h. - Removing redundant `#include "utils_utf8.h"`. This change is not directly related to the problem being fixed, but it's nice to remove redundant directives for both collation.h and utils_utf8.h from all the files that do not really need them. (this change could probably have gone as a separate commit) - Changing my_init() to MY_INIT(argv[0]) in the MCS services sources. After the fix of the complitation failure it appeared that ColumnStore services compiled with the debug build crash due to recent changes in safemalloc. The crash happened in strcmp() with `my_progname` as an argument (where my_progname is a mysys global variable). This problem should probably be fixed on the server side as well to avoid passing NULL. But, the majority of MariaDB executable programs also use MY_INIT(argv[0]) rather than my_init(). So let's make MCS do like the other programs do.	2021-05-25 12:34:36 +04:00

1 2 3 4 5 ...

278 Commits