mariadb-columnstore-engine

mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-07-29 08:21:15 +03:00

Author	SHA1	Message	Date
Alexander Barkov	67449418ed	MCOL-4700 Wrong result of a UNION for INT and INT UNSIGNED	2021-06-11 19:31:51 +04:00
Alexander Barkov	b3d6f62964	MCOL-4753 Performance problem in Typeless join	2021-06-10 09:26:26 +00:00
Alexey Antipovsky	0dedb7e628	Fix compilation warnings	2021-06-09 16:51:00 +03:00
Roman Nozdrin	7a152c6a19	Merge pull request #1944 from mariadb-AlexeyAntipovsky/MCOL-563-dev [MCOL-4709] Disk-based aggregation	2021-06-08 20:42:58 +03:00
Alexey Antipovsky	475104e4d3	[MCOL-4709] Disk-based aggregation * Introduce multigeneration aggregation * Do not save unused part of RGDatas to disk * Add IO error explanation (strerror) * Reduce memory usage while aggregating * introduce in-memory generations to better memory utilization * Try to limit the qty of buckets at a low limit * Refactor disk aggregation a bit * pass calculated hash into RowAggregation * try to keep some RGData with free space in memory * do not dump more than half of rowgroups to disk if generations are allowed, instead start a new generation * for each thread shift the first processed bucket at each iteration, so the generations start more evenly * Unify temp data location * Explicitly create temp subdirectories whether disk aggregation/join are enabled or not	2021-06-06 16:09:15 +03:00
Denis Khalikov	606194e6e4	MCOL-4685: Eliminate some irrelevant settings (uncompressed data and extents per file). This patch: 1. Removes the option to declare uncompressed columns (set columnstore_compression_type = 0). 2. Ignores [COMMENT '[compression=0] option at table or column level (no error messages, just disregard). 3. Removes the option to set more than 2 extents per file (ExtentsPreSegmentFile). 4. Updates rebuildEM tool to support up to 10 dictionary extent per dictionary segment file. 5. Adds check for `DBRootStorageType` for rebuildEM tool. 6. Renamed rebuildEM to mcsRebuildEM.	2021-06-03 14:44:33 +03:00
Roman Nozdrin	90397dfed0	MCOL-4675 DMLProc now automatically and gracefully shutdowns when a cluster state is set to SS_SHUTDOWN_PENDING \| SS_ROLLBACK	2021-05-27 11:07:32 +00:00
Roman Nozdrin	42e710f817	Merge pull request #1942 from mariadb-corporation/bar-develop-compile-10.6 Fixing 10.6 + develop compilation failure	2021-05-25 14:31:37 +03:00
Alexander Barkov	9608533d92	MCOL-4734 Compilation failure: MariaDB-10.6 + ColumnStore-develop mcsconfig.h and my_config.h have the following pre-processor definitions: 1. Conflicting definitions coming from the standard cmake definitions: - PACKAGE - PACKAGE_BUGREPORT - PACKAGE_NAME - PACKAGE_STRING - PACKAGE_TARNAME - PACKAGE_VERSION - VERSION 2. Conflicting definitions of other kinds: - HAVE_STRTOLL - this is a dirt in MariaDB headers. Should be fixed in the server code. my_config.h erroneously performs "#define HAVE_STRTOLL" instead of "#define HAVE_STRTOLL 1". in some cases. The former is not CMake compatible style. The latter is. 3. Non-conflicting definitions: Otherwise, mcsconfig.h and my_config.h should be mutually compatible, because both are generated by cmake on the same host machine. So they should have exactly equal definitions like "HAVE_XXX", "SIZEOF_XXX", etc. Observations: - It's OK to include both mcsconfig.h and my_config.h providing that we suppress duplicate definition of the above conflicting types #1 and #2. - There is no a need to suppress duplicate definitions mentioned in #3, as they are compatible! - my_sys.h and m_ctype.h must always follow a CMake configuation header, either my_config.h or mcsconfig.h (or both). They must never be included without any preceeding configuration header. This change make sure that we resolve conflicts by: - either disallowing inclusion of mcsconfig.h and my_config.h at the same time - or by hiding conflicting definitions #1 and #2 (with their later restoring). - also, by making sure that my_sys.h and m_ctype.h always follow a CMake configuration file. Details: - idb_mysql.h can now only be included only after my_config.h An attempt to use idb_mysql.h with mcsconfig.h instead of my_config.h is caught by the "#error" preprocessor directive. - mariadb_my_sys.h can now be only included after mcsconfig.h. An attempt to use mariadb_my_sys.h without mcscofig.h (e.g. with my_config.h) is also caught by "#error". - collation.h now can now be included in two ways. It now has the following effective structure: #if defined(PREFER_MY_CONFIG_H) && defined(MY_CONFIG_H) // Remember current conflicting definitions on the preprocessor stack // Undefine current conflicting definitions #endif #include "mcsconfig.h" #include "m_ctype.h" #if defined(PREFER_MY_CONFIG_H) && defined(MY_CONFIG_H) # Restore conflicting definitions from the preprocessor stack #endif and can be included as follows: a. using only mcsconfig.h as a configuration header: // my_config.h must not be included so far #include "collation.h" b. using my_config.h as the first included configuration file: #define PREFER_MY_CONFIG_H // Force conflict resolution #include "my_config.h" // can be included directly or indirectly ... #include "collation.h" Other changes: - Adding helper header files utils/common/mcsconfig_conflicting_defs_remember.h utils/common/mcsconfig_conflicting_defs_restore.h utils/common/mcsconfig_conflicting_defs_undef.h to perform conflict resolution easier. - Removing `#include "collation.h"` from a number of files, as it's automatically included from rowgroup.h. - Removing redundant `#include "utils_utf8.h"`. This change is not directly related to the problem being fixed, but it's nice to remove redundant directives for both collation.h and utils_utf8.h from all the files that do not really need them. (this change could probably have gone as a separate commit) - Changing my_init() to MY_INIT(argv[0]) in the MCS services sources. After the fix of the complitation failure it appeared that ColumnStore services compiled with the debug build crash due to recent changes in safemalloc. The crash happened in strcmp() with `my_progname` as an argument (where my_progname is a mysys global variable). This problem should probably be fixed on the server side as well to avoid passing NULL. But, the majority of MariaDB executable programs also use MY_INIT(argv[0]) rather than my_init(). So let's make MCS do like the other programs do.	2021-05-25 12:34:36 +04:00
Alexander Barkov	284fc51bb7	MCOL-4726 Wrong result of WHERE char1_col='A'	2021-05-21 14:40:16 +04:00
Alexander Barkov	bd4cbb542d	MCOL-4721 CHAR(1) is not collation-aware for GROUP/DISTINCT	2021-05-18 16:14:53 +04:00
Gagan Goel	78cca01dfa	Merge pull request #1899 from tntnatbry/MCOL-4612 MCOL-4612 A subquery with a union for DECIMAL and BIGINT returns zeros.	2021-05-03 02:52:23 -04:00
benthompson15	71c16fcb56	Merge pull request #1909 from dhall-MariaDB/MCOL-4643 MCOL-4643 reset valOut after processing UDAF	2021-04-30 14:35:07 -05:00
David Hall	f4e6939139	MCOL-4643 dev 5 reset valOut after processing UDAF After a UDAF result has been inserted in the output stream, the valOut object needs to be reset to empty in preparation for the next value. Failing to do so may cause what should be a NULL value to erroneously take the last value inserted.	2021-04-30 10:57:40 -05:00
David.Hall	6d138a4963	Merge pull request #1905 from benthompson15/MCOL-4044 MCOL-4044: Add oracle mode functions.	2021-04-30 10:13:17 -05:00
Gagan Goel	4e9307fa6d	MCOL-4612 A subquery with a union for DECIMAL and BIGINT returns zeros. In this patch, we set the unioned type to a wide decimal, if any of the numeric columns involved in the union operation have a precision > 18 (which is also possible for BIGINT/UBIGINT types) and <= 38.	2021-04-30 12:33:33 +00:00
benthompson15	1eea9f9e47	MCOL-4598: Fix the syslog setup script. Add syslog options for broken/non-syslog setup.	2021-04-29 16:34:53 -05:00
Roman Nozdrin	2aa5380d51	Remove global lock from OAMCache Config now uses a single atomic variable to speed up its operations Config has a special method to re-read a config file if it changed on disk	2021-04-23 15:41:29 +00:00
Alexander Barkov	d6e88cd82e	MCOL-4693 ColumnStore MTR tests: FUNCTION mcs192_db.CORR does not exist	2021-04-22 18:29:08 +04:00
benthompson15	870d672efb	MCOL-4044: Add oracle mode functions.	2021-04-21 16:07:42 -05:00
Roman Nozdrin	c9b353e975	Simplify PMS connection entries configuration	2021-04-16 10:52:13 +00:00
Roman Nozdrin	757f8d00a5	A plugable PoorManProfiler singleton	2021-04-14 10:54:46 +00:00
Alexander Barkov	362bfcd15e	MCOL-4361 Replace pow(10.0, (double)scale) expressions with a static dictionary lookup.	2021-04-09 12:41:04 +04:00
Alexander Barkov	912cbe641e	Removing func_bitand.cpp It was a dead code. It was not even a part of the soures in CMakeList.txt. The bit AND operator implemententation resides in func_bitwise.cpp together with all other bit operators and functions.	2021-04-08 11:34:22 +04:00
Gagan Goel	47b1ea1cf9	Merge pull request #1849 from mariadb-corporation/bar-develop-MCOL-4666 MCOL-4666 Empty set when using BIT OR and BIT AND functions in WHERE	2021-04-08 03:04:54 -04:00
Alexander Barkov	a6a85d157d	MCOL-4666 Empty set when using BIT OR and BIT AND functions in WHERE	2021-04-07 14:37:39 +04:00
Gagan Goel	1c54df4ab7	Merge pull request #1851 from mariadb-corporation/bar-develop-MCOL-4668 MCOL-4668 PERIOD_DIFF(dec_or_double1,dec_or_double2) is not as in InnoDB	2021-04-07 02:12:50 -04:00
Gagan Goel	f33b860580	Merge pull request #1841 from mariadb-corporation/bar-develop-MCOL-4614 A joint patch for MCOL-4614 and MCOL-4615 (decimal to string conversion)	2021-04-06 02:13:38 -04:00
Alexander Barkov	9f41f574da	MCOL-4668 PERIOD_DIFF(dec_or_double1,dec_or_double2) is not as in InnoDB	2021-04-06 08:15:18 +04:00
Alexander Barkov	69911c2710	A joint patch for MCOL-4614, MCOL-4615, MCOL-4660 (decimal to string conversion) This patch fixes: - MCOL-4614 calShowPartitions() precision loss for huge narrow decimal - MCOL-4615 GROUP_CONCAT() precision loss for huge narrow decimal - MCOL-4660 Narow decimal to string conversion is inconsistent about zero integral Changes: - Implementing Row::getDecimalField() - Removing double arithmetic from the code printing DECIMAL values in TypeHandlerXDecimal::format64() and GroupConcator::outputRow(). Using Decimal::toString() instead. - Rewriting Decimal::toStringTSInt64(). The old implementation was wrong, too complex and slow (used unnecessary memmove, memcpy). An additional cleanup: - Removing the ENGINE=COLUMNSTORE clause from tests for MCOL-4532 and MCOL-4640 type_decimal.test is combinations-aware. It's run two times with default_storage_engine=MyISAM and default_storage_engine=COLUMNSTORE. So the CREATE TABLE statements should not specify the engine explicitly. - Adding --disable_warnings in the old fixed test. We needed to suppress warnings when the MyISAM combination is being run. Previously the table was erroneously created with ENGINE=COLUMNSTORE even with the MyISAM combination run. So warning were not generated.	2021-04-05 16:36:19 +04:00
Gagan Goel	a3db5bde36	Merge pull request #1839 from mariadb-corporation/bar-develop-MCOL-4649 Fixing DOUBLE-to-[U]INT conversion (MCOL-4649, MCOL-4631, MCOL-4647)	2021-04-05 05:55:37 -04:00
Alexander Barkov	a86f432f35	Fixing DOUBLE-to-[U]INT conversion (MCOL-4649, MCOL-4631, MCOL-4647) Bugs fixed: - MCOL-4649 CAST(double AS UNSIGNED) returns 0 - MCOL-4631 CAST(double AS SIGNED) returns 0 or NULL - MCOL-4647 SEC_TO_TIME(double_or_float) returns a wrong result Problems: - The code in Func_cast_unsigned::getUintVal() and Func_cast_signed::getIntVal() did not properly check the double value to fit inside a uint64_t/int64_t range. So the corner cases: - numeric_limits<uint64_t>::max()-2 for uint64_t - numeric_limits<int64_t>::max() for int64_t produced unexpected results. The problem was in tests like this: if (value > (double) numeric_limits<int64_t>::max()) A correct test would be: if (value >= (double) numeric_limits<int64_t>::max()) - The code in Func_sec_to_time::getStrVal() searched for the decimal dot character, assuming that the next character after the dot was the leftmost fractional digit. This assumption was wrong because huge double numbers use scientific notation. So for example in "2.5e-40" the digit "5" following the dot is NOT the leftmost fractional digit. Also, the code in Func_sec_to_time::getStrVal() was slow because of using non necessary to-string and from-string data conversion. Also, the code in Func_sec_to_time::getStrVal() evaluated the argument two times: using getStrVal() then using getIntVal(). Solution: - Adding new classes TDouble and TLongDouble. - Adding a few function templates to reuse the code easier. - Moving the conversion code inside TDouble and TLongDouble methods toMCSSInt64Round() and toMCSUInt64Round(). - Reusing new classes and their methods in func_cast.cc and func_sec_to_time.cc.	2021-04-05 11:30:52 +04:00
Roman Nozdrin	05863a3fb5	Merge pull request #1808 from denis0x0D/MCOL-4566/rebuild_em_compressed MCOL-4566: Add rebuildEM tool support to work with compressed files.	2021-04-02 17:52:34 +03:00
Roman Nozdrin	a0b46425dc	Merge pull request #1787 from mariadb-corporation/bar-develop-like MCOL-4498 LIKE is not collation aware	2021-04-02 11:57:06 +03:00
Denis Khalikov	5d497e8821	MCOL-4566: Add rebuildEM tool support to work with compressed files. * This patch adds rebuildEM tool support to work with compressed files. * This patch increases a version of the file header. Note: Default version of the `rebuildEM` tool was using very old API, those functions are not present currently. So `rebuildEM` will not work with files created without compression, because we cannot deduce some info which are needed to create column extent.	2021-04-02 10:55:01 +03:00
Gagan Goel	f3766e40e4	Merge pull request #1836 from mariadb-corporation/bar-develop-MCOL-4618 A joint patch fixing MCOL-4618 and MCOL-4653:	2021-04-01 06:01:44 -04:00
Alexander Barkov	e19096a91a	A joint patch fixing MCOL-4618 and MCOL-4653: - MCOL-4618 FLOOR(-9999.0) returns a bad result - MCOL-4653 CEIL(negativeNarrowDecimal) returns a wrong result Main changes: a. Moving ROUND, CEIL, FLOOR related code into a new simple class template DecomposedDecimal, which is reused for 64 and 128 bit decimal. b. Using DecomposedDecimal in TDecimal64 and TDecimal128 to implement ROUND, CEIL, FLOOR related methods. c. Adding corresponding wrapper methods to the class Decimal. d. Using new Decimal methods in Func_ceil and Func_floor. Additional minor changed: - Adding "explicit" to TSInt128 constructors to avoid hidden data type conversion and erroneous choice between 64 vs 128 bit APIs when using Decimal. Now one can call constructors in this self explanatory way: - Decimal(TSInt128(some_int_value), scale, precision) to create a wide decimal - Decimal(TSInt64(some_int_value, scale, precision) to create a narrow decimal TODO: Consider changing Decimal(int64_t val, int8_t s, uint8_t p, const int128_t &val128 = 0) to Decimal(int64_t val, int8_t s, uint8_t p, const int128_t &val128) (or even removing this constructor) to disallow compilation of: Decimal(some_trivial_type_value, scale, precision)	2021-04-01 09:47:22 +04:00
benthompson15	c98d0c997d	MCOL-4386: Update libmarias3 ref	2021-03-30 16:13:39 -05:00
Alexander Barkov	30fe666a8f	A join patch for MCOL-4609, MCOL-4610, MCOL-4619, MCOL-4650, MCOL-4651 This patch is fixing the following bugs: - MCOL-4609 TreeNode::getIntVal() does not round: implicit DECIMAL->INT cast is not MariaDB compatible - MCOL-4610 TreeNode::getUintVal() looses precision for narrow decimal - MCOL-4619 TreeNode::getUintVal() does not round: Implicit DECIMAL->UINT conversion is not like in InnoDB - MCOL-4650 TreeNode::getIntVal() looses precision for narrow decimal - MCOL-4651 SEC_TO_TIME(hugePositiveDecimal) returns a negative time	2021-03-30 16:37:05 +04:00
Alexander Barkov	1acc631a04	MCOL-4600 CAST(decimal AS SIGNED/UNSIGNED) returns a wrong result The "SIGNED" part of the problem was previously fixed by MCOL-4640. Fixing the "UNSIGNED" part. - Adding TDecimal64::toUInt64Round() and Decimal::decimal64ToUInt64Round() - Renaming Decimal::narrowRound() to decimal64ToSInt64Round(), for a more self-descriptive name, and for symmetry with decimal64ToUInt64Round() - Reusing TDecimal64::toSInt64Round() inside decimal64ToSInt64Round(). This change was forgotten in MCOL-4640 :( - Removing the old code in Func_cast_unsigned::getUintVal with pow(). It caused precision loss, hence the bug. Adding a call for Decimal::decimal64ToUInt64Round() instead. - Adding tests for both SIGNED and UNSIGNED casts. Additional change: - Moving the wide-decimal-to-uint64_t rounding code from Func_cast_unsigned::getUintVal() to TDecimal128::toUInt64Round() (with refactoring). Adding TDecimal::toUInt64Round() for symmetry with TDecimal::toSInt64Round(). It will be easier to reuse the code with way.	2021-03-30 12:46:07 +04:00
David Hall	0eee6cfc62	MCOL-4643 reset valOut after UDAF evaluation	2021-03-26 16:09:15 -05:00
Alexander Barkov	c0b8445225	MCOL-4633 Remove duplicate code for DECIMAL to int64_t rounding conversion Detailed change list: - Splitting out the narrow part of "class Decimal" into a separate class TDecimal64 - Adding a method TDecimal64::toSInt64Round() - Reusing the method TDecimal64::toSInt64Round() in: * Func_cast_signed::getIntVal() * Func_char::getStrVal() * Func_elt::getStrVal() * makedate() * Func_maketime::getStrVal() Note, reusing this method in Func_char::getStrVal() also fixed this bug: MCOL-4634 CHAR(negativeWideDecimal) is not like InnoDB because the old code handled negative wide decimal values in a wrong way. - Adding a new class TDecimal128 for symmetry. Moving a few wide decimal methods and constexpr's from Decimal to TDecimal128. The new class TDecimal128 does not do much at this point yet. Later we should be able to use TDecimal128 vs TDecimal64 in templates.	2021-03-25 17:56:10 +04:00
Roman Nozdrin	65e0a69914	Merge pull request #1815 from benthompson15/MCOL-4554 MCOL-4554: use jemalloc provided by repo.	2021-03-23 16:02:06 +03:00
Gagan Goel	894ebced33	Merge pull request #1817 from mariadb-corporation/bar-develop-MCOL-4629 MCOL-4629 Add a helper method mcsv1_UDAF::toDouble()	2021-03-23 05:46:31 -04:00
Alexander Barkov	ca7a310309	MCOL-4629 Add a helper method mcsv1_UDAF::toDouble()	2021-03-23 13:23:48 +04:00
Alexander Barkov	765858bc5b	MCOL-4498 LIKE is not collation aware	2021-03-22 20:42:01 +04:00
David Hall	13b7a794e4	MCOL-4620 Add charset to various RowGroup initializers Specifically to operator+=	2021-03-19 16:57:54 -05:00
benthompson15	c74beb6178	MCOL-4554: use jemalloc provided by repo.	2021-03-18 12:35:16 -05:00
David Hall	af20387985	MCOL-4516 check for var_pop < 0 In some cases, because of rounding error, var_pop will evaluate to some value just less than 0. We check for this and force to round to 0.	2021-03-09 13:36:10 -06:00
Denis Khalikov	a2efa1efeb	MCOL-4566: Extend CompressedDBFileHeader struct with new fields. * This patch extends CompressedDBFileHeader struct with new fields: `fColumWidth`, `fColDataType`, which are necessary to rebuild extent map from the given file. Note: new fields do not change the memory layout of the struct, because the size is calculated as max(sizeof(CompressedDBFileHeader), HDR_BUF_LEN)). * This patch changes API of some functions, by adding new function argument `colDataType` when needed, to be able to call `initHdr` function with colDataType value.	2021-03-05 22:15:34 +03:00

... 4 5 6 7 8 ...

1162 Commits