mariadb-columnstore-engine

mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-11-14 00:42:34 +03:00

Author	SHA1	Message	Date
Serguey Zefirov	2cd8f716c1	Fix MCOL-5035, a difference in INSERT and UPDATE behavior The UPDATE statement wrote NULL when the column set is DATETIME and value is '0000-00-00 00:00:00'. The problem was inside WriteEngine's handling of UPDATE statements and this is where heart of change is. Other changes are related to some obsolete data structures in DML/DDL handling that just hanging around there, doing nothing.	2024-06-27 13:07:49 +03:00
Sergey Zefirov	b53c231ca6	MCOL-271 empty strings should not be NULLs (#2794 ) This patch improves handling of NULLs in textual fields in ColumnStore. Previously empty strings were considered NULLs and it could be a problem if data scheme allows for empty strings. It was also one of major reasons of behavior difference between ColumnStore and other engines in MariaDB family. Also, this patch fixes some other bugs and incorrect behavior, for example, incorrect comparison for "column <= ''" which evaluates to constant True for all purposes before this patch.	2023-03-30 21:18:29 +03:00
Gagan Goel	006b92bba2	Revert "This commit fixes an incorrect predicate in the if condition (#2608 )" This reverts commit `f4e3022fbd`. The commit apparently caused MCOL-5318 and MCOL-5319 which involve the internal ColumnStore batch insert mechanism passing through the SQL layer. The code block involved in this change is a predicate checking for the HWM extent in WriteEngineServer at the end of the batch insert. This is done in WE_DMLCommandProc::processBatchInsertHwm(). The original predicate check in this function for the HWM extent is restored until further investigation.	2023-02-02 08:07:18 -05:00
Gagan Goel	ad59ed5402	MCOL-5367 Fix a bug introduced in MCOL-5021 (AUX column implementation). In the implementation of MCOL-5021, an assert was added in `WE_DMLCommandProc::processBatchInsertHwm()` that assumed the `WriteEngine::TableMetaData` cache is uniform across the cluster. However, this assumption is incorrect. This bug caused undefined behaviour in ColumnStore resulting in bugs such as MCOL-5367. In MCOL-5367, in a multi-node ColumnStore cluster, an INSERT ... SELECT in a transaction with system variable `columnstore_use_import_for_batchinsert=OFF/ON` did not show inserted records when a SELECT query was issued. Assuming a 3-node cluster setup, DMLProc only sends a given batch of records to be inserted to one of the 3 nodes, and not all nodes. As a result, the `WriteEngine::TableMetaData` cache is only populated for that one node and is not uniform across the cluster, causing the assert to fail. As a fix, we simply remove this assert as it is redundant and should not have been added in the first place.	2023-01-16 05:54:44 -05:00
Denis Khalikov	d61780cab1	MCOL-5263 Add support to ROLLBACK when PP were restarted. DMLProc starts ROLLBACK when SELECT part of UPDATE fails b/c EM facility in PP were restarted. Unfortunately this ROLLBACK stuck if EM/PP are not yet available. DMLProc must have a t/o with re-try doing ROLLBACK.	2022-12-13 16:18:53 +03:00
Gagan Goel	f4e3022fbd	This commit fixes an incorrect predicate in the if condition (#2608 ) that checks for HWM extent in WE_DMLCommandProc::processBatchInsertHwm().	2022-11-08 14:51:42 -06:00
Leonid Fedorov	d2432f9bf6	get rid of pointers for 128 fields	2022-08-26 15:12:22 +00:00
mariadb-AndreyPiskunov	0863ecd279	Replace getBinaryField	2022-08-25 18:21:43 +03:00
Gagan Goel	cbfdae3481	MCOL-5021 Code changes based on review feedback.	2022-08-05 14:40:50 -04:00
Gagan Goel	1355237ca3	MCOL-5021 Some minor fixes.	2022-08-05 14:40:50 -04:00
Gagan Goel	9b6d3c3870	MCOL-5021 Add support for AUX column in the client code calling CalpontSystemCatalog::columnRIDs().	2022-08-05 14:40:49 -04:00
Gagan Goel	262cd5c501	MCOL-5021 Remove hard-coded values for data type, column width and compression type for the AUX column, and replace them with constants defined in the execplan namespace.	2022-08-05 14:40:49 -04:00
Gagan Goel	86df9a972c	MCOL-5021 Add prototype support for the AUX column in CREATE/DROP DDL commands, single and multi-value INSERTs, cpimport, and DELETE.	2022-08-05 14:40:49 -04:00
Serguey Zefirov	53b9a2a0f9	MCOL-4580 extent elimination for dictionary-based text/varchar types The idea is relatively simple - encode prefixes of collated strings as integers and use them to compute extents' ranges. Then we can eliminate extents with strings. The actual patch does have all the code there but miss one important step: we do not keep collation index, we keep charset index. Because of this, some of the tests in the bugfix suite fail and thus main functionality is turned off. The reason of this patch to be put into PR at all is that it contains changes that made CHAR/VARCHAR columns unsigned. This change is needed in vectorization work.	2022-03-02 23:53:39 +03:00
Gagan Goel	973e5024d8	MCOL-4957 Fix performance slowdown for processing TIMESTAMP columns. Part 1: As part of MCOL-3776 to address synchronization issue while accessing the fTimeZone member of the Func class, mutex locks were added to the accessor and mutator methods. However, this slows down processing of TIMESTAMP columns in PrimProc significantly as all threads across all concurrently running queries would serialize on the mutex. This is because PrimProc only has a single global object for the functor class (class derived from Func in utils/funcexp/functor.h) for a given function name. To fix this problem: (1) We remove the fTimeZone as a member of the Func derived classes (hence removing the mutexes) and instead use the fOperationType member of the FunctionColumn class to propagate the timezone values down to the individual functor processing functions such as FunctionColumn::getStrVal(), FunctionColumn::getIntVal(), etc. (2) To achieve (1), a timezone member is added to the execplan::CalpontSystemCatalog::ColType class. Part 2: Several functors in the Funcexp code call dataconvert::gmtSecToMySQLTime() and dataconvert::mySQLTimeToGmtSec() functions for conversion between seconds since unix epoch and broken-down representation. These functions in turn call the C library function localtime_r() which currently has a known bug of holding a global lock via a call to __tz_convert. This significantly reduces performance in multi-threaded applications where multiple threads concurrently call localtime_r(). More details on the bug: https://sourceware.org/bugzilla/show_bug.cgi?id=16145 This bug in localtime_r() caused processing of the Functors in PrimProc to slowdown significantly since a query execution causes Functors code to be processed in a multi-threaded manner. As a fix, we remove the calls to localtime_r() from gmtSecToMySQLTime() and mySQLTimeToGmtSec() by performing the timezone-to-offset conversion (done in dataconvert::timeZoneToOffset()) during the execution plan creation in the plugin. Note that localtime_r() is only called when the time_zone system variable is set to "SYSTEM". This fix also required changing the timezone type from a std::string to a long across the system.	2022-02-14 14:12:27 -05:00
Leonid Fedorov	04752ec546	clang format apply	2022-01-21 16:43:49 +00:00
Roman Nozdrin	af36f9940f	This patch introduces support for scanning/filtering vectorized execution for numeric-based data types TEXT, CHAR, VARCHAR, FLOAT and DOUBLE are not yet supported by vectorized path This patch introduces an example for Google benchmarking suite to measure a perf diff b/w legacy scan/filtering code and the templated version	2021-12-10 10:30:00 +00:00
Leonid Fedorov	5c5f103f98	MCOL-4839: Fix clang build (#2100 ) * Fix clang build * Extern C returned to plugin_instance Co-authored-by: Leonid Fedorov <l.fedorov@mail.corp.ru>	2021-08-23 10:45:10 -05:00
Sergey Zefirov	9e0851e4cf	MCOL-4766 ROLLBACK kept ranges changed inside rolled back transaction Now ROLLBACK drops ranges to INVALID state which makes engine to rescan blocks and discover correct ranges.	2021-07-07 18:16:56 +03:00
Gagan Goel	8520f87237	MCOL-641 Cleanup.	2021-07-06 09:01:49 +00:00
Denis Khalikov	606194e6e4	MCOL-4685: Eliminate some irrelevant settings (uncompressed data and extents per file). This patch: 1. Removes the option to declare uncompressed columns (set columnstore_compression_type = 0). 2. Ignores [COMMENT '[compression=0] option at table or column level (no error messages, just disregard). 3. Removes the option to set more than 2 extents per file (ExtentsPreSegmentFile). 4. Updates rebuildEM tool to support up to 10 dictionary extent per dictionary segment file. 5. Adds check for `DBRootStorageType` for rebuildEM tool. 6. Renamed rebuildEM to mcsRebuildEM.	2021-06-03 14:44:33 +03:00
Roman Nozdrin	5fce19df0a	MCOL-4412 Introduce TypeHandler::getEmptyValueForType to return const ptr for an empty value WE changes for SQL DML and DDL operations Changes for bulk operations Changes for scanning operations Cleanup	2021-01-18 12:30:17 +00:00
Sergey Zefirov	2bfe9b6c19	Refactor better extent info bookkeeping structure and handling Logs for research purposes Keep progress - may not build Good interface to collect LBIDs and CPInfo's Write Engine compiles with new interface New interface breaks things the least way and allows for new features to be added gradually. Still ironing design - rewriting parts of WE Keep progress commit Write Engine compiles, going to test I could introduce crashes there. Let's see. Disable logging for tests Fixing build problems - keep progress commit Changed related to new interface Add back accidentally removed m_txnLBIDMap.find Remove printf/cout; up-to-date comment for AddLBIDtoList Add "auto" type annotation Work on PR comments Descriptive vector emptines check	2020-12-07 13:12:36 +03:00
Roman Nozdrin	494bde61e1	MCOL-4409 Moved static Decimal conversion methods into VDecimal class MCOL-4409 This patch combines VDecimal and Decimal and makes IDB_Decimal an alias for the result class MCOL-4409 More boilerplate reduction in Func_mod Removed couple TSInt128::toType() methods	2020-11-30 12:08:52 +00:00
Roman Nozdrin	58495d0d2f	MCOL-4387 Convert dataconvert::decimalToString() into VDecimal and TSInt128 methods	2020-11-18 13:53:16 +00:00
Roman Nozdrin	15b1bfa709	Fix fallthrough compilation warnings	2020-11-18 13:53:15 +00:00
Roman Nozdrin	3eb26c0d4a	MCOL-4313 Introduced TSInt128 that is a storage class for int128 Removed uint128 from joblist/lbidlist.* Another toString() method for wide-decimal that is EMPTY/NULL aware Unified decimal processing in WF functions Fixed a potential issue in EqualCompData::operator() for wide-decimal processing Fixed some signedness warnings	2020-11-18 13:53:15 +00:00
Alexander Barkov	129d5b5a0f	MCOL-4174 Review/refactor frontend/connector code	2020-11-18 13:53:15 +00:00
Roman Nozdrin	1c3a34a3d0	Dataconvert::decimalToString badly fails w/o 20th member of mcs_pow_10 so I returned it WF::percentile runtime threw an exception b/c of wrong DT deduced from its argument Replaced literals with constants Tought WF_sum_avg::checkSumLimit to use refs instead of values	2020-11-18 13:52:20 +00:00
Gagan Goel	74b64eb4f1	MCOL-641 1. Add support for int128_t in ParsedColumnFilter. 2. Set Decimal precision in SimpleColumn::evaluate(). 3. Add support for int128_t in ConstantColumn. 4. Set IDB_Decimal::s128Value in buildDecimalColumn(). 5. Use width 16 as first if predicate for branching based on decimal width.	2020-11-18 13:47:45 +00:00
Gagan Goel	9b714274db	MCOL-641 1. Minor refactoring of decimalToString for int128_t. 2. Update unit tests for decimalToString. 3. Allow support for wide decimal in TupleConstantStep::fillInConstants().	2020-11-18 13:47:44 +00:00
Gagan Goel	824615a55b	MCOL-641 Refactor empty value implementation in writeengine.	2020-11-18 13:47:44 +00:00
Roman Nozdrin	97ee1609b2	MCOL-641 Replaced NULL binary constants. DataConvert::decimalToString, toString, writeIntPart, writeFractionalPart are not templates anymore.	2020-11-18 13:47:44 +00:00
Gagan Goel	b07db9a8f4	MCOL-641 Basic support for updates.	2020-11-18 13:47:01 +00:00
Gagan Goel	93170c3b31	MCOL-641 Basic support for multi-value inserts, and deletes.	2020-11-18 13:47:01 +00:00
drrtuy	0c67b6ab50	MCOL-641 atoi128 now correctly processes decimal point and - signs. There are multiple overloaded version of the low level DML write methods to push down CSC column type. WE needs the type to convert values correctly. Replaced WE_INT128 with CSC data type that is more informative. Removed commented and obsolete code. Replaced switch-case blocks with oneliners.	2020-11-18 13:47:01 +00:00
Roman Nozdrin	c9f42fb5cc	MCOL-641 PoC version for DECIMAL(38) using BINARY as a basis.	2020-11-18 13:47:00 +00:00
Alexey Antipovsky	ede047f0fa	Fix warnings on CentOS7	2020-11-17 15:03:10 +03:00
Alexey Antipovsky	0e29b0b0f9	Fix -Wtype-limits	2020-11-17 15:03:10 +03:00
David Hall	60dfe38b63	MCOL-4116 Set precision for long double to 19	2020-07-08 10:26:36 -05:00
David Hall	e47eaae033	MCOL-4116 add long LONGDOUBLE support to update	2020-07-06 12:47:10 -05:00
Gagan Goel	2ba9263df4	Silence -Werror=implicit-fallthrough compiler errors - Patch from Monty. The patch also fixes some potential bugs due to missing break statements.	2020-06-26 12:32:57 -04:00
Andrew Hutchings	5e4f1b9933	Merge branch 'develop' into MCOL-265	2019-06-10 13:58:03 +01:00
Andrew Hutchings	020b211bb7	Merge branch 'develop-1.2' into develop-merge-up-20190514	2019-05-14 13:58:33 +01:00
Andrew Hutchings	067b1bd3d0	Merge pull request #752 from mariadb-corporation/MCOL-1495 MCOL-1495 DML operations created two fCatalogMap entries per session.	2019-05-09 09:44:05 +01:00
gscteam	b2bf6dece5	missing semicol	2019-05-01 13:38:02 -05:00
Patrice Linel	9e7d852804	unneeded k for loop	2019-05-01 10:29:57 -05:00
Roman Nozdrin	a8adef8820	MCOL-1495 DML operations created two fCatalogMap entries per session. These entries were never deleted so WE leaks about 7MB per 100 DML sessions. I add purge operations in the very end of four functions involved in DML.	2019-05-01 18:03:32 +03:00
Gagan Goel	e89d1ac3cf	MCOL-265 Add support for TIMESTAMP data type	2019-04-23 00:00:09 -04:00
Andrew Hutchings	a955b56f4d	Merge branch 'develop-1.2' into develop-merge-up-20190218	2019-02-18 15:55:11 +00:00

1 2

75 Commits