mariadb-columnstore-engine

mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-12-07 20:42:15 +03:00

Author	SHA1	Message	Date
Daniel Black	c01ab42e87	MCOL-5747 gcc-14.1.1 compile error - calloc - transposed args The arguments of calloc are the number of members and the sizeof the member. Gcc-14.1.1 worked out how to tell the difference. We correct this by transposing to gcc's will.	2024-05-16 16:53:52 +04:00
Sergey Zefirov	69b8e1c779	feat(extent-elimination)!: re-enable extent-elimination for dictionary columns scanning This is "productization" of an old code that would enable extent elimination for dictionary columns. This concrete patch enables it, fixes perfomance degradation (main problem with old code) and also fixes incorrect behavior of cpimport.	2023-11-17 17:14:35 +03:00
Roman Nozdrin	4fe9cd64a3	Revert "No boost condition (#2822 )" (#2828 ) This reverts commit `f916e64927`.	2023-04-22 15:49:50 +03:00
Leonid Fedorov	f916e64927	No boost condition (#2822 ) This patch replaces boost primitives with stdlib counterparts.	2023-04-22 00:42:45 +03:00
Leonid Fedorov	2e1394149b	MCOL-5464: Fixes of bugs from ASAN warnings, part one (#2792 ) * Fixes of bugs from ASAN warnings, part one * MQC as static library, with nifty counter for global map and mutex * Switch clang to 16 * link messageqcpp to execplan	2023-04-04 02:33:23 +03:00
Sergey Zefirov	b53c231ca6	MCOL-271 empty strings should not be NULLs (#2794 ) This patch improves handling of NULLs in textual fields in ColumnStore. Previously empty strings were considered NULLs and it could be a problem if data scheme allows for empty strings. It was also one of major reasons of behavior difference between ColumnStore and other engines in MariaDB family. Also, this patch fixes some other bugs and incorrect behavior, for example, incorrect comparison for "column <= ''" which evaluates to constant True for all purposes before this patch.	2023-03-30 21:18:29 +03:00
Leonid Fedorov	56f2346083	Remove windows ifdefs	2023-03-02 15:59:42 +00:00
Gagan Goel	c8b6b154bf	MCOL-5021 Add an option in Columnstore.xml, fastdelete (disabled by default), which when enabled, indiscriminately invalidates all column extents and performs the actual DELETE only on the AUX column. The trade-off with this approach would now be that the first SELECT for certain query patterns (those containing a WHERE predicate) after the DELETE operation will slow down as the invalidated column extent would need to be scanned again to set the min/max values.	2022-08-05 14:40:49 -04:00
Gagan Goel	60eb0f86ec	MCOL-5021 non-AUX column files are opened in read-only mode during the DELETE operation. ColumnOp::readBlock() calls can cause writes to database files when the active chunk list in ChunkManager is full. Since non-AUX columns are read-only for the DELETE operation, we prevent writes of compressed chunks and header for these columns by passing an isReadOnly flag to CompFileData which indicates whether the column is read-only or read-write.	2022-08-05 14:40:49 -04:00
Gagan Goel	35a3a93964	MCOL-5021 For the DELETE operation, empty magic values are only written to database files for AUX column. Perform read-only operation for other columns in the table to update the Casual Partitioning information.	2022-08-05 14:40:49 -04:00
Gagan Goel	86df9a972c	MCOL-5021 Add prototype support for the AUX column in CREATE/DROP DDL commands, single and multi-value INSERTs, cpimport, and DELETE.	2022-08-05 14:40:49 -04:00
Serguey Zefirov	53b9a2a0f9	MCOL-4580 extent elimination for dictionary-based text/varchar types The idea is relatively simple - encode prefixes of collated strings as integers and use them to compute extents' ranges. Then we can eliminate extents with strings. The actual patch does have all the code there but miss one important step: we do not keep collation index, we keep charset index. Because of this, some of the tests in the bugfix suite fail and thus main functionality is turned off. The reason of this patch to be put into PR at all is that it contains changes that made CHAR/VARCHAR columns unsigned. This change is needed in vectorization work.	2022-03-02 23:53:39 +03:00
Leonid Fedorov	3919c541ac	New warnfixes (#2254 ) * Fix clang warnings * Remove vim tab guides * initialize variables * 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length * Fix ISO C++17 does not allow 'register' storage class specifier for outdated bison * chars are unsigned on ARM, having if (ival < 0) always false * chars are unsigned by default on ARM and comparison with -1 if always true	2022-02-17 13:08:58 +03:00
Leonid Fedorov	04752ec546	clang format apply	2022-01-21 16:43:49 +00:00
Roman Nozdrin	af36f9940f	This patch introduces support for scanning/filtering vectorized execution for numeric-based data types TEXT, CHAR, VARCHAR, FLOAT and DOUBLE are not yet supported by vectorized path This patch introduces an example for Google benchmarking suite to measure a perf diff b/w legacy scan/filtering code and the templated version	2021-12-10 10:30:00 +00:00
Sergey Zefirov	6eaee180f3	MCOL-4779 Keep correct ranges during DML for short char columns	2021-07-12 14:07:39 +03:00
Roman Nozdrin	866dc25729	Merge pull request #1842 from denis0x0D/MCOL-987_LZ MCOL-987 LZ4 compression support.	2021-07-07 13:13:18 +03:00
Denis Khalikov	cc1c3629c5	MCOL-987 Add LZ4 compression. * Adds CompressInterfaceLZ4 which uses LZ4 API for compress/uncompress. * Adds CMake machinery to search LZ4 on running host. * All methods which use static data and do not modify any internal data - become `static`, so we can use them without creation of the specific object. This is possible, because the header specification has not been modified. We still use 2 sections in header, first one with file meta data, the second one with pointers for compressed chunks. * Methods `compress`, `uncompress`, `maxCompressedSize`, `getUncompressedSize` - become pure virtual, so we can override them for the other compression algos. * Adds method `getChunkMagicNumber`, so we can verify chunk magic number for each compression algo. * Renames "s/IDBCompressInterface/CompressInterface/g" according to requirement.	2021-07-06 18:04:37 +03:00
Gagan Goel	8520f87237	MCOL-641 Cleanup.	2021-07-06 09:01:49 +00:00
Sergey Zefirov	1143e89fd0	PR fixes	2021-04-05 20:34:45 +03:00
Sergey Zefirov	4545a86a80	Porting old MCOL-2044-update...: new interface for writeColRecs Progress keep and test commit Progress keep and test commit Progress keep and test commit Again, trying to pinpoint problematic part of a change Revert "Again, trying to pinpoint problematic part of a change" This reverts commit 71874e7c0d7e4eeed0c201b12d306b583c07b9e2. Revert "Progress keep and test commit" This reverts commit 63c7bc67ae55bdb81433ca58bbd239d6171a1031. Revert "Progress keep and test commit" This reverts commit 121c09febd78dacd37158caeab9ac70f65b493df. Small steps - I walk minefield here Propagating changes - now CPInfo in convertValArray Progress keep commit Restoring old functionality Progress keep commit Small steps to avoid/better locate old problem with the write engine. Progress keeping commit Thread the CPInfo up to convertValArray call in writeColumnRec About to test changes - I should get no regression and no updates in ranges either. Testing out why I get a regression Investigating source of regression Debugging prints Fix compile error Debugging print - debug regression I clearly see calls to writeColumnRec and prints there added to discern between these. Fix warning error Possible culprit Add forgotten default parameter for convertValArray New logic to test Max/min gets updated during value conversion To test results of updates Debug logs Debug logs An attempt to provide proper sequence index Debug logs An attempt to provide proper sequence index - now magic for resetting Debug logs Debug logs Debug logs Trying to perform correct updates Trying to perform correct updates - seqNum woes fight COMMIT after INSERT performs 'mark extent as invalid' operation - investigating To test: cut setting of CPInfo upon commit from DML processor It may be superfluous as write engine does that too Debug logs Debug logs Better interface for CPMaxMin Old interface forgot to set isBinaryColumn field Possible fix for the problems I forgot to reassign the value in cpinfoList Debug logs Computation of 'binary' column property logs indicated that it was not set in getExtentCPMaxMin, and it was impossible to compute there so I had to move that into writeengine. To test: code to allow cross-extent insertion To test: removed another assertion for probable cause of errors Debug logs Dropped excessive logs Better reset code Again, trying to fix ordering Fixing order of rowids for LBID computation Debug logs Remove update of second LBID in split insert I have to validate incorrect behaviour for this test Restoring the case where everything almost worked Tracking changes in newly created extents Progress keeping commit Fixing build errors with recent server An ability to get old values from blocks we update Progress keeping commit Adding analysis of old values to write engine code. It is needed for updates and deletes. Progress keeping commit Moving max/min range update from convertValArray into separate function with simpler logic. To test and debug - logic is there Fix build errors Update logic to debug There is a suspicious write engine method updateColumnRecs which receives a vector of column types but does not iterate over them (otherwise it will be identical to updateColumnRec in logic). Other than that, the updateColumnRec looks like the center of all updates - deleteRow calls it, for example, dml processor also calls it. Debug logs for insert bookkeeping regression Set up operation type in externally-callable interface Internal operations depend on the operation type and consistency is what matters there. Debug logs Fix for extent range update failure during update operation Fix build error Debug logs Fix for update on deletion I am not completely sure in it - to debug. Debug log writeColumnRec cannot set m_opType to UPDATE unconditionally It is called from deleteRow Better diagnostics Debug logs Fixed search condition Debug logs Debugging invalid LBID appearance Debug logs - fixed condition Fix problems with std::vector reallocation during growth Fix growing std::vector data dangling access error Still fixing indexing errors Make in-range update to work Correct sequence numbers Debug logs Debug logs Remove range drop from DML part of write engine A hack to test the culprit of range non-keeping Tests - no results for now MTR-style comments Empty test results To be filled with actual results. Special database and result selects for all tests Pleasing MTR with better folder name Pleasing MTR - testing test result comparison Pleasing MTR by disabling warnings All test results Cleaning up result files Reset ranges before update Remove comments from results - point of failure in MTR Remove empty line from result - another MTR failure point Probably fix for deletes Possible fix for remaining failed delete test Fix a bug in writeRows It should not affect delete-with-range test case, yet it is a bug. Debug logs Debug logs Tests reorganization and description Support for unsigned integer for new tests Fix type omission Fix test failure due to warnings on clean installation Support for bigint to test Fix for failed signed bigint test Set proper unsignedness flag Removed that assignment during refactoring. Tests for types with column width 1 and 2 Support for types in new tests Remove trailing empty lines from results Tests had failed because of extra empty lines. Remove debug logs Update README with info about new tests Move tests for easier testing Add task tag to tests Fix invalid unsaigned range check Fix for signed types Fix regressions - progress keeping commit Do not set invalid ranges into valid state A possible fix for mcs81_self_join test MCOL 2044 test database cleanup Missing expected results Delete extraneous assignment to m_opType nullptr instead of NULL Refactor extended CPInfo with TypeHandler Better handling of ranges - safer types, less copy-paste Fix logic error related to typo Fix logic error related to typo Trying to figure out why invalid ranges aren't displayed as NULL..NULL Debug logs Debug logs Debug logs Debug logs for worker node Debug logs for worker node in extent map Debugging virtual table fill operation Debugging virtual table fill operation Fix for invalid range computation Remove debug logs Change handling of invalid ranges They are also set, but to invalid state. Complete change Fix typo Remove unused code "Fix" for tests - -1..0 instead of NULL..NULL for invalid unsigned ranges Not a good change, yet I cannot do better for now. MTR output requires tabs instead of spaces Debug logs Debug logs Debug logs - fix build Debug logs and logic error fix Fix for clearly incorrect firstLBID in CPInfo being set - to test Fix for system catalog operations suppot Better interface to fix build errors Delete tests we cannot satisfy due to extent rescan due to WHERE Tests for wide decimals Testing support for wide decimals Fix for wide decimals tests Fix for delete within range Memory leak fix and, possible, double free fix Dispatch on CalpontSystemCatalog::ColDataType is more robust Add support for forgotten MEDINT type Add forgottent BIGINT empty() instead of size() > 0 Better layout Remove confusing comment Sensible names for special values of seqNum field Tests for wide decimal support Addressing concerns of drrtuy Remove test we cannot satisfy Final touches for PR Remove unused result file	2021-04-05 14:18:22 +03:00
Roman Nozdrin	a9b3957182	MCOL-4519 RefColumn now uses the correct empty value iterating over the block values	2021-01-29 15:29:39 +00:00
Roman Nozdrin	5fce19df0a	MCOL-4412 Introduce TypeHandler::getEmptyValueForType to return const ptr for an empty value WE changes for SQL DML and DDL operations Changes for bulk operations Changes for scanning operations Cleanup	2021-01-18 12:30:17 +00:00
Roman Nozdrin	c399249b1e	MCOL-4468 Add forgotten cast from long saved as boost::any into int64_t for WR_LONGLONG	2020-12-23 13:44:53 +00:00
Sergey Zefirov	2bfe9b6c19	Refactor better extent info bookkeeping structure and handling Logs for research purposes Keep progress - may not build Good interface to collect LBIDs and CPInfo's Write Engine compiles with new interface New interface breaks things the least way and allows for new features to be added gradually. Still ironing design - rewriting parts of WE Keep progress commit Write Engine compiles, going to test I could introduce crashes there. Let's see. Disable logging for tests Fixing build problems - keep progress commit Changed related to new interface Add back accidentally removed m_txnLBIDMap.find Remove printf/cout; up-to-date comment for AddLBIDtoList Add "auto" type annotation Work on PR comments Descriptive vector emptines check	2020-12-07 13:12:36 +03:00
Jose Rojas	d1908f7a0f	MCOL-2055. Only flush the oids from the primproc cache	2020-12-02 18:58:02 +00:00
Jose Rojas	69550cbe78	MCOL-2055 Fix. Flush PrimProc Cache during batchinserts	2020-12-02 17:10:20 +00:00
Gagan Goel	995cadef2d	MCOL-641 Fix alter table add wide decimal column. This patch also removes CalpontSystemCatalog::BINARY and ddlpackage::DDL_BINARY that were added during the initial stages of the work on MCOL-641.	2020-11-20 19:49:54 -05:00
Roman Nozdrin	58495d0d2f	MCOL-4387 Convert dataconvert::decimalToString() into VDecimal and TSInt128 methods	2020-11-18 13:53:16 +00:00
Alexander Barkov	129d5b5a0f	MCOL-4174 Review/refactor frontend/connector code	2020-11-18 13:53:15 +00:00
Roman Nozdrin	f7002e20b5	::writeRow now treats WR_BINARY as int128 for 16 bytes DT only WF avg uses const & as arguments types Removed BINARY from DDL parser	2020-11-18 13:52:20 +00:00
David Hall	af80081c94	MCOL-4171 Some fixes	2020-11-18 13:52:20 +00:00
David Hall	638202417f	MCOL-4171	2020-11-18 13:52:19 +00:00
Gagan Goel	d3bc68b02f	MCOL-641 Refactor initial extent elimination support. This commit also adds support in TupleHashJoinStep::forwardCPData, although we currently do not support wide decimals as join keys. Row estimation to determine large-side of the join is also updated.	2020-11-18 13:52:19 +00:00
Gagan Goel	74b64eb4f1	MCOL-641 1. Add support for int128_t in ParsedColumnFilter. 2. Set Decimal precision in SimpleColumn::evaluate(). 3. Add support for int128_t in ConstantColumn. 4. Set IDB_Decimal::s128Value in buildDecimalColumn(). 5. Use width 16 as first if predicate for branching based on decimal width.	2020-11-18 13:47:45 +00:00
Gagan Goel	62d0c82d75	MCOL-641 1. Templatized convertValueNum() function. 2. Allocate int128_t buffers in batchprimitiveprocessor if a query involves wide decimal columns.	2020-11-18 13:47:44 +00:00
Gagan Goel	9b714274db	MCOL-641 1. Minor refactoring of decimalToString for int128_t. 2. Update unit tests for decimalToString. 3. Allow support for wide decimal in TupleConstantStep::fillInConstants().	2020-11-18 13:47:44 +00:00
Gagan Goel	824615a55b	MCOL-641 Refactor empty value implementation in writeengine.	2020-11-18 13:47:44 +00:00
Roman Nozdrin	97ee1609b2	MCOL-641 Replaced NULL binary constants. DataConvert::decimalToString, toString, writeIntPart, writeFractionalPart are not templates anymore.	2020-11-18 13:47:44 +00:00
Gagan Goel	b07db9a8f4	MCOL-641 Basic support for updates.	2020-11-18 13:47:01 +00:00
Gagan Goel	93170c3b31	MCOL-641 Basic support for multi-value inserts, and deletes.	2020-11-18 13:47:01 +00:00
Gagan Goel	55afcd8890	MCOL-641 Basic extent elimination support for Decimal38.	2020-11-18 13:47:01 +00:00
drrtuy	0c67b6ab50	MCOL-641 atoi128 now correctly processes decimal point and - signs. There are multiple overloaded version of the low level DML write methods to push down CSC column type. WE needs the type to convert values correctly. Replaced WE_INT128 with CSC data type that is more informative. Removed commented and obsolete code. Replaced switch-case blocks with oneliners.	2020-11-18 13:47:01 +00:00
Roman Nozdrin	63dcaa387f	MCOL-641 Simple INSERT with one record works with this commit.	2020-11-18 13:47:00 +00:00
Roman Nozdrin	df65543dd4	MCOL-641 This commit contains fixes for the rebase that mostly adds WE_BINARY and WE_INT128 into switch-case blocks.	2020-11-18 13:47:00 +00:00
Roman Nozdrin	c9f42fb5cc	MCOL-641 PoC version for DECIMAL(38) using BINARY as a basis.	2020-11-18 13:47:00 +00:00
Gagan Goel	32f6167067	MCOL-641 Work of Ivan Zuniga on basic read and write support for Binary16	2020-11-18 13:47:00 +00:00
Gagan Goel	d1ada75395	MCOL-270 Add support for MEDIUMINT data type	2018-12-30 19:13:16 -05:00
Andrew Hutchings	e4ee1095de	Merge branch 'develop-1.1' into 1.1-merge-up-2018-12-20	2018-12-20 20:37:24 +00:00
Roman Nozdrin	d807aaee0a	MCOL-1347 ALTER TABLE ADD COLUMN now creates a column with correct width for a varchar columns.	2018-12-10 10:11:11 -08:00

1 2

78 Commits