1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-04-20 09:07:44 +03:00

75 Commits

Author SHA1 Message Date
Serguey Zefirov
5780bfa15f Fix MCOL-5035, a difference in INSERT and UPDATE behavior
The UPDATE statement wrote NULL when the column set is DATETIME and
value is '0000-00-00 00:00:00'. The problem was inside WriteEngine's
handling of UPDATE statements and this is where heart of change is.

Other changes are related to some obsolete data structures in DML/DDL
handling that just hanging around there, doing nothing.
2024-03-19 12:44:22 +03:00
Sergey Zefirov
b53c231ca6 MCOL-271 empty strings should not be NULLs (#2794)
This patch improves handling of NULLs in textual fields in ColumnStore.
Previously empty strings were considered NULLs and it could be a problem
if data scheme allows for empty strings. It was also one of major
reasons of behavior difference between ColumnStore and other engines in
MariaDB family.

Also, this patch fixes some other bugs and incorrect behavior, for
example, incorrect comparison for "column <= ''" which evaluates to
constant True for all purposes before this patch.
2023-03-30 21:18:29 +03:00
Gagan Goel
006b92bba2 Revert "This commit fixes an incorrect predicate in the if condition (#2608)"
This reverts commit f4e3022fbdecc25a22ae6eaf072e462aa6695f35.

The commit apparently caused MCOL-5318 and MCOL-5319 which involve the
internal ColumnStore batch insert mechanism passing through the SQL
layer. The code block involved in this change is a predicate checking
for the HWM extent in WriteEngineServer at the end of the batch insert.
This is done in WE_DMLCommandProc::processBatchInsertHwm(). The original
predicate check in this function for the HWM extent is restored until
further investigation.
2023-02-02 08:07:18 -05:00
Gagan Goel
ad59ed5402 MCOL-5367 Fix a bug introduced in MCOL-5021 (AUX column implementation).
In the implementation of MCOL-5021, an assert was added in
`WE_DMLCommandProc::processBatchInsertHwm()` that assumed the
`WriteEngine::TableMetaData` cache is uniform across the cluster.
However, this assumption is incorrect.

This bug caused undefined behaviour in ColumnStore resulting in bugs
such as MCOL-5367. In MCOL-5367, in a multi-node ColumnStore cluster,
an INSERT ... SELECT in a transaction with system variable
`columnstore_use_import_for_batchinsert=OFF/ON` did not show inserted
records when a SELECT query was issued. Assuming a 3-node cluster setup,
DMLProc only sends a given batch of records to be inserted to one of the
3 nodes, and not all nodes. As a result, the `WriteEngine::TableMetaData`
cache is only populated for that one node and is not uniform across the
cluster, causing the assert to fail.

As a fix, we simply remove this assert as it is redundant and should not
have been added in the first place.
2023-01-16 05:54:44 -05:00
Denis Khalikov
d61780cab1 MCOL-5263 Add support to ROLLBACK when PP were restarted.
DMLProc starts ROLLBACK when SELECT part of UPDATE fails b/c EM facility in PP were restarted.
Unfortunately this ROLLBACK stuck if EM/PP are not yet available.
DMLProc must have a t/o with re-try doing ROLLBACK.
2022-12-13 16:18:53 +03:00
Gagan Goel
f4e3022fbd
This commit fixes an incorrect predicate in the if condition (#2608)
that checks for HWM extent in WE_DMLCommandProc::processBatchInsertHwm().
2022-11-08 14:51:42 -06:00
Leonid Fedorov
d2432f9bf6 get rid of pointers for 128 fields 2022-08-26 15:12:22 +00:00
mariadb-AndreyPiskunov
0863ecd279 Replace getBinaryField 2022-08-25 18:21:43 +03:00
Gagan Goel
cbfdae3481 MCOL-5021 Code changes based on review feedback. 2022-08-05 14:40:50 -04:00
Gagan Goel
1355237ca3 MCOL-5021 Some minor fixes. 2022-08-05 14:40:50 -04:00
Gagan Goel
9b6d3c3870 MCOL-5021 Add support for AUX column in the client code calling
CalpontSystemCatalog::columnRIDs().
2022-08-05 14:40:49 -04:00
Gagan Goel
262cd5c501 MCOL-5021 Remove hard-coded values for data type, column width
and compression type for the AUX column, and replace them with
constants defined in the execplan namespace.
2022-08-05 14:40:49 -04:00
Gagan Goel
86df9a972c MCOL-5021 Add prototype support for the AUX column in CREATE/DROP
DDL commands, single and multi-value INSERTs, cpimport, and
DELETE.
2022-08-05 14:40:49 -04:00
Serguey Zefirov
53b9a2a0f9 MCOL-4580 extent elimination for dictionary-based text/varchar types
The idea is relatively simple - encode prefixes of collated strings as
integers and use them to compute extents' ranges. Then we can eliminate
extents with strings.

The actual patch does have all the code there but miss one important
step: we do not keep collation index, we keep charset index. Because of
this, some of the tests in the bugfix suite fail and thus main
functionality is turned off.

The reason of this patch to be put into PR at all is that it contains
changes that made CHAR/VARCHAR columns unsigned. This change is needed in
vectorization work.
2022-03-02 23:53:39 +03:00
Gagan Goel
973e5024d8 MCOL-4957 Fix performance slowdown for processing TIMESTAMP columns.
Part 1:
 As part of MCOL-3776 to address synchronization issue while accessing
 the fTimeZone member of the Func class, mutex locks were added to the
 accessor and mutator methods. However, this slows down processing
 of TIMESTAMP columns in PrimProc significantly as all threads across
 all concurrently running queries would serialize on the mutex. This
 is because PrimProc only has a single global object for the functor
 class (class derived from Func in utils/funcexp/functor.h) for a given
 function name. To fix this problem:

   (1) We remove the fTimeZone as a member of the Func derived classes
   (hence removing the mutexes) and instead use the fOperationType
   member of the FunctionColumn class to propagate the timezone values
   down to the individual functor processing functions such as
   FunctionColumn::getStrVal(), FunctionColumn::getIntVal(), etc.

   (2) To achieve (1), a timezone member is added to the
   execplan::CalpontSystemCatalog::ColType class.

Part 2:
 Several functors in the Funcexp code call dataconvert::gmtSecToMySQLTime()
 and dataconvert::mySQLTimeToGmtSec() functions for conversion between seconds
 since unix epoch and broken-down representation. These functions in turn call
 the C library function localtime_r() which currently has a known bug of holding
 a global lock via a call to __tz_convert. This significantly reduces performance
 in multi-threaded applications where multiple threads concurrently call
 localtime_r(). More details on the bug:
   https://sourceware.org/bugzilla/show_bug.cgi?id=16145

 This bug in localtime_r() caused processing of the Functors in PrimProc to
 slowdown significantly since a query execution causes Functors code to be
 processed in a multi-threaded manner.

 As a fix, we remove the calls to localtime_r() from gmtSecToMySQLTime()
 and mySQLTimeToGmtSec() by performing the timezone-to-offset conversion
 (done in dataconvert::timeZoneToOffset()) during the execution plan
 creation in the plugin. Note that localtime_r() is only called when the
 time_zone system variable is set to "SYSTEM".

 This fix also required changing the timezone type from a std::string to
 a long across the system.
2022-02-14 14:12:27 -05:00
Leonid Fedorov
04752ec546 clang format apply 2022-01-21 16:43:49 +00:00
Roman Nozdrin
af36f9940f This patch introduces support for scanning/filtering vectorized execution for numeric-based
data types TEXT, CHAR, VARCHAR, FLOAT and DOUBLE are not yet supported by vectorized path
This patch introduces an example for Google benchmarking suite to measure a perf diff
b/w legacy scan/filtering code and the templated version
2021-12-10 10:30:00 +00:00
Leonid Fedorov
5c5f103f98
MCOL-4839: Fix clang build (#2100)
* Fix clang build

* Extern C returned to plugin_instance

Co-authored-by: Leonid Fedorov <l.fedorov@mail.corp.ru>
2021-08-23 10:45:10 -05:00
Sergey Zefirov
9e0851e4cf MCOL-4766 ROLLBACK kept ranges changed inside rolled back transaction
Now ROLLBACK drops ranges to INVALID state which makes engine to rescan
blocks and discover correct ranges.
2021-07-07 18:16:56 +03:00
Gagan Goel
8520f87237 MCOL-641 Cleanup. 2021-07-06 09:01:49 +00:00
Denis Khalikov
606194e6e4 MCOL-4685: Eliminate some irrelevant settings (uncompressed data and extents per file).
This patch:
1. Removes the option to declare uncompressed columns (set columnstore_compression_type = 0).
2. Ignores [COMMENT '[compression=0] option at table or column level (no error messages, just disregard).
3. Removes the option to set more than 2 extents per file (ExtentsPreSegmentFile).
4. Updates rebuildEM tool to support up to 10 dictionary extent per dictionary segment file.
5. Adds check for `DBRootStorageType` for rebuildEM tool.
6. Renamed rebuildEM to mcsRebuildEM.
2021-06-03 14:44:33 +03:00
Roman Nozdrin
5fce19df0a MCOL-4412 Introduce TypeHandler::getEmptyValueForType to return const ptr for an empty value
WE changes for SQL DML and DDL operations

Changes for bulk operations

Changes for scanning operations

Cleanup
2021-01-18 12:30:17 +00:00
Sergey Zefirov
2bfe9b6c19 Refactor better extent info bookkeeping structure and handling
Logs for research purposes

Keep progress - may not build

Good interface to collect LBIDs and CPInfo's

Write Engine compiles with new interface

New interface breaks things the least way and allows for new features to be added gradually.

Still ironing design - rewriting parts of WE

Keep progress commit

Write Engine compiles, going to test

I could introduce crashes there. Let's see.

Disable logging for tests

Fixing build problems - keep progress commit

Changed related to new interface

Add back accidentally removed m_txnLBIDMap.find

Remove printf/cout; up-to-date comment for AddLBIDtoList

Add "auto" type annotation

Work on PR comments

Descriptive vector emptines check
2020-12-07 13:12:36 +03:00
Roman Nozdrin
494bde61e1 MCOL-4409 Moved static Decimal conversion methods into VDecimal class
MCOL-4409 This patch combines VDecimal and Decimal and makes
IDB_Decimal an alias for the result class

MCOL-4409 More boilerplate reduction in Func_mod

Removed couple TSInt128::toType() methods
2020-11-30 12:08:52 +00:00
Roman Nozdrin
58495d0d2f MCOL-4387 Convert dataconvert::decimalToString() into VDecimal and TSInt128 methods 2020-11-18 13:53:16 +00:00
Roman Nozdrin
15b1bfa709 Fix fallthrough compilation warnings 2020-11-18 13:53:15 +00:00
Roman Nozdrin
3eb26c0d4a MCOL-4313 Introduced TSInt128 that is a storage class for int128
Removed uint128 from joblist/lbidlist.*

Another toString() method for wide-decimal that is EMPTY/NULL aware

Unified decimal processing in WF functions

Fixed a potential issue in EqualCompData::operator() for
    wide-decimal processing

Fixed some signedness warnings
2020-11-18 13:53:15 +00:00
Alexander Barkov
129d5b5a0f MCOL-4174 Review/refactor frontend/connector code 2020-11-18 13:53:15 +00:00
Roman Nozdrin
1c3a34a3d0 Dataconvert::decimalToString badly fails w/o 20th member of mcs_pow_10 so I returned it
WF::percentile runtime threw an exception b/c of wrong DT deduced from its argument

Replaced literals with constants

Tought WF_sum_avg::checkSumLimit to use refs instead of values
2020-11-18 13:52:20 +00:00
Gagan Goel
74b64eb4f1 MCOL-641 1. Add support for int128_t in ParsedColumnFilter.
2. Set Decimal precision in SimpleColumn::evaluate().
3. Add support for int128_t in ConstantColumn.
4. Set IDB_Decimal::s128Value in buildDecimalColumn().
5. Use width 16 as first if predicate for branching based on decimal width.
2020-11-18 13:47:45 +00:00
Gagan Goel
9b714274db MCOL-641 1. Minor refactoring of decimalToString for int128_t.
2. Update unit tests for decimalToString.
3. Allow support for wide decimal in TupleConstantStep::fillInConstants().
2020-11-18 13:47:44 +00:00
Gagan Goel
824615a55b MCOL-641 Refactor empty value implementation in writeengine. 2020-11-18 13:47:44 +00:00
Roman Nozdrin
97ee1609b2 MCOL-641 Replaced NULL binary constants.
DataConvert::decimalToString, toString, writeIntPart, writeFractionalPart are not templates anymore.
2020-11-18 13:47:44 +00:00
Gagan Goel
b07db9a8f4 MCOL-641 Basic support for updates. 2020-11-18 13:47:01 +00:00
Gagan Goel
93170c3b31 MCOL-641 Basic support for multi-value inserts, and deletes. 2020-11-18 13:47:01 +00:00
drrtuy
0c67b6ab50 MCOL-641 atoi128 now correctly processes decimal point and - signs.
There are multiple overloaded version of the low level DML write methods to
push down CSC column type. WE needs the type to convert values correctly.

Replaced WE_INT128 with CSC data type that is more informative.

Removed commented and obsolete code.

Replaced switch-case blocks with oneliners.
2020-11-18 13:47:01 +00:00
Roman Nozdrin
c9f42fb5cc MCOL-641 PoC version for DECIMAL(38) using BINARY as a basis. 2020-11-18 13:47:00 +00:00
Alexey Antipovsky
ede047f0fa Fix warnings on CentOS7 2020-11-17 15:03:10 +03:00
Alexey Antipovsky
0e29b0b0f9 Fix -Wtype-limits 2020-11-17 15:03:10 +03:00
David Hall
60dfe38b63 MCOL-4116 Set precision for long double to 19 2020-07-08 10:26:36 -05:00
David Hall
e47eaae033 MCOL-4116 add long LONGDOUBLE support to update 2020-07-06 12:47:10 -05:00
Gagan Goel
2ba9263df4 Silence -Werror=implicit-fallthrough compiler errors - Patch from Monty.
The patch also fixes some potential bugs due to missing break
statements.
2020-06-26 12:32:57 -04:00
Andrew Hutchings
5e4f1b9933
Merge branch 'develop' into MCOL-265 2019-06-10 13:58:03 +01:00
Andrew Hutchings
020b211bb7 Merge branch 'develop-1.2' into develop-merge-up-20190514 2019-05-14 13:58:33 +01:00
Andrew Hutchings
067b1bd3d0
Merge pull request #752 from mariadb-corporation/MCOL-1495
MCOL-1495 DML operations created two fCatalogMap entries per session.
2019-05-09 09:44:05 +01:00
gscteam
b2bf6dece5
missing semicol 2019-05-01 13:38:02 -05:00
Patrice Linel
9e7d852804 unneeded k for loop 2019-05-01 10:29:57 -05:00
Roman Nozdrin
a8adef8820 MCOL-1495 DML operations created two fCatalogMap entries per session.
These entries were never deleted so WE leaks about 7MB per 100 DML
    sessions. I add purge operations in the very end of four functions
    involved in DML.
2019-05-01 18:03:32 +03:00
Gagan Goel
e89d1ac3cf MCOL-265 Add support for TIMESTAMP data type 2019-04-23 00:00:09 -04:00
Andrew Hutchings
a955b56f4d Merge branch 'develop-1.2' into develop-merge-up-20190218 2019-02-18 15:55:11 +00:00