1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-04-18 21:44:02 +03:00

23 Commits

Author SHA1 Message Date
Sergey Zefirov
b53c231ca6 MCOL-271 empty strings should not be NULLs (#2794)
This patch improves handling of NULLs in textual fields in ColumnStore.
Previously empty strings were considered NULLs and it could be a problem
if data scheme allows for empty strings. It was also one of major
reasons of behavior difference between ColumnStore and other engines in
MariaDB family.

Also, this patch fixes some other bugs and incorrect behavior, for
example, incorrect comparison for "column <= ''" which evaluates to
constant True for all purposes before this patch.
2023-03-30 21:18:29 +03:00
NTH19
df7c967d54 when eq Filtercount <6 ,the speed of for loop is faster than hashmap
add threshold for eqFilter
2022-08-23 18:45:36 +08:00
David.Hall
bbb168a846
Mcol 4560 (#2337)
* MCOL-4560 remove unused xml entries and code that references it.
There is reader code and variables for some of these settings, but nobody uses them.
2022-04-18 18:00:17 -04:00
Serguey Zefirov
53b9a2a0f9 MCOL-4580 extent elimination for dictionary-based text/varchar types
The idea is relatively simple - encode prefixes of collated strings as
integers and use them to compute extents' ranges. Then we can eliminate
extents with strings.

The actual patch does have all the code there but miss one important
step: we do not keep collation index, we keep charset index. Because of
this, some of the tests in the bugfix suite fail and thus main
functionality is turned off.

The reason of this patch to be put into PR at all is that it contains
changes that made CHAR/VARCHAR columns unsigned. This change is needed in
vectorization work.
2022-03-02 23:53:39 +03:00
Leonid Fedorov
04752ec546 clang format apply 2022-01-21 16:43:49 +00:00
Roman Nozdrin
05897948e4 MCOL-4899 MCS now applies a correct collation running IN for character data types 2022-01-05 12:00:01 +00:00
Roman Nozdrin
af36f9940f This patch introduces support for scanning/filtering vectorized execution for numeric-based
data types TEXT, CHAR, VARCHAR, FLOAT and DOUBLE are not yet supported by vectorized path
This patch introduces an example for Google benchmarking suite to measure a perf diff
b/w legacy scan/filtering code and the templated version
2021-12-10 10:30:00 +00:00
Alexander Barkov
c16b0f6ad7
MCOL-4823 WHERE char_col<varchar_col returns a wrong result of a large table (#2060)
SCommand StrFilterCmd::duplicate() missed these two lines:

    filterCmd->leftColType = leftColType;
    filterCmd->rightColType = rightColType;

which exist in the parent's FilterCommand::duplicate().

Rewriting the code to avoid duplication by using more inherited
methods/constructors. This reduces the probability of similar bugs
in the future.
2021-08-03 11:53:05 +03:00
Alexander Barkov
9794f24369 MCOL-4801 Replace Row methods getStringLength() and getStringPointer() to getConstString() 2021-07-06 21:15:32 +04:00
Alexander Barkov
765858bc5b MCOL-4498 LIKE is not collation aware 2021-03-22 20:42:01 +04:00
Alexander Barkov
0ff6a6ec20 Part#1 MCOL-495 Make string comparison not case sensitive
Fixing field='str' for long (Dict) string data types.
2020-12-04 07:49:00 +04:00
David Hall
236b92d706 MCOL-3536 Collation 2020-06-08 09:00:48 -05:00
David Hall
11ba12f6ea MCOL-3536 collation 2020-05-19 16:22:44 -05:00
David Hall
1f3d1e6fd6 MCOL-3536 collation 2020-05-14 16:02:49 -05:00
Andrew Hutchings
9390ee05fb Revert "MCOL-1559 Some string trailing blank stuff"
This reverts commit e5d76e142be7edcbbf0817f5bea594ddffc84973.
2019-05-23 13:49:08 +01:00
David Hall
e5d76e142b MCOL-1559 Some string trailing blank stuff 2019-03-28 15:25:49 -06:00
Andrew Hutchings
01446d1e22 Reformat all code to coding standard 2017-10-26 17:18:17 +01:00
Andrew Hutchings
785e6c91bd MCOL-670 Fix UPDATE with BLOB/TEXT
* Don't cache > 8000 bytes during update
* Fix PrimProc case where token is used more than once
2017-04-19 22:45:23 +01:00
Andrew Hutchings
e9db44424c MCOL-642 Separate TEXT from BLOB
* TEXT and BLOB now have separate identifiers internally
* TEXT columns are identified as such in system catalog
* cpimport only requires hex input for BLOB, not TEXT
2017-03-27 21:36:27 +01:00
Andrew Hutchings
b1d04c04fb MCOL-267 Fix LONGBLOB issues
* Set max column length to a little under 2.1GB in DDL
* Fix token edge case
* Re-write RowGroup string handling to take more than 64KB in one string
2017-03-21 17:22:31 +00:00
Andrew Hutchings
093aa377e5 MCOL-267 multi-block support for PrimProc and bulk
* Adds multi-block bulk write support
* Adds PrimProc multi-block read support
* Allows the functions length() and hex() to work with BLOB columns
2017-03-20 18:32:24 +00:00
Andrew Hutchings
aea729fe7d MCOL-267 DML support
* DML writes for multi-block dictionary (blob) now works
* PrimProc fixed so that the first block in multi-block is read
correctly
* Performance optimisation (removed string copy into stack) for new
dictionary entries
2017-03-18 14:31:29 +00:00
david hill
f6afc42dd0 the begginning 2016-01-06 14:08:59 -06:00