The idea is relatively simple - encode prefixes of collated strings as
integers and use them to compute extents' ranges. Then we can eliminate
extents with strings.
The actual patch does have all the code there but miss one important
step: we do not keep collation index, we keep charset index. Because of
this, some of the tests in the bugfix suite fail and thus main
functionality is turned off.
The reason of this patch to be put into PR at all is that it contains
changes that made CHAR/VARCHAR columns unsigned. This change is needed in
vectorization work.
* Fix clang warnings
* Remove vim tab guides
* initialize variables
* 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length
* Fix ISO C++17 does not allow 'register' storage class specifier for outdated bison
* chars are unsigned on ARM, having if (ival < 0) always false
* chars are unsigned by default on ARM and comparison with -1 if always true
After creating and populating tables with CHAR(5) case insensitive columns,
in a set of consequent joins like:
select * from t1, t2 where t1.c1=t2.c1;
select * from t1, t2 where t1.c1=t2.c2;
select * from t1, t2 where t1.c2=t2.c1;
select * from t1, t2 where t1.c2=t2.c2;
only the first join worked reliably case insensitively.
Removing the remaining pieces of the code that used order_swap() to compare
short CHAR columns, and using Charset::strnncollsp() instead.
This fixes the issue.
Removed uint128 from joblist/lbidlist.*
Another toString() method for wide-decimal that is EMPTY/NULL aware
Unified decimal processing in WF functions
Fixed a potential issue in EqualCompData::operator() for
wide-decimal processing
Fixed some signedness warnings
For now it consists of only:
using int128_t = __int128;
using uint128_t = unsigned __int128;
All new privitive data types should go into this file in the future.
This commit also adds support in TupleHashJoinStep::forwardCPData,
although we currently do not support wide decimals as join keys.
Row estimation to determine large-side of the join is also updated.
For equality string matches other engines ignore trailing whitespace
(this does not apply to LIKE matches). So we should do the same. This
patch trims whitespace for MIN/MAX extent elimination checks, fixed
width columns and dictionary columns during equality matches against
constants (SELECT * FROM t1 WHERE b = 'ABC').
pDictionaryScan won't work for BLOB/TEXT since it requires searching the
data file and rebuilding the token from matches. The tokens can't be
rebuild correctly due the bits in the token used for block counts. This
patch forces the use of pDictionaryStep instead for WHERE conditions.
In addition this patch adds support for TEXT/BLOB in various parts of
the job step processing. This fixes things like error 202 during an
UPDATE with a join condition on TEXT/BLOB columns.