The idea is relatively simple - encode prefixes of collated strings as
integers and use them to compute extents' ranges. Then we can eliminate
extents with strings.
The actual patch does have all the code there but miss one important
step: we do not keep collation index, we keep charset index. Because of
this, some of the tests in the bugfix suite fail and thus main
functionality is turned off.
The reason of this patch to be put into PR at all is that it contains
changes that made CHAR/VARCHAR columns unsigned. This change is needed in
vectorization work.
After creating and populating tables with CHAR(5) case insensitive columns,
in a set of consequent joins like:
select * from t1, t2 where t1.c1=t2.c1;
select * from t1, t2 where t1.c1=t2.c2;
select * from t1, t2 where t1.c2=t2.c1;
select * from t1, t2 where t1.c2=t2.c2;
only the first join worked reliably case insensitively.
Removing the remaining pieces of the code that used order_swap() to compare
short CHAR columns, and using Charset::strnncollsp() instead.
This fixes the issue.
Removed uint128 from joblist/lbidlist.*
Another toString() method for wide-decimal that is EMPTY/NULL aware
Unified decimal processing in WF functions
Fixed a potential issue in EqualCompData::operator() for
wide-decimal processing
Fixed some signedness warnings
For now it consists of only:
using int128_t = __int128;
using uint128_t = unsigned __int128;
All new privitive data types should go into this file in the future.
This commit also adds support in TupleHashJoinStep::forwardCPData,
although we currently do not support wide decimals as join keys.
Row estimation to determine large-side of the join is also updated.