This patch improves handling of NULLs in textual fields in ColumnStore.
Previously empty strings were considered NULLs and it could be a problem
if data scheme allows for empty strings. It was also one of major
reasons of behavior difference between ColumnStore and other engines in
MariaDB family.
Also, this patch fixes some other bugs and incorrect behavior, for
example, incorrect comparison for "column <= ''" which evaluates to
constant True for all purposes before this patch.
1. In TupleUnion::writeNull(), add the missing switch case for
wide decimal with 16bytes column width.
2. MCOL-5432 Disable complete/partial pushdown of UNION operation
if the query involves an ORDER BY or a LIMIT clause, until
MCOL-5222 is fixed. Also add MTR test cases for this.
This patch improves the runtime performance of UNION processing in CS, as reported JIRA issue MCOL 4590. The idea of the optimization is to infer the normalize seperate functions beforehand and perform the normalization individually later, instead of a huge switch body of all normalization. This patch also cover engineering optimization, removing the hotspots in UNION processing. After application of this patch, the normalize part takes only about 25% of the whole UNION query in our experiment avg case.
Signed-off-by: Jigao Luo <luojigao@outlook.com>
In this patch, we set the unioned type to a wide decimal, if any of the
numeric columns involved in the union operation have a precision > 18
(which is also possible for BIGINT/UBIGINT types) and <= 38.
MCOL-4409 This patch combines VDecimal and Decimal and makes
IDB_Decimal an alias for the result class
MCOL-4409 More boilerplate reduction in Func_mod
Removed couple TSInt128::toType() methods
1. In TupleAggregateStep::configDeliveredRowGroup(), use
jobInfo.projectionCols instead of jobInfo.nonConstCols
for setting scale and precision if the source column is
wide decimal.
2. Tighten rules for wide decimal processing. Specifically:
a. Replace (precision > INT64MAXPRECISION) checks with
(precision > INT64MAXPRECISION && precision <= INT128MAXPRECISION)
b. At places where (colWidth == MAXDECIMALWIDTH) is not enough to
determine if a column is wide decimal or not, also add a check on
type being DECIMAL/UDECIMAL.
Since we now perform type promotion to wide decimals for aggregations
involving numeric fields, we need to check for wide decimal in
in and out ROWs and call the appropriate setter and getter functions.
Ubuntu 18.04 uses GCC 7.3 which is a little stricter than before.
Fixes a few errors due to implicit includes that are no longer implicit
and a ton of warnings about the implied alignment of code in
utils/common/any.hpp
pDictionaryScan won't work for BLOB/TEXT since it requires searching the
data file and rebuilding the token from matches. The tokens can't be
rebuild correctly due the bits in the token used for block counts. This
patch forces the use of pDictionaryStep instead for WHERE conditions.
In addition this patch adds support for TEXT/BLOB in various parts of
the job step processing. This fixes things like error 202 during an
UPDATE with a join condition on TEXT/BLOB columns.
This does the following:
* Switch resource manager to a singleton which reduces the amount of
times the XML data is scanned and objects allocated.
* Make the I_S tables use the FE implementation of the system catalog
* Make the I_S.columnstore_columns table use the RID list cache
* Make the extentmap pre-allocate a vector instead of many small allocs