* MCOL-5092 Ensure column width is correct for datatype
Change MODA return type to STRING
Modify MODA to handle every numeric type
* MCOL-5162 MODA to support char and varchar with collation support
Fixes to the aggregate bit functions
When we fixed the storage sign issue for MCOL-5092, it uncovered a problem in the bit aggregates (bit_and, bit_or and bit_xor). These aggregates should always return UBIGINT, but they relied on the type of the argument column, which gave bad results.
This patch adds support for on clause filter for a table which is not involved in particular join
by disabling an `merge optimization` for those particular cases.
The `merge optimization` is optimization when CS
tries to create a one BPP join with one `large side` table and multiple `small sides` tables, in this
case we cannot apply a FE filter if this filter requires a columns from `small side` table which is not
involved in particular join.
Part 1:
As part of MCOL-3776 to address synchronization issue while accessing
the fTimeZone member of the Func class, mutex locks were added to the
accessor and mutator methods. However, this slows down processing
of TIMESTAMP columns in PrimProc significantly as all threads across
all concurrently running queries would serialize on the mutex. This
is because PrimProc only has a single global object for the functor
class (class derived from Func in utils/funcexp/functor.h) for a given
function name. To fix this problem:
(1) We remove the fTimeZone as a member of the Func derived classes
(hence removing the mutexes) and instead use the fOperationType
member of the FunctionColumn class to propagate the timezone values
down to the individual functor processing functions such as
FunctionColumn::getStrVal(), FunctionColumn::getIntVal(), etc.
(2) To achieve (1), a timezone member is added to the
execplan::CalpontSystemCatalog::ColType class.
Part 2:
Several functors in the Funcexp code call dataconvert::gmtSecToMySQLTime()
and dataconvert::mySQLTimeToGmtSec() functions for conversion between seconds
since unix epoch and broken-down representation. These functions in turn call
the C library function localtime_r() which currently has a known bug of holding
a global lock via a call to __tz_convert. This significantly reduces performance
in multi-threaded applications where multiple threads concurrently call
localtime_r(). More details on the bug:
https://sourceware.org/bugzilla/show_bug.cgi?id=16145
This bug in localtime_r() caused processing of the Functors in PrimProc to
slowdown significantly since a query execution causes Functors code to be
processed in a multi-threaded manner.
As a fix, we remove the calls to localtime_r() from gmtSecToMySQLTime()
and mySQLTimeToGmtSec() by performing the timezone-to-offset conversion
(done in dataconvert::timeZoneToOffset()) during the execution plan
creation in the plugin. Note that localtime_r() is only called when the
time_zone system variable is set to "SYSTEM".
This fix also required changing the timezone type from a std::string to
a long across the system.
SCommand StrFilterCmd::duplicate() missed these two lines:
filterCmd->leftColType = leftColType;
filterCmd->rightColType = rightColType;
which exist in the parent's FilterCommand::duplicate().
Rewriting the code to avoid duplication by using more inherited
methods/constructors. This reduces the probability of similar bugs
in the future.
cross-engine join with a ColumnStore table errors out.
ColumnStore cannot directly update a foreign table. We detect whether
a multi-table UPDATE operation is performed on a foreign table, if so,
do not create the select_handler and let the server execute the UPDATE
operation instead.
This feature allows a query execution to fallback to the server,
in case query execution using the select_handler (SH) fails. In case
of fallback, a warning message containing the original reason for
query failure using SH is generated.
To accomplish this task, SH execution is moved to an earlier step when
we create the SH in create_columnstore_select_handler(), instead of the
previous call to SH execution in ha_columnstore_select_handler::init_scan().
This requires some pre-requisite steps that occur in the server in
JOIN::optimize() and JOIN::exec() to be performed before starting SH execution.
In addition, missing test cases from MCOL-424 are also added to the MTR suite,
and the corresponding fix using disable_indices_for_CEJ() is reverted back
since the original fix now appears to be redundant.
This is a subtask of MCOL-4525 Implement select_handler=AUTO.
Server performs outer join to inner join conversion using simplify_joins()
in sql/sql_select.cc, by updating the TABLE_LIST::outer_join variable.
In order to perform this conversion, permanent changes are made in some
cases to the SELECT_LEX::JOIN::conds and/or TABLE_LIST::on_expr.
This is undesirable for MCOL-4525 which will attemp to fallback and execute
the query inside the server, in case the query execution fails in ColumnStore
using the select_handler.
For a query such as:
SELECT * FROM t1 LEFT JOIN t2 ON expr1 LEFT JOIN t3 ON expr2
In some cases, server can update the original SELECT_LEX::JOIN::conds
and/or TABLE_LIST::on_expr and create new Item_cond_and objects
(e.g. with 2 Item's expr1 and expr2 in Item_cond_and::list).
Instead of making changes to the original query structs, we use
gp_walk_info::tableOnExprList and gp_walk_info::condList. 2 Item's,
expr1 and expr2, in the condList, mean Item_cond_and(expr1, expr2), and
hence avoid permanent transformations to the SELECT_LEX.
We also define a new member variable
ha_columnstore_select_handler::tableOuterJoinMap
which saves the original TABLE_LIST::outer_join values before they are
updated. This member variable will be used later on to restore to the original
state of TABLE_LIST::outer_join in case of a query fallback to server execution.
The original simplify_joins() implementation in the server also performs a
flattening of the JOIN nest, however we don't perform this operation in
convertOuterJoinToInnerJoin() since it is not required for ColumnStore.