1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-04-20 09:07:44 +03:00

107 Commits

Author SHA1 Message Date
Serguey Zefirov
ef592038cb fix(MCOL-5396): Fix possible infinite loop in plugin--PrimProc communication
If you manage to shut down PrimProc just before plugin is trying to send
Calpont Select Execution Plan to PrimProc, you now get a nice error
message about PrimProc being down instead of endless logs of failed
reconnection attempts.
2025-03-11 17:11:01 +04:00
Serguey Zefirov
38fd96a663 fix(memory leaks): MCOL-5791 - get rid of memory leaks in plugin code
There were numerous memory leaks in plugin's code and associated code.
During typical run of MTR tests it leaked around 65 megabytes of
objects. As a result they may severely affect long-lived connections.

This patch fixes (almost) all leaks found in the plugin. The exceptions
are two leaks associated with SHOW CREATE TABLE columnstore_table and
getting information of columns of columnstore-handled table. These
should be fixed on the server side and work is on the way.
2024-12-04 10:59:12 +03:00
Leonid Fedorov
7a2ca9d6bc
MCOL-4480: TEXT type added (#3142)
* TEXT type added
* tests
2024-03-21 00:26:35 +04:00
Roman Nozdrin
6579180810 fix(plugin): MCOL-4740: This fixes update rows counter for multi-table update
For UPDATEs involving a single table, the server call to handler::direct_update_rows() is used to correctly set the count for the number of updated rows in the UPDATE statement.
However, for UPDATEs involving multi-tables, the server does not call handler::direct_update_rows(). This patch adds support to correctly report the number of updated rows to the client by setting
multi_update::updated and multi_update::found in handler::rnd_end().
2023-11-02 14:18:06 +00:00
Leonid Fedorov
5013717730
fix(plugin): Fix wrong ask for stat call for table mode 2023-09-26 14:43:06 +03:00
Sergey Zefirov
b53c231ca6 MCOL-271 empty strings should not be NULLs (#2794)
This patch improves handling of NULLs in textual fields in ColumnStore.
Previously empty strings were considered NULLs and it could be a problem
if data scheme allows for empty strings. It was also one of major
reasons of behavior difference between ColumnStore and other engines in
MariaDB family.

Also, this patch fixes some other bugs and incorrect behavior, for
example, incorrect comparison for "column <= ''" which evaluates to
constant True for all purposes before this patch.
2023-03-30 21:18:29 +03:00
Leonid Fedorov
56f2346083 Remove windows ifdefs 2023-03-02 15:59:42 +00:00
Gagan Goel
45a779f743 MDEV-25080 Implement ColumnStore-side changes for pushdown of SELECT_LEX_UNITs. 2023-02-27 06:38:31 -05:00
Leonid Fedorov
81f0334698 Connection resource cleaning by Karol Roslaniec 2023-01-13 16:35:12 +03:00
Leonid Fedorov
d42485656c Fix clang 16 warnings for comfort build 2023-01-12 22:11:28 +03:00
David.Hall
53af74b027
MCOL-1170 Fix ANALYZE to not error (#2682)
Analyze needs to be completed differently than a normal query. In server, when an ANALYZE is seen, it calls init_scan() immediatly followed by end_scan(). This leaves the sqlfrontendsession (ExeMgr) in a state where it expects to return rows. This patch fixes end_scan to clean this up via reads and writes to get everything back in synch.

ANALYZE should display the number of rows to be displayed if the query were run normally. We have that information available, but no way to return it. A modification to server side to ask for that in the handler is required.

This patch also includes a beautification of sqlfrontsessionthread.cpp since it looked bad. The important change is at line 774
if (!swallowRows)
which short circuits the actual return of data
2023-01-09 13:59:26 -06:00
Gagan Goel
8a9b6b32e7 MCOL-5000 Disable ALTER TABLE statement execution on replicas.
Exit early from the plugin execution of ALTER TABLE statements
on the replica nodes. This is to prevent re-execution of syscat
table population from the replica nodes which should only be
executed once by the primary node in a CS cluster setup.
2022-10-14 17:20:55 +00:00
Leonid Fedorov
65252df4f6 C++20 fixes 2022-03-28 12:32:29 +00:00
Gagan Goel
973e5024d8 MCOL-4957 Fix performance slowdown for processing TIMESTAMP columns.
Part 1:
 As part of MCOL-3776 to address synchronization issue while accessing
 the fTimeZone member of the Func class, mutex locks were added to the
 accessor and mutator methods. However, this slows down processing
 of TIMESTAMP columns in PrimProc significantly as all threads across
 all concurrently running queries would serialize on the mutex. This
 is because PrimProc only has a single global object for the functor
 class (class derived from Func in utils/funcexp/functor.h) for a given
 function name. To fix this problem:

   (1) We remove the fTimeZone as a member of the Func derived classes
   (hence removing the mutexes) and instead use the fOperationType
   member of the FunctionColumn class to propagate the timezone values
   down to the individual functor processing functions such as
   FunctionColumn::getStrVal(), FunctionColumn::getIntVal(), etc.

   (2) To achieve (1), a timezone member is added to the
   execplan::CalpontSystemCatalog::ColType class.

Part 2:
 Several functors in the Funcexp code call dataconvert::gmtSecToMySQLTime()
 and dataconvert::mySQLTimeToGmtSec() functions for conversion between seconds
 since unix epoch and broken-down representation. These functions in turn call
 the C library function localtime_r() which currently has a known bug of holding
 a global lock via a call to __tz_convert. This significantly reduces performance
 in multi-threaded applications where multiple threads concurrently call
 localtime_r(). More details on the bug:
   https://sourceware.org/bugzilla/show_bug.cgi?id=16145

 This bug in localtime_r() caused processing of the Functors in PrimProc to
 slowdown significantly since a query execution causes Functors code to be
 processed in a multi-threaded manner.

 As a fix, we remove the calls to localtime_r() from gmtSecToMySQLTime()
 and mySQLTimeToGmtSec() by performing the timezone-to-offset conversion
 (done in dataconvert::timeZoneToOffset()) during the execution plan
 creation in the plugin. Note that localtime_r() is only called when the
 time_zone system variable is set to "SYSTEM".

 This fix also required changing the timezone type from a std::string to
 a long across the system.
2022-02-14 14:12:27 -05:00
Leonid Fedorov
04752ec546 clang format apply 2022-01-21 16:43:49 +00:00
Gagan Goel
195425924d MCOL-4936 Disable binlog for DML statements.
DML statements executed on the primary node in a ColumnStore
cluster do not need to be written to the primary's binlog. This
is due to ColumnStore's distributed storage architecture.

With this patch, we disable writing to binlog when a DML statement
(INSERT/DELETE/UPDATE/LDI/INSERT..SELECT) is performed on a ColumnStore
table. HANDLER::external_lock() calls are used to
  1. Turn OFF the OPTION_BIN_LOG flag
  2. Turn ON the OPTION_BIN_TMP_LOG_OFF flag
in THD::variables.option_bits during a WRITE lock call.

THD::variables.option_bits is restored back to the original state
during the UNLOCK call in HANDLER::external_lock().

Further, isDMLStatement() function is added to reduce code verbosity
to check if a given statement is a DML statement.

Note that with this patch, not writing to primary's binlog means
DML replication from a ColumnStore cluster to another ColumnStore
cluster or to another foreign engine will not work.
2022-01-04 17:31:59 +00:00
Gagan Goel
7f456e58cc MCOL-4868 UPDATE on a ColumnStore table containing an IN-subquery
on a non-ColumnStore table does not work.

As part of MCOL-4617, we moved the in-to-exists predicate creation
and injection from the server into the engine. However, when query
with an IN Subquery contains a non-ColumnStore table, the server
still performs the in-to-exists predicate transformation for the
foreign engine table. This caused ColumnStore's execution plan to
contain incorrect WHERE predicates. As a fix, we call
mutate_optimizer_flags() for the WRITE lock, in addition to the READ
table lock. And in mutate_optimizer_flags(), we change the optimizer
flag from OPTIMIZER_SWITCH_IN_TO_EXISTS to OPTIMIZER_SWITCH_MATERIALIZATION.
2021-12-16 23:11:26 +00:00
Leonid Fedorov
5c5f103f98
MCOL-4839: Fix clang build (#2100)
* Fix clang build

* Extern C returned to plugin_instance

Co-authored-by: Leonid Fedorov <l.fedorov@mail.corp.ru>
2021-08-23 10:45:10 -05:00
David Hall
ecde2719b1 MCOL-3741 Change IDB-xxxx error codes to MCS-xxxx 2021-08-09 11:33:09 -05:00
Gagan Goel
649ca10429 MCOL-4805 Follow up. 2021-08-04 23:54:02 +00:00
Gagan Goel
afb638b9bd MCOL-4805 For functions in the plugin code that disable replication on the
slave threads, we now check for this condition early on in the function
block.
2021-08-03 22:49:22 +00:00
Gagan Goel
c5502c02fa
Rename columnstore_use_cpimport_for_cache_inserts system variable to (#2053)
columnstore_cache_use_import.
2021-07-19 12:47:15 -05:00
Gagan Goel
a0bd790005 ColumnStore Cache changes.
1. Add a new system variable, columnstore_use_cpimport_for_cache_inserts,
  that when set to ON, uses cpimport for the cache flush into ColumnStore.
  This variable is set to OFF by default. By default, we perform batch inserts
  for the cache flush.

  2. Disable DMLProc logging of the SQL statement text for the cache
  flush operation in case of batch inserts. Under certain heavy loads
  involving INSERT statements, this logging becomes a bottleneck for
  the cache flush, causing subsequent inserts into the cache table to hang.
2021-07-07 19:02:28 +00:00
Roman Nozdrin
fb5ba84212 MCOL-4802 Removed ByteStream methods for bool manipulations and add some logging into I_S.columnstore_files 2021-07-07 07:16:30 +00:00
Roman Nozdrin
6dc356ed60
Merge pull request #1989 from denis0x0D/MCOL-4713
MCOL-4713 Analyze table implementation.
2021-07-02 16:17:07 +03:00
Denis Khalikov
c20015a7b2 MCOL-4713 Analyze table implementation. 2021-07-02 12:37:12 +03:00
Roman Nozdrin
a465b60bdd MCOL-1482 Future repetition reduction 2021-07-01 12:27:03 +00:00
Gagan Goel
49255f5cbd MCOL-1482 An UPDATE operation on a non-ColumnStore table involving a
cross-engine join with a ColumnStore table errors out.

ColumnStore cannot directly update a foreign table. We detect whether
a multi-table UPDATE operation is performed on a foreign table, if so,
do not create the select_handler and let the server execute the UPDATE
operation instead.
2021-06-25 15:27:54 +00:00
Gagan Goel
e3d8100150 MCOL-4525 Implement columnstore_select_handler=AUTO.
This feature allows a query execution to fallback to the server,
in case query execution using the select_handler (SH) fails. In case
of fallback, a warning message containing the original reason for
query failure using SH is generated.

To accomplish this task, SH execution is moved to an earlier step when
we create the SH in create_columnstore_select_handler(), instead of the
previous call to SH execution in ha_columnstore_select_handler::init_scan().
This requires some pre-requisite steps that occur in the server in
JOIN::optimize() and JOIN::exec() to be performed before starting SH execution.

In addition, missing test cases from MCOL-424 are also added to the MTR suite,
and the corresponding fix using disable_indices_for_CEJ() is reverted back
since the original fix now appears to be redundant.
2021-06-11 11:35:34 +00:00
Gagan Goel
e0d2a21cb9 MCOL-4665 Move outer join to inner join conversion into the engine.
This is a subtask of MCOL-4525 Implement select_handler=AUTO.

Server performs outer join to inner join conversion using simplify_joins()
in sql/sql_select.cc, by updating the TABLE_LIST::outer_join variable.
In order to perform this conversion, permanent changes are made in some
cases to the SELECT_LEX::JOIN::conds and/or TABLE_LIST::on_expr.
This is undesirable for MCOL-4525 which will attemp to fallback and execute
the query inside the server, in case the query execution fails in ColumnStore
using the select_handler.

For a query such as:
  SELECT * FROM t1 LEFT JOIN t2 ON expr1 LEFT JOIN t3 ON expr2
In some cases, server can update the original SELECT_LEX::JOIN::conds
and/or TABLE_LIST::on_expr and create new Item_cond_and objects
(e.g. with 2 Item's expr1 and expr2 in Item_cond_and::list).
Instead of making changes to the original query structs, we use
gp_walk_info::tableOnExprList and gp_walk_info::condList. 2 Item's,
expr1 and expr2, in the condList, mean Item_cond_and(expr1, expr2), and
hence avoid permanent transformations to the SELECT_LEX.

We also define a new member variable
ha_columnstore_select_handler::tableOuterJoinMap
which saves the original TABLE_LIST::outer_join values before they are
updated. This member variable will be used later on to restore to the original
state of TABLE_LIST::outer_join in case of a query fallback to server execution.

The original simplify_joins() implementation in the server also performs a
flattening of the JOIN nest, however we don't perform this operation in
convertOuterJoinToInnerJoin() since it is not required for ColumnStore.
2021-06-03 11:13:19 +00:00
Alexander Barkov
9608533d92 MCOL-4734 Compilation failure: MariaDB-10.6 + ColumnStore-develop
mcsconfig.h and my_config.h have the following
pre-processor definitions:

1. Conflicting definitions coming from the standard cmake definitions:
- PACKAGE
- PACKAGE_BUGREPORT
- PACKAGE_NAME
- PACKAGE_STRING
- PACKAGE_TARNAME
- PACKAGE_VERSION
- VERSION

2. Conflicting definitions of other kinds:
- HAVE_STRTOLL - this is a dirt in MariaDB headers.
  Should be fixed in the server code. my_config.h erroneously
  performs "#define HAVE_STRTOLL" instead of "#define HAVE_STRTOLL 1".
  in some cases. The former is not CMake compatible style. The latter is.

3. Non-conflicting definitions:
  Otherwise, mcsconfig.h and my_config.h should be mutually compatible,
  because both are generated by cmake on the same host machine. So
  they should have exactly equal definitions like "HAVE_XXX", "SIZEOF_XXX", etc.

Observations:
- It's OK to include both mcsconfig.h and my_config.h providing that we
  suppress duplicate definition of the above conflicting types #1 and #2.
- There is no a need to suppress duplicate definitions mentioned in #3,
  as they are compatible!
- my_sys.h and m_ctype.h must always follow a CMake configuation header,
  either my_config.h or mcsconfig.h (or both).
  They must never be included without any preceeding configuration header.

This change make sure that we resolve conflicts by:
- either disallowing inclusion of mcsconfig.h and my_config.h
  at the same time
- or by hiding conflicting definitions #1 and #2
  (with their later restoring).
- also, by making sure that my_sys.h and m_ctype.h always follow
  a CMake configuration file.

Details:
- idb_mysql.h can now only be included only after my_config.h
  An attempt to use idb_mysql.h with mcsconfig.h instead of
  my_config.h is caught by the "#error" preprocessor directive.

- mariadb_my_sys.h can now be only included after mcsconfig.h.
  An attempt to use mariadb_my_sys.h without mcscofig.h
  (e.g. with my_config.h) is also caught by "#error".

- collation.h now can now be included in two ways.
  It now has the following effective structure:

    #if defined(PREFER_MY_CONFIG_H) && defined(MY_CONFIG_H)
    //  Remember current conflicting definitions on the preprocessor stack
    //  Undefine current conflicting definitions
    #endif
    #include "mcsconfig.h"
    #include "m_ctype.h"
    #if defined(PREFER_MY_CONFIG_H) && defined(MY_CONFIG_H)
    #    Restore conflicting definitions from the preprocessor stack
    #endif

  and can be included as follows:

  a. using only mcsconfig.h as a configuration header:

    // my_config.h must not be included so far
    #include "collation.h"

  b. using my_config.h as the first included configuration file:

    #define PREFER_MY_CONFIG_H // Force conflict resolution
    #include "my_config.h"     // can be included directly or indirectly
    ...
    #include "collation.h"

Other changes:

- Adding helper header files
     utils/common/mcsconfig_conflicting_defs_remember.h
     utils/common/mcsconfig_conflicting_defs_restore.h
     utils/common/mcsconfig_conflicting_defs_undef.h
  to perform conflict resolution easier.

- Removing `#include "collation.h"` from a number of files,
  as it's automatically included from rowgroup.h.

- Removing redundant `#include "utils_utf8.h"`.
  This change is not directly related to the problem being fixed,
  but it's nice to remove redundant directives for both collation.h
  and utils_utf8.h from all the files that do not really need them.
  (this change could probably have gone as a separate commit)

- Changing my_init() to MY_INIT(argv[0]) in the MCS services sources.
  After the fix of the complitation failure it appeared that ColumnStore
  services compiled with the debug build crash due to recent changes in
  safemalloc. The crash happened in strcmp() with `my_progname` as an argument
  (where my_progname is a mysys global variable). This problem should
  probably be fixed on the server side as well to avoid passing NULL.
  But, the majority of MariaDB executable programs also use MY_INIT(argv[0])
  rather than my_init(). So let's make MCS do like the other programs do.
2021-05-25 12:34:36 +04:00
Alexander Barkov
2697c9bce7 MCOL-4687 Insert from view regression 2021-04-30 14:20:26 +04:00
Gagan Goel
bdd0963ea3 MCOL-4515 Binlog replication to CS slave hangs for INSERT .. SELECT query on master.
An INSERT .. SELECT performed on a master node caused an infinite
loop on the slave node with columnstore_replication_slave=OFF.

The issue was ha_mcs_impl_select_next() was returning 0 if
the executing thread is a slave and returning early to avoid DML
execution on the slave. The calling code in select_handler::execute()
ran into an infinite loop due to this as it checked the return code
of the function as 0, and incorrectly thought there are more rows
to process.

The fix is to return HA_ERR_END_OF_FILE as the return value,
instead of 0. This is for the case of the query running on the
slave node with columnstore_replication_slave=OFF.
2021-02-04 06:35:00 +00:00
Roman Nozdrin
8206fafb7c MCOL-4496 Upmerged the commit 7fa5ca3a6 to generalise CHAR|VARCHAR|TEXT|BLOB
bulk insertion processing and fix the original issue introduced by MCOL-2000
2021-01-19 13:25:06 +00:00
Alexander Barkov
d5c6645ba1 Adding mcs_basic_types.h
For now it consists of only:

using int128_t = __int128;
using uint128_t = unsigned __int128;

All new privitive data types should go into this file in the future.
2020-11-18 13:53:15 +00:00
Alexander Barkov
129d5b5a0f MCOL-4174 Review/refactor frontend/connector code 2020-11-18 13:53:15 +00:00
David Hall
638202417f MCOL-4171 2020-11-18 13:52:19 +00:00
Roman Nozdrin
3d94ec1568 MCOL-641 Followup on functions commit. 2020-11-18 13:51:25 +00:00
Gagan Goel
74b64eb4f1 MCOL-641 1. Add support for int128_t in ParsedColumnFilter.
2. Set Decimal precision in SimpleColumn::evaluate().
3. Add support for int128_t in ConstantColumn.
4. Set IDB_Decimal::s128Value in buildDecimalColumn().
5. Use width 16 as first if predicate for branching based on decimal width.
2020-11-18 13:47:45 +00:00
Roman Nozdrin
0bd172cd6e MCOL-641 The fix to support recent changes in 10.5.1. 2020-11-18 13:47:44 +00:00
Roman Nozdrin
2e8e7d52c3 Renamed datatypes/decimal.* into csdecimal to avoid collision with MDB. 2020-11-18 13:47:44 +00:00
Roman Nozdrin
238386bf63 MCOL-641 Replaced IDB_Decima.__v union with int128_t attribute.
Moved all tests into ./test

Introduced ./datatypes directory
2020-11-18 13:47:44 +00:00
Gagan Goel
824615a55b MCOL-641 Refactor empty value implementation in writeengine. 2020-11-18 13:47:44 +00:00
Roman Nozdrin
97ee1609b2 MCOL-641 Replaced NULL binary constants.
DataConvert::decimalToString, toString, writeIntPart, writeFractionalPart are not templates anymore.
2020-11-18 13:47:44 +00:00
Roman Nozdrin
de85e21c38 MCOL-641 This commit cleans up Row methods and adds couple UT for Row. 2020-11-18 13:47:02 +00:00
Gagan Goel
b07db9a8f4 MCOL-641 Basic support for updates. 2020-11-18 13:47:01 +00:00
drrtuy
54c152d6c8 MCOL-641 This commit introduces templates for DataConvert and RowGroup methods. 2020-11-18 13:47:01 +00:00
Gagan Goel
77e1d6abe3 Basic SELECT support for Decimal38 2020-11-18 13:47:00 +00:00
Roman Nozdrin
c9f42fb5cc MCOL-641 PoC version for DECIMAL(38) using BINARY as a basis. 2020-11-18 13:47:00 +00:00
Gagan Goel
32f6167067 MCOL-641 Work of Ivan Zuniga on basic read and write support for Binary16 2020-11-18 13:47:00 +00:00