1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-04-18 21:44:02 +03:00

87 Commits

Author SHA1 Message Date
Gagan Goel
afb638b9bd MCOL-4805 For functions in the plugin code that disable replication on the
slave threads, we now check for this condition early on in the function
block.
2021-08-03 22:49:22 +00:00
Gagan Goel
c5502c02fa
Rename columnstore_use_cpimport_for_cache_inserts system variable to (#2053)
columnstore_cache_use_import.
2021-07-19 12:47:15 -05:00
Gagan Goel
a0bd790005 ColumnStore Cache changes.
1. Add a new system variable, columnstore_use_cpimport_for_cache_inserts,
  that when set to ON, uses cpimport for the cache flush into ColumnStore.
  This variable is set to OFF by default. By default, we perform batch inserts
  for the cache flush.

  2. Disable DMLProc logging of the SQL statement text for the cache
  flush operation in case of batch inserts. Under certain heavy loads
  involving INSERT statements, this logging becomes a bottleneck for
  the cache flush, causing subsequent inserts into the cache table to hang.
2021-07-07 19:02:28 +00:00
Roman Nozdrin
fb5ba84212 MCOL-4802 Removed ByteStream methods for bool manipulations and add some logging into I_S.columnstore_files 2021-07-07 07:16:30 +00:00
Roman Nozdrin
6dc356ed60
Merge pull request #1989 from denis0x0D/MCOL-4713
MCOL-4713 Analyze table implementation.
2021-07-02 16:17:07 +03:00
Denis Khalikov
c20015a7b2 MCOL-4713 Analyze table implementation. 2021-07-02 12:37:12 +03:00
Roman Nozdrin
a465b60bdd MCOL-1482 Future repetition reduction 2021-07-01 12:27:03 +00:00
Gagan Goel
49255f5cbd MCOL-1482 An UPDATE operation on a non-ColumnStore table involving a
cross-engine join with a ColumnStore table errors out.

ColumnStore cannot directly update a foreign table. We detect whether
a multi-table UPDATE operation is performed on a foreign table, if so,
do not create the select_handler and let the server execute the UPDATE
operation instead.
2021-06-25 15:27:54 +00:00
Gagan Goel
e3d8100150 MCOL-4525 Implement columnstore_select_handler=AUTO.
This feature allows a query execution to fallback to the server,
in case query execution using the select_handler (SH) fails. In case
of fallback, a warning message containing the original reason for
query failure using SH is generated.

To accomplish this task, SH execution is moved to an earlier step when
we create the SH in create_columnstore_select_handler(), instead of the
previous call to SH execution in ha_columnstore_select_handler::init_scan().
This requires some pre-requisite steps that occur in the server in
JOIN::optimize() and JOIN::exec() to be performed before starting SH execution.

In addition, missing test cases from MCOL-424 are also added to the MTR suite,
and the corresponding fix using disable_indices_for_CEJ() is reverted back
since the original fix now appears to be redundant.
2021-06-11 11:35:34 +00:00
Gagan Goel
e0d2a21cb9 MCOL-4665 Move outer join to inner join conversion into the engine.
This is a subtask of MCOL-4525 Implement select_handler=AUTO.

Server performs outer join to inner join conversion using simplify_joins()
in sql/sql_select.cc, by updating the TABLE_LIST::outer_join variable.
In order to perform this conversion, permanent changes are made in some
cases to the SELECT_LEX::JOIN::conds and/or TABLE_LIST::on_expr.
This is undesirable for MCOL-4525 which will attemp to fallback and execute
the query inside the server, in case the query execution fails in ColumnStore
using the select_handler.

For a query such as:
  SELECT * FROM t1 LEFT JOIN t2 ON expr1 LEFT JOIN t3 ON expr2
In some cases, server can update the original SELECT_LEX::JOIN::conds
and/or TABLE_LIST::on_expr and create new Item_cond_and objects
(e.g. with 2 Item's expr1 and expr2 in Item_cond_and::list).
Instead of making changes to the original query structs, we use
gp_walk_info::tableOnExprList and gp_walk_info::condList. 2 Item's,
expr1 and expr2, in the condList, mean Item_cond_and(expr1, expr2), and
hence avoid permanent transformations to the SELECT_LEX.

We also define a new member variable
ha_columnstore_select_handler::tableOuterJoinMap
which saves the original TABLE_LIST::outer_join values before they are
updated. This member variable will be used later on to restore to the original
state of TABLE_LIST::outer_join in case of a query fallback to server execution.

The original simplify_joins() implementation in the server also performs a
flattening of the JOIN nest, however we don't perform this operation in
convertOuterJoinToInnerJoin() since it is not required for ColumnStore.
2021-06-03 11:13:19 +00:00
Alexander Barkov
9608533d92 MCOL-4734 Compilation failure: MariaDB-10.6 + ColumnStore-develop
mcsconfig.h and my_config.h have the following
pre-processor definitions:

1. Conflicting definitions coming from the standard cmake definitions:
- PACKAGE
- PACKAGE_BUGREPORT
- PACKAGE_NAME
- PACKAGE_STRING
- PACKAGE_TARNAME
- PACKAGE_VERSION
- VERSION

2. Conflicting definitions of other kinds:
- HAVE_STRTOLL - this is a dirt in MariaDB headers.
  Should be fixed in the server code. my_config.h erroneously
  performs "#define HAVE_STRTOLL" instead of "#define HAVE_STRTOLL 1".
  in some cases. The former is not CMake compatible style. The latter is.

3. Non-conflicting definitions:
  Otherwise, mcsconfig.h and my_config.h should be mutually compatible,
  because both are generated by cmake on the same host machine. So
  they should have exactly equal definitions like "HAVE_XXX", "SIZEOF_XXX", etc.

Observations:
- It's OK to include both mcsconfig.h and my_config.h providing that we
  suppress duplicate definition of the above conflicting types #1 and #2.
- There is no a need to suppress duplicate definitions mentioned in #3,
  as they are compatible!
- my_sys.h and m_ctype.h must always follow a CMake configuation header,
  either my_config.h or mcsconfig.h (or both).
  They must never be included without any preceeding configuration header.

This change make sure that we resolve conflicts by:
- either disallowing inclusion of mcsconfig.h and my_config.h
  at the same time
- or by hiding conflicting definitions #1 and #2
  (with their later restoring).
- also, by making sure that my_sys.h and m_ctype.h always follow
  a CMake configuration file.

Details:
- idb_mysql.h can now only be included only after my_config.h
  An attempt to use idb_mysql.h with mcsconfig.h instead of
  my_config.h is caught by the "#error" preprocessor directive.

- mariadb_my_sys.h can now be only included after mcsconfig.h.
  An attempt to use mariadb_my_sys.h without mcscofig.h
  (e.g. with my_config.h) is also caught by "#error".

- collation.h now can now be included in two ways.
  It now has the following effective structure:

    #if defined(PREFER_MY_CONFIG_H) && defined(MY_CONFIG_H)
    //  Remember current conflicting definitions on the preprocessor stack
    //  Undefine current conflicting definitions
    #endif
    #include "mcsconfig.h"
    #include "m_ctype.h"
    #if defined(PREFER_MY_CONFIG_H) && defined(MY_CONFIG_H)
    #    Restore conflicting definitions from the preprocessor stack
    #endif

  and can be included as follows:

  a. using only mcsconfig.h as a configuration header:

    // my_config.h must not be included so far
    #include "collation.h"

  b. using my_config.h as the first included configuration file:

    #define PREFER_MY_CONFIG_H // Force conflict resolution
    #include "my_config.h"     // can be included directly or indirectly
    ...
    #include "collation.h"

Other changes:

- Adding helper header files
     utils/common/mcsconfig_conflicting_defs_remember.h
     utils/common/mcsconfig_conflicting_defs_restore.h
     utils/common/mcsconfig_conflicting_defs_undef.h
  to perform conflict resolution easier.

- Removing `#include "collation.h"` from a number of files,
  as it's automatically included from rowgroup.h.

- Removing redundant `#include "utils_utf8.h"`.
  This change is not directly related to the problem being fixed,
  but it's nice to remove redundant directives for both collation.h
  and utils_utf8.h from all the files that do not really need them.
  (this change could probably have gone as a separate commit)

- Changing my_init() to MY_INIT(argv[0]) in the MCS services sources.
  After the fix of the complitation failure it appeared that ColumnStore
  services compiled with the debug build crash due to recent changes in
  safemalloc. The crash happened in strcmp() with `my_progname` as an argument
  (where my_progname is a mysys global variable). This problem should
  probably be fixed on the server side as well to avoid passing NULL.
  But, the majority of MariaDB executable programs also use MY_INIT(argv[0])
  rather than my_init(). So let's make MCS do like the other programs do.
2021-05-25 12:34:36 +04:00
Alexander Barkov
2697c9bce7 MCOL-4687 Insert from view regression 2021-04-30 14:20:26 +04:00
Gagan Goel
bdd0963ea3 MCOL-4515 Binlog replication to CS slave hangs for INSERT .. SELECT query on master.
An INSERT .. SELECT performed on a master node caused an infinite
loop on the slave node with columnstore_replication_slave=OFF.

The issue was ha_mcs_impl_select_next() was returning 0 if
the executing thread is a slave and returning early to avoid DML
execution on the slave. The calling code in select_handler::execute()
ran into an infinite loop due to this as it checked the return code
of the function as 0, and incorrectly thought there are more rows
to process.

The fix is to return HA_ERR_END_OF_FILE as the return value,
instead of 0. This is for the case of the query running on the
slave node with columnstore_replication_slave=OFF.
2021-02-04 06:35:00 +00:00
Roman Nozdrin
8206fafb7c MCOL-4496 Upmerged the commit 7fa5ca3a6 to generalise CHAR|VARCHAR|TEXT|BLOB
bulk insertion processing and fix the original issue introduced by MCOL-2000
2021-01-19 13:25:06 +00:00
Alexander Barkov
d5c6645ba1 Adding mcs_basic_types.h
For now it consists of only:

using int128_t = __int128;
using uint128_t = unsigned __int128;

All new privitive data types should go into this file in the future.
2020-11-18 13:53:15 +00:00
Alexander Barkov
129d5b5a0f MCOL-4174 Review/refactor frontend/connector code 2020-11-18 13:53:15 +00:00
David Hall
638202417f MCOL-4171 2020-11-18 13:52:19 +00:00
Roman Nozdrin
3d94ec1568 MCOL-641 Followup on functions commit. 2020-11-18 13:51:25 +00:00
Gagan Goel
74b64eb4f1 MCOL-641 1. Add support for int128_t in ParsedColumnFilter.
2. Set Decimal precision in SimpleColumn::evaluate().
3. Add support for int128_t in ConstantColumn.
4. Set IDB_Decimal::s128Value in buildDecimalColumn().
5. Use width 16 as first if predicate for branching based on decimal width.
2020-11-18 13:47:45 +00:00
Roman Nozdrin
0bd172cd6e MCOL-641 The fix to support recent changes in 10.5.1. 2020-11-18 13:47:44 +00:00
Roman Nozdrin
2e8e7d52c3 Renamed datatypes/decimal.* into csdecimal to avoid collision with MDB. 2020-11-18 13:47:44 +00:00
Roman Nozdrin
238386bf63 MCOL-641 Replaced IDB_Decima.__v union with int128_t attribute.
Moved all tests into ./test

Introduced ./datatypes directory
2020-11-18 13:47:44 +00:00
Gagan Goel
824615a55b MCOL-641 Refactor empty value implementation in writeengine. 2020-11-18 13:47:44 +00:00
Roman Nozdrin
97ee1609b2 MCOL-641 Replaced NULL binary constants.
DataConvert::decimalToString, toString, writeIntPart, writeFractionalPart are not templates anymore.
2020-11-18 13:47:44 +00:00
Roman Nozdrin
de85e21c38 MCOL-641 This commit cleans up Row methods and adds couple UT for Row. 2020-11-18 13:47:02 +00:00
Gagan Goel
b07db9a8f4 MCOL-641 Basic support for updates. 2020-11-18 13:47:01 +00:00
drrtuy
54c152d6c8 MCOL-641 This commit introduces templates for DataConvert and RowGroup methods. 2020-11-18 13:47:01 +00:00
Gagan Goel
77e1d6abe3 Basic SELECT support for Decimal38 2020-11-18 13:47:00 +00:00
Roman Nozdrin
c9f42fb5cc MCOL-641 PoC version for DECIMAL(38) using BINARY as a basis. 2020-11-18 13:47:00 +00:00
Gagan Goel
32f6167067 MCOL-641 Work of Ivan Zuniga on basic read and write support for Binary16 2020-11-18 13:47:00 +00:00
Gagan Goel
13264feb7d MCOL-4320/4364/4370 Fix multibyte processing for LDI/Insert...Select
For CHAR/VARCHAR/TEXT fields, the buffer size of a field represents
the field size in bytes, which can be bigger than the field size in
number of characters, for multi-byte character sets such as utf8,
utf8mb4 etc. The buffer also contains a byte length prefix which can be
up to 65532 bytes for a VARCHAR field, and much higher for a TEXT
field (we process a maximum byte length for a TEXT field which fits in
4 bytes, which is 2^32 - 1 = 4GB!).

There is also special processing for a TEXT field defined with a default
length like so:
  CREATE TABLE cs1 (a TEXT CHARACTER SET utf8)
Here, the byte length is a fixed 65535, irrespective of the character
set used. This is different from a case such as:
  CREATE TABLE cs1 (a TEXT(65535) CHARACTER SET utf8), where the byte length
for the field will be 65535*3.
2020-10-26 17:51:24 +00:00
Gagan Goel
5646164a46
Merge pull request #1459 from dhall-MariaDB/MCOL-4144-dev
MCOL-4144-dev Enable lower_case_table_names
2020-09-24 19:10:22 -04:00
David Hall
35c4b66a67 MCOL-4144 Enable lower_case_table_names
Create tables and schemas with lower case name only if the flag is set.
During operations, convert to lowercase in plugin. Byt the time a query gets to ExeMgr, DDLProc etc., everything must be lower case if the flag is set, and undisturbed if not.
2020-09-24 15:21:13 -05:00
Roman Nozdrin
df0c2b2fbe MCOL-4278 MCS quits early from rnd_end() in the presense of sql_select_limit session variable
Renamed a couple methods to align their names with others
2020-09-24 08:46:00 +00:00
Alexander Barkov
7f6ad16728 MCOL-4303 UPDATE..SET using another table is not updating
The change for MCOL-4264 erroneously added the "lock_type" member
to cal_connection_info, which is shared between multiple tables.
So some tables that were opened for write erroneously identified
themselves as read only.

Moving the member to ha_mcs instead.
2020-09-11 12:26:26 +04:00
Alexander Barkov
f00cc571b5 MCOL-4264 [Cross-Engine] UPDATE to INNODB table with WHERE clause using Columnstore as sub query failing
Problem:

When processing cross-engine queries like:

update cstab1 set a=100 where a not in (select a from innotab1 where a=11);
delete from innotab1  where a not in (select a from cstab1 where a=1);

the ColumnStore plugin erroneously executed the whole query inside
ColumnStore.

Fix:

- Adding a new member cal_connection_info::lock_type and setting it
  inside ha_mcs_impl_external_lock() to the value passed in the parameter
  "lock_type".

- Adding a method cal_connection_info::isReadOnly() to test
  if the last table lock made in ha_mcs_impl_external_lock()
  for done for reading.

- Adding a new condition checking cal_connection_info::isReadOnly() inside
  ha_mcs_impl_rnd_init(). If the current table was locked last time for reading,
  then doUpdateDelete() should not be executed.
2020-09-08 07:06:52 +04:00
Gagan Goel
b44e1e2566
Merge pull request #1372 from dhall-MariaDB/MCOL-4236
Mcol 4236
2020-08-24 19:04:33 -04:00
David Hall
7bd878de0a MCOL-4236 remove type casting *f to something it's not
During data retrieval, we were type casting a field type to what we thought was the correct type. Often it was not. Since we're calling virtual functions on *f, there's no need to type cast in most cases. This was a relic from days gone by.
2020-08-19 14:37:38 -05:00
Gagan Goel
86fb66365c 1. Set 1M as the threshold on the number of records to flush the cache.
2. Set 100k as the batch size when flushing records into ColumnStore, i.e.,
a flush of 1M records will be performed in 10 batches, each being 100k.

3. For INSERT ... SELECT on the cache, use the default insertion method of cpimport.
2020-08-18 18:01:40 -04:00
David Hall
2880d0871f MCOL-4247 add break to case 2020-08-18 11:47:06 -05:00
David Hall
7ba40a5544 MCOL-4247 TYpecast causes wrong virtual function 2020-08-17 17:34:58 -05:00
benthompson15
219f67d162 MCOL-4181: Possible setting of ci->stats.fUser to NULL causing crash. 2020-07-20 15:18:58 -05:00
David Hall
085b06d422 Merge branch 'develop' into MCOL-4126 2020-06-30 11:25:11 -05:00
David Hall
8179ffffdf MCOL-4126 Don't reset ci->tableOid if not autocommit. 2020-06-30 11:21:30 -05:00
Patrick LeBlanc
41745490ed
Revert "MCOL-4126 reset ci->tableOid after INSERT|DELETE" 2020-06-30 10:37:34 -05:00
David Hall
0a2fe7d2fb MCOL-4126 reset ci->tableOid after INSERT|DELETE 2020-06-29 16:20:05 -05:00
Gagan Goel
c30d105c30 Use batch inserts for the cache flush. 2020-06-03 15:20:03 -04:00
Gagan Goel
57f393feaf
Merge pull request #1246 from pleblanc1976/mcol-4023-1.5
Merge MCOL-4023 fix into 1.5
2020-06-02 11:33:10 -04:00
Patrick LeBlanc
4bddc92092 MCOL-4010 - fixes compilation errors on x64 w/-Werror
Merged in Sergei's patch.
2020-06-01 12:52:43 -04:00
Gagan Goel
01ff2652a6 MCOL-4023 Pushdown WHERE conditions for UPDATE/DELETE.
For certain queries, such as:
  update cs1 set i = 41 where i = 42 or (i is null and 42 is null);
the SELECT_LEX.where does not contain the required where conditions.
Server sends the where conditions in the call to cond_push(), so
we are storing them in a handler data member, condStack, and later
push them down to getSelectPlan() for UPDATES/DELETEs.
2020-06-01 11:03:42 -04:00