1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-04-26 11:48:52 +03:00

36 Commits

Author SHA1 Message Date
Andrey Piskunov
d586975da7
Rename a limit var + change error message (#2946)
* Rename a limit var + change error message

* Adjust the test
2023-09-05 12:19:15 +03:00
mariadb-AndreyPiskunov
05547f2342 Add a limit (as runtime value) for long in queries 2023-08-21 10:38:46 +03:00
Denis Khalikov
896e8dd769
MCOL-5522 Properly process pm join result count. (#2909)
This patch:
1. Properly processes situation when pm join result count is exceeded.
2. Adds session variable 'columnstore_max_pm_join_result_count` to control the limit.
2023-08-04 16:55:45 +03:00
Leonid Fedorov
65cde8c894
feature: pron (#2908)
* feature: Special dictionary, we can pass with session veriable to modify codepaths and behaviour for testing and debugging
2023-07-21 14:02:03 +03:00
Denis Khalikov
1f190a6e75 MCOL-5477 Disk join step improvement.
This patch:
1. Handles corner case when the bucket exceeded the memory limit, but we cannot redistribute the data in this bucket into new buckets based on a hash algorithm, because the rows have the same values.
2. Adds force option for disk join step.
3. Add a option to contol the depth of the partition tree.
2023-06-23 18:40:15 +03:00
Roman Nozdrin
1b51d265ed MCOL-5400 Disable group by pushdown 2023-01-26 12:09:00 +00:00
Leonid Fedorov
81f0334698 Connection resource cleaning by Karol Roslaniec 2023-01-13 16:35:12 +03:00
Roman Nozdrin
72e264e8ef MCOL-5199 This patch solves the overal performance degradation introduced with a new way of char columns hashing
in aggregation code
The patch disables padding that forces hasher to calculate over the whole 2k buffer. This patch also moves hashing code
into the common place where it belongs.
2022-08-24 19:07:06 +00:00
Leonid Fedorov
f5b2a6885f MCOL-5013: Load Data from S3 into Columnstore
Introduced UDF and stored prodecure.
usage:

set columnstore_s3_key='<s3_key>';
set columnstore_s3_secret='<s3_secret>';
set columnstore_s3_region='region';

and then use UDF
select columnstore_dataload("<tablename>", "<filename>", "<bucket>", "<db_name>");
for UDF db_name can be ommited, then current connection db will be used

or stored function
call calpontsys.columnstore_load_from_s3("<tablename>", "<filename>", "<bucket>", "<db_name>");
2022-07-04 19:52:37 +03:00
Leonid Fedorov
04752ec546 clang format apply 2022-01-21 16:43:49 +00:00
Gagan Goel
195425924d MCOL-4936 Disable binlog for DML statements.
DML statements executed on the primary node in a ColumnStore
cluster do not need to be written to the primary's binlog. This
is due to ColumnStore's distributed storage architecture.

With this patch, we disable writing to binlog when a DML statement
(INSERT/DELETE/UPDATE/LDI/INSERT..SELECT) is performed on a ColumnStore
table. HANDLER::external_lock() calls are used to
  1. Turn OFF the OPTION_BIN_LOG flag
  2. Turn ON the OPTION_BIN_TMP_LOG_OFF flag
in THD::variables.option_bits during a WRITE lock call.

THD::variables.option_bits is restored back to the original state
during the UNLOCK call in HANDLER::external_lock().

Further, isDMLStatement() function is added to reduce code verbosity
to check if a given statement is a DML statement.

Note that with this patch, not writing to primary's binlog means
DML replication from a ColumnStore cluster to another ColumnStore
cluster or to another foreign engine will not work.
2022-01-04 17:31:59 +00:00
Gagan Goel
c5502c02fa
Rename columnstore_use_cpimport_for_cache_inserts system variable to (#2053)
columnstore_cache_use_import.
2021-07-19 12:47:15 -05:00
Denis Khalikov
fa8dc815a7 MCOL-4814 Add a cmake build option to enable LZ4 compression.
This patch adds an option for cmake flags to enable lz4 compression.
2021-07-16 17:57:11 +03:00
Gagan Goel
a0bd790005 ColumnStore Cache changes.
1. Add a new system variable, columnstore_use_cpimport_for_cache_inserts,
  that when set to ON, uses cpimport for the cache flush into ColumnStore.
  This variable is set to OFF by default. By default, we perform batch inserts
  for the cache flush.

  2. Disable DMLProc logging of the SQL statement text for the cache
  flush operation in case of batch inserts. Under certain heavy loads
  involving INSERT statements, this logging becomes a bottleneck for
  the cache flush, causing subsequent inserts into the cache table to hang.
2021-07-07 19:02:28 +00:00
Denis Khalikov
cc1c3629c5 MCOL-987 Add LZ4 compression.
* Adds CompressInterfaceLZ4 which uses LZ4 API for compress/uncompress.
* Adds CMake machinery to search LZ4 on running host.
* All methods which use static data and do not modify any internal data - become `static`,
  so we can use them without creation of the specific object. This is possible, because
  the header specification has not been modified. We still use 2 sections in header, first
  one with file meta data, the second one with pointers for compressed chunks.
* Methods `compress`, `uncompress`, `maxCompressedSize`, `getUncompressedSize` - become
  pure virtual, so we can override them for the other compression algos.
* Adds method `getChunkMagicNumber`, so we can verify chunk magic number
  for each compression algo.
* Renames "s/IDBCompressInterface/CompressInterface/g" according to requirement.
2021-07-06 18:04:37 +03:00
Roman Nozdrin
e153486361
Merge pull request #1994 from denis0x0D/MCOL-4685_rename
MCOL-4685 Remname UNUSED -> SNAPPY
2021-06-16 11:24:17 +03:00
Denis Khalikov
e2a5956ef8 MCOL-4685 Remname UNUSED -> SNAPPY 2021-06-15 21:19:09 +03:00
Gagan Goel
e3d8100150 MCOL-4525 Implement columnstore_select_handler=AUTO.
This feature allows a query execution to fallback to the server,
in case query execution using the select_handler (SH) fails. In case
of fallback, a warning message containing the original reason for
query failure using SH is generated.

To accomplish this task, SH execution is moved to an earlier step when
we create the SH in create_columnstore_select_handler(), instead of the
previous call to SH execution in ha_columnstore_select_handler::init_scan().
This requires some pre-requisite steps that occur in the server in
JOIN::optimize() and JOIN::exec() to be performed before starting SH execution.

In addition, missing test cases from MCOL-424 are also added to the MTR suite,
and the corresponding fix using disable_indices_for_CEJ() is reverted back
since the original fix now appears to be redundant.
2021-06-11 11:35:34 +00:00
Denis Khalikov
606194e6e4 MCOL-4685: Eliminate some irrelevant settings (uncompressed data and extents per file).
This patch:
1. Removes the option to declare uncompressed columns (set columnstore_compression_type = 0).
2. Ignores [COMMENT '[compression=0] option at table or column level (no error messages, just disregard).
3. Removes the option to set more than 2 extents per file (ExtentsPreSegmentFile).
4. Updates rebuildEM tool to support up to 10 dictionary extent per dictionary segment file.
5. Adds check for `DBRootStorageType` for rebuildEM tool.
6. Renamed rebuildEM to mcsRebuildEM.
2021-06-03 14:44:33 +03:00
Gagan Goel
554c6da8e8 MCOL-641 Implement int128_t versions of arithmetic operations and add unit test cases. 2020-11-18 13:47:45 +00:00
Roman Nozdrin
6ab1b829a0 MCOL-4334 Enable Select Handler for queries run inside Stored Procedures
There is another session variable to enable/disable SH in SP
2020-10-13 13:07:59 +00:00
Gagan Goel
b3ae9cf04e Use a session variable, columnstore_cache_flush_threshold,
to allow the user to set the threshold, instead of using a
hard coded value.
2020-08-18 18:01:40 -04:00
Gagan Goel
4afcba9520 Do not build the cache as a separate user-visible engine.
We are creating a new read-only system variable, columnstore_cache_inserts,
to enable/disable the cache. When this variable is set at server start up,
any table created with engine=columnstore will also create the corresponding
cache table in Aria engine for performing inserts.

It is important to note that a ColumnStore table created with this
option unset should not be queried when the server is restarted with
the option set, as this will most likely result in query failures.
2020-08-18 18:01:40 -04:00
Gagan Goel
816139d06d MCOL-4000 Allow columnstore_use_import_for_batchinsert to use a
new value, ALWAYS, which invokes cpimport for LDI and INSERT..SELECT
from within and outside a transaction.

Default value of the session variable, ON, remains unchanged.
2020-05-12 19:42:15 -04:00
Roman Nozdrin
3fabf01e93 MCOL-3593 Disabled full optimizer run and enabled copy-pasted simplify_joins.
Disabled 4th if block in buildOuterJoin to handle non-optimized MDB query
    structures.

    Broke getSelectPlan into pieces: processFrom, processWhere.

MCOL-3593 UNION processing depends on two flags isUnion that comes as
arg of getSelectPlan and unionSel that is a local variable in
getSelectPlan. Modularization of getSelectPlan broke the mechanizm.
This patch is supposed to partially fix it.

MCOL-3593 Removed unused if condition from buildOuterJoin that allows
unsupported construct subquery in ON expression.
Fixed an improper if condition that ignors tableMap entries w/o condition
in external_lock thus external_lock doesn't clean up when the query
finishes.
Fixed wrong logging for queries processed in tableMode. Now rnd_init
properly sends queryText down to ExeMgr to be logged.

MCOL-3593 Unused attribute FromSubQuery::fFromSub was removed.
 getSelectPlan has been modularized into: setExecutionParams,
 processFrom, processWhere. SELECT, HAVING, GROUP BY, ORDER BY
 still lives in getSelectPlan.
Copied optimization function simplify_joins_ into our pushdown
 code to provide the plugin code with some rewrites from MDB it
 expects.
The columnstore_processing_handlers_fallback session variable
 has been removed thus CS can't fallback from SH to partial
 execution paths, e.g. DH, GBH or plugin API.

MCOL-3602 Moved MDB optimizer rewrites into a separate file.

Add SELECT_LEX::optimize_unflattened_subqueries() call to fix IN
 into EXISTS rewrite for semi-JOINs with subqueries.

disable_indices_for_CEJ() add index related hints to disable
 index access methods in Cross Engine Joins.

create_SH() now flattens JOIN that has both physical tables and
 views. This fixes most of views related tests in the regression.
2019-11-25 10:03:32 -06:00
Roman Nozdrin
7b5e5f0eb6 MCOL-894 Upmerged the fist part of the patch into develop.
MCOL-894 Add default values in Compare and CSEP ctors to activate UTF-8 sorting
    properly.

MCOL-894 Unit tests to build a framework for a new parallel sorting.

MCOL-894 Finished with parallel workers invocation.
     The implementation lacks final aggregation step.

MCOL-894 TupleAnnexStep's init and destructor are now parallel execution aware.

    Implemented final merging step for parallel execution finalizeParallelOrderBy().

    Templated unit test to use it with arbitrary number of rows, threads.

Reuse LimitedOrderBy in the final step

MCOL-894 Cleaned up finalizeParallelOrderBy.

MCOL-894 Add and propagate thread variable that controls a number of threads.

    Optimized comparators used for sorting and add corresponding UTs.

    Refactored TupleAnnexStep::finalizeParallelOrderByDistinct.

    Parallel sorting methods now preallocates memory in batches.

MCOL-894 Fixed comparator for StringCompare.
2019-11-05 15:23:43 +03:00
Andrew Hutchings
20c1949152 Replication improvements
This patch fixes:

MCOL-3557 - Row Based Replication events to ColumnStore tables will no
longer cause MariaDB to crash, it will error instead.

MCOL-3556 - Remove the Columnstore.xml variable to turn on ColumnStore
tables applying replication events and instead make it a system variable
that can be set in my.cnf called "columnstore_replication_slave". This
allows it to be set per-UM.
2019-10-14 11:54:48 +01:00
Gagan Goel
594ee22999 Add status variables to show ColumnStore version and commit hash. 2019-10-09 10:43:35 -04:00
Roman Nozdrin
2c63258537 MCOL-2178 SH now allows to fallback to other pushdown handlers.
SH query execution migrated from SH::init() into create_SH().
There is a session variable columnstore_processing_handlers_fallback
that allows to fallback to DH, GBH if SH fails. DH now uses semantic
tree check for unsupported features to allow to fallback to GBH or
storage API.

Fixes GBH related bug when create_GBH() returns a handler for
queries with impossible WHERE/HAVING.

Fixed bug in FromSubquery::transform() where isUnion is set to true.

Enabled RTTI b/c server team enabled it for MDB.

Removed unused code supposed to be used with vtable.
2019-08-25 04:05:59 +03:00
Roman Nozdrin
b1bc995420
Merge branch 'develop' into remove-infinidb 2019-08-13 12:32:01 +03:00
Andrew Hutchings
9d83b49fca MCOL-104 First pass of InfiniDB rename in code 2019-08-12 09:41:28 +01:00
Patrick LeBlanc
a09a9d5d0f Mass substitution 'Corporaton' -> 'Corporation' 2019-08-07 14:43:25 -05:00
Gagan Goel
1c460f3ba5 MCOL-2178 Cleanup of MIGR:: singleton from the plugin code.
Disable SP execution by the smart handlers for now.

    Add session variables to Enable/Disable select/derived/group_by
    handlers. Defaulted to Enable.
2019-08-04 21:50:50 -04:00
Roman Nozdrin
d62b66ecf7 parse_item() in execplan code now always get an actual GWI structure
to avoid accedental crashes.

Add check for Conversion of Big IN Predicates Into Subqueries optimization
conditions.

Enabled derivedTableOptimization() for group by and derived handlers.

Disabled Conversion of Big IN Predicates Into Subqueries optimization.

Disabled most of optimizer_flags for now.

RowGroup + operator now correctly sets useStringTable flag that
instructs code to check StringStore instead of plain data buffer.
2019-08-01 14:29:55 -04:00
Roman Nozdrin
6fd5b2f22d MCOL-2178 Merging with 10.4
SELECT_LEX had been moved in THD so changed all references.
        Avoid writing CS decimal scales into MDB decimal fields
            d-only dec attribute. WIP
        Replaced infinidb_vtable with a singleton MIGR.
        Merged with MCOL-2121.
        Added new wsrep include paths needed by UDaF code.
        Removed .vcxproj from Connector code.
2019-08-01 12:54:17 -04:00
Roman Nozdrin
06696f596a MCOL-1101 Add plugin variables to replace the legacy system vars.
Legacy system vars with names infinidb_* was preserved for
    backward compatibility and they will be used if
    columnstore_use_legacy_vars variable is set.

    Remove unused structure and plugin variable.
2019-02-18 16:13:50 +03:00