This patch:
1. Properly processes situation when pm join result count is exceeded.
2. Adds session variable 'columnstore_max_pm_join_result_count` to control the limit.
This patch:
1. Handles corner case when the bucket exceeded the memory limit, but we cannot redistribute the data in this bucket into new buckets based on a hash algorithm, because the rows have the same values.
2. Adds force option for disk join step.
3. Add a option to contol the depth of the partition tree.
in aggregation code
The patch disables padding that forces hasher to calculate over the whole 2k buffer. This patch also moves hashing code
into the common place where it belongs.
Introduced UDF and stored prodecure.
usage:
set columnstore_s3_key='<s3_key>';
set columnstore_s3_secret='<s3_secret>';
set columnstore_s3_region='region';
and then use UDF
select columnstore_dataload("<tablename>", "<filename>", "<bucket>", "<db_name>");
for UDF db_name can be ommited, then current connection db will be used
or stored function
call calpontsys.columnstore_load_from_s3("<tablename>", "<filename>", "<bucket>", "<db_name>");
DML statements executed on the primary node in a ColumnStore
cluster do not need to be written to the primary's binlog. This
is due to ColumnStore's distributed storage architecture.
With this patch, we disable writing to binlog when a DML statement
(INSERT/DELETE/UPDATE/LDI/INSERT..SELECT) is performed on a ColumnStore
table. HANDLER::external_lock() calls are used to
1. Turn OFF the OPTION_BIN_LOG flag
2. Turn ON the OPTION_BIN_TMP_LOG_OFF flag
in THD::variables.option_bits during a WRITE lock call.
THD::variables.option_bits is restored back to the original state
during the UNLOCK call in HANDLER::external_lock().
Further, isDMLStatement() function is added to reduce code verbosity
to check if a given statement is a DML statement.
Note that with this patch, not writing to primary's binlog means
DML replication from a ColumnStore cluster to another ColumnStore
cluster or to another foreign engine will not work.
1. Add a new system variable, columnstore_use_cpimport_for_cache_inserts,
that when set to ON, uses cpimport for the cache flush into ColumnStore.
This variable is set to OFF by default. By default, we perform batch inserts
for the cache flush.
2. Disable DMLProc logging of the SQL statement text for the cache
flush operation in case of batch inserts. Under certain heavy loads
involving INSERT statements, this logging becomes a bottleneck for
the cache flush, causing subsequent inserts into the cache table to hang.
* Adds CompressInterfaceLZ4 which uses LZ4 API for compress/uncompress.
* Adds CMake machinery to search LZ4 on running host.
* All methods which use static data and do not modify any internal data - become `static`,
so we can use them without creation of the specific object. This is possible, because
the header specification has not been modified. We still use 2 sections in header, first
one with file meta data, the second one with pointers for compressed chunks.
* Methods `compress`, `uncompress`, `maxCompressedSize`, `getUncompressedSize` - become
pure virtual, so we can override them for the other compression algos.
* Adds method `getChunkMagicNumber`, so we can verify chunk magic number
for each compression algo.
* Renames "s/IDBCompressInterface/CompressInterface/g" according to requirement.
This feature allows a query execution to fallback to the server,
in case query execution using the select_handler (SH) fails. In case
of fallback, a warning message containing the original reason for
query failure using SH is generated.
To accomplish this task, SH execution is moved to an earlier step when
we create the SH in create_columnstore_select_handler(), instead of the
previous call to SH execution in ha_columnstore_select_handler::init_scan().
This requires some pre-requisite steps that occur in the server in
JOIN::optimize() and JOIN::exec() to be performed before starting SH execution.
In addition, missing test cases from MCOL-424 are also added to the MTR suite,
and the corresponding fix using disable_indices_for_CEJ() is reverted back
since the original fix now appears to be redundant.
This patch:
1. Removes the option to declare uncompressed columns (set columnstore_compression_type = 0).
2. Ignores [COMMENT '[compression=0] option at table or column level (no error messages, just disregard).
3. Removes the option to set more than 2 extents per file (ExtentsPreSegmentFile).
4. Updates rebuildEM tool to support up to 10 dictionary extent per dictionary segment file.
5. Adds check for `DBRootStorageType` for rebuildEM tool.
6. Renamed rebuildEM to mcsRebuildEM.
We are creating a new read-only system variable, columnstore_cache_inserts,
to enable/disable the cache. When this variable is set at server start up,
any table created with engine=columnstore will also create the corresponding
cache table in Aria engine for performing inserts.
It is important to note that a ColumnStore table created with this
option unset should not be queried when the server is restarted with
the option set, as this will most likely result in query failures.
new value, ALWAYS, which invokes cpimport for LDI and INSERT..SELECT
from within and outside a transaction.
Default value of the session variable, ON, remains unchanged.
Disabled 4th if block in buildOuterJoin to handle non-optimized MDB query
structures.
Broke getSelectPlan into pieces: processFrom, processWhere.
MCOL-3593 UNION processing depends on two flags isUnion that comes as
arg of getSelectPlan and unionSel that is a local variable in
getSelectPlan. Modularization of getSelectPlan broke the mechanizm.
This patch is supposed to partially fix it.
MCOL-3593 Removed unused if condition from buildOuterJoin that allows
unsupported construct subquery in ON expression.
Fixed an improper if condition that ignors tableMap entries w/o condition
in external_lock thus external_lock doesn't clean up when the query
finishes.
Fixed wrong logging for queries processed in tableMode. Now rnd_init
properly sends queryText down to ExeMgr to be logged.
MCOL-3593 Unused attribute FromSubQuery::fFromSub was removed.
getSelectPlan has been modularized into: setExecutionParams,
processFrom, processWhere. SELECT, HAVING, GROUP BY, ORDER BY
still lives in getSelectPlan.
Copied optimization function simplify_joins_ into our pushdown
code to provide the plugin code with some rewrites from MDB it
expects.
The columnstore_processing_handlers_fallback session variable
has been removed thus CS can't fallback from SH to partial
execution paths, e.g. DH, GBH or plugin API.
MCOL-3602 Moved MDB optimizer rewrites into a separate file.
Add SELECT_LEX::optimize_unflattened_subqueries() call to fix IN
into EXISTS rewrite for semi-JOINs with subqueries.
disable_indices_for_CEJ() add index related hints to disable
index access methods in Cross Engine Joins.
create_SH() now flattens JOIN that has both physical tables and
views. This fixes most of views related tests in the regression.
MCOL-894 Add default values in Compare and CSEP ctors to activate UTF-8 sorting
properly.
MCOL-894 Unit tests to build a framework for a new parallel sorting.
MCOL-894 Finished with parallel workers invocation.
The implementation lacks final aggregation step.
MCOL-894 TupleAnnexStep's init and destructor are now parallel execution aware.
Implemented final merging step for parallel execution finalizeParallelOrderBy().
Templated unit test to use it with arbitrary number of rows, threads.
Reuse LimitedOrderBy in the final step
MCOL-894 Cleaned up finalizeParallelOrderBy.
MCOL-894 Add and propagate thread variable that controls a number of threads.
Optimized comparators used for sorting and add corresponding UTs.
Refactored TupleAnnexStep::finalizeParallelOrderByDistinct.
Parallel sorting methods now preallocates memory in batches.
MCOL-894 Fixed comparator for StringCompare.
This patch fixes:
MCOL-3557 - Row Based Replication events to ColumnStore tables will no
longer cause MariaDB to crash, it will error instead.
MCOL-3556 - Remove the Columnstore.xml variable to turn on ColumnStore
tables applying replication events and instead make it a system variable
that can be set in my.cnf called "columnstore_replication_slave". This
allows it to be set per-UM.
SH query execution migrated from SH::init() into create_SH().
There is a session variable columnstore_processing_handlers_fallback
that allows to fallback to DH, GBH if SH fails. DH now uses semantic
tree check for unsupported features to allow to fallback to GBH or
storage API.
Fixes GBH related bug when create_GBH() returns a handler for
queries with impossible WHERE/HAVING.
Fixed bug in FromSubquery::transform() where isUnion is set to true.
Enabled RTTI b/c server team enabled it for MDB.
Removed unused code supposed to be used with vtable.
to avoid accedental crashes.
Add check for Conversion of Big IN Predicates Into Subqueries optimization
conditions.
Enabled derivedTableOptimization() for group by and derived handlers.
Disabled Conversion of Big IN Predicates Into Subqueries optimization.
Disabled most of optimizer_flags for now.
RowGroup + operator now correctly sets useStringTable flag that
instructs code to check StringStore instead of plain data buffer.
SELECT_LEX had been moved in THD so changed all references.
Avoid writing CS decimal scales into MDB decimal fields
d-only dec attribute. WIP
Replaced infinidb_vtable with a singleton MIGR.
Merged with MCOL-2121.
Added new wsrep include paths needed by UDaF code.
Removed .vcxproj from Connector code.
Legacy system vars with names infinidb_* was preserved for
backward compatibility and they will be used if
columnstore_use_legacy_vars variable is set.
Remove unused structure and plugin variable.