1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-04-20 09:07:44 +03:00

143 Commits

Author SHA1 Message Date
benthompson15
923bbf4033
MCOL-1356: Add convert_tz (#2099) 2021-08-19 17:47:10 -05:00
benthompson15
91945fe271 Fix warnings for vla, unused variables. 2021-07-14 20:08:46 -05:00
Denis Khalikov
dc51dbf6cf [MCOL-4786] Fix filter comparison.
Compare ParseTree by dereferencing pointers.
2021-07-12 19:18:02 +03:00
Denis Khalikov
adace6e0c7 MCOL-4786 Fix wrong comparison for the filters.
Fix wrong comparison for the filters while creating case function.
2021-07-09 12:18:26 +03:00
Leonid Fedorov
f81f743282 Replace underlying type for avg and sum for int types from long double to wide decimal 2021-07-08 17:04:43 +00:00
Gagan Goel
8520f87237 MCOL-641 Cleanup. 2021-07-06 09:01:49 +00:00
David.Hall
237cad347f
MCOL-4758 Limit LONGTEXT and LONGBLOB to 16MB (#1995)
MCOL-4758 Limit LONGTEXT and LONGBLOB to 16MB

Also add the original test case from MCOL-3879.
2021-07-05 02:09:41 -04:00
Roman Nozdrin
a4aecc120e
Merge pull request #2006 from tntnatbry/fix-const-scalar-subselect
Fixes for queries containing constant scalar subselects in the WHERE clause.
2021-07-02 20:15:09 +03:00
Gagan Goel
8d0ca55495 Fixes for queries containing constant scalar subselects in the WHERE clause.
For queries of the form:
SELECT col1 FROM t1 WHERE col2 = (SELECT 2);

We fix the execution plan which earlier had an empty filters
expression. For this query, we now build a SimpleFilter with a
SimpleColumn and a ConstantColumn as the LHS and the RHS operands
respectively.

For queries of the form:
SELECT ... WHERE col1 NOT IN (SELECT <const_item>);

The execution plan earlier built a SimpleFilter with an "=" as
the predicate operator of the filter. We fix this by assigning
the correct "<>" operator instead.
2021-07-02 16:40:30 +00:00
Roman Nozdrin
325bb6c9e0
Merge pull request #1986 from tntnatbry/MCOL-1482
MCOL-1482 An UPDATE operation on a non-ColumnStore table involving a cross-engine join
2021-07-01 14:25:32 +03:00
David.Hall
132146b9c8
Mcol 3738 Allow COUNT(DISTINCT to have multiple parms) (#2002)
* MCOL-3738 allow COUNT(DISTINCT) multiple parameters
Changes in the way tupleaggregatestep sets up the aggregate arrays.

* MCOL-3738 mtr test
2021-06-28 20:14:44 +03:00
Gagan Goel
49255f5cbd MCOL-1482 An UPDATE operation on a non-ColumnStore table involving a
cross-engine join with a ColumnStore table errors out.

ColumnStore cannot directly update a foreign table. We detect whether
a multi-table UPDATE operation is performed on a foreign table, if so,
do not create the select_handler and let the server execute the UPDATE
operation instead.
2021-06-25 15:27:54 +00:00
Gagan Goel
7c8b502dc2 Fix regression in a query involving an aggregate function on a
non-wide decimal column in the HAVING clause.

In buildAggregateColumn(), if an aggregate function (such as avg)
is applied on a non-wide decimal column, we were setting the precision
of the resulting column as -1. This later down in the execution got
converted to 255 as in some cases, precision is stored as uint8_t.
The predicate operations on a DECIMAL column has logic that uses
the wide Decimal::s128value field if precision > 18. This logic incorrectly
used the Decimal::s128value instead of the correct value stored in the
narrow Decimal::value field, since precision of the Decimal column
was 255. The fix is to set the aggregate column precision to
datatypes::INT64MAXPRECISION (18) in buildAggregateColumn() when the
aggregate is applied on a non-wide decimal column.

This commit also partially fixes -Wstrict-aliasing GCC warnings.
2021-06-22 11:11:34 +00:00
Gagan Goel
e0d2a21cb9 MCOL-4665 Move outer join to inner join conversion into the engine.
This is a subtask of MCOL-4525 Implement select_handler=AUTO.

Server performs outer join to inner join conversion using simplify_joins()
in sql/sql_select.cc, by updating the TABLE_LIST::outer_join variable.
In order to perform this conversion, permanent changes are made in some
cases to the SELECT_LEX::JOIN::conds and/or TABLE_LIST::on_expr.
This is undesirable for MCOL-4525 which will attemp to fallback and execute
the query inside the server, in case the query execution fails in ColumnStore
using the select_handler.

For a query such as:
  SELECT * FROM t1 LEFT JOIN t2 ON expr1 LEFT JOIN t3 ON expr2
In some cases, server can update the original SELECT_LEX::JOIN::conds
and/or TABLE_LIST::on_expr and create new Item_cond_and objects
(e.g. with 2 Item's expr1 and expr2 in Item_cond_and::list).
Instead of making changes to the original query structs, we use
gp_walk_info::tableOnExprList and gp_walk_info::condList. 2 Item's,
expr1 and expr2, in the condList, mean Item_cond_and(expr1, expr2), and
hence avoid permanent transformations to the SELECT_LEX.

We also define a new member variable
ha_columnstore_select_handler::tableOuterJoinMap
which saves the original TABLE_LIST::outer_join values before they are
updated. This member variable will be used later on to restore to the original
state of TABLE_LIST::outer_join in case of a query fallback to server execution.

The original simplify_joins() implementation in the server also performs a
flattening of the JOIN nest, however we don't perform this operation in
convertOuterJoinToInnerJoin() since it is not required for ColumnStore.
2021-06-03 11:13:19 +00:00
Alexander Barkov
9608533d92 MCOL-4734 Compilation failure: MariaDB-10.6 + ColumnStore-develop
mcsconfig.h and my_config.h have the following
pre-processor definitions:

1. Conflicting definitions coming from the standard cmake definitions:
- PACKAGE
- PACKAGE_BUGREPORT
- PACKAGE_NAME
- PACKAGE_STRING
- PACKAGE_TARNAME
- PACKAGE_VERSION
- VERSION

2. Conflicting definitions of other kinds:
- HAVE_STRTOLL - this is a dirt in MariaDB headers.
  Should be fixed in the server code. my_config.h erroneously
  performs "#define HAVE_STRTOLL" instead of "#define HAVE_STRTOLL 1".
  in some cases. The former is not CMake compatible style. The latter is.

3. Non-conflicting definitions:
  Otherwise, mcsconfig.h and my_config.h should be mutually compatible,
  because both are generated by cmake on the same host machine. So
  they should have exactly equal definitions like "HAVE_XXX", "SIZEOF_XXX", etc.

Observations:
- It's OK to include both mcsconfig.h and my_config.h providing that we
  suppress duplicate definition of the above conflicting types #1 and #2.
- There is no a need to suppress duplicate definitions mentioned in #3,
  as they are compatible!
- my_sys.h and m_ctype.h must always follow a CMake configuation header,
  either my_config.h or mcsconfig.h (or both).
  They must never be included without any preceeding configuration header.

This change make sure that we resolve conflicts by:
- either disallowing inclusion of mcsconfig.h and my_config.h
  at the same time
- or by hiding conflicting definitions #1 and #2
  (with their later restoring).
- also, by making sure that my_sys.h and m_ctype.h always follow
  a CMake configuration file.

Details:
- idb_mysql.h can now only be included only after my_config.h
  An attempt to use idb_mysql.h with mcsconfig.h instead of
  my_config.h is caught by the "#error" preprocessor directive.

- mariadb_my_sys.h can now be only included after mcsconfig.h.
  An attempt to use mariadb_my_sys.h without mcscofig.h
  (e.g. with my_config.h) is also caught by "#error".

- collation.h now can now be included in two ways.
  It now has the following effective structure:

    #if defined(PREFER_MY_CONFIG_H) && defined(MY_CONFIG_H)
    //  Remember current conflicting definitions on the preprocessor stack
    //  Undefine current conflicting definitions
    #endif
    #include "mcsconfig.h"
    #include "m_ctype.h"
    #if defined(PREFER_MY_CONFIG_H) && defined(MY_CONFIG_H)
    #    Restore conflicting definitions from the preprocessor stack
    #endif

  and can be included as follows:

  a. using only mcsconfig.h as a configuration header:

    // my_config.h must not be included so far
    #include "collation.h"

  b. using my_config.h as the first included configuration file:

    #define PREFER_MY_CONFIG_H // Force conflict resolution
    #include "my_config.h"     // can be included directly or indirectly
    ...
    #include "collation.h"

Other changes:

- Adding helper header files
     utils/common/mcsconfig_conflicting_defs_remember.h
     utils/common/mcsconfig_conflicting_defs_restore.h
     utils/common/mcsconfig_conflicting_defs_undef.h
  to perform conflict resolution easier.

- Removing `#include "collation.h"` from a number of files,
  as it's automatically included from rowgroup.h.

- Removing redundant `#include "utils_utf8.h"`.
  This change is not directly related to the problem being fixed,
  but it's nice to remove redundant directives for both collation.h
  and utils_utf8.h from all the files that do not really need them.
  (this change could probably have gone as a separate commit)

- Changing my_init() to MY_INIT(argv[0]) in the MCS services sources.
  After the fix of the complitation failure it appeared that ColumnStore
  services compiled with the debug build crash due to recent changes in
  safemalloc. The crash happened in strcmp() with `my_progname` as an argument
  (where my_progname is a mysys global variable). This problem should
  probably be fixed on the server side as well to avoid passing NULL.
  But, the majority of MariaDB executable programs also use MY_INIT(argv[0])
  rather than my_init(). So let's make MCS do like the other programs do.
2021-05-25 12:34:36 +04:00
Roman Nozdrin
c6db1f9191
Merge pull request #1825 from tntnatbry/MCOL-4617
MCOL-4617 Move in-to-exists predicate creation and injection into the engine.
2021-05-07 13:33:02 +03:00
Gagan Goel
22c7fb7c01 MCOL-4680 FROM subquery containing nested joins returns an error.
Main theme of the patch is to fix joins processing in the plugin
code. We now use SELECT_LEX::top_join_list and process the nested
joins recursively, instead of SELECT_LEX::table_list struct which
we earlier used to build the join filters. The earlier approach
did not process certain nested join ON expressions, causing certain
queries to incorrectly error out such as that described in MCOL-4680.

In addition, some legacy code is also removed.
2021-05-03 06:28:27 +00:00
Gagan Goel
f167a6e505 MCOL-4617 Move in-to-exists predicate creation and injection into the engine.
We earlier leveraged the server functionality provided by

Item_in_subselect::create_in_to_exists_cond and
Item_in_subselect::inject_in_to_exists_cond

to create and inject the in-to-exists predicate into an IN
subquery's JOIN struct. With this patch, we leave the IN subquery's
JOIN unaltered and instead directly perform this predicate creation
and injection into ColumnStore's select execution plan.
2021-04-30 07:57:00 +00:00
Alexander Barkov
c67f70f385 MCOL-4689 [135B blob data] in PrimPrim jounralctl records 2021-04-22 10:46:26 +04:00
Roman Nozdrin
1f46baa980
Merge pull request #1827 from cvicentiu/fetch-first-refactor
MCOL-4645 Update columnstore usage of select_lex
2021-04-21 14:24:43 +03:00
Roman Nozdrin
f4b02a7aca
Merge pull request #1828 from tntnatbry/MCOL-4543-4589
MCOL -4543/MCOL-4589 Subquery optimization
2021-04-14 13:50:46 +03:00
Alexander Barkov
362bfcd15e MCOL-4361 Replace pow(10.0, (double)scale) expressions with a static dictionary lookup. 2021-04-09 12:41:04 +04:00
Gagan Goel
8a03e6c7d1 MCOL-4543 Subquery optimization.
For a query of the form:

SELECT COUNT(c2) FROM (SELECT * FROM t1) q;

where t1 contains 10 columns c1, c2, ... , c10.

We currently create an intermediate RowGroup in ExeMgr with
a row of the form (1, c2_value1, 1, 1, 1, 1, 1, 1, 1, 1), i.e.
for all the columns of the subquery which are not referenced in
the outer query, we substitute a constant value, which is wasteful.

With this optimization, we are trimming the RowGroup to a row
of the form (1, c2_value1). This can have non-trivial query
execution time improvements if the subquery contains large number
of columns (such as a "select *" on a very wide table) and the outer
query is only referencing a subset of these columns with lower
index values from the subquery (as an example, c1 or c2 above).
That is, the current limitation of this optimization is we are not
removing those non-referenced subquery columns (c1 in the query above)
which are to the left of a referenced column.
2021-03-29 11:56:04 +00:00
Vicențiu Ciorbaru
0643125426 Update columnstore usage of select_lex
After the cleanup work done for FETCH FIRST ... WITH TIES
SELECT_LEX members select_limit, explicit_limit and offset_limit are now moved
to SELECT_LEX::limit_params.
2021-03-28 16:07:32 +03:00
Gagan Goel
abf45bf46c MCOL-4493 Add ON expressions for WHERE processing when the JOIN
type is not LEFT/RIGHT.

In buildOuterJoin(), do not add ON expressions for WHERE
processing when the JOIN type is not LEFT/RIGHT.

Test cases to check correct processing of INNER JOIN ON expressions
with possible/impossible WHERE conditions are added for
  1. One side of the LEFT JOIN being INNER JOIN.
  2. One side of the LEFT JOIN being an INNER JOIN inside an INNER JOIN.
  3. Both sides of the LEFT JOIN being an INNER JOIN.
2021-02-05 09:10:38 +00:00
Roman Nozdrin
b0f97611fc
Merge pull request #1691 from drrtuy/MCOL-4465_MCOL-4466_MCOL-4452
Mcol 4465 mcol 4466 mcol 4452
2020-12-23 16:28:40 +03:00
Roman Nozdrin
994d9a5125 MCOL-4465 Use a proper ColType for UDAF in a projection RowGroup 2020-12-23 10:32:11 +00:00
David Hall
d35002fb65 MCOL-4263 return int for func_floor on datetime
For TIMESTAMP, it should do similar. However, it didn't work. For some reason, MDB has the function set as DATETIME, which for cs, isn't the same thing. Added a kludge to ha_mcs_execplan.cpp to handle it.
2020-12-16 16:23:21 -06:00
Gagan Goel
a159f8a0b6 MCOL-4188 Regression fixes for MCOL-641.
1. Add wide decimal support to AggregateColumn::evaluate
and TreeNode::getDecimalVal().
2. Use the pm aggregate attributes to determine um aggregate
attributes in TupleAggregateStep::prep2PhasesAggregate.
2020-11-30 13:49:05 -05:00
Alexander Barkov
129d5b5a0f MCOL-4174 Review/refactor frontend/connector code 2020-11-18 13:53:15 +00:00
Gagan Goel
1f4a781704 MCOL-641 Fixes for arithmetic operations.
1. Perform type promotion to wide decimal if the result
   of an arithmetic operation has a precision > 18.
2. Only set the decimal width of an arithmetic operation to wide
   if both the LHS and RHS of the operation are decimal types.
2020-11-18 13:52:20 +00:00
Roman Nozdrin
8de9764f84 MCOL-4172 Add support for wide-DECIMAL into statistical aggregate and regr_* UDAF functions
The patch fixes wrong results returned when multiple UDAF exist in projection

aggregate over wide decimal literals now works
2020-11-18 13:52:20 +00:00
Gagan Goel
6aea838360 MCOL-641 Add support for functions (Part 2). 2020-11-18 13:51:55 +00:00
Roman Nozdrin
e88cbe9bc1 MCOL-641 Simple aggregates support: min, max, sum, avg for wide-DECIMALs. 2020-11-18 13:51:25 +00:00
Gagan Goel
cfe35b5c7f MCOL-641 Add support for functions (Part 1). 2020-11-18 13:51:25 +00:00
Gagan Goel
554c6da8e8 MCOL-641 Implement int128_t versions of arithmetic operations and add unit test cases. 2020-11-18 13:47:45 +00:00
Roman Nozdrin
b5534eb847 MCOL-641 Refactored MultiplicationOverflowCheck but it still has flaws.
Introduced fDecimalOverflowCheck to enable/disable overflow check.

Add support into a FunctionColumn.

Low level scanning crashes on medium sized data sets.
2020-11-18 13:47:45 +00:00
Gagan Goel
74b64eb4f1 MCOL-641 1. Add support for int128_t in ParsedColumnFilter.
2. Set Decimal precision in SimpleColumn::evaluate().
3. Add support for int128_t in ConstantColumn.
4. Set IDB_Decimal::s128Value in buildDecimalColumn().
5. Use width 16 as first if predicate for branching based on decimal width.
2020-11-18 13:47:45 +00:00
drrtuy
0ff0472842 MCOL-641 sum() now works with DECIMAL(38) columns.
TupleAggregateStep class method and buildAggregateColumn() now properly set result data type.

doSum() now handles DECIMAL(38) in approprate manner.

Low-level null related methods for new binary-based datatypes now handles magic values for
binary-based DT.
2020-11-18 13:47:01 +00:00
drrtuy
54c152d6c8 MCOL-641 This commit introduces templates for DataConvert and RowGroup methods. 2020-11-18 13:47:01 +00:00
Alexey Antipovsky
b25fee320a Remove variable-length arrays (-Wvla) 2020-11-17 15:03:10 +03:00
David Hall
35c4b66a67 MCOL-4144 Enable lower_case_table_names
Create tables and schemas with lower case name only if the flag is set.
During operations, convert to lowercase in plugin. Byt the time a query gets to ExeMgr, DDLProc etc., everything must be lower case if the flag is set, and undisturbed if not.
2020-09-24 15:21:13 -05:00
Gagan Goel
a117786027 MCOL-4282 Enable Select Handler for Prepared Statements
This patch enables select handler for executing prepared
statements. Most importantly, we are now activating a
persistent arena which will allocate any new items in a
permanent MEMROOT for prepared statements and stored procedures.
Refer to JOIN::optimize_inner() for details.

In processWhere(), we now use SELECT_LEX::prep_where in case
we are executing a prepared statement, as this is where the saved
WHERE clause is stored for prepared statement processing.

In addition, we also disable derived handler for prepared
statements.
2020-09-11 16:35:51 -04:00
Gagan Goel
a35be51cc6
Merge pull request #1433 from dhall-MariaDB/MCOL-3464
MCOL-3464 don't dereference a NULL String.
2020-09-09 12:25:20 -04:00
David Hall
890846fa8a MCOL-3464 don't dereference a NULL String. 2020-09-04 16:31:20 -05:00
David Hall
a7fef967c4 MCOL-4108 For functions not found, send not supported error 2020-09-04 15:27:47 -05:00
Gagan Goel
03c50eabee
Revert "MCOL-3827 Optimize out sort on SubQuery in Select" 2020-08-19 19:23:55 -04:00
Gagan Goel
f1759e6560
Merge pull request #1367 from dhall-MariaDB/MCOL-3827
MCOL-3827 Optimize out sort on SubQuery in Select
2020-08-18 14:54:36 -04:00
David Hall
e7b8abfdb9 MCOL-3827 Optimize out sort on SubQuery in Select 2020-08-17 14:40:22 -05:00
David Hall
478426c8bf MCOL-3814 add back lower to viewName assignment 2020-08-12 10:00:04 -05:00