Server expands ut8_XXX aliases to utf8mb3_XXX or utf8mb4_XXX depending
on the UTF8_IS_UTF8MB3 setting in the OLD_MODE environment variable.
Server already has the necessary code implemented in the get_utf8_flag()
method of class THD. There are several uses of this flag and all we have
to do to be in line with server is to use it.
This patch does that for DDL as work on MCOL-5705 uncovered some
problems in that area.
The fix is simple: enable subtotals in single-phase aggregation and
disable parallel processing when there are subtotals and aggregation is
single-phase.
The purpose of this changeset is to obtain list of partitions from
SELECT_LEX structure and pass it down to joblist and then to
CrossEngineStep to pass to InnoDB.
This patch introduces an internal aggregate operator SELECT_SOME that
is automatically added to columns that are not in GROUP BY. It
"computes" some plausible value of the column (actually, last one
passed).
Along the way it fixes incorrect handling of HAVING being transferred
into WHERE, window function handling and a bit of other inconsistencies.
The most important fix here is the fix of possible buffer overrun in
DATEFORMAT() function. A "%W" format, repeated enough times, would
overflow the 256-bytes buffer for result. Now we use ostringstream to
construct result and we are safe.
Changes in date/time projection functions made me fix difference between
us and server behavior. The new, better behavior is reflected in changes
in tests' results.
Also, there was incorrect logic in TRUNCATE() and ROUND() functions in
computing the decimal "shift."
* MCOL-4234: improve GROUP BY and ORDER BY interaction (#3194)
This patch fixes the problem in MCOL-4234 and also generally improves
behavior of GROUP BY.
It does so by introducing a "dummy" aggregate and by wrapping columns
into it. This allows for columns that are not in GROUP BY to be used
more freely, for example, in SELECT * FROM tbl GROUP BY col - all
columns that are not "col" will be wrapped into an aggregate and query
will proceed to execution.
The dummy aggregate itself does nothing more than remember last value
passed into it.
There also an additional error message that tries to explain what types
of expressions can be wrapped into an aggregate.
* MCOL-5772: incorrect ORDER BY ordering for a columns not in GROUP BY (#3214)
When ORDER BY column is not in GROUP BY, is not an aggregate and there
is a SELECT column that is also not an aggregate, there was a problem:
ordering happened on the SELECTed column, not ORDERed one.
This patch fixes that particular problem and also performs some tidying
around newly added aggregate.
---------
Co-authored-by: Leonid Fedorov <79837786+mariadb-LeonidFedorov@users.noreply.github.com>
* MCOL-5328: PCRE based regexp regexp_substr regexp_instr regexp_replace
* Add qa test for MCOL-5328
---------
Co-authored-by: Susil Behera <susil.behera@mariadb.com>
The UPDATE statement wrote NULL when the column set is DATETIME and
value is '0000-00-00 00:00:00'. The problem was inside WriteEngine's
handling of UPDATE statements and this is where heart of change is.
Other changes are related to some obsolete data structures in DML/DDL
handling that just hanging around there, doing nothing.
This is a fix of logging subsystem, nothing else.
The old code expanded an argument into string and advanced too little
and, if expansion contained argument's index, it expanded it again. And
again.
Fixes MCOL-5643.
The problem was that different views with same column names in GROUP BY
and on the SELECT clause produced an error about "projection column is
not an aggergate neither in GROUP BY list."
This was due to incorrect search in expressions's list that lead to
duplicate columns in GROUP BY list.
JSON functions were implemented violating an assumption of their
pureness, as they should not have any state. This concrete patch
fixes implementation of JSON_VALUE function.
We add intermediate calculations in int128_t when target is UBIGINT and
check for overflow before converting into the UBIGINT. This is so
because we can overflow on addition and multiplication, with (some)
signed operands or both unsigned.
Adds a special column which helps to differentiate data and rollups of
various depts and a simple logic to row aggregation to add processing of
subtotals.
1. Extend the calpontsys.syscolumn system catalog table
with a new column, 'charsetnum'.
'charsetnum' field is set to the 'number' member of the
'charset_info_st' struct defined in the server in m_ctype.h.
For CHAR/VARCHAR/TEXT column types, 'charset_info_st' is
initialized to the charset/collation of the column, which
is set at the column-level or at the table-level in the DDL.
For BLOB/VARBINARY binary column types, 'charset_info_st' is
initialized to my_charset_bin (charsetnum=63).
For all other column types, charsetnum is set to 0.
2. Add support for the newly added 'charsetnum' column in the
automatic system catalog upgrade logic in dbbuilder.
For existing table definitions, charsetnum for the column is
defaulted to 0.
3. Add MTR test case that creates a few table definitions with
a range of charset/collation combinations and queries the
calpontsys.syscolumn system catalog table with the charsetnum
field for the columns in the table DDLs.
feat(charset)!: utf8 is a new charset default and utf8_general_ci is a new collation default in the engine configuration file shipped
---------
Co-authored-by: Leonid Fedorov <leonid.fedorov@mariadb.com>
Co-authored-by: mariadb-DanielLee <daniel.lee@mariadb.com>
This patch improves handling of NULLs in textual fields in ColumnStore.
Previously empty strings were considered NULLs and it could be a problem
if data scheme allows for empty strings. It was also one of major
reasons of behavior difference between ColumnStore and other engines in
MariaDB family.
Also, this patch fixes some other bugs and incorrect behavior, for
example, incorrect comparison for "column <= ''" which evaluates to
constant True for all purposes before this patch.
Added logical transformation of the execplan::ParseTrees with the taking out the common factor in expression of the form "(A and B) or (A and C)" for the purposes of passing a TPCH 19 query.
Co-authored-by: Leonid Fedorov <leonid.fedorov@mariadb.com>
1. In TupleUnion::writeNull(), add the missing switch case for
wide decimal with 16bytes column width.
2. MCOL-5432 Disable complete/partial pushdown of UNION operation
if the query involves an ORDER BY or a LIMIT clause, until
MCOL-5222 is fixed. Also add MTR test cases for this.
When a UNION operation involving DECIMAL datatypes with scale and digits
before the decimal exceeds the currently supported maximum precision
of 38, we throw an error to the user:
"MCS-2060: Union operation exceeds maximum DECIMAL precision of 38".
This is until MCOL-5417 is implemented where ColumnStore will have
full parity with MariaDB server in terms of maximum supported DECIMAL
precision and scale of 65 and 38 digits respectively.
Disable check for correlated subqueries, basically those types of queries transforms
to join (aggr(table2), table1), table2) and post join scalar filter.
* Sort test result so the test case would pass
* Server message has been changes
* Added schema name in query for rows in test case only. Also use lower case schema name
* Changed database to lower case
* Run test case in its own database to avoid table already exists error
Co-authored-by: root <root@rocky8.localdomain>
This patch fixs the reported JIRA issue MCOL 5205, which consists of a wrong union type from two input Int types. The bug results in wrong unioned answers in CS. The fix includes more INT case discussions. Additionaly, this patch provides detailed unit tests for correctness in UNION processing with Int.
Signed-off-by: Jigao Luo <luojigao@outlook.com>