- Checking that the key expression is compatible with the INDEX BY data type
for assignment in expressions:
assoc_array_variable(key_expr)
assoc_array_variable(key_expr).field
in all contexts: SELECT, assignment target, INTO target.
Raising an error in case it's not compatible.
- Disallowing non-constant expressions as a key,
as the key is evaluated during the fix_fields() time.
- Disallowing stored functions as a key:
assoc_array(stored_function())
assoc_array(stored_function()).field
The underlying MariaDB code is not ready to call a stored function
during the fix_fields() time. This will be fixed in a separate MDEV.
- Removing the move Assoc_array_data's constructor.
Using the usual constructor instead.
- Setting m_key.thread_specific and m_value.thread_specific to true
in the Assoc_array_data constructor. This is needed to get assoc array
element data counted by the @@session.memory_used status variable.
Adding DBUG_ASSERTs to make sure the thread_specific flag never
disappears in Assoc_array_data members.
- Removing my_free(item) from Field_assoc_array::element_by_key.
It was a remainder from an earlier patch version.
In the current patch version all Items behind an assoc array are
created on a mem_root. It's wrong to use my_free() with them.
- Adding a helper method Field_assoc_array::assoc_tree_search()
- Fixing assoc_array_var.delete() to work as a procedure
rather than a function. It does not need SELECT/DO any more.
- Fixing the crash in a few ctype_xxx tests, caused by the grammar change.
- Fixing compilation failure on Windows
- Adding a new method LEX::set_field_type_udt_or_typedef()
and removing duplicate code from sql_yacc.yy
- Renaming the grammar rule field_type_all_with_composites to
field_type_all_with_typedefs
- Removing the grammar rule assoc_array_index_types.
Changing the grammar to "INDEX_SYM BY field_type".
Removing the grammar rule field_type_all_with_record.
Allow field_type_all_with_typedefs as an assoc array element.
Catching wrong index and element data types has been moved to
Type_handler_assoc_array::Column_definition_set_attributes().
It raises an SQL error on things like:
* assoc array of assoc arrays in TABLE OF
* index by a non-supported types in INDEX BY
- Removing four methods:
* sp_type_def_list::type_defs_add_record()
* sp_type_def_list::type_defs_add_composite2()
* sp_pcontext::type_defs_declare_record()
* sp_type_def_list::type_defs_declare_composite2()
Adding two methods instead:
* sp_type_def_list::type_defs_add()
* sp_pcontext::type_defs_add()
This allows to get rid of the duplicate code detecting data type
declarations with the same name in the same sp_pcontext frame.
- Adding new methods:
* LEX::declare_type_assoc_array()
* LEX::LEX::declare_type_record()
They create a type specific sp_type_def_xxx and the call the generic
sp_pcontext::type_defs_add().
- m_key_def.sp_prepare_create_field() inside
Field_assoc_array::create_fields() is now called for all key data types
(not only for integers)
- Removing the assignment of key_def->charset in
Type_handler_assoc_array::sp_variable_declarations_finalize().
The charset is now evaluated in m_key_def.sp_prepare_create_field().
- Fixing Item_assoc_array::get_key() to set the character set of the "key"
to utf8mb3 instead of binary
- Fixing Field_assoc_array::copy_and_convert_key() to set the key length
limit in terms of the character length as specified in
INDEX BY VARCHAR(N), instead of octet length. This is needed to make
keys with multi-byte characters work correctly.
Also it now raises different errors depending on the reason of the
key conversion failures:
* ER_INVALID_CHARACTER_STRING
* ER_CANNOT_CONVERT_CHARACTER
- Changing the prototype for Type_handler_composite::key_to_lex_cstring() to
virtual LEX_CSTRING key_to_lex_cstring(THD *thd,
const sp_rcontext_addr &var,
Item **key,
String *buffer) const;
* Now it returns a LEX_CSTRING, instead of getting it as an out parameter.
* Gets an sp_rcontext_addr instead of "name" and "def"
* Gets a String buffer which can be used to be passed to val_str(),
or for character set conversion purposes.
- Removing Field_assoc_array::m_key_def, as all required information
is available from Field_assoc_array::m_key_field.
In Field_assoc_array::create_fields turning m_key_def to a local variable
key_def.
- Fixing Field_assoc_array::copy_and_convert_key() to follow MariaDB coding
style: only constants can be passed by-reference, not-constants should
be passed by-pointer.
- Adding DBUG_ASSERTs into Type_handler_assoc_array::get_item()
and Type_handler_assoc_array::get_or_create_item() that the passed
key in "name" is well formed according to the charset of INDEX BY.
- Changing the error ER_TOO_LONG_KEY to ER_WRONG_STRING_LENGTH.
The former prints length limit in bytes, which is not applicable
for INDEX BY values, because its limit is in characters.
Also, the latter is more verbose.
- Fixing the problem that these wrong uses of an assoc array variable:
BEGIN
assoc_var;
assoc_var(1);
END;
raised a weird error message:
ERROR 1054 (42S22): Unknown column 'assoc_var' in '(null)'
Now a more readable parse error is raised.
- Adding a "Duplicate key" warning for the cases when assigning
between two assoc arrays rejects some records due to different
collations in their INDEX BY key definitions.
- Disallow INDEX OF propagation from VARCHAR to TEXT.
The underlying code cannot handle TEXT.
Adding tests.
- Adding a helper class StringBufferKey to pass to val_str() when
a key value is evaluated.
Fixing all val_str() calls to val_str(&buffer), as the former is
not desirable.
- Fixing a wrong use of args[0]->null_value in
Item_func_assoc_array_exists::val_bool()
- Fixing a problem that using TABLE OF TEXT crashed the server.
Thanks to Iqbal Hassan for the proposed patch.
- Changes in Qualified_ident:
* Fixing the Qualified_ident constructors to get all parst as
Lex_ident_cli_st, rather than the first part as Lex_ident_cli_st
with the following parts Lex_ident_sys.
This makes the code more symmetric.
* Fixing the grammar in sql_yacc.yy accordinly.
* Fixing the data type storing the possition in the client query
from "const char *" to Lex_ident_cli.
* Adding a new method Qualified_ident::is_sane().
It allows to reduce the code side in sql_yacc.yy.
Thanks to Iqbal Hassan for the idea.
- Replacing qs_append() to append_ulonglong() in:
* Item_method_func::print()
* Item_splocal_assoc_array_element::print()
* Item_splocal_assoc_array_element_field::print()
These methods do not use reserve()/alloc(), so calling qs_append()
was wrong and caused a crash.
- Changing the output formats of these methods:
* Item_splocal_assoc_array_element::print()
* Item_splocal_assoc_array_element_field::print()
not to print the key two times.
Also moving the `@123` part (the variable offset) immediately
after the variabl name and before the `[key]` part.
- Fixing a memory leak happened when trying to insert a duplicate
key into an assoc array. Also adding a new "THD *" parameter to
Field_assoc_array::insert_element(). Thanks to Iqbal Hassan for the fix.
Adding a test into sp-assoc-array-ctype.test.
- In Field_assoc_array::create_fields: m_element_field->field_name is now
set for all element data types (not only for records).
This fixed a wrong variable name in warnings. Adding tests.
- Adding tests:
* Adding tests for assoc array elements in UNIONs.
* Copying from an assoc array with a varchar key
to an assoc array with a shorter varchar key.
* A relatively big associative array.
* Memory usage for x86_64.
* Package variable as assoc array keys.
* Character set conversion
* TABLE OF TEXT
* TABLE OF VARCHAR(>64k bytes) propagation to TABLE OF TEXT.
* TEXT element fields in an array of records.
* VARCHAR->TEXT propagation in elements in an array of records.
* Some more tests
This patch adds support for associative arrays in stored procedures
for sql_mode=ORACLE.
The syntax follows Oracle's PL/SQL syntax for associative arrays -
TYPE assoc_array_t IS TABLE OF VARCHAR2(100) INDEX BY INTEGER;
or
TYPE assoc_array_t IS TABLE OF record_t INDEX BY VARCHAR2(100);
where record_t is a record type.
The following functions were added for associative arrays:
- COUNT - Retrieve the number of elements within the arra
- EXISTS - Check whether given key exists in the array
- FIRST - Retrieve the first key in the array
- LAST - Retrieve the last key in the array
- PRIOR - Retrieve the key before the given key
- NEXT - Retrieve the key after the given key
- DELETE - Remove the element with the given key or remove all elements
if no key is given
The arrays/elements can be initialized with the following methods:
- Constructor
i.e. array:= assoc_array_t('key1'=>1, 'key2'=>2, 'key3'=>3)
- Assignment
i.e. array(key):= record_t(1, 2)
- SELECT INTO
i.e. SELECT x INTO array(key)
TODOs:
- Nested tables are not supported yet.
i.e. TYPE assoc_array_t IS TABLE OF other_assoc_array_t INDEX BY INTEGER;
- Associative arrays comparisons are not supported yet.
- Moving the definition of "class Type_handler_row" into a new file
sql_type_row.h. Also moving *some* of its methods into sql_type_row.cc.
The rest of the methods will be moved in the patch for MDEV-34319.
Moving the definition of my_var_sp_row_field into sql_type_row.cc.
- Fixing the grammar for function_call_generic to get the first
production as "ident_cli_func" rather than "ident_func".
The upcoming patch needs to know the position of the function name
within the client query.
- Adding new data types to store data types defined by "TYPE" declarations:
* sp_type_def
* sp_type_def_list
sp_pcontext now derives from sp_type_def_list
- A new virtual method in Field:
virtual Item_field *make_item_field_spvar(THD *thd,
const Spvar_definition &def);
Using it in sp_rcontext::init_var_items().
- Fixing my_var_sp to get sp_rcontext_addr in the parameter
instead of two separate parameters (rcontext_handler + offset).
- Adding new virtual methods in my_var:
virtual bool set_row(THD *thd, List<Item> &select_list);
It's used when a select_list record is assigned to a
single composite variable, such as ROW, specified in the INTO clause.
Using it in select_dumpvar::send_data().
virtual bool check_assignability(THD *thd,
const List<Item> &select_list,
bool *assign_as_row) const;
It's used to check if the select_list is compatible with
a single INTO variable, in select_dumpvar::prepare().
- Fixing LEX methods create_outvar() to get identifiers
a Lex_ident_sys_st values instead of generic LEX_CSTRING values.
- Adding virtual methods in Type_handler:
// Used in Item_func_null_predicate::check_arguments()
virtual bool has_null_predicate() const;
// Used in LEX::sp_variable_declarations_finalize()
virtual bool sp_variable_declarations_finalize(THD *thd,
LEX *lex, int nvars,
const Column_definition &def)
const;
// Handle SELECT 1 INTO spvar;
virtual my_var *make_outvar(THD *thd,
const Lex_ident_sys_st &name,
const sp_rcontext_addr &addr,
sp_head *sphead,
bool validate_only) const;
// Handle SELECT 1 INTO spvar.field;
virtual my_var *make_outvar_field(THD *thd,
const Lex_ident_sys_st &name,
const sp_rcontext_addr &addr,
const Lex_ident_sys_st &field,
sp_head *sphead,
bool validate_only) const;
// create the value in: DECLARE var rec_t DEFAULT rec_t(1,'c');
virtual Item *make_typedef_constructor_item(THD *thd,
const sp_type_def &def,
List<Item> *arg_list) const;
- A new helper method:
Row_definition_list *Row_definition_list::deep_copy(THD *thd) const;
The patch for SYS_REFCURSOR (MDEV-20034) overrode these methods:
- Item_func_case_searched::check_arguments()
- Item_func_if::check_arguments()
to validate WHEN-style arguments (e.g. args[0] in case of IF) for being
able to return a boolean result.
However, this unintentionally removed the test for the THEN-style arguments
that they are not expressions of the ROW data type.
This led to a crash inside Type_handler_hybrid_field_type::aggregate_for_result
on a DBUG_ASSERT that arguments are not of the ROW data type.
Fix:
The fix restores blocking ROW expressions in the not supported cases,
to avoid the DBUG_ASSERT and to raise an SQL error instead.
Blocking ROW_RESULT expressions is done per Item_func_case_expression
descendant individually, instead of blocking any ROW_RESULT arguments
at the Item_func_case_expression level.
The fix is done taking into account the upcoming patch for associative arrays
(MDEV-34319). It should be possible to pass associative array expressions into
some hybrid type functions, where ROW type expressions are not possible.
As a side effect, some lecagy ER_OPERAND_COLUMNS changed to
a newer ER_ILLEGAL_PARAMETER_DATA_TYPE_FOR_OPERATION
Changes in the top affected class Item_func_case_expression:
- item_func.h:
Overriding Item_func_case_expression::check_arguments() to return false,
without checking any arguments. Descendant validate arguments
in a various different ways. No needs to block all non-scalar data type at
this level, to prevent disallowing associative arrays.
Changes in descendants:
- item_cmpfunc.cc:
Adding a test in Item_func_case_simple::aggregate_switch_and_when_arguments()
preventing passing ROW_RESULT expression in predicant and WHEN in a
simple CASE:
CASE predicant WHEN when1 THEN .. WHEN when2 THEN .. END;
This is not supported yet. Should be preferrably fixed before MDEV-34319.
- item_cmpfunc.cc:
Calling args[0]->type_handler()->Item_hybrid_func_fix_attributes()
from Item_func_nullif::fix_length_and_dec().
This prevents a ROW expression to be passed to args[0] of NULLIF().
But will allow to pass associative arrays.
args[1] is still only checked to be comparable with args[0].
No needs to add additional tests for it.
- item_cmpfunc.h:
Adding a call for Item_hybrid_func_fix_attributes() in
Item_func_case_abbreviation2::cache_type_info().
This prevents calling the descendant functions with
a ROW expression in combination with an explicit NULL
in the THEN-style arguments (but will allow to pass associative arrays):
IFNULL(row_expression, NULL)
IFNULL(NULL, row_expression)
IF(switch, row_expression, NULL)
IF(switch, NULL, row_expression)
NVL2(switch, row_expression, NULL)
NVL2(switch, NULL, row_expression)
Adding a THD* argument into involved methods.
- item_cmpfunc.h:
Overriding Item_func_case_abbreviation2_switch::check_arguments() to
check that the first argument in IF() and NVL2() can return bool.
Removing Item_func_if::check_arguments(), as it become redundant.
- sql_type.cc:
Fixing sql_type.cc not to disallow items[0] with ROW_RESULT.
This makes it call Item_hybrid_func_fix_attributes() at the end,
which block ROW arguments into THEN-style arguments of hybrid functions.
But this will allow to pass Type_handler_assoc_array expressions.
- sql_type.cc:
Changing Type_handler_row::Item_hybrid_func_fix_attributes to raise the
ER_ILLEGAL_PARAMETER_DATA_TYPE_FOR_OPERATION error instead of the DBUG_ASSERT.
This patch adds support for SYS_REFCURSOR (a weakly typed cursor)
for both sql_mode=ORACLE and sql_mode=DEFAULT.
Works as a regular stored routine variable, parameter and return value:
- can be passed as an IN parameter to stored functions and procedures
- can be passed as an INOUT and OUT parameter to stored procedures
- can be returned from a stored function
Note, strongly typed REF CURSOR will be added separately.
Note, to maintain dependencies easier, some parts of sql_class.h
and item.h were moved to new header files:
- select_results.h:
class select_result_sink
class select_result
class select_result_interceptor
- sp_cursor.h:
class sp_cursor_statistics
class sp_cursor
- sp_rcontext_handler.h
class Sp_rcontext_handler and its descendants
The implementation consists of the following parts:
- A new class sp_cursor_array deriving from Dynamic_array
- A new class Statement_rcontext which contains data shared
between sub-statements of a compound statement.
It has a member m_statement_cursors of the sp_cursor_array data type,
as well as open cursor counter. THD inherits from Statement_rcontext.
- A new data type handler Type_handler_sys_refcursor in plugins/type_cursor/
It is designed to store uint16 references -
positions of the cursor in THD::m_statement_cursors.
- Type_handler_sys_refcursor suppresses some derived numeric features.
When a SYS_REFCURSOR variable is used as an integer an error is raised.
- A new abstract class sp_instr_fetch_cursor. It's needed to share
the common code between "OPEN cur" (for static cursors) and
"OPER cur FOR stmt" (for SYS_REFCURSORs).
- New sp_instr classes:
* sp_instr_copen_by_ref - OPEN sys_ref_curor FOR stmt;
* sp_instr_cfetch_by_ref - FETCH sys_ref_cursor INTO targets;
* sp_instr_cclose_by_ref - CLOSE sys_ref_cursor;
* sp_instr_destruct_variable - to destruct SYS_REFCURSOR variables when
the execution goes out of the BEGIN..END block
where SYS_REFCURSOR variables are declared.
- New methods in LEX:
* sp_open_cursor_for_stmt - handles "OPEN sys_ref_cursor FOR stmt".
* sp_add_instr_fetch_cursor - "FETCH cur INTO targets" for both
static cursors and SYS_REFCURSORs.
* sp_close - handles "CLOSE cur" both for static cursors and SYS_REFCURSORs.
- Changes in cursor functions to handle both static cursors and SYS_REFCURSORs:
* Item_func_cursor_isopen
* Item_func_cursor_found
* Item_func_cursor_notfound
* Item_func_cursor_rowcount
- A new system variable @@max_open_cursors - to limit the number
of cursors (static and SYS_REFCURSORs) opened at the same time.
Its allowed range is [0-65536], with 50 by default.
- A new virtual method Type_handler::can_return_bool() telling
if calling item->val_bool() is allowed for Items of this data type,
or if otherwise the "Illegal parameter for operation" error should be raised
at fix_fields() time.
- New methods in Sp_rcontext_handler:
* get_cursor()
* get_cursor_by_ref()
- A new class Sp_rcontext_handler_statement to handle top level statement
wide cursors which are shared by all substatements.
- A new virtual method expr_event_handler() in classes Item and Field.
It's needed to close (and make available for a new OPEN)
unused THD::m_statement_cursors elements which do not have any references
any more. It can happen in various moments in time, e.g.
* after evaluation parameters of an SQL routine
* after assigning a cursor expression into a SYS_REFCURSOR variable
* when leaving a BEGIN..END block with SYS_REFCURSOR variables
* after setting OUT/INOUT routine actual parameters from formal
parameters.
normalize_cond() translated `WHERE col` into `WHERE col<>0`
But the opetator "not equal to 0" does not necessarily exists
for all data types.
For example, the query:
SELECT * FROM t1 WHERE inet6col;
was translated to:
SELECT * FROM t1 WHERE inet6col<>0;
which further failed with this error:
ERROR : Illegal parameter data types inet6 and bigint for operation '<>'
This patch changes the translation from `col<>0` to `col IS TRUE`.
So now
SELECT * FROM t1 WHERE inet6col;
gets translated to:
SELECT * FROM t1 WHERE inet6col IS TRUE;
Details:
1. Implementing methods:
- Field_longstr::val_bool()
- Field_string::val_bool()
- Item::val_int_from_val_str()
If the input contains bad data,
these methods raise a better error message:
Truncated incorrect BOOLEAN value
Before the change, the error was:
Truncated incorrect DOUBLE value
2. Fixing normalize_cond() to generate Item_func_istrue/Item_func_isfalse
instances instead of Item_func_ne/Item_func_eq
3. Making Item_func_truth sargable, so it uses the range optimizer.
Implementing the following methods:
- get_mm_tree(), get_mm_leaf(), add_key_fields() in Item_func_truth.
- get_func_mm_tree(), for all Item_func_truth descendants.
4. Implementing the method negated_item() for all Item_func_truth
descendants, so the negated item has a chance to be sargable:
For example,
WHERE NOT col IS NOT FALSE -- this notation is not sargable
is now translated to:
WHERE col IS FALSE -- this notation is sargable
MDEV-33407 Parser support for vector indexes
The syntax is
create table t1 (... vector index (v) ...);
limitation:
* v is a binary string and NOT NULL
* only one vector index per table
* temporary tables are not supported
MDEV-33404 Engine-independent indexes: subtable method
added support for so-called "high level indexes", they are not visible
to the storage engine, implemented on the sql level. For every such
an index in a table, say, t1, the server implicitly creates a second
table named, like, t1#i#05 (where "05" is the index number in t1).
This table has a fixed structure, no frm, not accessible directly,
doesn't go into the table cache, needs no MDLs.
MDEV-33406 basic optimizer support for k-NN searches
for a query like SELECT ... ORDER BY func() optimizer will use
item_func->part_of_sortkey() to decide what keys can be used
to resolve ORDER BY.
The task "MDEV-25829 Change default Unicode collation to uca1400_ai_ci"
previously changed collation derivation for string user variables
from DERIVATION_EXPLICIT to DERIVATION_COERCIBLE, to resolve illegal
collation mix conflicts between table columns and user variables
when they have different collations.
However, DERIVATION_COERCIBLE was a wrong choice because it caused
conflicts between string literals and user variables when they have
different collations.
Adding a new collation derivation level DERIVATION_USERVAR.
This makes the collation of a user variable:
- weaker than a table column (like it was intended by MDEV-25829)
- but stronger than a literal (like it was in pre-MDEV-25829)
Cleanup in sql_type.h:
Removing the line "- BINARY(expr)" from the before-DERIVATION_CAST
comment, as it was on a wrong place. It's also listed on the correct
place before DERIVATION_IMPLICIT.
Changing the return type of the following functions:
- CURRENT_TIMESTAMP, CURRENT_TIMESTAMP(), NOW()
- SYSDATE()
- FROM_UNIXTIME()
from DATETIME to TIMESTAMP.
Note, the old function NOW() returning DATETIME is still available
as LOCALTIMESTAMP or LOCALTIMESTAMP(), e.g.:
SELECT
LOCALTIMESTAMP, -- DATETIME
CURRENT_TIMESTAMP; -- TIMESTAMP
The change in the functions return data type fixes some problems
that occurred near a DST change:
- Problem #1
INSERT INTO t1 (timestamp_field) VALUES (CURRENT_TIMESTAMP);
INSERT INTO t1 (timestamp_field) VALUES (COALESCE(CURRENT_TIMESTAMP));
could result into two different values inserted.
- Problem #2
INSERT INTO t1 (timestamp_field) VALUES (FROM_UNIXTIME(1288477526));
INSERT INTO t1 (timestamp_field) VALUES (FROM_UNIXTIME(1288477526+3600));
could result into two equal TIMESTAMP values near a DST change.
Additional changes:
- FROM_UNIXTIME(0) now returns SQL NULL instead of '1970-01-01 00:00:00'
(assuming time_zone='+00:00')
- UNIX_TIMESTAMP('1970-01-01 00:00:00') now returns SQL NULL instead of 0
(assuming time_zone='+00:00'
These additional changes are needed for consistency with TIMESTAMP fields,
which cannot store '1970-01-01 00:00:00 +00:00'
Search conditions were evaluated using val_int(), which was wrong.
Fixing the code to use val_bool() instead.
Details:
- Adding a new item_base_t::IS_COND flag which marks Items used
as <search condition> in WHERE, HAVING, JOIN ON, CASE WHEN clauses.
The flag is at the parse time.
These expressions must be evaluated using val_bool() rather than val_int().
Note, the optimizer creates more Items which are used as search conditions.
Most of these items are not marked with IS_COND yet. This is OK for now,
but eventually these Items can also be fixed to have the flag.
- Adding a method Item::is_cond() which tests if the Item has the IS_COND flag.
- Implementing Item_cache_bool. It evaluates the cached expression using
val_bool() rather than val_int().
Overriding Type_handler_bool::Item_get_cache() to create Item_cache_bool.
- Implementing Item::save_bool_in_field(). It uses val_bool() rather than
val_int() to evaluate the expression.
- Implementing Type_handler_bool::Item_save_in_field()
using Item::save_bool_in_field().
- Fixing all Item_bool_func descendants to implement a virtual val_bool()
rather than a virtual val_int().
- To find places where val_int() should be fixed to val_bool(), a few
DBUG_ASSERT(!is_cond()) where added into val_int() implementations
of selected (most frequent) classes:
Item_field
Item_str_func
Item_datefunc
Item_timefunc
Item_datetimefunc
Item_cache_bool
Item_bool_func
Item_func_hybrid_field_type
Item_basic_constant descendants
- Fixing all places where DBUG_ASSERT() happened during an "mtr" run
to use val_bool() instead of val_int().
A mixture of a multi-byte *TEXT column and a short binary column
produced a too large column.
For example, COALESCE(tinytext_utf8mb4, short_varbinary)
produced a BLOB column instead of an expected TINYBLOB.
- Adding a virtual method Type_all_attributes::character_octet_length(),
returning max_length by default.
- Overriding Item_field::character_octet_length() to extract
the octet length from the underlying Field.
- Overriding Item_ref::character_octet_length() to extract
the octet length from the references Item (e.g. as VIEW fields).
- Fixing Type_numeric_attributes::find_max_octet_length() to
take the octet length using the new method character_octet_length()
instead of accessing max_length directly.
Fixing applying the COLLATE clause to a parameter caused an error error:
COLLATION '...' is not valid for CHARACTER SET 'binary'
Fix:
- Changing the collation derivation for a non-prepared Item_param
to DERIVATION_IGNORABLE.
- Allowing to apply any COLLATE clause to expressions with DERIVATION_IGNORABLE.
This includes:
1. A non-prepared Item_param
2. An explicit NULL
3. Expressions derived from #1 and #2
For example:
SELECT ? COLLATE utf8mb_unicode_ci;
SELECT NULL COLLATE utf8mb_unicode_ci;
SELECT CONCAT(?) COLLATE utf8mb_unicode_ci;
SELECT CONCAT(NULL) COLLATE utf8mb_unicode_ci
- Additional change: preserving the collation of an expression when
the expression gets assigned to a PS parameter and evaluates to SQL NULL.
Before this change, the collation of the parameter was erroneously set
to &my_charset_binary.
- Additional change: removing the multiplication to mbmaxlen from the
fix_char_length_ulonglong() argument, because the multiplication already
happens inside fix_char_length_ulonglong().
This fixes a too large column size created for a COLLATE clause.
MDEV-32188 make TIMESTAMP use whole 32-bit unsigned range
- Changed usage of timeval to my_timeval as the timeval parts on windows
are 32-bit long, which causes some compiler issues on windows.
Step#2 - Adding a new collation derivation level for CAST and CONVERT.
Now character string cast functions:
- CAST(string_expr AS CHAR)
- CONVERT(expr USING charset_name)
have a new collation derivation level between:
- string literals
- utf8 metadata functions, e.g. user() and database()
Before the change these cast functions had collation derivation equal
to table columns, which caused more illegal mix of collation conflicts.
Note, binary string cast functions:
- BINARY(expr)
- CAST(string_expr AS BINARY)
- CONVERT(expr USING binary)
did not change their collation derivation, to preserve the behaviour of
queries like these:
SELECT database()=BINARY'test';
SELECT user()=CAST('root' AS BINARY);
SELECT current_role()=CONVERT('role' USING binary);
Derivation levels after the change look as follows:
DERIVATION_IGNORABLE= 7, // Explicit NULL
DERIVATION_NUMERIC= 6, // Numbers in string context,
// Numeric user variables
// CAST(numeric_expr AS CHAR)
DERIVATION_COERCIBLE= 5, // Literals, string user variables
DERIVATION_CAST= 4, // CAST(string_expr AS CHAR),
// CONVERT(string_expr USING cs)
DERIVATION_SYSCONST= 3, // utf8 metadata functions, e.g. user(), database()
DERIVATION_IMPLICIT= 2, // Table columns, SP variables, BINARY(expr)
DERIVATION_NONE= 1, // A mix (e.g. CONCAT) of two differrent collations
DERIVATION_EXPLICIT= 0 // An explicit COLLATE clause
Fixing the problem that an operation involving a mix of
two or more GEOMETRY operands did not preserve their SRIDs.
Now SRIDs are preserved by hybrid functions, subqueries, TVCs, UNIONs, VIEWs.
Incompatible change:
An attempt to mix two different SRIDs now raises an error.
Details:
- Adding a new class Type_extra_attributes. It's a generic
container which can store very specific data type attributes.
For now it can store one uint32 and one const pointer attribute
(for GEOMETRY's SRID and for ENUM/SET TYPELIB respectively).
In the future it can grow as needed.
Type_extra_attributes will also be reused soon to store "const Type_zone*"
pointers for the TIMESTAMP's "WITH TIME ZONE 'tz'" attribute
(a timestamp data type with a fixed time zone independent from @@time_zone).
The time zone attribute will be stored in exactly the same way like
a TYPELIB pointer is stored by ENUM/SET.
- Removing Column_definition_attributes members "interval" and "srid".
Deriving Column_definition_attributes from the generic attribute container
Type_extra_attributes instead.
- Adding a new class Type_typelib_attributes, to store
the TYPELIB of the ENUM and SET data types. Deriving Field_enum from it.
Removing the member Field_enum::typelib.
- Adding a new class Type_geom_attributes, to store
the GEOMETRY related attributes. Deriving Field_geom from it.
Removing the member Field_geom::srid.
- Removing virtual methods:
Field::get_typelib()
Type_all_attributes::get_typelib() and
Type_all_attributes::set_typelib()
They were very specific to TYPELIB.
Adding more generic virtual methods instead:
* Field::type_extra_attributes() - to get extra attributes
* Type_all_attributes::type_extra_attributes() - to get extra attributes
* Type_all_attributes::type_extra_attributes_addr() - to set extra attributes
- Removing Item_type_holder::enum_set_typelib. Deriving Item_type_holder
from the generic attribute container Type_extra_attributes instead.
This makes it possible for UNION to preserve SRID
(in addition to preserving TYPELIB).
- Deriving Item_hybrid_func from Type_extra_attributes.
This makes it possible for hybrid functions (e.g. CASE, COALESCE,
LEAST, GREATEST etc) to preserve SRID.
- Deriving Item_singlerow_subselect from Type_extra_attributes and
overriding methods:
* Item_cache::type_extra_attributes()
* subselect_single_select_engine::fix_length_and_dec()
* Item_singlerow_subselect::type_extra_attributes()
* Item_singlerow_subselect::type_extra_attributes_addr()
This is needed to preserve SRID in subqueries and TVCs
- Cleanup: fixing the data type of members
* Binlog_type_info::m_enum_typelib
* Binlog_type_info::m_set_typelib
from "TYPELIB *" to "const TYPELIB *"
Some fixes related to commit f838b2d799 and
Rows_log_event::do_apply_event() and Update_rows_log_event::do_exec_row()
for system-versioned tables were provided by Nikita Malyavin.
This was required by test versioning.rpl,trx_id,row.
Problem:
REPAIR TABLE executed for a pre-MDEV-29959 table (with the old UUID format)
updated the server version in the FRM file without rewriting the data,
so it created a new FRM for old UUIDs. After that MariaDB could not
read UUIDs correctly.
Fix:
- Adding a new virtual method in class Type_handler:
virtual bool type_handler_for_implicit_upgrade() const;
* For the up-to-date data types it returns "this".
* For the data types which need to be implicitly upgraded
during REPAIR TABLE or ALTER TABLE, it returns a pointer
to a new replacement data type handler.
Old VARCHAR and old UUID type handlers override this method.
See more comments below.
- Changing the semantics of the method
Type_handler::Column_definition_implicit_upgrade(Column_definition *c)
to the opposite, so now:
* c->type_handler() references the old data type (to upgrade from)
* "this" references the new data type (to upgrade to).
Before this change Column_definition_implicit_upgrade() was supposed
to be called with the old data type handler (to upgrade from).
Renaming the method to Column_definition_implicit_upgrade_to_this(),
to avoid automatic merges in this method.
Reflecting this change in Create_field::upgrade_data_types().
- Replacing the hard-coded data type tests inside handler::check_old_types()
to a call for the new virtual method
Type_handler::type_handler_for_implicit_upgrade()
- Overriding Type_handler_fbt::type_handler_for_implicit_upgrade()
to call a new method FbtImpl::type_handler_for_implicit_upgrade().
Reasoning:
Type_handler_fbt is a template, so it has access only to "this".
So in case of UUID data types, the type handler for old UUID
knows nothing about the type handler of new UUID inside sql_type_fixedbin.h.
So let's have Type_handler_fbt delegate type_handler_for_implicit_upgrade()
to its Type_collection, which knows both new UUID and old UUID.
- Adding Type_collection_uuid::type_handler_for_implicit_upgrade().
It returns a pointer to the new UUID type handler.
- Overriding Type_handler_var_string::type_handler_for_implicit_upgrade()
to return a pointer to type_handler_varchar (true VARCHAR).
- Cleanup: these two methods:
handler::check_old_types()
handler::ha_check_for_upgrade()
were always called consequently.
So moving the call for check_old_types() inside ha_check_for_upgrade(),
and making check_old_types() private.
- Cleanup: removing the "bool varchar" parameter from fill_alter_inplace_info(),
as its not used any more.
Functions extracting non-negative datetime components:
- YEAR(dt), EXTRACT(YEAR FROM dt)
- QUARTER(td), EXTRACT(QUARTER FROM dt)
- MONTH(dt), EXTRACT(MONTH FROM dt)
- WEEK(dt), EXTRACT(WEEK FROM dt)
- HOUR(dt),
- MINUTE(dt),
- SECOND(dt),
- MICROSECOND(dt),
- DAYOFYEAR(dt)
- EXTRACT(YEAR_MONTH FROM dt)
did not set their max_length properly, so in the DECIMAL
context they created a too small DECIMAL column, which
led to the 'Out of range value' error.
The problem is that most of these functions historically
returned the signed INT data type.
There were two simple ways to fix these functions:
1. Add +1 to max_length.
But this would also change their size in the string context
and create too long VARCHAR columns, with +1 excessive size.
2. Preserve max_length, but change the data type from INT to INT UNSIGNED.
But this would break backward compatibility.
Also, using UNSIGNED is generally not desirable,
it's better to stay with signed when possible.
This fix implements another solution, which it makes all these functions
work well in all contexts: int, decimal, string.
Fix details:
- Adding a new special class Type_handler_long_ge0 - the data type
handler for expressions which:
* should look like normal signed INT
* but which known not to return negative values
Expressions handled by Type_handler_long_ge0 store in Item::max_length
only the number of digits, without adding +1 for the sign.
- Fixing Item_extract to use Type_handler_long_ge0
for non-negative datetime components:
YEAR, YEAR_MONTH, QUARTER, MONTH, WEEK
- Adding a new abstract class Item_long_ge0_func, for functions
returning non-negative datetime components.
Item_long_ge0_func uses Type_handler_long_ge0 as the type handler.
The class hierarchy now looks as follows:
Item_long_ge0_func
Item_long_func_date_field
Item_func_to_days
Item_func_dayofmonth
Item_func_dayofyear
Item_func_quarter
Item_func_year
Item_long_func_time_field
Item_func_hour
Item_func_minute
Item_func_second
Item_func_microsecond
- Cleanup: EXTRACT(QUARTER FROM dt) created an excessive VARCHAR column
in string context. Changing its length from 2 to 1.
This original query:
(1) SELECT ts0 FROM t1
WHERE DATE(ts0) <= '2024-01-23';
was rewritten (by MDEV-8320) to:
(2) SELECT ts0 FROM t1
WHERE ts0 <= '2024-01-23 23:59.59.999999';
-- DATETIME comparison, Item_datetime on the right side
which was further optimized (by MDEV-32148) to:
(3) SELECT ts0 FROM t1
WHERE ts0 <= TIMESTAMP/* WITH LOCAL TIME ZONE*/ '2024-01-23 23:59.59.999999';
-- TIMESTAMP comparison, Item_timestamp_literal on the right side
The origin of the problem was in (2) - in the MDEV-8320 related code.
The recent new code for MDEV-32148 revealed this problem.
Item_datetime on step (2) was always created in an inconsistent way:
- with Item::decimals==0
- with ltime.second_part==999999,
without taking into account the precision of the left side
(e.g. ts0 in the above example)
On step (3), Item_timestamp_literal was created in an inconsistent way too,
because it copied the inconsistent data from step (2):
- with Item::decimals==0 (copied from Item_datetime::decimals)
- with m_value.tv_usec==999999 (copied from ltime.second_part of Item_datetime)
Later, the Item_timestamp_literal performed save_in_field()
and crashed in my_timestamp_to_binary() on a DBUG_ASSERT checking
consistency between the fractional precision and the fractional seconds value.
Fix:
On step (2) create Item_datetime with truncating maximum possible
second_part value of 999999 according to the the left side fractional
second precision. So for example it sets second_part as follows:
- 000000 for TIMESTAMP(0)
- 999000 for TIMESTAMP(3)
- 999999 for TIMESTAMP(6)
This automatically makes the code create a consistent Item_timestamp_literal
on step (3).
This also makes TIMESTAMP comparison work faster, because now
Item_timestamp_literal is created with Item::decimals value equal
to the Item_field (which is on the other side of the comparison),
so the low level function Type_handler_timestamp_common::cmp_native()
goes the fastest execution path optimized for the case when both sides
have equal fractional precision.
Adding a helper class TimeOfDay to reuse the code when populating:
- the last datetime point for YEAR()
- the last datetime point for DATE()
with a given fractional precision.
This class also helped to unify the equal code in create_start_bound()
and create_end_bound() into a single method create_bound().