The task "MDEV-25829 Change default Unicode collation to uca1400_ai_ci"
previously changed collation derivation for string user variables
from DERIVATION_EXPLICIT to DERIVATION_COERCIBLE, to resolve illegal
collation mix conflicts between table columns and user variables
when they have different collations.
However, DERIVATION_COERCIBLE was a wrong choice because it caused
conflicts between string literals and user variables when they have
different collations.
Adding a new collation derivation level DERIVATION_USERVAR.
This makes the collation of a user variable:
- weaker than a table column (like it was intended by MDEV-25829)
- but stronger than a literal (like it was in pre-MDEV-25829)
Cleanup in sql_type.h:
Removing the line "- BINARY(expr)" from the before-DERIVATION_CAST
comment, as it was on a wrong place. It's also listed on the correct
place before DERIVATION_IMPLICIT.
RAND() and UUID() are treated differently with respect to subquery
materialization both should be marked as uncacheable, forcing materialization.
Altered Create_func_uuid(_short)::create_builder().
Added comment in header about UNCACHEABLE_RAND meaning also unmergeable.
Step#2 - Adding a new collation derivation level for CAST and CONVERT.
Now character string cast functions:
- CAST(string_expr AS CHAR)
- CONVERT(expr USING charset_name)
have a new collation derivation level between:
- string literals
- utf8 metadata functions, e.g. user() and database()
Before the change these cast functions had collation derivation equal
to table columns, which caused more illegal mix of collation conflicts.
Note, binary string cast functions:
- BINARY(expr)
- CAST(string_expr AS BINARY)
- CONVERT(expr USING binary)
did not change their collation derivation, to preserve the behaviour of
queries like these:
SELECT database()=BINARY'test';
SELECT user()=CAST('root' AS BINARY);
SELECT current_role()=CONVERT('role' USING binary);
Derivation levels after the change look as follows:
DERIVATION_IGNORABLE= 7, // Explicit NULL
DERIVATION_NUMERIC= 6, // Numbers in string context,
// Numeric user variables
// CAST(numeric_expr AS CHAR)
DERIVATION_COERCIBLE= 5, // Literals, string user variables
DERIVATION_CAST= 4, // CAST(string_expr AS CHAR),
// CONVERT(string_expr USING cs)
DERIVATION_SYSCONST= 3, // utf8 metadata functions, e.g. user(), database()
DERIVATION_IMPLICIT= 2, // Table columns, SP variables, BINARY(expr)
DERIVATION_NONE= 1, // A mix (e.g. CONCAT) of two differrent collations
DERIVATION_EXPLICIT= 0 // An explicit COLLATE clause
It was wrong to derive Item_func_uuid from Item_func_sys_guid,
because the former is a function returning the UUID data type,
while the latter is a string function returning VARCHAR.
As a result of the wrong hierarchy, Item_func_uuid erroneously derived
Item_str_func::fix_fields(), which contains this code:
/*
In Item_str_func::check_well_formed_result() we may set null_value
flag on the same condition as in test() below.
*/
if (thd->is_strict_mode())
set_maybe_null();
This code is not relevant to UUID() at all.
A simple fix would be to set_maybe_null(false) in
Item_func_uuid::fix_length_and_dec(). However,
it'd fix only exactly this single consequence of the wrong
class hierarchy, and similar bugs could appear again in
the future. Moreover, we're going to add functions UUIDv4()
and UUIDv7() soon (in 11.6). So it's better to fix the class hierarchy
in the right way before adding these new functions.
Fix:
- Adding a new abstract class Item_fbt_func in the template
in sql_type_fixedbin.h
- Deriving Item_typecast_fbt from Item_fbt_func
- Deriving Item_func_uuid from Item_fbt_func
- Adding a new helper class UUIDv1. It derives from UUID, and additionally
initializes the value to "UUID version 1" right in the constructor.
Note, the new coming soon SQL functions UUIDv4() and UUIDv7()
will also have corresponding classes UUIDv4 and UUIDv7.
So now UUID() is a pure "returning UUID" function,
like CAST(expr AS UUID) used to be, without any unintentional
artifacts of functions returning VARCHAR/TEXT.
Cleanup:
- Removing the member Item_func_sys_guid::with_dashes,
as it's not needed any more:
* Item_func_sys_guid now does not have any descendants any more
* Item_func_sys_guid::val_str() itself always displays without dashes
In main.index_merge_myisam we remove the test that was added in
commit a2d24def8c because
it duplicates the test case that was added in
commit 5af12e4635.
- Moving the implementations of class Inet4 and class Inet6 into separate
files sql_type_inet.h and sql_type_inet.cc, in order to reuse them
for the INET6 data type and inet function collection.
- Adding a warning in the case when IS_IPV4_MAPPED() and IS_IPV4_COMPAT()
erroneously gets an IPv4 address instead of the expected IPv6 address.
Detailed: changes:
1. Moving Field specific code into new methods on Field:
- Field *Field::create_tmp_field(...)
- virtual void init_for_tmp_table(...)
2. Removing virtual Item::create_tmp_field().
Adding instead a new virtual method Item::create_tmp_field_ex().
Note, a virtual create_tmp_field() still exists, but only for Item_sum.
This resembles 10.0 code structure. Perhaps create_tmp_field() should
be removed from Item_sum and Item_sum descendants should override
create_tmp_field_ex() directly. This can be done in a separate commit.
3. Adding helper classes Tmp_field_src and Tmp_field_param,
to make the API for Item::create_tmp_field_ex() smaller
and easier to extend in the future.
4. Decomposing the public function create_tmp_field() into
virtual implementations for Item and a number of its descendants:
- Item_basic_value
- Item_sp_variable
- Item_name_const
- Item_result_field
- Item_field
- Item_ref
- Item_type_holder
- Item_row
- Item_func_sp
- Item_func_user_var
- Item_sum
- Item_sum_field
- Item_proc
5. Adding DBUG_ASSERT-only virtual implementations for
Item types that should not appear in create_tmp_table_ex(),
for easier debugging:
- Item_nodeset_func
- Item_nodeset_to_const_comparator
- Item_null_result
- Item_copy
- Item_ident_for_show
- Item_user_var_as_out_param
6. Moving public function create_tmp_field_from_field()
as a method to Item_field.
7. Removing Item::set_result_field(). It's not needed any more.
8. Cleanup: Removing the enum value "EXPR_CACHE_ITEM",
as it's not used for a very long time.