- Implementing the task according to the MDEV description.
- Adding a helper class Sec6_add to share the code in type-specific
branches in Item_func_add_time::get_date().
Adding new methods:
- virtual void Type_handler::Column_definition_reuse_fix_attributes()
according to the MDEV description
- virtual uint32 Field::character_octet_length()
To simplify handling of Column_definition::length for
TEXT and VARCHAR columns (with and without compression).
MDEV-16426 Optimizer erroneously treats equal constants of different formats as same
A cleanup for MDEV-14630: fixing a crash in Item_decimal::eq().
Problems:
- old implementations of Item_decimal::eq() and
Item_temporal_literal::eq() were not symmetric
with Item_param::eq(), this caused MDEV-11361.
- old implementations for DECIMAL and temporal data types
did not take into account that in case when eq() is called
with binary_cmp==true, {{eq()}} should check not only equality
of the two values, but also equality if their decimal precision.
This cuases MDEV-16426.
- Item_decimal::eq() crashes with "item" pointing
to a non-DECIMAL value. Before MDEV-14630
non-DECIMAL values were filtered out by the test:
type() == item->type()
as literals of different types had different type().
After MDEV-14630 type() for literals of all data types return CONST_ITEM.
This caused failures in tests:
./mtr engines/iuds.insert_number
./mtr --ps --embedded main.explain_slowquerylog
(revealed by buildbot)
The essence of the fix:
Making literals and Item_param reuse the same code to avoid
asymmetries between Item_param::eq(Item_literal) and
Item_literal::eq(Item_param), now and in the future, and to
avoid code duplication between Item_literal and Item_param.
Adding tests for "decimals" for DECIMAL and temporal data types,
to treat constants of different scale as not equal when "binary_cmp"
is "true".
Details:
1. Adding a helper class Item_const to extract constant values from Items easier
2. Deriving Item_basic_value from Item_const
3. Joining Type_handler::Item_basic_value_eq() and Item_basic_value_bin_eq()
into a single method with an extra "binary_cmp" argument
(it looks simple this way) and renaming the new method to Item_const_eq().
Modifying its implementations to operate with
Item_const instead of Item_basic_value.
4. Adding a new class Type_handler_hex_hybrid,
to handle hex constants like 0x616263.
5. Removing Item::VARBIN_ITEM and fixing Item_hex_constant to
use type_handler_hex_hybrid instead of type_handler_varchar.
Item_hex_hybrid::type() now returns CONST_ITEM, like all
other literals do.
6. Move virtual methods Item::type_handler_for_system_time() and
Item::cast_to_int_type_handler() from Item to Type_handler.
7. Removing Item_decimal::eq() and Item_temporal_literal::eq().
These classes are now handled by the generic Item_basic_value::eq().
8. Implementing Type_handler_temporal_result::Item_const_eq()
and Type_handler_decimal_result::Item_const_eq(),
this fixes MDEV-11361.
9. Adding tests for "decimals" into
Type_handler_decimal_result::Item_const_eq() and
Type_handler_temporal_result::Item_const_eq()
in case if "binary_cmp" is true.
This fixes MDEV-16426.
10. Moving Item_cache out of Item_basic_value.
They share nothing. It simplifies implementation
of Item_basic_value::eq(). Deriving Item_cache
directly from Item.
11. Adding class DbugStringItemTypeValue, which
used Item::print() internally, and using
in instead of the old debug printing code.
This gives nicer output in func_debug.result.
Changes N5 and N6 do not directly relate to the bugs fixed,
but make the code fully symmetric across all literal types.
Without a new handler Type_handler_hex_hybrid we'd have
to keep two code branches (for regular literals and for
hex hybrid literals).
Now the boolean data type is preserved in hybrid functions and MIN/MAX,
so COALESCE(bool_expr,bool_expr) and MAX(bool_expr) are correctly
detected by JSON_OBJECT() as being boolean rather than numeric expressions.
- Adding Type_handler::traditional_merge_field_type()
- Removing real_type_to_type(), field_merge_type()
- Making Type_handler_var_string to merge as VARCHAR
- Additionally, fixing Field_string::print() to add the "/*old*/"
comment into the data type for the old VARCHAR.
This is similar to what MDEV-8267 earlier did for old DECIMAL.
- Adding tests
1. Adding new methods:
- Item::is_order_clause_position()
- Item_splocal::is_valid_limit_clause_variable_with_error()
- Type_handler::is_order_clause_position_type()
- is_limit_clause_valid_type()
and changing all tests related to the ORDER and LIMIT clauses
like "item->type()==INT_ITEM" to these new methods.
2. Adding a helper function prepare_param() in sql_analyse.cc
and replacing three pieces of duplicate code to prepare_param() calls.
Replacing the test "item->type()!=Item::INT_ITEM" to an equivalent
condition using item->basic_const_item() and type_handler()->result_type().
MDEV-16100 FOR SYSTEM_TIME erroneously resolves string user variables as transaction IDs
Problem:
Vers_history_point::resolve_unit() tested item->result_type() before
item->fix_fields() was called.
- Item_func_get_user_var::result_type() returned REAL_RESULT by default.
This caused MDEV-16100.
- Item_func_sp::result_type() crashed on assert.
This caused MDEV-16094
Changes:
1. Adding item->fix_fields() into Vers_history_point::resolve_unit()
before using data type specific properties of the history point
expression.
2. Adding a new virtual method Type_handler::Vers_history_point_resolve_unit()
3. Implementing type-specific
Type_handler_xxx::Type_handler::Vers_history_point_resolve_unit()
in the way to:
a. resolve temporal and general purpose string types to TIMESTAMP
b. resolve BIT and general purpose INT types to TRANSACTION
c. disallow use of non-relevant data type expressions in FOR SYSTEM_TIME
Note, DOUBLE and DECIMAL data types are disallowed intentionally.
- DOUBLE does not have enough precision to hold huge BIGINT UNSIGNED values
- DECIMAL rounds on conversion to INT
Both lack of precision and rounding might potentionally lead to
very unpredictable results when a wrong transaction ID would be chosen.
If one really wants dangerous use of DOUBLE and DECIMAL, explicit CAST
can be used:
FOR SYSTEM_TIME AS OF CAST(double_or_decimal AS UNSIGNED)
QQ: perhaps DECIMAL(N,0) could still be allowed.
4. Adding a new virtual method Item::type_handler_for_system_time(),
to make HEX hybrids and bit literals work as TRANSACTION rather
than TIMESTAMP.
5. sql_yacc.yy: replacing the rule temporal_literal to "TIMESTAMP TEXT_STRING".
Other temporal literals now resolve to TIMESTAMP through the new
Type_handler methods. No special grammar needed. This removed
a few shift/resolve conflicts.
(TIMESTAMP related conflicts in "history_point:" will be removed separately)
6. Removing the "timestamp_only" parameter from
vers_select_conds_t::resolve_units() and Vers_history_point::resolve_unit().
It was a hint telling that a table did not have any TRANSACTION-aware
system time columns, so it's OK to resolve to TIMESTAMP in case of uncertainty.
In the new reduction it works as follows:
- the decision between TIMESTAMP and TRANSACTION is first made
based only on the expression data type only
- then, in case if the expression resolved to TRANSACTION, the table
is checked if TRANSACTION-aware columns really exist.
This way is safer against possible ALTER TABLE statements changing
ROW START and ROW END columns from "BIGINT UNSIGNED" to "TIMESTAMP(x)"
or the other way around.
This patch does the following:
1. Makes Field_vers_trx_id::type_handler() return
&type_handler_vers_trx_id rather than &type_handler_longlong.
Fixes Item_func::convert_const_compared_to_int_field() to
test field_item->type_handler() against &type_handler_vers_trx_id,
instead of testing field_item->vers_trx_id().
2. Removes VERS_TRX_ID related code from
Type_handler_hybrid_field_type::aggregate_for_comparison(),
because "BIGINT UNSIGNED GENERATED ALWAYS AS ROW {START|END}"
columns behave just like a BIGINT in a regular comparison,
i.e. when not inside AS OF.
3. Removes
- Type_handler_hybrid_field_type::m_vers_trx_id;
- Type_handler_hybrid_field_type::m_flags;
because a "BIGINT UNSIGNED GENERATED ALWAYS AS ROW {START|END}"
behaves like a regular BIGINT column when in UNION.
4. Removes Field::vers_trx_id(), Item::vers_trx_id(), Item::field_flags()
They are not needed anymore. See N1.
Problems:
1. Unlike Item_field::fix_fields(),
Item_sum_sp::fix_length_and_dec() and Item_func_sp::fix_length_and_dec()
did not run the code which resided in adjust_max_effective_column_length(),
therefore they did not extend max_length for the integer return data types
from the user-specified length to the maximum length according to
the data type capacity.
2. The code in adjust_max_effective_column_length() was not correct
for TEXT data, because Field_blob::max_display_length()
multiplies to mbmaxlen. So TEXT variants were unintentionally
promoted to the next longer data type for multi-byte character
sets: TINYTEXT->TEXT, TEXT->MEDIUMTEXT, MEDIUMTEXT->LONGTEXT.
3. Item_sum_sp::create_table_field_from_handler()
Item_func_sp::create_table_field_from_handler()
erroneously called tmp_table_field_from_field_type(),
which converted VARCHAR(>512) to TEXT variants.
So "CREATE..SELECT spfunc()" erroneously converted
VARCHAR to TEXT. This was wrong, because stored
functions have explicitly declared data types,
which should be preserved.
Solution:
- Removing Type_std_attributes(const Field *)
and using instead Type_std_attributes::set() in combination
with field->type_str_attributes() all around the code, e.g.:
Type_std_attributes::set(field->type_std_attributes())
These two ways of copying attributes from a Field
to an Item duplicated each other, and were slightly
different in how to mix max_length and mbmaxlen.
- Removing adjust_max_effective_column_length() and
fixing Field::type_std_attributes() to do all necessary
type-specific calculations , so no further adjustments
is needed.
Field::type_std_attributes() is now called from all affected methods:
Item_field::fix_fields()
Item_sum_sp::fix_length_and_dec()
Item_func_sp::fix_length_and_dec()
This fixes the problem N1.
- Making Field::type_std_attributes() virtual, to make
sure that type-specific adjustments a properly done
by individual Field_xxx classes. Implementing
Field_blob::type_std_attributes() in the way that
no TEXT promotion is done.
This fixes the problem N2.
- Fixing Item_sum_sp::create_table_field_from_handler()
Item_func_sp::create_table_field_from_handler() to
call create_table_field_from_handler() instead of
tmp_table_field_from_field_type() to avoid
VARCHAR->TEXT conversion on "CREATE..SELECT spfunc()".
- Recording mysql-test/suite/compat/oracle/r/sp-param.result
as "CREATE..SELECT spfunc()" now correctly
preserve the data type as specified in the RETURNS clause.
- Adding new tests
Problem:
The logic in store_column_type() with a switch on field type was
hard to follow. The part for MEDIUMINT (MYSQL_TYPE_INT24) was not correct.
It erroneously calculated the precision of MEDIUMINT UNSIGNED
as 7 instead of 8.
A similar hard-to-follow switch doing some type specific calculations
resided in adjust_max_effective_column_length(). It was also wrong for
MEDIUMINT (reported as a separate issue in MDEV-15946).
Solution:
1. Introducing a new class Information_schema_numeric_attributes
2. Adding a new virtual method Field::information_schema_numeric_attributes()
3. Splitting the logic in store_column_type() into virtual
implementations of information_schema_numeric_attributes().
4. In order to avoid adding duplicate code for the integer data types,
adding a new virtual method Field_int::numeric_precision(),
which returns the number of digits.
Additional changes:
1. Adding the "const" qualifier to Field::max_display_length()
2. Moving the code from adjust_max_effective_column_length()
directly to Field::max_display_length().
There was no any sense to have two implementations:
- a set of wrong virtual implementations for Field_xxx::max_display_length()
- additional code in adjust_max_effective_column_length() fixing
bad results of Field_xxx::max_display_length()
This change is safe:
- The code using Field::max_display_length()
in field.cc, sql_show.cc, sql_type.cc is not affected.
- The code in rpl_utility.cc is also not affected.
See a new DBUG_ASSSERT and new comments explaining why.
In the new reduction, Field_xxx::max_display_length() returns
correct results for all integer types (except MEDIUMINT, see below).
Putting implementations of numeric_precision() and max_display_length()
near each other in field.h made the logic much clearer and thus
helped to reveal bad results for Field_medium::max_display_length(),
which returns 9 instead of 8 for signed MEDIUMINT fields.
This problem will be addressed separately (MDEV-15946).
Note, this change is also useful for pluggable data types (see MDEV-4912),
as now a user defined Field_xxx has a way to control what's returned
in INFORMATION_SCHEMA.COLUMNS.NUMERIC_PRECISION and
INFORMATION_SCHEMA.COLUMNS.NUMERIC_SCALE by implementing
a desired behavior in Field_xxx::information_schema_numeric_attributes().
The code in Type_handler_blob****::make_conversion_table_field()
erroneously assumed that row format replication uses
MYSQL_TYPE_TINYBLOB, MYSQL_TYPE_BLOB, MYSQL_TYPE_MEDIUMBLOB,
MYSQL_TYPE_LONGBLOB type codes to tranfer BLOB variations.
In fact, all BLOB variations use MYSQL_TYPE_BLOB as the type
code, while the BLOB packlength (1,2,3 or 4) it tranferred
in metadata.
The bug was introduced by aee068085ddb13f700780eeb61fce29c1e37df63
(MDEV-9238 Wrap create_virtual_tmp_table() into a class, split into different steps)
The change N7 in MDEV-15340 (see the commit message) introduced
a regression in how CAST(AS TIME), HOUR(), TIME_TO_SEC() treat datetimes
'0000-00-DD mm:hh:ss' (i.e. with zero YYYYMM part and a non-zero day).
These functions historically do not mix days to hours on datetime-to-time
conversion. Implementations of the underlying methods used get_arg0_time()
to fetch MYSQL_TIME. After MDEV-15340, get_arg0_time() went through the
Time() constructor, which always adds '0000-00-DD' to hours automatically
(as in all other places in the code we do mix days to hours).
Changes:
1. Extending Time() to make it possible to choose a desired way of treating
'0000-00-DD' (ignore or mix to hours) on datetime-to-time conversion.
Adding a helper class Time::Options for this, which now describes two aspects
of Time() creation:
1. Flags for get_date()
2. Days/hours mixing behavior.
2. Removing Item_func::get_arg0_time(). Using Time() directly
in all affected classes. Forcing Time() to ignore (rather than mix)
'0000-00-DD' in these affected classes by passing a suitable Options value.
3. Adding Time::to_seconds(), to reuse the code between
Item_func_time_to_sec::decimal_op() and Item_func_time_to_sec::int_op().
4. Item_func::get_arg0_date() now returns only a datetime value,
with automatic time-to-datetime conversion if needed. An assert was
added to catch attempts to pass TIME_TIME_ONLY to get_arg0_date().
All callers were checked not to pass TIME_TIME_ONLY, this revealed
a bug MDEV-15363.
5. Changing Item_func_last_day::get_date() to remove the TIME_TIME_ONLY flag
before calling get_arg0_date(). This fixes MDEV-15363.
The problem was that Item_func_hybrid_field_type::get_date() did not
convert the result to the correct data type, so MYSQL_TIME::time_type
of the get_date() result could be not in sync with field_type().
Changes:
1. Adding two new classes Datetime and Date to store MYSQL_TIMESTAMP_DATETIME
and MYSQL_TIMESTAMP_DATE values respectively
(in addition to earlier added class Time, for MYSQL_TIMESTAMP_TIME values).
2. Adding Item_func_hybrid_field_type::time_op().
It performs the operation using TIME representation,
and always returns a MYSQL_TIME value with time_type=MYSQL_TIMESTAMP_TIME.
Implementing time_op() for all affected children classes.
3. Fixing all implementations of date_op() to perform the operation
using strictly DATETIME representation. Now they always return a MYSQL_TIME
value with time_type=MYSQL_TIMESTAMP_{DATE|DATETIME},
according to the result data type.
4. Removing assignment of ltime.time_type to mysql_timestamp_type()
from all val_xxx_from_date_op(), because now date_op() makes sure
to return a proper MYSQL_TIME value with a good time_type (and other member)
5. Adding Item_func_hybrid_field_type::val_xxx_from_time_op().
6. Overriding Type_handler_time_common::Item_func_hybrid_field_type_val_xxx()
to call val_xxx_from_time_op() instead of val_xxx_from_date_op().
7. Modified Item_func::get_arg0_date() to return strictly a TIME value
if TIME_TIME_ONLY is passed, or return strictly a DATETIME value otherwise.
If args[0] returned a value of a different temporal type,
(for example a TIME value when TIME_TIME_ONLY was not passed,
or a DATETIME value when TIME_TIME_ONLY was passed), the conversion
is automatically applied.
Earlier, get_arg0_date() did not guarantee a result in
accordance to TIME_TIME_ONLY flag.
There were two problems related to the bug report:
1. Item_datetime::get_date() was not implemented.
So execution went through val_int() followed
by int-to-datetime or int-to-time conversion.
This was the reason why the optimizer did not
work well on data with fractional seconds.
2. Item_datetime::set() did not have a TIME specific code
to mix months and days to hours after unpack_time().
This is why the optimizer did not work well with negative
TIME values, as well as huge time values.
Changes:
1. Overriding Item_datetime::get_date(), to return ltime.
This fixes the problem N1.
2. Cleanup: Moving pack_time() and unpack_time() from
sql-common/my_time.c and include/my_time.h to
sql/sql_time.cc and sql/sql_time.h, as they are not needed
on the client side.
3. Adding a new "enum_mysql_timestamp_type ts_type" parameter
to unpack_time() and moving the TIME specific code to mix
months and days with hours inside unpack_time().
Adding a new "ts_type" parameter to Item_datetime::set(),
to pass it from the caller down to unpack_time().
So now the TIME specific code is automatically called
from Item_datetime::set(). This fixes the problem N2.
This change also helped to get rid of duplicate TIME specific code
from other three places, where mixing month/days to hours
was done immediately after unpack_time().
Moving the DATE specific code to zero hhmmssff
from Item_func_min_max::get_date_native to inside unpack_time(),
for symmetry.
4. Removing the virtual method in_vector::result_type(),
adding in_vector::type_handler() instead.
This helps to get result_type(), field_type(),
mysql_timestamp_type() of an in_vector easier.
Passing type_handler()->mysql_timestamp_type() as
a new parameter to Item_datetime::set() inside
in_temporal::value_to_item().
5. Cleaup: Removing separate implementations of in_datetime::get_value()
and in_time::get_value(). Adding a single implementation
in_temporal::get_value() instead.
Passing type_handler()->field_type() to get_value_internal().
1. Removing data type specific constants from enum_item_param_state,
adding SHORT_DATA_VALUE instead.
2. Replacing tests for Item_param::state for the removed constants to
tests for Type_handler::cmp_type() against {INT|REAL|TIME|DECIAML}_RESULT.
Deriving Item_param::PValue from Type_handler_hybrid_field_type,
to store the data type handler of the current value of the parameter.
3. Moving Item_param::decimal_value and Item_param::str_value_ptr
to Item_param::PValue. Adding Item_param::PValue::m_string
and changing Item_param to use it to store string values,
instead of Item::str_value. The intent is to replace Item_param::value
to a st_value based implementation in the future, to avoid duplicate code.
Adding a sub-class Item::PValue_simple, to implement
Item_param::PValue::swap() easier.
Remaming Item_basic_value::fix_charset_and_length_from_str_value()
to fix_charset_and_length() and adding the "CHARSET_INFO" pointer
parameter, instead of getting it directly from item->str_value.charset().
Changing Item_param to pass value.m_string.charset() instead
of str_value.charset().
Adding a String argument to the overloaded
fix_charset_and_length_from_str_value() and changing Item_param
to pass value.m_string instead of str_value.
4. Replacing the case in Item_param::save_in_field() to a call
for Type_handler::Item_save_in_field().
5. Adding new methods into Item_param::PValue:
val_real(), val_int(), val_decimal(), val_str().
Changing the corresponding Item_param methods
to use these new Item_param::PValue methods
internally. Adding a helper method
Item_param::can_return_value() and removing
duplicate code in Item_param::val_xxx().
6. Removing value.set_handler() from Item_param::set_conversion()
and Type_handler_xxx::Item_param_set_from_value().
It's now done inside Item_param::set_param_func(),
Item_param::set_value() and Item_param::set_limit_clause_param().
7. Changing Type_handler_int_result::Item_param_set_from_value()
to set max_length using attr->max_length instead of
MY_INT64_NUM_DECIMAL_DIGITS, to preserve the data type
of the assigned expression more precisely.
8. Adding Type_handler_hybrid_field_type::swap(),
using it in Item_param::PValue::swap().
9. Moving the data-type specific code from
Item_param::query_val_str(), Item_param::eq(),
Item_param::clone_item() to
Item_param::value_query_type_str(),
Item_param::value_eq(), Item_param::value_clone_item(),
to split the "state" dependent code and
the data type dependent code.
Later we'll split the data type related code further
and add new methods in Type_handler. This will be done
after we replace Item_param::PValue to st_value.
10. Adding asserts into set_int(), set_double(), set_decimal(),
set_time(), set_str(), set_longdata() to make sure that
the value set to Item_param corresponds to the previously
set data type handler.
11. Adding tests into t/ps.test and suite/binlog/t/binlog_stm_ps.test,
to cover Item_param::print() and Item_param::append_for_log()
for LIMIT clause parameters.
Note, the patch does not change the behavior covered by the new
tests. Adding for better code coverage.
12. Adding tests for more precise integer data type in queries like this:
EXECUTE IMMEDIATE
'CREATE OR REPLACE TABLE t1 AS SELECT 999999999 AS a,? AS b'
USING 999999999;
The explicit integer literal and the same integer literal
passed as a PS parameter now produce columns of the same data type.
Re-recording old results in ps.result, gis.result, func_hybrid_type.result
accordingly.
- sql_prepare.cc: Moving functions set_param_xxx() as
methods to Item_param
- Replacing a pointer to a function Item_param::set_param_func
to Type_handler based implementation:
Item_param::value now derives from Type_handler_hybrid_field_type.
Adding new virtual methods Type_handler::Item_param_setup_conversion()
and Type_handler::Item_param_set_param_func()
- Moving declaration of some Item_param members from "public:" to "private:"
(CONVERSION_INFO, value, decimal_value)
- Adding a new method Item_param::set_limit_clause_param(),
to share duplicate code, as well as to encapsulate
Item_param::value.
- Adding Item_param::setup_conversion_string() and
Item_param::setup_conversion_blob() to share
the code for binding from a client value
(mysql_stmt_bind_param), and for binding from
an expression (Item).
- Removing two different functions set_param_str_or_null()
and set_param_str(). Adding a common method Item_param::set_param_str().
Item_param::m_empty_string_is_null, used by Item_param::set_param_str().
- Removing the call for setup_one_conversion_function() from
insert_params_from_actual_params_with_log(). It's not needed,
because the call for ps_param->save_in_param() makes sure
to initialized all data type dependent members properly,
by calling setup_conversion_string() from
Type_handler_string_result::Item_param_set_from_value()
and by calling setup_conversion_blob() from
Type_handler_blob_common::Item_param_set_from_value()
- Cleanup: removing multiplication to MY_CHARSET_BIN_MB_MAXLEN
in a few places. It's 1 anyway, and will never change.
Side effect: the second debug Note in cache_temporal_4265.result disappeared.
Before this change:
- During JOIN::cache_const_exprs(),
Item::get_cache() for Item_date_add_interval() was called.
The data type for date_add('2001-01-01',interval 5 day) is VARCHAR,
because the first argument is VARCHAR (not temporal).
Item_get_cache() created Item_cache_str('2001-01-06').
- During evaluate_join_record(), get_datetime_value() was called,
which called Item::get_date() for Item_cache_str('2001-01-06').
This gave the second Note. Then, get_datetime_value() created
a new cache, now Item_cache_temporal for '2001-01-06', so not
further str_to_datetime() happened.
After this change:
- During tem_bool_rowready_func2::fix_length_and_dec(),
Arg_comparator::set_cmp_func_datetime() is called,
which immediately creates an instance of Item_cache_date for
the result of date_add('2001-01-01',interval 5 day).
So later no str_to_datetime happens any more,
neither during JOIN::cache_const_exprs(),
nor during evaluate_join_record().
- Implementing stricter data type control for Item_long_func descendants
- Cleanup: renaming Type_handler::can_return_str_ascii() to can_return_text()
(a better name).