To fix the crash there we need to make sure that the
server while storing the statistical values in statistical tables should do it
in a multi-byte safe way.
Also there is no need to throw warnings if there is truncation while storing
values from statistical fields.
This patch implements engine independent unique hash index.
Usage:- Unique HASH index can be created automatically for blob/varchar/test column whose key
length > handler->max_key_length()
or it can be explicitly specified.
Automatic Creation:-
Create TABLE t1 (a blob unique);
Explicit Creation:-
Create TABLE t1 (a int , unique(a) using HASH);
Internal KEY_PART Representations:-
Long unique key_info will have 2 representations.
(lets understand this with an example create table t1(a blob, b blob , unique(a, b)); )
1. User Given Representation:- key_info->key_part array will be similar to what user has defined.
So in case of example it will have 2 key_parts (a, b)
2. Storage Engine Representation:- In this case there will be only one key_part and it will point to
HASH_FIELD. This key_part will be always after user defined key_parts.
So:- User Given Representation [a] [b] [hash_key_part]
key_info->key_part ----^
Storage Engine Representation [a] [b] [hash_key_part]
key_info->key_part ------------^
Table->s->key_info will have User Given Representation, While table->key_info will have Storage Engine
Representation.Representation can be changed into each other by calling re/setup_keyinfo_hash function.
Working:-
1. So when user specifies HASH_INDEX or key_length is > handler->max_key_length(), In mysql_prepare_create_table
One extra vfield is added (for each long unique key). And key_info->algorithm is set to HA_KEY_ALG_LONG_HASH.
2. In init_from_binary_frm_image values for hash_keypart is set (like fieldnr , field and flags)
3. In parse_vcol_defs, HASH_FIELD->vcol_info is created. Item_func_hash is used with list of Item_fields,
When Explicit length is given by user then Item_left is used to concatenate Item_field values.
4. In ha_write_row/ha_update_row check_duplicate_long_entry_key is called which will create the hash key from
table->record[0] and then call ha_index_read_map , if we found duplicated hash , we will compare the result
field by field.
If we instantly change the size of a fixed-length field
and treat it as kind-of variable-length, then we will need
conversions between old column values and new ones.
I tried adding such a conversion to row_build(), but then I
noticed that more conversions would be needed, because
old values still appeared in a freshly rebuilt secondary index,
causing a mismatch when trying to search with the correct
longer value that was converted in my provisional fix to row_build().
So, we will revert the essential part of
MDEV-15563: Instant ROW_FORMAT=REDUNDANT column extension
(commit 22feb179ae166500ec91feec6246c8154e33f9a2), but not
remove any tests.
Field_str::is_equal(): Do not allow instant conversions between
BIT (which is stored big-endian) and integer types (which can
be stored big-endian or little-endian, depending on storage engine).
row_sel_field_store_in_mysql_format_func(): Properly extend
narrower integer and DATA_FIXBINARY values to the current format.
DATA_FIXBINARY was incorrectly padded with 0x20 instead of 0.
1. Renaming Type_handler_json to Type_handler_json_longtext
There will be other JSON handlers soon, e.g. Type_handler_json_varchar.
2. Making the code more symmetric for data types:
- Adding a new virtual method
Type_handler::Column_definition_validate_check_constraint()
- Moving JSON-specific code from sql_yacc.yy to
Type_handler_json_longtext::Column_definition_validate_check_constraint()
3. Adding new files sql_type_json.cc and sql_type_json.h
and moving Type_handler+JSON related code into these files.
Allow ALGORITHM=INSTANT (or avoid touching any data)
when changing the collation, or in some cases, the character set,
of a non-indexed CHAR or VARCHAR column. There is no penalty
for subsequent DDL or DML operations, and compatibility with
older MariaDB versions will be unaffected.
Character sets may be changed when the old encoding is compatible
with the new one. For example, changing from ASCII to anything
ASCII-based, or from 3-byte to 4-byte UTF-8 can sometimes be
performed instantly.
This is joint work with Eugene Kosov.
The test cases as well as ALTER_CONVERT_TO, charsets_are_compatible(),
Type_handler::Charsets_are_compatible() are his work.
The Field_str::is_equal(), Field_varstring::is_equal() and
the InnoDB changes were mostly rewritten by me due to conflicts
with MDEV-15563.
Limitations:
Changes of indexed columns will still require
ALGORITHM=COPY. We should allow ALGORITHM=NOCOPY and allow
the indexes to be rebuilt inside the storage engine,
without copying the entire table.
Instant column size changes (in bytes) are not supported by
all storage engines.
Instant CHAR column changes are only allowed for InnoDB
ROW_FORMAT=REDUNDANT. We could allow this for InnoDB
when the CHAR internally uses a variable-length encoding,
say, when converting from 3-byte UTF-8 to 4-byte UTF-8.
Instant VARCHAR column changes are allowed for InnoDB
ROW_FORMAT=REDUNDANT, and for others only if the size
in bytes does not change from 128..255 bytes to more
than 256 bytes.
Inside InnoDB, this slightly changes the way how MDEV-15563
works and fixes the result of the innodb.instant_alter_extend test.
We change the way how ALTER_COLUMN_EQUAL_PACK_LENGTH_EXT
is handled. All column extension, type changes and renaming
now go through a common route, except when ctx->is_instant()
is in effect, for example, instant ADD or DROP COLUMN has
been initiated. Only in that case we will go through
innobase_instant_try() and rewrite all column metadata.
get_type(field, prtype, mtype, len): Convert a SQL data type into
InnoDB column metadata.
innobase_rename_column_try(): Remove the update of SYS_COLUMNS.
innobase_rename_or_enlarge_column_try(): New function,
replacing part of innobase_rename_column_try() and all of
innobase_enlarge_column_try(). Also changes column types.
innobase_rename_or_enlarge_columns_cache(): Also change
the column type.
When creating a field of type JSON, it will be automatically
converted to TEXT with CHECK (json_valid(`a`)), if there wasn't any
previous check for the column.
Additional things:
- Added two bug fixes that was found while testing JSON. These bug
fixes has also been pushed to 10.3 (with a test case), but as they
where minimal and needed to get this task done and tested, the fixes
are repeated here.
- CREATE TABLE ... SELECT drops constraints for columns that
are both in the create and select part.
- If one has both a default expression and check constraint for a
column, one can get the error "Expression for field `a` is refering
to uninitialized field `a`.
- Removed some duplicate MYSQL_PLUGIN_IMPORT symbols
- CREATE TABLE ... SELECT drops constraints for columns that
are both in the create and select part.
- Fixed by copying the constraint in
Column_definiton::redefine_stage1_common()
- If one has both a default expression and check constraint for a
column, one can get the error "Expression for field `a` is refering
to uninitialized field `a`.
- Fixed by ignoring default expressions for current column when checking
for CHECK constraint
This was developed by Aleksey Midenkov based on my design.
In the original InnoDB storage format (that was retroactively named
ROW_FORMAT=REDUNDANT in MySQL 5.0.3), the length of each index field
is stored explicitly.
Because of this, we can and now will allow instant conversion from
VARCHAR to CHAR or VARBINARY to BINARY of equal or greater size,
as well as instant conversion of TINYINT to SMALLINT to MEDIUMINT
to INT to BIGINT (while not changing between signed and unsigned).
Theoretically, we could allow changing from an unsigned integer to
a bigger unsigned integer, as well as changing CHAR to VARCHAR, but
that would require additional metadata and conversions whenever
reading old records.
Field_str::is_equal(), Field_varstring::is_equal(), Field_num::is_equal():
Return the new result IS_EQUAL_PACK_LENGTH_EXT if the table advertises
HA_EXTENDED_TYPES_CONVERSION capability and we are considering the
above-mentioned conversions.
ALTER_COLUMN_EQUAL_PACK_LENGTH_EXT: A new ALTER TABLE flag, similar
to ALTER_COLUMN_EQUAL_PACK_LENGTH but requiring conversions when
reading the data. The Field::is_equal() result IS_EQUAL_PACK_LENGTH_EXT
will map to this flag.
dtype_get_fixed_size_low(): For BINARY, CHAR and integer columns
in ROW_FORMAT=REDUNDANT, return 0 (variable length) from now on.
dtype_get_sql_null_size(): Keep returning the current size for
BINARY, CHAR and integer columns, so that in ROW_FORMAT=REDUNDANT
it will remain possible to update in place between NULL and NOT NULL
values.
btr_index_rec_validate(): Relax a CHECK TABLE length check for
ROW_FORMAT=REDUNDANT tables.
btr_cur_instant_init_low(): No longer trust fixed_len
for ROW_FORMAT=REDUNDANT tables.
We cannot rely on fixed_len anymore because the record can have shorter
length from before instant extension. Note that importing such tablespace
into earlier MariaDB versions produces ER_TABLE_SCHEMA_MISMATCH when
using a .cfg file.
In the original InnoDB storage format (which was retroactively named
ROW_FORMAT=REDUNDANT in MySQL 5.0.3), the length of each index field
is stored explicitly. Thus, we can and from now on will allow arbitrary
extension of VARBINARY and VARCHAR columns when the table is in
ROW_FORMAT=REDUNDANT.
ha_innobase::open(): Advertise a new HA_EXTENDED_TYPES_CONVERSION
capability for ROW_FORMAT=REDUNDANT tables.
Field_varstring::is_equal(): If the HA_EXTENDED_TYPES_CONVERSION
capability is advertised for the table, return IS_EQUAL_PACK_LENGTH
for any length extension.
For up to 127 bytes length, InnoDB would use 1 byte for length, and
that byte would always be less than 128. If the maximum length is
longer than 255 bytes, InnoDB would use a variable-length encoding
for the length, using 1 byte for lengths up to 127 bytes, and
2 bytes for longer lengths.
Thus, 1-byte lengths are always compatible when the maximum size
changes from less than 128 bytes to anything longer.
Field_varstring::is_equal(): Return IS_EQUAL_PACK_LENGTH also when
converting from VARCHAR less than 128 bytes to any longer VARCHAR.
remove TABLE_SHARE::error_table_name() and TABLE_SHARE::orig_table_name
(that was allocated in a wrong memroot in this bug).
instead, simply set TABLE_SHARE::table_name correctly.
The error message modified.
Then the TABLE_SHARE::error_table_name() implementation taken from 10.3,
to be used as a name of the table in this message.
- clean up DEFAULT() to work only with default value and correctly print
itself.
- fix of DBUG_ASSERT about fields read/write
- fix of field marking for write based really on the thd->mark_used_columns flag
MDEV-17625 Different warnings when comparing a garbage to DATETIME vs TIME
- Splitting processes of data type conversion (to TIME/DATE,DATETIME)
and warning generation.
Warning are now only get collected during conversion (in an "int" variable),
and are pushed in the very end of conversion (not in parallel).
Warnings generated by the low level routines str_to_xxx() and number_to_xxx()
can now be changed at the end, when TIME_FUZZY_DATES is applied,
from "Invalid value" to "Truncated invalid value".
Now "Illegal value" is issued only when the low level routine returned
an error and TIME_FUZZY_DATES was not set. Otherwise, if the low level
routine returned "false" (success), or if NULL was converted to a zero
datetime by TIME_FUZZY_DATES, then "Truncated illegal value"
is issued. This gives better warnings.
- Methods Type_handler::Item_get_date() and
Type_handler::Item_func_hybrid_field_type_get_date() now only
convert and collect warning information, but do not push warnings.
- Changing the return data type for Type_handler::Item_get_date()
and Type_handler::Item_func_hybrid_field_type_get_date() from
"bool" to "void". The conversion result (success vs error) can be
checked by testing ltime->time_type. MYSQL_TIME_{NONE|ERROR}
mean mean error, other values mean success.
- Adding new wrapper methods Type_handler::Item_get_date_with_warn() and
Type_handler::Item_func_hybrid_field_type_get_date_with_warn()
to do conversion followed by raising warnings, and changing
the code to call new Type_handler::***_with_warn() methods.
- Adding a helper class Temporal::Status, a wrapper
for MYSQL_TIME_STATUS with automatic initialization.
- Adding a helper class Temporal::Warn, to collect warnings
but without actually raising them. Moving a part of ErrConv
into a separate class ErrBuff, and deriving both Temporal::Warn
and ErrConv from ErrBuff. The ErrBuff part of Temporal::Warn
is used to collect textual representation of the input data.
- Adding a helper class Temporal::Warn_push. It's used
to collect warning information during conversion, and
automatically pushes warnings to the diagnostics area
on its destructor time (in case of non-zero warning).
- Moving more code from various functions inside class Temporal.
- Adding more Temporal_hybrid constructors and
protected Temporal methods make_from_xxx(),
which convert and only collect warning information, but do not
actually raise warnings.
- Now the low level functions str_to_datetime() and str_to_time()
always set status->warning if the return value is "true" (error).
- Now the low level functions number_to_time() and number_to_datetime()
set the "*was_cut" argument if the return value is "true" (error).
- Adding a few DBUG_ASSERTs to make sure that str_to_xxx() and
number_to_xxx() always set warnings on error.
- Adding new warning flags MYSQL_TIME_WARN_EDOM and MYSQL_TIME_WARN_ZERO_DATE
for the code symmetry. Before this change there was a special
code path for (rc==true && was_cut==0) which was treated by
Field_temporal::store_invalid_with_warning as "zero date violation".
Now was_cut==0 always means that there are no any error/warnings/notes
to be raised, not matter what rc is.
- Using new Temporal_hybrid constructors in combination with
Temporal::Warn_push inside str_to_datetime_with_warn(),
double_to_datetime_with_warn(), int_to_datetime_with_warn(),
Field::get_date(), Item::get_date_from_string(), and a few other places.
- Removing methods Dec_ptr::to_datetime_with_warn(),
Year::to_time_with_warn(), my_decimal::to_datetime_with_warn(),
Dec_ptr::to_datetime_with_warn().
Fixing Sec6::to_time() and Sec6::to_datetime() to
convert and only collect warnings, without raising warnings.
Now warning raising functionality resides in Temporal::Warn_push.
- Adding classes Longlong_hybrid_null and Double_null, to
return both value and the "IS NULL" flag. Adding methods
Item::to_double_null(), to_longlong_hybrid_null(),
Item_func_hybrid_field_type::to_longlong_hybrid_null_op(),
Item_func_hybrid_field_type::to_double_null_op().
Removing separate classes VInt and VInt_op, as they
have been replaced by a single class Longlong_hybrid_null.
- Adding a helper method Temporal::type_name_by_timestamp_type(),
moving a part of make_truncated_value_warning() into it,
and reusing in Temporal::Warn::push_conversion_warnings().
- Removing Item::make_zero_date() and
Item_func_hybrid_field_type::make_zero_mysql_time().
They provided duplicate functionality.
Now this code resides in Temporal::make_fuzzy_date().
The latter is now called for all Item types when data type
conversion (to DATE/TIME/DATETIME) is involved, including
Item_field and Item_direct_view_ref.
This fixes MDEV-17563: Item_direct_view_ref now correctly converts
NULL to a zero date when TIME_FUZZY_DATES says so.
The problem happened because {{Field_xxx::store(longlong nr, bool unsigned_val)}} erroneously passed {{unsigned_flag}} to the {{usec}} parameter of this constructor:
{code:cpp}
Datetime(int *warn, longlong sec, ulong usec, date_conv_mode_t flags)
{code}
1. Changing Time and Datetime constructors to accept data as Sec6 rather than as
longlong/double/my_decimal, so it's not possible to do such mistakes
in the future. Additional good effect of these changes:
- This reduced some amount of similar code (minus ~35 lines).
- The code now does not rely on the fact that "unsigned_flag" is
not important inside Datetime().
The constructor always gets all three parts: sign, integer part,
fractional part. The simple the better.
2. Fixing Field_xxx::store() to use the new Datetime constructor format.
This change actually fixes the problem.
3. Adding "explicit" keyword to all Sec6 constructors,
to avoid automatic hidden conversion from double/my_decimal to Sec6,
as well as from longlong/ulonglong through double to Sec6.
4. Change#1 caused (as a dependency) changes in a few places
with code like this:
bool neg= nr < 0 && !unsigned_val;
ulonglong value= m_neg ? (ulonglong) -nr : (ulonglong) nr;
These fragments relied on a non-standard behavior with
the operator "minus" applied to the lowest possible negative
signed long long value. This can lead to different results
depending on the platform and compilation flags.
We have fixed such bugs a few times already.
So instead of modifying the old wrong code to a new wrong code,
replacing all such fragments to use Longlong_hybrid,
which correctly handles this special case with -LONGLONG_MIN
in its method abs().
This also reduced the amount of similar code
(1 or 2 new lines instead 3 old lines in all 6 such fragments).
5. Removing ErrConvInteger(longlong nr, bool unsigned_flag= false)
and adding ErrConvInteger(Longlong_hybrid) instead, to encourage
use of safe Longlong_hybrid instead of unsafe pairs nr+neg.
6. Removing unused ErrConvInteger from Item_cache_temporal::get_date()
This reverts commit 87609324e0ee
RocksDB was making invalid assumption about Field_blob::make_sort_key,
and the commit 87609324e0ee changed Field_blob::make_sort_key to match
RocksDB assumptions.
It also unintentionaly broke sys_vars.max_sort_length_func