The MDEV-17262 commit 26432e49d3
was skipped. In Galera 4, the implementation would seem to require
changes to the streaming replication.
In the tests archive.rnd_pos main.profiling, disable_ps_protocol
for SHOW STATUS and SHOW PROFILE commands until MDEV-18974
has been fixed.
The macro innobase_is_v_fld() turns out to be equivalent with
the opposite of Field::stored_in_db(). Remove the macro and
invoke the member function directly.
innodb_base_col_setup_for_stored(): Simplify a condition to only
check Field::vcol_info.
innobase_create_index_def(): Replace some redundant code with
DBUG_ASSERT().
This is a follow-up task to MDEV-12026, which introduced
innodb_checksum_algorithm=full_crc32 and a simpler page format.
MDEV-12026 did not enable full_crc32 for page_compressed tables,
which we will be doing now.
This is joint work with Thirunarayanan Balathandayuthapani.
For innodb_checksum_algorithm=full_crc32 we change the
page_compressed format as follows:
FIL_PAGE_TYPE: The most significant bit will be set to indicate
page_compressed format. The least significant bits will contain
the compressed page size, rounded up to a multiple of 256 bytes.
The checksum will be stored in the last 4 bytes of the page
(whether it is the full page or a page_compressed page whose
size is determined by FIL_PAGE_TYPE), covering all preceding
bytes of the page. If encryption is used, then the page will
be encrypted between compression and computing the checksum.
For page_compressed, FIL_PAGE_LSN will not be repeated at
the end of the page.
FSP_SPACE_FLAGS (already implemented as part of MDEV-12026):
We will store the innodb_compression_algorithm that may be used
to compress pages. Previously, the choice of algorithm was written
to each compressed data page separately, and one would be unable
to know in advance which compression algorithm(s) are used.
fil_space_t::full_crc32_page_compressed_len(): Determine if the
page_compressed algorithm of the tablespace needs to know the
exact length of the compressed data. If yes, we will reserve and
write an extra byte for this right before the checksum.
buf_page_is_compressed(): Determine if a page uses page_compressed
(in any innodb_checksum_algorithm).
fil_page_decompress(): Pass also fil_space_t::flags so that the
format can be determined.
buf_page_is_zeroes(): Check if a page is full of zero bytes.
buf_page_full_crc32_is_corrupted(): Renamed from
buf_encrypted_full_crc32_page_is_corrupted(). For full_crc32,
we always simply validate the checksum to the page contents,
while the physical page size is explicitly specified by an
unencrypted part of the page header.
buf_page_full_crc32_size(): Determine the size of a full_crc32 page.
buf_dblwr_check_page_lsn(): Make this a debug-only function, because
it involves potentially costly lookups of fil_space_t.
create_table_info_t::check_table_options(),
ha_innobase::check_if_supported_inplace_alter(): Do allow the creation
of SPATIAL INDEX with full_crc32 also when page_compressed is used.
commit_cache_norebuild(): Preserve the compression algorithm when
updating the page_compression_level.
dict_tf_to_fsp_flags(): Set the flags for page compression algorithm.
FIXME: Maybe there should be a table option page_compression_algorithm
and a session variable to back it?
row_drop_tables_for_mysql_in_background(): Copy the table name
before closing the table handle, to avoid heap-use-after-free if
another thread succeeds in dropping the table before
row_drop_table_for_mysql_in_background() completes the table name lookup.
dict_mem_create_temporary_tablename(): With innodb_safe_truncate=ON
(the default), generate a simple, unique, collision-free table name
using only the id, no pseudorandom component. This is safe, because
on startup, we will drop any #sql tables that might exist in InnoDB.
This is a backport from 10.3. It should have been backported already
as part of backporting MDEV-14717,MDEV-14585 which were prerequisites
for the MDEV-13564 backup-friendly TRUNCATE TABLE.
This seems to reduce the chance of table creation failures in
ha_innobase::truncate().
ha_innobase::truncate(): Do not invoke close(), but instead
mimic it, so that we can restore to the original table handle
in case opening the truncated copy of the table failed.
In commit d30f17af49 the change of
the loop iteration broke another error handling path that did
"goto error_handling_drop_uncached". Cover this code path with
fault injection, and revert to the correct iteration.
There are two fault injection labels innodb_OOM_prepare_inplace_alter.
Their order was swapped in MDEV-11369, so that the label that used
to be covered in an ADD INDEX code path would become unreachable
because the label that is executed for any ALTER TABLE was executed
first. Let us introduce the label innodb_OOM_prepare_add_index
for the more specific case.
if create_index_dict() fails, we need to free ctx->add_index[a] too
This fixes innodb.alter_crash and innodb.instant_alter_debug
failures in ASAN_OPTIONS="abort_on_error=1" runs
row_merge_create_index_graph(): Relay the internal state
from dict_create_index_step(). Our caller should free the index
only if it was not copied, added to the cache, and freed.
row_merge_create_index(): Free the index template if it was
not added to the cache. This is a safer variant of the logic
that was introduced in 65070beffd in 10.2.
prepare_inplace_alter_table_dict(): Add additional fault injection
to exercise a code path where we have already added an index
to the cache.
Only starting with MariaDB 10.3.8 (MDEV-16365), InnoDB can actually
handle ALTER IGNORE TABLE correctly when introducing a NOT NULL
attribute to a column that contains a NULL value. Between
MariaDB Server 10.0 and 10.2, we would incorrectly return an error
for ALTER IGNORE TABLE when the column contains a NULL value.
On an error (such as when an index cannot be dropped due to
FOREIGN KEY constraints), the field dict_index_t::to_be_dropped
was only being cleared in debug builds, even though the field
is available and being used also in non-debug builds.
This was a regression that was introduced by myself originally
in MySQL 5.7.6 and later merged to MariaDB 10.2.2, in
d39898de8e
An error manifested itself in the MariaDB Server 10.4 non-debug build,
involving instant ADD or DROP column. Because an earlier failed
ALTER TABLE operation incorrectly left the dict_index_t::to_be_dropped
flag set, the column pointers of the index fields would fail to be
adjusted for instant ADD or DROP column (MDEV-15562). The instant
ADD COLUMN in MariaDB Server 10.3 is unlikely to be affected by a
similar scenario, because dict_table_t::instant_add_column() in 10.3
is applying the transformations to all indexes, not skipping
to-be-dropped ones.
The prtype & DATA_LONG_TRUE_VARCHAR flag only plays a role when
converting between InnoDB internal format and the MariaDB SQL layer
row format. Ideally this flag would never have been persisted in the
InnoDB data dictionary.
There were bogus assertion failures when an instant ADD, DROP, or
column reordering was combined with a change of extending a VARCHAR
from less than 256 bytes to more than 255 bytes. Such changes are
allowed starting with MDEV-15563 in MariaDB 10.4.3.
dict_table_t::instant_column(), dict_col_t::same_format(): Ignore
the DATA_LONG_TRUE_VARCHAR flag, because it does not affect the
persistent storage format.
MariaDB data-at-rest encryption (innodb_encrypt_tables)
had repurposed the same unused data field that was repurposed
in MySQL 5.7 (and MariaDB 10.2) for the Split Sequence Number (SSN)
field of SPATIAL INDEX. Because of this, MariaDB was unable to
support encryption on SPATIAL INDEX pages.
Furthermore, InnoDB page checksums skipped some bytes, and there
are multiple variations and checksum algorithms. By default,
InnoDB accepts all variations of all algorithms that ever existed.
This unnecessarily weakens the page checksums.
We hereby introduce two more innodb_checksum_algorithm variants
(full_crc32, strict_full_crc32) that are special in a way:
When either setting is active, newly created data files will
carry a flag (fil_space_t::full_crc32()) that indicates that
all pages of the file will use a full CRC-32C checksum over the
entire page contents (excluding the bytes where the checksum
is stored, at the very end of the page). Such files will always
use that checksum, no matter what the parameter
innodb_checksum_algorithm is assigned to.
For old files, the old checksum algorithms will continue to be
used. The value strict_full_crc32 will be equivalent to strict_crc32
and the value full_crc32 will be equivalent to crc32.
ROW_FORMAT=COMPRESSED tables will only use the old format.
These tables do not support new features, such as larger
innodb_page_size or instant ADD/DROP COLUMN. They may be
deprecated in the future. We do not want an unnecessary
file format change for them.
The new full_crc32() format also cleans up the MariaDB tablespace
flags. We will reserve flags to store the page_compressed
compression algorithm, and to store the compressed payload length,
so that checksum can be computed over the compressed (and
possibly encrypted) stream and can be validated without
decrypting or decompressing the page.
In the full_crc32 format, there no longer are separate before-encryption
and after-encryption checksums for pages. The single checksum is
computed on the page contents that is written to the file.
We do not make the new algorithm the default for two reasons.
First, MariaDB 10.4.2 was a beta release, and the default values
of parameters should not change after beta. Second, we did not
yet implement the full_crc32 format for page_compressed pages.
This will be fixed in MDEV-18644.
This is joint work with Marko Mäkelä.
If we instantly change the size of a fixed-length field
and treat it as kind-of variable-length, then we will need
conversions between old column values and new ones.
I tried adding such a conversion to row_build(), but then I
noticed that more conversions would be needed, because
old values still appeared in a freshly rebuilt secondary index,
causing a mismatch when trying to search with the correct
longer value that was converted in my provisional fix to row_build().
So, we will revert the essential part of
MDEV-15563: Instant ROW_FORMAT=REDUNDANT column extension
(commit 22feb179ae), but not
remove any tests.
innobase_build_col_map_add(): Do not assume that old_field->pack_length()
equals to field->pack_length(). Fix submitted by Aleksey Midenkov.
innobase_instant_try(): Assert that the column length of fixed-length
NOT NULL columns is only changing for ROW_FORMAT=REDUNDANT.
The Create_field::charset can contain garbage for columns
that the SQL layer does not consider as being string columns.
InnoDB considers BIT a string column for historical reasons
(and backward compatibility with old persistent InnoDB metadata),
and therefore it checked the charset.
The Field::charset() consistently is my_charset_bin for BIT,
so we can trust that one.
instant_alter_column_possible(): Add the other MDEV-17459 work-around
condition. The existence of fulltext indexes only prevents instant
DROP COLUMN or changing the order of columns. Other forms of instant
ALTER TABLE are no problem.
Before commit 4e7ee166a9 that merged
the MDEV-18295 fix from 10.3, the work-around of MDEV-17459 in
instant_alter_column_possible() was categorically refusing any
ALGORITHM=INSTANT if any FULLTEXT INDEX was present. After that commit,
a related condition was only present in prepare_inplace_alter_table_dict()
but not in the other callers of instant_alter_column_possible().
Allow ALGORITHM=INSTANT (or avoid touching any data)
when changing the collation, or in some cases, the character set,
of a non-indexed CHAR or VARCHAR column. There is no penalty
for subsequent DDL or DML operations, and compatibility with
older MariaDB versions will be unaffected.
Character sets may be changed when the old encoding is compatible
with the new one. For example, changing from ASCII to anything
ASCII-based, or from 3-byte to 4-byte UTF-8 can sometimes be
performed instantly.
This is joint work with Eugene Kosov.
The test cases as well as ALTER_CONVERT_TO, charsets_are_compatible(),
Type_handler::Charsets_are_compatible() are his work.
The Field_str::is_equal(), Field_varstring::is_equal() and
the InnoDB changes were mostly rewritten by me due to conflicts
with MDEV-15563.
Limitations:
Changes of indexed columns will still require
ALGORITHM=COPY. We should allow ALGORITHM=NOCOPY and allow
the indexes to be rebuilt inside the storage engine,
without copying the entire table.
Instant column size changes (in bytes) are not supported by
all storage engines.
Instant CHAR column changes are only allowed for InnoDB
ROW_FORMAT=REDUNDANT. We could allow this for InnoDB
when the CHAR internally uses a variable-length encoding,
say, when converting from 3-byte UTF-8 to 4-byte UTF-8.
Instant VARCHAR column changes are allowed for InnoDB
ROW_FORMAT=REDUNDANT, and for others only if the size
in bytes does not change from 128..255 bytes to more
than 256 bytes.
Inside InnoDB, this slightly changes the way how MDEV-15563
works and fixes the result of the innodb.instant_alter_extend test.
We change the way how ALTER_COLUMN_EQUAL_PACK_LENGTH_EXT
is handled. All column extension, type changes and renaming
now go through a common route, except when ctx->is_instant()
is in effect, for example, instant ADD or DROP COLUMN has
been initiated. Only in that case we will go through
innobase_instant_try() and rewrite all column metadata.
get_type(field, prtype, mtype, len): Convert a SQL data type into
InnoDB column metadata.
innobase_rename_column_try(): Remove the update of SYS_COLUMNS.
innobase_rename_or_enlarge_column_try(): New function,
replacing part of innobase_rename_column_try() and all of
innobase_enlarge_column_try(). Also changes column types.
innobase_rename_or_enlarge_columns_cache(): Also change
the column type.
This was developed by Aleksey Midenkov based on my design.
In the original InnoDB storage format (that was retroactively named
ROW_FORMAT=REDUNDANT in MySQL 5.0.3), the length of each index field
is stored explicitly.
Because of this, we can and now will allow instant conversion from
VARCHAR to CHAR or VARBINARY to BINARY of equal or greater size,
as well as instant conversion of TINYINT to SMALLINT to MEDIUMINT
to INT to BIGINT (while not changing between signed and unsigned).
Theoretically, we could allow changing from an unsigned integer to
a bigger unsigned integer, as well as changing CHAR to VARCHAR, but
that would require additional metadata and conversions whenever
reading old records.
Field_str::is_equal(), Field_varstring::is_equal(), Field_num::is_equal():
Return the new result IS_EQUAL_PACK_LENGTH_EXT if the table advertises
HA_EXTENDED_TYPES_CONVERSION capability and we are considering the
above-mentioned conversions.
ALTER_COLUMN_EQUAL_PACK_LENGTH_EXT: A new ALTER TABLE flag, similar
to ALTER_COLUMN_EQUAL_PACK_LENGTH but requiring conversions when
reading the data. The Field::is_equal() result IS_EQUAL_PACK_LENGTH_EXT
will map to this flag.
dtype_get_fixed_size_low(): For BINARY, CHAR and integer columns
in ROW_FORMAT=REDUNDANT, return 0 (variable length) from now on.
dtype_get_sql_null_size(): Keep returning the current size for
BINARY, CHAR and integer columns, so that in ROW_FORMAT=REDUNDANT
it will remain possible to update in place between NULL and NOT NULL
values.
btr_index_rec_validate(): Relax a CHECK TABLE length check for
ROW_FORMAT=REDUNDANT tables.
btr_cur_instant_init_low(): No longer trust fixed_len
for ROW_FORMAT=REDUNDANT tables.
We cannot rely on fixed_len anymore because the record can have shorter
length from before instant extension. Note that importing such tablespace
into earlier MariaDB versions produces ER_TABLE_SCHEMA_MISMATCH when
using a .cfg file.
The code path where the table was not being rebuilt during ALTER TABLE
was not covered by the test. Add coverage, and remove the debug assertion
that could fail in this case.
MySQL 5.7 introduced the class page_size_t and increased the size of
buffer pool page descriptors by introducing this object to them.
Maybe the intention of this exercise was to prepare for a future
where the buffer pool could accommodate multiple page sizes.
But that future never arrived, not even in MySQL 8.0. It is much
easier to manage a pool of a single page size, and typically all
storage devices of an InnoDB instance benefit from using the same
page size.
Let us remove page_size_t from MariaDB Server. This will make it
easier to remove support for ROW_FORMAT=COMPRESSED (or make it a
compile-time option) in the future, just by removing various
occurrences of zip_size.
Analysis:
========
Increasing the length of the indexed varchar column is not an instant operation for
innodb.
Fix:
===
- Introduce the new handler flag 'Alter_inplace_info::ALTER_COLUMN_INDEX_LENGTH' to
indicate the index length differs due to change of column length changes.
- InnoDB makes the ALTER_COLUMN_INDEX_LENGTH flag as instant operation.
This is a port of Mysql fix.
commit 913071c0b16cc03e703308250d795bc381627e37
Author: Nisha Gopalakrishnan <nisha.gopalakrishnan@oracle.com>
Date: Wed May 30 14:54:46 2018 +0530
BUG#26848813: INDEXED COLUMN CAN'T BE CHANGED FROM VARCHAR(15)
TO VARCHAR(40) INSTANTANEOUSLY
ha_innobase::commit_inplace_alter_table(): Do not crash if
innobase_update_foreign_cache() returns an error. It can return
an error on ALTER TABLE if an inconsistent FOREIGN KEY constraint
was created earlier when SET foreign_key_checks=0 was in effect.
Instead, report a warning to the client that constraints cannot
be loaded.