Per fsp0types.h, SDI is on tablespace flags position 14 where MariaDB
stores its pagesize. Flag at position 13, also in MariaDB pagesize
flags, is a MySQL encryption flag.
These are checked only if fsp_flags_is_valid fails, so valid MariaDB
pages sizes don't become errors.
The error message "Cannot reset LSNs in table" was rather specific and
not always true to replaced with more generic error.
ALTER TABLE tbl IMPORT TABLESPACE now reports Unsupported on MySQL
tablespace (rather than index corrupted) along with a server error
message.
MySQL innodb Errors are with with UNSUPPORTED rather than CORRUPTED
to avoid user anxiety.
Reviewer: Marko Mäkelä
The issue is that record_should_be_deleted() returns true in
mysql_delete() even if sub-select with join gets error from storage
engine when DELETE FROM ... WHERE ... IN (SELECT ...) statement is
executed.
The same is true for mysql_update() where select->skip_record() returns
true even if sub-select with join gets error from storage engine.
In the test case if sub-select is chosen as deadlock victim the whole
transaction is rolled back during sub-select execution, but
mysql_delete()/mysql_update() continues transaction execution and invokes
table->delete_row() as record_should_be_deleted() wrongly returns true
in mysql_delete() and table->update_row() as select->skip_record(thd)
wrongly returns 1 for mysql_update().
record_should_be_deleted() wrogly returns true because thd->is_error()
returns false SQL_SELECT::skip_record() invoked from
record_should_be_deleted().
It's supposed that THD error should be set in rr_handle_error() called
from rr_sequential() during sub-select JOIN::exec_inner() execution.
But rr_handle_error() does not set THD error because
READ_RECORD::print_error is not set in JOIN_TAB::read_record.
READ_RECORD::print_error should be initialized in
init_read_record()/init_read_record_idx(). But make_join_readinfo() does
not invoke init_read_record()/init_read_record_idx() for
JOIN_TAB::read_record.
The fix is to set JOIN_TAB::read_record.print_error in
make_join_readinfo(), i.e. in the same place where
JOIN_TAB::read_record.table is set.
Reviewed by Sergey Petrunya.
Let us disable Valgrind on tests that would fail because a
server shutdown or a STOP SLAVE command would take longer,
causing the test harness to forcibly and silently kill the server
due to an exceeded timeout.
row_purge_get_partial(): Replaces trx_undo_rec_get_partial_row().
Also copy the purge_node_t::ref to the purge_node_t::row.
In this way, the clustered index key fields will always be
available, even if thanks to
commit d384ead0f0 (MDEV-14799)
they would no longer be repeated in the remaining part of the
undo log record.
row_log_table_apply_update(): Free the pcur.old_rec_buf before returning.
It may be allocated by btr_pcur_store_position() inside
btr_blob_log_check_t::check() and btr_store_big_rec_extern_fields().
This memory leak was introduced in
commit 2e814d4702 (MariaDB Server 10.2.2)
via mysql/mysql-server@ce0a1e85e2
(MySQL 5.7.5).
btr_lift_page_up(): If the leaf page only contains a hidden metadata
record for MDEV-11369 instant ADD COLUMN, convert the table to the
canonical format like we are supposed to do whenever the table
becomes empty.
Use suspend thread syncpoint instead of include/wait_condition.inc to
make sure DELETE created waiting lock before the next UPDATE begins
locking.
This is backport of commit 0fa4dd0747
from 10.6.
dict_table_rename_in_cache(), dict_table_get_highest_foreign_id():
Reserve sufficient space for the fkid[] buffer, and ensure that the
fkid[] will be NUL-terminated.
The fkid[] must accommodate both the database name (which is already
encoded in my_charset_filename) and the constraint name
(which must be converted to my_charset_filename) so that we can check
if it is in the format databasename/tablename_ibfk_1 (all encoded in
my_charset_filename).
trx_undo_page_report_rename(): Use the correct maximum length of
a table name. Both the database name and the table name can be up to
NAME_CHAR_LEN (64 characters) times 5 bytes per character in the
my_charset_filename encoding. They are not encoded in UTF-8!
fil_op_write_log(): Reserve the correct amount of log buffer for
a rename operation. The file name will be appended by
mlog_catenate_string().
rename_file_ext(): Reserve a large enough buffer for the file names.
- While creating a new InnoDB segment, allocates the extent
before allocating the inode or page allocation even though
the pages are present in fragment segment. This patch does
reserve the extent when InnoDB ran out of fragment pages
in the tablespace.
Let simplify the test.
The update_time is stored in the table metadata (dict_table_t);
it has nothing to do with buffer pool page eviction or replacement.
dict_load_foreigns(): Use a correctly sized buffer for the maximum-length
SYS_FOREIGN.ID. In case of overflow, do not crash the server but instead
return DB_CORRUPTION.
prepare_inplace_alter_table_dict(): If the table will not be rebuilt,
preserve all of the original ROW_FORMAT, including the compressed
page size flags related to ROW_FORMAT=COMPRESSED.
We do not really care about the exact result; we only care that the
statistics will be accessed. The result could change depending on
when some statistics were updated in the background or when some
committed delete-marked rows were purged from other tables on
which persistent statistics are enabled.
The counters were added in commit 5e55d1ced5
and any code to update them was
inadvertently removed in commit 2e814d4702
when applying InnoDB changes from MySQL 5.7.
Let us remove these counters that never reported anything useful. If such
statistics are really needed in a special case, they can be obtained by
instrumenting the code by some means, such as eBPF or a source code patch.
btr_insert_into_right_sibling(): Inherit any gap lock from the
left sibling to the right sibling before inserting the record
to the right sibling and updating the node pointer(s).
lock_update_node_pointer(): Update locks in case a node pointer
will move.
Based on mysql/mysql-server@c7d93c274f
The following condition has to added:
1) InnoDB fails to include the offset of the node pointer field
in non-leaf record for redundant row format.
2) If the Fixed length field does have only prefix length then
calculate the field maximum size as prefix length.
- Added the test case to test (2) and to check maximum number of
fields can exist in the index.
Starting with 10.3, an assertion would fail on the rollback of
a recovered incomplete transaction if a table definition violates
a FOREIGN KEY constraint.
DICT_ERR_IGNORE_RECOVER_LOCK: Include also DICT_ERR_IGNORE_FK_NOKEY
so that trx_resurrect_table_locks() will be able to load
table definitions and resurrect IX locks. Previously, if the
FOREIGN KEY constraints of a table were incomplete, the table
would fail to load until rollback, and in 10.3 or later an assertion
would fail that the rollback was not protected by a table IX lock.
Thanks to commit 9de2e60d74 there
will be no problems to enforce subsequent FOREIGN KEY operations
even though a table with invalid REFERENCES clause was loaded.
Let us explicitly wait for purge before invoking a slow shutdown,
so that instrumented builds (such as ASAN or UBSAN) will not
exceed the 60-second timeout during shutdown.
MDEV-27025 allows to insert records before the record on which DELETE is
locked, as a result the DELETE misses those records, what causes serious ACID
violation.
Revert MDEV-27025, MDEV-27550. The test which shows the scenario of ACID
violation is added.
btr_cur_optimistic_insert(): Disregard DEBUG_DBUG injection to
invoke btr_page_reorganize() if the page (and the table) is empty.
Otherwise, an assertion would fail in btr_page_reorganize_low()
because PAGE_MAX_TRX_ID is 0 in an empty secondary index leaf page.
Backported from 10.5 20e9e804c1 and
5948d7602e.
sel_restore_position_for_mysql() moves forward persistent cursor
position after btr_pcur_restore_position() call if cursor relative position
is BTR_PCUR_ON and the cursor points to the record with NOT the same field
values as in a stored record(and some other not important for this case
conditions).
It was done because btr_pcur_restore_position() sets
page_cur_mode_t mode to PAGE_CUR_LE for cursor->rel_pos == BTR_PCUR_ON
before opening cursor. So we are searching for the record less or equal
to stored one. And if the found record is not equal to stored one, then
it is less and we need to move cursor forward.
But there can be a situation when the stored record was purged, but the
new one with the same key but different value was inserted while
row_search_mvcc() was suspended. In this case, when the thread is
awaken, it will invoke sel_restore_position_for_mysql(), which, in turns,
invoke btr_pcur_restore_position(), which will return false because found
record don't match stored record, and
sel_restore_position_for_mysql() will move forward cursor position.
The above can lead to the case when awaken row_search_mvcc() do not see
records inserted by other transactions while it slept. The mtr test case
shows the example how it can be.
The fix is to return special value from persistent cursor restoring
function which would notify its caller that uniq fields of restored
record and stored record are the same, and in this case
sel_restore_position_for_mysql() don't move cursor forward.
Delete-marked records are correctly processed in row_search_mvcc().
Non-unique secondary indexes are "uniquified" by adding the PK, the
index->n_uniq should then be index->n_fields. So there is no need in
additional checks in the fix.
If transaction's readview can't see the changes made in secondary index
record, it requests clustered index record in row_search_mvcc() to check
its transaction id and get the correspondent record version. After this
row_search_mvcc() commits mtr to preserve clustered index latching
order, and starts mtr. Between those mtr commit and start secondary
index pages are unlatched, and purge has the ability to remove stored in
the cursor record, what causes rows duplication in result set for
non-locking reads, as cursor position is restored to the previously
visited record.
To solve this the changes are just switched off for non-locking reads,
it's quite simple solution, besides the changes don't make sense for
non-locking reads.
The more complex and effective from performance perspective solution is
to create mtr savepoint before clustered record requesting and rolling
back to that savepoint after that. See MDEV-27557.
One more solution is to have per-record transaction id for secondary
indexes. See MDEV-17598.
If any of those is implemented, just remove select_lock_type argument in
sel_restore_position_for_mysql().
convert_error_code_to_mysql(): Use the correct limit FK_MAX_CASCADE_DEL
in the error message. The DICT_FK_MAX_RECURSIVE_LOAD applies to
the number of foreign key constraints in table definitions,
not to the number of rows that are visited while processing
a foreign key constraint.
Some GNU/Linux distributions ship a zlib that is modified to use
the s390x DFLTCC instruction. That modification would essentially
redefine compressBound(sourceLen) as (sourceLen * 16 + 2308) / 8 + 6.
Let us relax the tests for InnoDB ROW_FORMAT=COMPRESSED to cope with
such a weaker compression guarantee.
create_table_info_t::row_size_is_acceptable(): Remove a bogus debug-only
assertion that would fail to hold for the test innodb_zip.bug36169.
The function page_zip_empty_size() may indeed return 0.
row_sel_sec_rec_is_for_clust_rec() treats empty BLOB prefix field in
secondary index as a field equal to any external BLOB field in clustered
index. Row_sel_get_clust_rec_for_mysql::operator() doesn't zerro out
clustered record pointer in row_search_mvcc(), and row_search_mvcc()
thinks that delete-marked secondary index record has visible for
"CHECK TABLE"'s read view old-versioned clustered index record, and
row_scan_index_for_mysql() counts it as a row.
The fix is to execute row_sel_sec_rec_is_for_blob() in
row_sel_sec_rec_is_for_clust_rec() if clustered field contains BLOB's
reference.
Removing DEFAULT from INFORMATION_SCHEMA columns.
DEFAULT in read-only tables is rather meaningless.
Upgrade should go smoothly.
Also fixes:
MDEV-20254 Problems with EMPTY_STRING_IS_NULL and I_S tables
It's misleading to compare and write to user number of columns and fields.
Thus, it would be better to remove that check and let use see a subsequent
error message about missing or mispaced column.
row_import::match_schema(): remove misleading check
The InnoDB DATA DIRECTORY attribute is not implemented via
symbolic links but something similar, *.isl files that contain
the names of data files.
InnoDB failed to ignore the DATA DIRECTORY attribute even though
the server was started with --skip-symbolic-links.
Native ALTER TABLE in InnoDB will retain the DATA DIRECTORY attribute
of the table, no matter if the table will be rebuilt or not.
Generic ALTER TABLE (with ALGORITHM=COPY) as well as TRUNCATE TABLE
will discard the DATA DIRECTORY attribute.
All tests have been run with and without the ./mtr option
--mysqld=--skip-symbolic-links
and some tests that use the InnoDB DATA DIRECTORY attribute
have been adjusted for this.
.. to be the same as startup.
In resolving MDEV-27461, BUF_LRU_MIN_LEN (256) is the minimum number of
pages for the innodb buffer pool size. Obviously we need more than just
flushing pages. Taking the 16k page size and its default minimum, an
extra 25% is needed on top of the flushing pages to make a workable buffer
pool.
The minimum innodb_buffer_pool_chunk_size (1M) restricts the minimum
otherwise we'd have a pool made up of different chunk sizes.
The resulting minimum innodb buffer pool sizes are:
Page Size, Previously minimum (startup), with change.
4k 5M 2M
8k 5M 3M
16k 5M 5M
32k 24M 10M
64k 24M 20M
With this patch, SET GLOBAL innodb_buffer_pool_size minimums are
enforced.
The evident minimum system variable size for innodb_buffer_pool_size
is 2M, however this is only setable if using 4k page size. As
the order of the page_size and buffer_pool_size aren't fixed, we can't
hide this change.
Subsequent changes:
* innodb_buffer_pool_resize_with_chunks.test - raised of pool resize due to new
minimums. Chunk size also needed increase as the test was for
pool_size < chunk_size to generate a warning.
* Removed srv_buf_pool_min_size and replaced use with MYSQL_SYSVAR_NAME(buffer_pool_size).min_val
* Removed srv_buf_pool_def_size and replaced constant defination in
MYSQL_SYSVAR_LONGLONG(buffer_pool_size)
* Reordered ha_innodb to allow for direct use of MYSQL_SYSVAR_NAME(buffer_pool_size).min_val
* Moved buf_pool_size_align into ha_innodb to access to MYSQL_SYSVAR_NAME(buffer_pool_size).min_val
* loose-innodb_disable_resize_buffer_pool_debug is needed in the
innodb.restart.opt test so that under debug mode, resizing of the
innodb buffer pool can occur.
The column INFORMATION_SCHEMA.INNODB_LOCKS.LOCK_DATA
would report NULL when the page that contains the locked
record does not reside in the buffer pool.
Pages may be evicted from the buffer pool due to some background
activity, such as the purge of transaction history loading
undo log pages to the buffer pool. The regression tests intentionally
run with a small buffer pool size setting.
To prevent the intermittent test failures, we will filter out the
contents of the LOCK_DATA column from the output.