1
0
mirror of https://github.com/MariaDB/server.git synced 2025-08-08 11:22:35 +03:00
Commit Graph

3769 Commits

Author SHA1 Message Date
Marko Mäkelä
8aa37c264f MDEV-28980 InnoDB: Failing assertion: len <= MAX_TABLE_NAME_LEN
dict_load_foreigns(): Use a correctly sized buffer for the maximum-length
SYS_FOREIGN.ID. In case of overflow, do not crash the server but instead
return DB_CORRUPTION.
2022-07-25 10:28:45 +03:00
Marko Mäkelä
f8240a2723 MDEV-26294 Duplicate entries in unique index not detected when changing collation
Problem:
=======
ALTER TABLE in InnoDB fails to detect duplicate entries
for the unique index when the character set or collation of
an indexed column is changed in such a way that the character
encoding is compatible with the old table definition.
In this case, any secondary indexes on the changed columns
would be rebuilt (DROP INDEX, ADD INDEX).

Solution:
========
During ALTER TABLE, InnoDB keeps track of columns whose collation
changed, and will fill in the correct metadata when sorting the
index records, or applying changes from concurrent DML.
This metadata will be allocated in the dict_index_t::heap of
the being-created secondary indexes.

The fix was developed by Thirunarayanan Balathandayuthapani
and simplified by me.
2022-07-04 16:13:04 +03:00
Marko Mäkelä
f43145a0e3 Merge 10.5 into 10.6 2022-07-04 16:12:44 +03:00
Marko Mäkelä
33f0270ebb Merge 10.4 into 10.5 2022-07-04 10:19:28 +03:00
Marko Mäkelä
9a0cbd31ce MDEV-26294 Duplicate entries in unique index not detected when changing collation
ha_innobase::check_if_supported_inplace_alter(): Refuse to change the
collation of a column that would become or remain indexed as part of
the ALTER TABLE operation.

In MariaDB Server 10.6, we will allow this type of operation;
that fix depends on MDEV-15250.
2022-07-04 08:04:44 +03:00
Marko Mäkelä
3c2a5ad3e8 Merge 10.7 into 10.8 2022-07-01 17:53:06 +03:00
Marko Mäkelä
3dff84cd15 Merge 10.6 into 10.7 2022-07-01 17:45:29 +03:00
Marko Mäkelä
62a20f8047 Merge 10.5 into 10.6 2022-07-01 15:24:50 +03:00
Marko Mäkelä
b546913ba2 Valgrind: Disable tests that would often time out
Starting with 10.5, InnoDB crash recovery tests seem to time out
more easily under Valgrind, which emulates multiple threads by
interleaving them in a single operating system thread.

These tests will still be covered by
AddressSanitizer and MemorySanitizer.
2022-07-01 14:42:13 +03:00
Marko Mäkelä
f09687094c Merge 10.4 into 10.5 2022-07-01 14:42:02 +03:00
Thirunarayanan Balathandayuthapani
99de8cc028 MDEV-28919 Assertion `(((core_null) + 7) >> 3) == oindex.n_core_null_bytes || !not_redundant()' failed
- In case of discarded tablespace, InnoDB can't read the root page to
assign the n_core_null_bytes. Consecutive instant DDL fails because
of non-matching n_core_null_bytes.
2022-06-30 17:30:13 +05:30
Marko Mäkelä
63961a08a6 Merge 10.9 into 10.10 2022-06-28 12:33:39 +03:00
Marko Mäkelä
404d4820af Merge 10.8 into 10.9 2022-06-28 10:59:01 +03:00
Marko Mäkelä
9523986299 Merge 10.7 into 10.8 2022-06-28 10:06:00 +03:00
Marko Mäkelä
ac0af4ec4a Merge 10.6 into 10.7 2022-06-28 08:34:12 +03:00
Marko Mäkelä
20cf63fe8b Merge 10.5 into 10.6 2022-06-27 16:51:27 +03:00
Marko Mäkelä
773f1dad94 Merge 10.4 into 10.5 2022-06-27 16:17:02 +03:00
Marko Mäkelä
b922ae5fc9 Merge 10.3 into 10.4 2022-06-27 16:16:20 +03:00
Marko Mäkelä
f339ef3f97 MDEV-26577 InnoDB: Failing assertion: dict_tf2_is_valid(flags, flags2) during ADD COLUMN
prepare_inplace_alter_table_dict(): If the table will not be rebuilt,
preserve all of the original ROW_FORMAT, including the compressed
page size flags related to ROW_FORMAT=COMPRESSED.
2022-06-27 16:00:34 +03:00
Marko Mäkelä
7d92c9d212 Suppress a message that may be emitted on slow systems
On FreeBSD, tests run on persistent storage, and no asynchronous I/O
has been implemented. Warnings about 205-second waits on dict_sys.latch
may occur.
2022-06-27 11:03:52 +03:00
Marko Mäkelä
87bd79b1e7 Merge 10.5 into 10.6 2022-06-27 10:59:31 +03:00
Marko Mäkelä
ea847cbeaf Merge 10.4 into 10.5 2022-06-27 10:51:20 +03:00
Marko Mäkelä
01d757036f Merge 10.3 into 10.4 2022-06-27 10:14:37 +03:00
Marko Mäkelä
9810a4ecdf Merge 10.9 into 10.10 2022-06-21 18:27:54 +03:00
Marko Mäkelä
707f2aa214 Merge 10.8 into 10.9 2022-06-21 18:21:07 +03:00
Marko Mäkelä
54ac356dea Merge 10.7 into 10.8 2022-06-21 18:19:24 +03:00
Marko Mäkelä
6680fd8d4b Merge 10.6 into 10.7 2022-06-21 18:02:41 +03:00
Marko Mäkelä
2e43af69e3 MDEV-28870 InnoDB: Missing FILE_CREATE, FILE_DELETE or FILE_MODIFY before FILE_CHECKPOINT
There was a race condition between log_checkpoint_low() and
deleting or renaming data files. The scenario is as follows:

1. The buffer pool does not contain dirty pages.
2. A FILE_DELETE or FILE_RENAME record is written.
3. The checkpoint LSN will be moved ahead of the write of the record.
4. The server is killed before the file is actually renamed or deleted.

We will prevent this race condition by ensuring that a log checkpoint
cannot occur between the durable write and the file system operation:

1. Durably write the FILE_DELETE or FILE_RENAME record.
2. Perform the file system operation.
3. Allow any log checkpoint to proceed.

mtr_t::commit_file(): Implement the DELETE or RENAME logic.

fil_delete_tablespace(): Delegate some of the logic to
mtr_t::commit_file().

fil_space_t::rename(): Delegate some logic to mtr_t::commit_file().
Remove the debug injection point fil_rename_tablespace_failure_2
because we do test RENAME failures without any debug injection.

fil_name_write_rename_low(), fil_name_write_rename(): Remove.

Tested by Matthias Leich
2022-06-21 16:59:21 +03:00
Marko Mäkelä
be99d0ddb6 Fix intermittent failures of innodb.stats_persistent
We do not really care about the exact result; we only care that the
statistics will be accessed. The result could change depending on
when some statistics were updated in the background or when some
committed delete-marked rows were purged from other tables on
which persistent statistics are enabled.
2022-06-17 08:40:51 +03:00
Sergei Golubchik
5ad9a41301 re-enable innodb.innodb_page_compressed tests 2022-06-16 21:56:34 +02:00
Marko Mäkelä
51a4fcd565 Merge 10.9 into 10.10 2022-06-15 10:07:31 +03:00
Marko Mäkelä
9fe784ff7e Merge 10.8 into 10.9 2022-06-15 10:01:51 +03:00
Marko Mäkelä
813986a647 Merge 10.7 into 10.8 2022-06-14 16:19:29 +03:00
Marko Mäkelä
ddf511c44d Merge 10.6 into 10.7 2022-06-14 10:17:36 +03:00
Marko Mäkelä
1f3f457193 MDEV-28802 DROP DATABASE in InnoDB still is case-insensitive
innodb_drop_database(): Use explicit TO_BINARY casts on
SYS_TABLES.NAME, which for historical reasons uses the wrong collation
latin1_swedish_ci instead of BINARY.
2022-06-13 17:05:31 +03:00
Marko Mäkelä
32edabd1f2 Merge 10.9 into 10.10 2022-06-09 15:26:09 +03:00
Marko Mäkelä
6dea701e0f Merge 10.8 into 10.9 2022-06-09 14:53:34 +03:00
Marko Mäkelä
0af9346079 Merge 10.7 into 10.8 2022-06-09 14:37:53 +03:00
Marko Mäkelä
fe75e5e5b1 Merge 10.6 into 10.7 2022-06-09 14:11:43 +03:00
Marko Mäkelä
e11b82f8f5 Merge 10.5 into 10.6 2022-06-09 13:34:52 +03:00
Marko Mäkelä
a9d0bb12e6 Merge 10.4 into 10.5 2022-06-09 12:22:55 +03:00
Marko Mäkelä
9e6fd2995b MDEV-25506 fixup: Wait for TRUNCATE recovery 2022-06-07 10:53:33 +03:00
Marko Mäkelä
5a33a37682 Merge 10.8 into 10.9 2022-06-07 09:20:07 +03:00
Marko Mäkelä
57d4a242da Merge 10.7 into 10.8 2022-06-06 16:22:09 +03:00
Marko Mäkelä
7e39470e33 Merge 10.6 into 10.7 2022-06-06 14:56:20 +03:00
Marko Mäkelä
0b47c126e3 MDEV-13542: Crashing on corrupted page is unhelpful
The approach to handling corruption that was chosen by Oracle in
commit 177d8b0c12
is not really useful. Not only did it actually fail to prevent InnoDB
from crashing, but it is making things worse by blocking attempts to
rescue data from or rebuild a partially readable table.

We will try to prevent crashes in a different way: by propagating
errors up the call stack. We will never mark the clustered index
persistently corrupted, so that data recovery may be attempted by
reading from the table, or by rebuilding the table.

This should also fix MDEV-13680 (crash on btr_page_alloc() failure);
it was extensively tested with innodb_file_per_table=0 and a
non-autoextend system tablespace.

We should now avoid crashes in many cases, such as when a page
cannot be read or allocated, or an inconsistency is detected when
attempting to update multiple pages. We will not crash on double-free,
such as on the recovery of DDL in system tablespace in case something
was corrupted.

Crashes on corrupted data are still possible. The fault injection mechanism
that is introduced in the subsequent commit may help catch more of them.

buf_page_import_corrupt_failure: Remove the fault injection, and instead
corrupt some pages using Perl code in the tests.

btr_cur_pessimistic_insert(): Always reserve extents (except for the
change buffer), in order to prevent a subsequent allocation failure.

btr_pcur_open_at_rnd_pos(): Merged to the only caller ibuf_merge_pages().

btr_assert_not_corrupted(), btr_corruption_report(): Remove.
Similar checks are already part of btr_block_get().

FSEG_MAGIC_N_BYTES: Replaces FSEG_MAGIC_N_VALUE.

dict_hdr_get(), trx_rsegf_get_new(), trx_undo_page_get(),
trx_undo_page_get_s_latched(): Replaced with error-checking calls.

trx_rseg_t::get(mtr_t*): Replaces trx_rsegf_get().

trx_rseg_header_create(): Let the caller update the TRX_SYS page if needed.

trx_sys_create_sys_pages(): Merged with trx_sysf_create().

dict_check_tablespaces_and_store_max_id(): Do not access
DICT_HDR_MAX_SPACE_ID, because it was already recovered in dict_boot().
Merge dict_check_sys_tables() with this function.

dir_pathname(): Replaces os_file_make_new_pathname().

row_undo_ins_remove_sec(): Do not modify the undo page by adding
a terminating NUL byte to the record.

btr_decryption_failed(): Report decryption failures

dict_set_corrupted_by_space(), dict_set_encrypted_by_space(),
dict_set_corrupted_index_cache_only(): Remove.

dict_set_corrupted(): Remove the constant parameter dict_locked=false.
Never flag the clustered index corrupted in SYS_INDEXES, because
that would deny further access to the table. It might be possible to
repair the table by executing ALTER TABLE or OPTIMIZE TABLE, in case
no B-tree leaf page is corrupted.

dict_table_skip_corrupt_index(), dict_table_next_uncorrupted_index(),
row_purge_skip_uncommitted_virtual_index(): Remove, and refactor
the callers to read dict_index_t::type only once.

dict_table_is_corrupted(): Remove.

dict_index_t::is_btree(): Determine if the index is a valid B-tree.

BUF_GET_NO_LATCH, BUF_EVICT_IF_IN_POOL: Remove.

UNIV_BTR_DEBUG: Remove. Any inconsistency will no longer trigger
assertion failures, but error codes being returned.

buf_corrupt_page_release(): Replaced with a direct call to
buf_pool.corrupted_evict().

fil_invalid_page_access_msg(): Never crash on an invalid read;
let the caller of buf_page_get_gen() decide.

btr_pcur_t::restore_position(): Propagate failure status to the caller
by returning CORRUPTED.

opt_search_plan_for_table(): Simplify the code.

row_purge_del_mark(), row_purge_upd_exist_or_extern_func(),
row_undo_ins_remove_sec_rec(), row_undo_mod_upd_del_sec(),
row_undo_mod_del_mark_sec(): Avoid mem_heap_create()/mem_heap_free()
when no secondary indexes exist.

row_undo_mod_upd_exist_sec(): Simplify the code.

row_upd_clust_step(), dict_load_table_one(): Return DB_TABLE_CORRUPT
if the clustered index (and therefore the table) is corrupted, similar
to what we do in row_insert_for_mysql().

fut_get_ptr(): Replace with buf_page_get_gen() calls.

buf_page_get_gen(): Return nullptr and *err=DB_CORRUPTION
if the page is marked as freed. For other modes than
BUF_GET_POSSIBLY_FREED or BUF_PEEK_IF_IN_POOL this will
trigger a debug assertion failure. For BUF_GET_POSSIBLY_FREED,
we will return nullptr for freed pages, so that the callers
can be simplified. The purge of transaction history will be
a new user of BUF_GET_POSSIBLY_FREED, to avoid crashes on
corrupted data.

buf_page_get_low(): Never crash on a corrupted page, but simply
return nullptr.

fseg_page_is_allocated(): Replaces fseg_page_is_free().

fts_drop_common_tables(): Return an error if the transaction
was rolled back.

fil_space_t::set_corrupted(): Report a tablespace as corrupted if
it was not reported already.

fil_space_t::io(): Invoke fil_space_t::set_corrupted() to report
out-of-bounds page access or other errors.

Clean up mtr_t::page_lock()

buf_page_get_low(): Validate the page identifier (to check for
recently read corrupted pages) after acquiring the page latch.

buf_page_t::read_complete(): Flag uninitialized (all-zero) pages
with DB_FAIL. Return DB_PAGE_CORRUPTED on page number mismatch.

mtr_t::defer_drop_ahi(): Renamed from mtr_defer_drop_ahi().

recv_sys_t::free_corrupted_page(): Only set_corrupt_fs()
if any log records exist for the page. We do not mind if read-ahead
produces corrupted (or all-zero) pages that were not actually needed
during recovery.

recv_recover_page(): Return whether the operation succeeded.

recv_sys_t::recover_low(): Simplify the logic. Check for recovery error.

Thanks to Matthias Leich for testing this extensively and to the
authors of https://rr-project.org for making it easy to diagnose
and fix any failures that were found during the testing.
2022-06-06 14:03:22 +03:00
Marko Mäkelä
0dab74ff3f MDEV-28539 Some InnoDB counters are duplicating generic SHOW STATUS
The InnoDB srv_stats counters
n_rows_updated, n_rows_deleted, n_rows_inserted, and n_rows_read
are duplicating
Handler_update, Handler_delete, Handler_write, and Handler_read_ counters.

Updating those counters is not free, especially because some counters
are furthermore split to distinguish a rare case of modifying tables
in the system schema.
2022-06-03 12:20:20 +03:00
Marko Mäkelä
12aeb9fa15 MDEV-28542 Useless output in SHOW ENGINE INNODB STATUS
srv_printf_innodb_monitor(): Only display an ADAPTIVE HASH INDEX
section if the adaptive hash index is enabled.

ibuf_print(): Only display an INSERT BUFFER section if the
change buffer is not empty.
2022-06-03 12:20:19 +03:00
Haidong Ji
41068a890e MDEV-27314 Condense innodb buffer pool resize message
InnoDB buffer pool resize messages are more succinct from this change:

Before:
```
2022-05-07 17:10:33 0 [Note] InnoDB: Completed resizing buffer pool from 14745600 to 19660800 bytes.
2022-05-07 17:10:33 0 [Note] InnoDB: Completed resizing buffer pool.
2022-05-07 17:10:33 8 [Note] InnoDB: Completed resizing buffer pool. (New size: 19660800 bytes).
```
After:
```
2022-05-07 17:10:33 0 [Note] InnoDB: Completed resizing buffer pool from 14745600 to 19660800 bytes.
```

Additionally, the INNODB_BUFFER_POOL_RESIZE_STATUS has more complete
info: it contains both the old and new buffer pool size values.
2022-05-26 12:10:29 +10:00
Sergei Golubchik
bf2bdd1a1a Merge branch '10.8' into 10.9 2022-05-19 14:07:55 +02:00