The buf_page_free() call that was introduced in MDEV-15528 was
performed too early in fseg_free_page(), tripping a debug check
in ibuf_remove_free_page(). In all other callers, we can (and will)
invoke buf_page_free() right after fseg_free_page(), but in
ibuf_remove_free_page() we will defer that call to the end of the
mini-transaction. (That call was already present.)
btr_pcur_store_position(): Replace a too strict debug assertion.
It is possible to have a clustered index B-tree for a logically
empty table, which will consist of a node pointer from the root
page to a leaf page that contains the metadata record.
The too strict debug assertion was added in
commit 0e5a4ac253 (MDEV-15562).
commit f74023b955 (MDEV-15090)
inadvertently removed a mtr_t::commit() call from
trx_undo_report_rename(), causing an InnoDB hang if
we failed to log a RENAME operation.
It is unclear whether this condition is possible in practice.
The test case involved SET GLOBAL innodb_trx_rseg_n_slots_debug=1
and a failed CREATE TABLE...SELECT, whose error handling would
internally invoke RENAME in InnoDB.
log_buf_pool_get_oldest_modification(): Acquire
log_sys_t::flush_order_mutex in order to prevent a race condition
that was introduced in
commit 1a6f708ec5 (MDEV-15058).
Before that change, log_buf_pool_get_oldest_modification()
was protected by both log_sys.mutex and log_sys.flush_order_mutex
like it was supposed to be ever since
commit a52c4820a3 (MySQL 5.5.10).
buf_pool_t::get_oldest_modification(): Replaces
buf_pool_get_oldest_modification(), to emphasize that
log_sys.flush_order_mutex must be acquired by the caller if needed.
log_close(): Invoke log_buf_pool_get_oldest_modification()
in order to ensure a clean shutdown.
The scenario of the race condition is as follows:
1. The buffer pool is clean (no writes are pending).
2. mtr_add_dirtied_pages_to_flush_list() releases log_sys.mutex.
3. log_buf_pool_get_oldest_modification() observes that the
buffer pool is clean and returns log_sys.lsn.
4. log_checkpoint() completes, writing a wrong checkpoint header
according to which everything up to log_sys.lsn was clean.
5. mtr_add_dirtied_pages_to_flush_list() completes the execution
of mtr_memo_note_modifications(), releases the page latches and
the flush_order_mutex.
6. On a subsequent log_checkpoint(), the assertion could fail
if the page modifications had not been flushed yet.
The failing assertion (which is valid) was added in MySQL 5.7
mysql/mysql-server@5c6c6ec693
and merged to MariaDB Server 10.2.2 in
commit fec844aca8.
Problem:
=======
While evicting the uncompressed page from buffer pool, InnoDB writes
the checksum for the compressed page in buf_LRU_free_page().
So while flushing the compressed page, checksum validation fails
when innodb_checksum_algorithm variable changed to strict_none.
Solution:
========
- Calculate the checksum only during flushing of page. Removed the
checksum write in buf_LRU_free_page().
Existing implementation used my_checksum (from mysys)
for calculating table checksum and binlog checksum.
This implementation was optimized for powerpc only and lacked
SIMD implementation for x86 (using clmul) and ARM
(using ACLE) instead used zlib-crc32.
mariabackup had its own copy of the crc32 implementation
using hardware optimized implementation only for x86 and lagged
hardware based implementation for powerpc and ARM.
Patch helps unifies all such calls and help aggregate all of them
using an unified interface my_checksum().
Said unification also enables hardware optimized calls for all
architecture viz. x86, ARM, POWERPC.
Default always fallback to zlib crc32.
Thanks to Daniel Black for reviewing, fixing and testing
PowerPC changes. Thanks to Marko and Daniel for early code feedback.
This change also affects information_schema.tables
The create table option "transactional=0 | 1" is now always shown for
storage engines that supports both transactional/crash safe tables and
non transactional tables.
Before this patch the transactional=... option was only shown if the user
specified transactional=... in the CREATE TABLE or ALTER TABLE statement.
The reason for the change was to be able to make it easy to know if an Aria
table is transactional or not.
A crash was observed where dict_acquire_mdl_shared<trylock=false>
would invoke memcpy() with an apparently uninitialized tbl_len.
dict_table_t::parse_name(): Remove an unnecessary tbl_len--
operation. (This should be mostly non-functional cleanup.)
dict_acquire_mdl_shared(): If the second dict_table_t::parse_name()
returns false, terminate the loop just like we would do on the
first invocation.
commit d09aec7a15 (MDEV-19940)
caused a regression. We made wait_lock_get_heap_no() return
uint16_t instead of ulint, and we mostly replaced the previous
magic value ULINT_UNDEFINED with 0. But, we failed to adjust
some assertions. Furthermore, 0 is a valid although rare value
for record locks. (Record locks can be temporarily stored on
page infimum in some operations that involve multiple leaf pages.)
Let us use 0xFFFF as the magic value. Valid heap numbers
are limited to less than 9362 = innodb_page_size/(5+1+1)
when using a minimal 1-byte PRIMARY KEY and a
secondary index on a NULL or '' column.
In the merge 9e6e43551f we replaced
direct use of std::atomic with a wrapper class, so that
dict_index_t::lock will support the default assignment operator.
As part of that change, one occurrence of std::memory_order_release
was accidentally replaced with std::memory_order_relaxed.
Thanks to Sergey Vojtovich for noticing this.
Respect system fields in NO_ZERO_DATE mode.
This is the subject for refactoring in MDEV-19597
Conflict resolution from 7d5223310789f967106d86ce193ef31b315ecff0
commit 3a37644a29 added a non-POD
member buf_page_info_t::id, and thus GCC 7 or later would complain
about a memset() call. Let my_malloc fill the memory for us.
InnoDB mutex monitor is accessing mutexes of poisoned (cached) trx
objects. Unpoison ReadView::m_mutex similarly to trx_t::mutex.
This is regression after MDEV-22593.
This reverts commit 6f1f911497.
because it doesn't do anything now (the server doesn't check
my_disable_leak_check) and it never did anything before
(because without `extern` it simply created a local instance of
my_disable_leak_check, did not affect server's my_disable_leak_check).
data_file_length == 0 in mi_repair() is normal for REPAIR ... USE_FRM.
But in-file links (for blocks and deleted chain) must be compared with
the real file length to avoid spurious "link points outside datafile"
warnings and arbitrary block skipping.
- Remove extra ',' and quotes
- Remove extra newline and remove double newlines
- Added options --lsn-redo-end and --lsn-undo-end to aria_read_log
- Allow one to give the aria_read_log lsn aruments as number,0xhexnumber,
the same way as lsn's are written by aria_read_log
- Don't write full pages to redo log with EXTRA_DEBUG as this takes up
a lot of disk and there has not been a need for this extra loggging for
a long time. Instead one should use EXTRA_ARIA_DEBUG instead.
commit 8ccb3caafb micro-optimized
page_id_t as a wrapper of uint64_t.
buf_dump_t: Remove, and replace with page_id_t, which uses
exactly the same encoding.
buf_page_info_t: Replace space_id,page_num with page_id_t id.
i_s_innodb_set_page_type(): Remove unnecessary code.
The buf_page_info_t::id was already assigned at the start
of the only caller, i_s_innodb_buffer_page_get_info().
namespace intrusive: removed
split class into two: ilist<T> and sized_ilist<T> which has a size field.
ilist<T> no more NULLify pointers to bring a slignly better performance.
As a consequence, fil_space_t::is_in_unflushed_spaces and
fil_space_t::is_in_rotation_list boolean members are needed now.
MDEV-20578 Got error 126 when executing undo undo_key_delete
upon Aria crash recovery
The crash happens in this scenario:
- Table with unique keys and non unique keys
- Batch insert (LOAD DATA or INSERT ... SELECT) with REPLACE
- Some insert succeeds followed by duplicate key error
In the above scenario the table gets corrupted.
The bug was that we don't generate any undo entry for the
failed insert as the whole insert can be ignored by undo.
The code did however not take into account that when bulk
insert is used, we would write cached keys to the file on
failure and undo would wrongly ignore these.
Fixed by moving the writing of the cache keys after we write
the aborted-insert event to the log.