1
0
mirror of https://github.com/MariaDB/server.git synced 2025-11-30 05:23:50 +03:00
Commit Graph

3074 Commits

Author SHA1 Message Date
Marko Mäkelä
19052b6deb Merge 10.2 into 10.3 2021-03-18 12:34:48 +02:00
Marko Mäkelä
f87a944c79 Merge 10.5 into 10.6 2021-03-16 15:51:26 +02:00
Marko Mäkelä
8ea923f55b MDEV-24818: Optimize multi-statement INSERT into an empty table
If the user "opts in" (as in the parent
commit 92b2a911e5),
we can optimize multiple INSERT statements to use table-level locking
and undo logging.

There will be a change of behavior:

    CREATE TABLE t(a PRIMARY KEY) ENGINE=InnoDB;
    SET foreign_key_checks=0, unique_checks=0;
    BEGIN; INSERT INTO t SET a=1; INSERT INTO t SET a=1; COMMIT;

will end up with an empty table, because in case of an error,
the entire transaction will be rolled back, instead of rolling
back the failing statement. Previously, the second INSERT statement
would have been logged row by row, and only that second statement
would have been rolled back, leaving the first INSERT intact.

lock_table_x_unlock(), trx_mod_table_time_t::WAS_BULK: Remove.
Because we cannot really support statement rollback in this
optimized mode, we will not optimize the locking. The exclusive
table lock will be held until the end of the transaction.
2021-03-16 15:21:34 +02:00
Marko Mäkelä
92b2a911e5 MDEV-24818 Concurrent use of InnoDB table is impossible until the first transaction is finished
In MDEV-515, we enabled an optimization where an insert into an
empty table will use table-level locking and undo logging.
This may break applications that expect row-level locking.

The SQL statements created by the mysqldump utility will include the
following:

    SET unique_checks=0, foreign_key_checks=0;

We will use these flags to enable the table-level locked and logged
insert. Unless the parameters are set, INSERT will be executed in
the old way, with row-level undo logging and implicit record locks.
2021-03-16 15:20:26 +02:00
Vlad Lesin
8cbada87f0 MDEV-24184 InnoDB RENAME TABLE recovery failure if names are reused
fil_op_replay_rename(): Remove.

fil_rename_tablespace_check(): Remove a parameter is_discarded=false.

recv_sys_t::parse(): Instead of applying FILE_RENAME operations,
buffer the operations in renamed_spaces.

recv_sys_t::apply(): In the last_batch, apply renamed_spaces.
2021-03-15 16:11:23 +03:00
Thirunarayanan Balathandayuthapani
eb7c5530ec MDEV-24730 Insert log operation fails after purge resets n_core_fields
Online log for insert operation of redundant table fails with
index->is_instant() assert. Purge can reset the n_core_fields when
alter is waiting to upgrade MDL for commit phase of DDL. In the
meantime, any insert DML tries to log the operation fails with
index is not being instant.

row_log_get_n_core_fields(): Get the n_core_fields of online log
for the given index.

rec_get_converted_size_comp_prefix_low(): Use n_core_fields of online
log when InnoDB calculates the size of data tuple during redundant
row format table rebuild.

rec_convert_dtuple_to_rec_comp(): Use n_core_fields of online log
when InnoDB does the conversion of data tuple to record during
redudant row format table rebuild.

- Adding the test case which has more than 129 instant columns.
2021-03-12 16:56:47 +05:30
Marko Mäkelä
a43ff483fa Merge 10.5 into 10.6 2021-03-11 20:20:07 +02:00
Marko Mäkelä
a4b7232b2c Merge 10.4 into 10.5 2021-03-11 20:09:34 +02:00
Marko Mäkelä
7a4fbb55b0 MDEV-25105 Remove innodb_checksum_algorithm values none,innodb,...
Historically, InnoDB supported a buggy page checksum algorithm that did not
compute a checksum over the full page. Later, well before MySQL 4.1
introduced .ibd files and the innodb_file_per_table option, the algorithm
was corrected and the first 4 bytes of each page were redefined to be
a checksum.

The original checksum was so slow that an option to disable page checksum
was introduced for benchmarketing purposes.

The Intel Nehalem microarchitecture introduced the SSE4.2 instruction set
extension, which includes instructions for faster computation of CRC-32C.
In MySQL 5.6 (and MariaDB 10.0), innodb_checksum_algorithm=crc32 was
implemented to make of that. As that option was changed to be the default
in MySQL 5.7, a bug was found on big-endian platforms and some work-around
code was added to weaken that checksum further. MariaDB disables that
work-around by default since MDEV-17958.

Later, SIMD-accelerated CRC-32C has been implemented in MariaDB for POWER
and ARM and also for IA-32/AMD64, making use of carry-less multiplication
where available.

Long story short, innodb_checksum_algorithm=crc32 is faster and more secure
than the pre-MySQL 5.6 checksum, called innodb_checksum_algorithm=innodb.
It should have removed any need to use innodb_checksum_algorithm=none.

The setting innodb_checksum_algorithm=crc32 is the default in
MySQL 5.7 and MariaDB Server 10.2, 10.3, 10.4. In MariaDB 10.5,
MDEV-19534 made innodb_checksum_algorithm=full_crc32 the default.
It is even faster and more secure.

The default settings in MariaDB do allow old data files to be read,
no matter if a worse checksum algorithm had been used.
(Unfortunately, before innodb_checksum_algorithm=full_crc32,
the data files did not identify which checksum algorithm is being used.)

The non-default settings innodb_checksum_algorithm=strict_crc32 or
innodb_checksum_algorithm=strict_full_crc32 would only allow CRC-32C
checksums. The incompatibility with old data files is why they are
not the default.

The newest server not to support innodb_checksum_algorithm=crc32
were MySQL 5.5 and MariaDB 5.5. Both have reached their end of life.
A valid reason for using innodb_checksum_algorithm=innodb could have
been the ability to downgrade. If it is really needed, data files
can be converted with an older version of the innochecksum utility.

Because there is no good reason to allow data files to be written
with insecure checksums, we will reject those option values:

    innodb_checksum_algorithm=none
    innodb_checksum_algorithm=innodb
    innodb_checksum_algorithm=strict_none
    innodb_checksum_algorithm=strict_innodb

Furthermore, the following innochecksum options will be removed,
because only strict crc32 will be supported:

    innochecksum --strict-check=crc32
    innochecksum -C crc32
    innochecksum --write=crc32
    innochecksum -w crc32

If a user wishes to convert a data file to use a different checksum
(so that it might be used with the no-longer-supported
MySQL 5.5 or MariaDB 5.5, which do not support IMPORT TABLESPACE
nor system tablespace format changes that were made in MariaDB 10.3),
then the innochecksum tool from MariaDB 10.2, 10.3, 10.4, 10.5 or
MySQL 5.7 can be used.

Reviewed by: Thirunarayanan Balathandayuthapani
2021-03-11 12:46:18 +02:00
Thirunarayanan Balathandayuthapani
8f4a3bf07c MDEV-25057 Assertion `n_fields < dtuple_get_n_fields(entry)'
failed in dtuple_convert_big_rec

In dtuple_convert_big_rec(), InnoDB fails to consider the
instant metadata blob while choosing the variable length
field.
2021-03-09 19:37:27 +05:30
Marko Mäkelä
78284a4c11 MDEV-25085: Simplify instrumentation for LRU eviction
Let us add the status variable innodb_buffer_pool_pages_LRU_freed
to monitor the number of pages that were freed by a buffer pool LRU
eviction scan, without flushing.

Also, let us simplify the monitor interface:
MONITOR_LRU_BATCH_FLUSH_COUNT, MONITOR_LRU_BATCH_FLUSH_PAGES,
MONITOR_LRU_BATCH_EVICT_COUNT, MONITOR_LRU_BATCH_EVICT_PAGES:
Remove.

MONITOR_LRU_BATCH_FLUSH_TOTAL_PAGE: Track buf_lru_flush_page_count
(innodb_buffer_pool_pages_LRU_flushed).

MONITOR_LRU_BATCH_EVICT_TOTAL_PAGE: Track buf_lru_freed_page_count
(buffer_pool_pages_LRU_freed).

Reviewed by: Vladislav Vaintroub
2021-03-09 09:05:26 +02:00
Julius Goryavsky
7345d37141 MDEV-24853: Duplicate key generated during cluster configuration change
Incorrect processing of an auto-incrementing field in the
WSREP-related code during applying transactions results in
a duplicate key being created. This is due to the fact that
at the beginning of the write_row() and update_row() functions,
the values of the auto-increment parameters are used, which
are read from the parameters of the current thread, but further
along the code other values are used, which are read from global
variables (when applying a transaction). This can happen when
the cluster configuration has changed while applying a transaction
(for example in the high_priority_service mode for Galera 4).
Further during IST processing duplicating key is detected, and
processing of the DB_DUPLICATE_KEY return code (inside innodb,
in the write_row() handler) results in a call to the
wsrep_thd_self_abort() function.
2021-03-08 11:15:08 +01:00
Marko Mäkelä
d346763479 Merge 10.5 into 10.6 2021-03-08 10:51:31 +02:00
Marko Mäkelä
a5d3c1c819 Merge 10.4 into 10.5 2021-03-08 10:16:20 +02:00
Marko Mäkelä
a26e7a3726 Merge 10.3 into 10.4 2021-03-08 09:39:54 +02:00
Marko Mäkelä
03ff588d15 Merge 10.5 into 10.6 2021-03-05 16:05:47 +02:00
Marko Mäkelä
10d544aa7b Merge 10.4 into 10.5 2021-03-05 12:54:43 +02:00
Marko Mäkelä
8bab5bb332 Merge 10.3 into 10.4 2021-03-05 10:36:51 +02:00
Varun Gupta
f691d9865b MDEV-7317: Make an index ignorable to the optimizer
This feature adds the functionality of ignorability for indexes.
Indexes are not ignored be default.

To control index ignorability explicitly for a new index,
use IGNORE or NOT IGNORE as part of the index definition for
CREATE TABLE, CREATE INDEX, or ALTER TABLE.

Primary keys (explicit or implicit) cannot be made ignorable.

The table INFORMATION_SCHEMA.STATISTICS get a new column named IGNORED that
would store whether an index needs to be ignored or not.
2021-03-04 22:50:00 +05:30
Vicențiu Ciorbaru
e9b8b76f47 Merge branch '10.2' into 10.3 2021-03-04 16:04:30 +02:00
Thirunarayanan Balathandayuthapani
b044898b97 MDEV-24748 extern column check missing in btr_index_rec_validate()
In btr_index_rec_validate(), externally stored column
check is missing while matching the length of the field
with the length of the field data stored in record.
Fetch the length of the externally stored part and compare it
with the fixed field length.
2021-03-03 17:20:43 +05:30
Marko Mäkelä
ddbc612692 Merge 10.2 into 10.3 2021-03-03 09:41:50 +02:00
Monty
676987c4a1 MDEV-24532 Table corruption ER_NO_SUCH_TABLE_IN_ENGINE .. on table with foreign key
When doing a truncate on an Innodb under lock tables, InnoDB would rename
the old table to #sql-... and recreate a new 't1' table. The table lock
would still be on the #sql-table.

When doing ALTER TABLE, Innodb would do the changes on the #sql table
(which would disappear on close).
When the SQL layer, as part of inline alter table, would close the
original t1 table (#sql in InnoDB) and then reopen the t1 table, Innodb
would notice that this does not match it's own (old) t1 table and
generate an error.

Fixed by adding code in truncate table that if we are under lock tables
and truncating an InnoDB table, we would close, reopen and lock the
table after truncate. This will remove the #sql table and ensure that
lock tables is using the new empty table.

Reviewer: Marko Mäkelä
2021-03-02 15:23:56 +02:00
Marko Mäkelä
7cf4419fc4 MDEV-24789: Reduce lock_sys.wait_mutex contention
A performance regression was introduced by
commit e71e613353 (MDEV-24671)
and mostly addressed by
commit 455514c800.

The regression is likely caused by increased contention
lock_sys.latch (former lock_sys.mutex), possibly indirectly
caused by contention on lock_sys.wait_mutex. This change aims to
reduce both, but further improvements will be needed.

lock_wait(): Minimize the lock_sys.wait_mutex hold time.

lock_sys_t::deadlock_check(): Add a parameter for indicating
whether lock_sys.latch is exclusively locked.

trx_t::was_chosen_as_deadlock_victim: Always use atomics.

lock_wait_wsrep(): Assume that no mutex is being held.

Deadlock::report(): Always kill the victim transaction.

lock_sys_t::timeout: New counter to back MONITOR_TIMEOUT.
2021-02-26 14:58:48 +02:00
Marko Mäkelä
5c9229b96f MDEV-24951 Assertion m.first->second.valid(trx->undo_no) failed
trx_t::commit_in_memory(): Invoke mod_tables.clear().

trx_free_at_shutdown(): Invoke mod_tables.clear() for transactions
that are discarded on shutdown.

Everywhere else, assert mod_tables.empty() on freed transaction objects.
2021-02-24 15:49:58 +02:00
Marko Mäkelä
7953bae22a Merge 10.5 into 10.6 2021-02-24 09:30:17 +02:00
Sergei Golubchik
f33e57a9e6 Merge branch '10.4' into 10.5 2021-02-23 13:06:22 +01:00
Sergei Golubchik
e841957416 Merge branch '10.3' into 10.4 2021-02-23 09:25:57 +01:00
Sergei Golubchik
0ab1e3914c Merge branch '10.2' into 10.3 2021-02-22 22:42:27 +01:00
Marko Mäkelä
93522bc9a9 MDEV-24917 Page cleaner wrongly remains idle
commit a993310593 (MDEV-24537)
introduced the regression that the page cleaner will keep sleeping
even if there is work to do.

innodb_max_dirty_pages_pct_update(): Always wake up the page cleaner
on any SET GLOBAL innodb_max_dirty_pages_pct= assignment.

buf_flush_page_cleaner(): If innodb_max_dirty_pages_pct is nonzero,
consult only that parameter when determining whether there is work
to do. Else, consult innodb_max_dirty_pages.
2021-02-18 18:20:50 +02:00
Marko Mäkelä
94b4578704 Merge 10.5 into 10.6 2021-02-17 19:39:05 +02:00
Marko Mäkelä
c68007d958 MDEV-24738 Improve the InnoDB deadlock checker
A new configuration parameter innodb_deadlock_report is introduced:
* innodb_deadlock_report=off: Do not report any details of deadlocks.
* innodb_deadlock_report=basic: Report transactions and waiting locks.
* innodb_deadlock_report=full (default): Report also the blocking locks.

The improved deadlock checker will consider all involved transactions
in one loop, even if the deadlock loop includes several transactions.
The theoretical maximum number of transactions that can be involved in
a deadlock is `innodb_page_size` * 8, limited by the persistent data
structures.

Note: Similar to
mysql/mysql-server@3859219875
our deadlock checker will consider at most one blocking transaction
for each waiting transaction. The new field trx->lock.wait_trx be
nullptr if and only if trx->lock.wait_lock is nullptr. Note that
trx->lock.wait_lock->trx == trx (the waiting transaction), while
trx->lock.wait_trx points to one of the transactions whose lock is
conflicting with trx->lock.wait_lock.

Considering only one blocking transaction will greatly simplify
our deadlock checker, but it may also make the deadlock checker
blind to some deadlocks where the deadlock cycle is 'hidden' by
the fact that the registered trx->lock.wait_trx is not actually
waiting for any InnoDB lock, but something else. So, instead of
deadlocks, sometimes lock wait timeout may be reported.

To improve on this, whenever trx->lock.wait_trx is changed, we
will register further 'candidate' transactions in Deadlock::to_check(),
and check for 'revealed' deadlocks as soon as possible, in lock_release()
and innobase_kill_query().

The old DeadlockChecker was holding lock_sys.latch, even though using
lock_sys.wait_mutex should be less contended (and thus preferred)
in the likely case that no deadlock is present.

lock_wait(): Defer the deadlock check to this function, instead of
executing it in lock_rec_enqueue_waiting(), lock_table_enqueue_waiting().

DeadlockChecker: Complete rewrite:
(1) Explicitly keep track of transactions that are being waited for,
in trx->lock.wait_trx, protected by lock_sys.wait_mutex. Previously,
we were painstakingly traversing the lock heaps while blocking
concurrent registration or removal of any locks (even uncontended ones).
(2) Use Brent's cycle-detection algorithm for deadlock detection,
traversing each trx->lock.wait_trx edge at most 2 times.
(3) If a deadlock is detected, release lock_sys.wait_mutex,
acquire LockMutexGuard, re-acquire lock_sys.wait_mutex and re-invoke
find_cycle() to find out whether the deadlock is still present.
(4) Display information on all transactions that are involved in the
deadlock, and choose a victim to be rolled back.

lock_sys.deadlocks: Replaces lock_deadlock_found. Protected by wait_mutex.

Deadlock::find_cycle(): Quickly find a cycle of trx->lock.wait_trx...
using Brent's cycle detection algorithm.

Deadlock::report(): Report a deadlock cycle that was found by
Deadlock::find_cycle(), and choose a victim with the least weight.
Altogether, we may traverse each trx->lock.wait_trx edge up to 5
times (2*find_cycle()+1 time for reporting and choosing the victim).

Deadlock::check_and_resolve(): Find and resolve a deadlock.

lock_wait_rpl_report(): Report the waits-for information to
replication. This used to be executed as part of DeadlockChecker.
Replication must know the waits-for relations even if no deadlocks
are present in InnoDB.

Reviewed by: Vladislav Vaintroub
2021-02-17 12:44:08 +02:00
Marko Mäkelä
3ddb4fddf1 MDEV-24738: Extend the test innodb.deadlock_detect 2021-02-17 12:34:24 +02:00
Marko Mäkelä
067465cd2f MDEV-15641 fixup: Make the test faster
Let us avoid the excessive allocation of explicit record locks
(a work-around of MDEV-24813) so that the test will execute
much faster under AddressSanitizer, MemorySanitizer, Valgrind.
2021-02-16 12:07:48 +02:00
Marko Mäkelä
e926964cb8 Remove useless test innodb.innodb_bug60049
The test innodb.innodb_bug60049 used to check that the record
(ID,NAME)=(12,'SYS_FOREIGN_COLS') is the last record in the
secondary index of the system table SYS_TABLES.
But, ever since commit 2336558423
or mysql/mysql-server@082d59670f
that record no longer is the last one in the table!

The more recent test innodb.purge_secondary covers the purge
functionality much better.
2021-02-15 18:12:31 +02:00
Sergei Golubchik
25d9d2e37f Merge branch 'bb-10.4-release' into bb-10.5-release 2021-02-15 16:43:15 +01:00
Marko Mäkelä
2e84846ec0 MDEV-24861 Assertion `trx->rsegs.m_redo.rseg' failed in innodb_prepare_commit_versioned
trx_t::commit_tables(): Ensure that mod_tables will be empty.
This was broken in commit b08448de64
where the query cache invalidation was moved from lock_release().
2021-02-15 10:19:57 +02:00
Sergei Golubchik
00a313ecf3 Merge branch 'bb-10.3-release' into bb-10.4-release
Note, the fix for "MDEV-23328 Server hang due to Galera lock conflict resolution"
was null-merged. 10.4 version of the fix is coming up separately
2021-02-12 17:44:22 +01:00
Marko Mäkelä
da3211e487 MDEV-24763 fixup: Use deterministic ORDER BY 2021-02-12 14:03:25 +02:00
Marko Mäkelä
6f3f191cfa MDEV-24763 ALTER TABLE fails to rename a column in SYS_FIELDS
innobase_rename_column_try(): When renaming SYS_FIELDS records
for secondary indexes, try to use both formats of SYS_FIELDS.POS
as keys, in case the PRIMARY KEY includes a column prefix.

Without this fix, an ALTER TABLE that renames a column followed
by a server restart (or LRU eviction of the table definition
from dict_sys) would make the table inaccessible.
2021-02-12 09:48:36 +02:00
Marko Mäkelä
028ba10d0b MDEV-18468 fixup: Make test case robust w.r.t. deferred DROP TABLE 2021-02-12 09:41:15 +02:00
Thirunarayanan Balathandayuthapani
a2fbbba2e3 MDEV-24832 Root page AHI removal fails during rollback of bulk insert
This failure is caused by commit 43ca6059ca
(MDEV-24720). InnoDB fails to remove the ahi entries
during rollback of bulk insert operation. InnoDB should
remove the AHI entries of root page before reinitialising it.

Reviewed-by: Marko Mäkelä
2021-02-10 15:27:25 +05:30
Marko Mäkelä
c42ee8a7cf MDEV-24781 fixup: Adjust innodb.innodb-index-debug
Now that an INSERT into an empty table is replicated more efficiently
during online ALTER, an old test case started to fail. Let us disable
the MDEV-515 logic for the critical INSERT statement.
2021-02-05 08:32:57 +02:00
Thirunarayanan Balathandayuthapani
597510adfc MDEV-24781 Assertion `mode == 16 || mode == 12 || fix_block->page.status != buf_page_t::FREED' failed in buf_page_get_low
This is caused by commit 3cef4f8f0f
(MDEV-515). dict_table_t::clear() frees all the blob during
rollback of bulk insert.But online log tries to read the
freed blob while applying the log. It can be fixed if we
truncate the online log during rollback of bulk insert operation.
2021-02-05 10:32:36 +05:30
Marko Mäkelä
5f46385764 MDEV-24731 Excessive mutex contention in DeadlockChecker::check_and_resolve()
The DeadlockChecker expects to be able to freeze the waits-for graph.
Hence, it is best executed somewhere where we are not holding any
additional mutexes.

lock_wait(): Defer the deadlock check to this function, instead
of executing it in lock_rec_enqueue_waiting(), lock_table_enqueue_waiting().

DeadlockChecker::trx_rollback(): Merge with the only caller,
check_and_resolve().

LockMutexGuard: RAII accessor for lock_sys.mutex.

lock_sys.deadlocks: Replaces lock_deadlock_found.

trx_t: Clean up some comments.
2021-02-04 16:38:07 +02:00
Thirunarayanan Balathandayuthapani
43ca6059ca MDEV-24720 AHI removal during rollback of bulk insert
InnoDB fails to remove the ahi entries during rollback
of bulk insert operation. InnoDB throws the error when
validates the ahi hash tables. InnoDB should remove
the ahi entries while freeing the segment only during
bulk index rollback operation.

Reviewed-by: Marko Mäkelä
2021-02-02 19:24:05 +05:30
Marko Mäkelä
1110beccd4 Merge 10.5 into 10.6 2021-02-02 15:15:53 +02:00
Marko Mäkelä
324e5f02a9 MDEV-24754 Crash in ha_partition_inplace_ctx::~ha_partition_inplace_ctx()
ha_innobase::commit_inplace_alter_table(): Fix a regression that was
introduced in 6d1f1b61b5 (MDEV-24564).
2021-02-01 18:45:35 +02:00
Sergei Golubchik
60ea09eae6 Merge branch '10.2' into 10.3 2021-02-01 13:49:33 +01:00
Marko Mäkelä
a70a47f2f3 MDEV-24661: Remove the test innodb.innodb_wl6326_big
The purpose of the test was to ensure that the SX (update) mode of
index tree and buffer page latches are being used.

The test has become unstable, possibly due to changes related to
buf_pool.mutex and buf_pool.page_hash, or to the use of MDL in the
purge of transaction history.

In 10.6, the test depends on instrumentation that was refactored
or removed in MDEV-24142.

The use of different latching modes can better be indirectly observed
through high-concurrency benchmarks. For MDEV-14637, a performance test
was conducted where the finer-grained latching and
BTR_CUR_FINE_HISTORY_LENGTH were removed. It caused a 20% performance
regression for UPDATE and somewhat smaller for INSERT.

Any new problem with latching granularity should be easily caught by
performance testing, or by stress tests with Random Query Generator.
2021-01-29 18:03:20 +02:00