1
0
mirror of https://github.com/MariaDB/server.git synced 2025-08-27 13:04:36 +03:00
Commit Graph

2630 Commits

Author SHA1 Message Date
Vlad Lesin
202316a38f Merge 10.5 into 10.6 2022-03-07 18:42:47 +03:00
Vlad Lesin
0b92c7b0e0 Merge 10.4 into 10.5 2022-03-07 17:16:11 +03:00
Vlad Lesin
1ec3205703 Merge 10.3 into 10.4 2022-03-07 16:46:00 +03:00
Vlad Lesin
86c1bf118a MDEV-27992 DELETE fails to delete record after blocking is released
MDEV-27025 allows to insert records before the record on which DELETE is
locked, as a result the DELETE misses those records, what causes serious ACID
violation.

Revert MDEV-27025, MDEV-27550. The test which shows the scenario of ACID
violation is added.
2022-03-07 16:42:05 +03:00
Marko Mäkelä
2dce3bad9c Merge 10.4 into 10.5 2022-03-07 09:26:50 +02:00
Marko Mäkelä
7b97020d40 Merge 10.3 into 10.4 2022-03-07 09:05:36 +02:00
Daniel Black
b6a2472489 MDEV-27891: SIGSEGV in InnoDB buffer pool resize
During an increase in resize, the new curr_size got a value
less than old_size.

As n_chunks_new and n_chunks have a strong correlation to the
resizing operation in progress, we can use them and remove the
need for old_size.

For convienece the n_chunks_new < n_chunks is now the is_shrinking
function.

The volatile compiler optimization on n_chunks{,_new} is removed
as real mutex uses are needed.

Other n_chunks_new/n_chunks methods:

n_chunks_new and n_chunks almost always read and altered under
the pool mutex.

Exceptions are:
* i_s_innodb_buffer_page_fill,
* buf_pool_t::is_uncompressed (via is_blocked_field)
These need reexamining for the need of a mutex, however comments
indicates this already.

get_n_pages has uses in buffer pool load, recover log memory
exhaustion estimates and innodb status so take the minimum number
of chunks for safety.

The buf_pool_t::running_out function also uses curr_size/old_size.
We replace this hot function calculation with just n_chunks_new.
This is the new size of the chunks before the resizing occurs.

If we are resizing down, we've already got the case we had previously
(as the minimum). If we are resizing upwards, we are taking an
optimistic view that there will be buffer chunks available for locks.
As this memory allocation is occurring immediately next the resizing
function it seems likely.

Compiler hint UNIV_UNLIKELY removed to leave it to the branch
predictor to make an informed decision.

Added test case of a smaller size than the Marko/Roel original
in JIRA reducing the size to 256M. SEGV hits roughly 1/10 times
but its better than a 21G memory size.

Reviewer: Marko
2022-03-07 13:36:18 +11:00
Marko Mäkelä
02da00a98c Merge 10.2 into 10.3 2022-03-04 14:29:36 +02:00
Marko Mäkelä
a92f07f4bd MDEV-27993 Assertion failed in btr_page_reorganize_low()
btr_cur_optimistic_insert(): Disregard DEBUG_DBUG injection to
invoke btr_page_reorganize() if the page (and the table) is empty.
Otherwise, an assertion would fail in btr_page_reorganize_low()
because PAGE_MAX_TRX_ID is 0 in an empty secondary index leaf page.
2022-03-03 11:51:25 +02:00
Marko Mäkelä
2db80c3773 MDEV-27973 SIGSEGV in ha_innobase::reset() after TRUNCATE of TEMPORARY TABLE
create_table_info_t::innobase_table_flags(): Ignore page_compressed
and page_compression_level on TEMPORARY tables.

ha_innobase::truncate(): Add a debug assertion that create() must succeed
on temporary tables.
2022-03-01 17:52:45 +02:00
Thirunarayanan Balathandayuthapani
446ec64651 MDEV-27962 Instant DDL downgrades the MDL when table is empty
- Server incorrectly downgrading the MDL after prepare phase when
table is empty. mdl_exclusive_after_prepare is being set in
prepare phase only. But mdl_exclusive_after_prepare condition was
misplaced and checked before prepare phase by
commit d270525dfd and it is now
changed to check after prepare phase.

 - main.innodb_mysql_sync test case was changed to avoid locking
optimization when table is empty.
2022-03-01 13:01:48 +05:30
Marko Mäkelä
6daf8f8a0d Merge 10.5 into 10.6 2022-02-25 13:48:47 +02:00
Marko Mäkelä
b791b942e1 Merge 10.4 into 10.5 2022-02-25 13:27:41 +02:00
Marko Mäkelä
f5ff7d09c7 Merge 10.3 into 10.4 2022-02-25 13:00:48 +02:00
Marko Mäkelä
00b70bbb51 Merge 10.2 into 10.3 2022-02-25 10:43:38 +02:00
Vlad Lesin
f6f055a191 Merge 10.3 into 10.4 2022-02-21 14:10:27 +03:00
Vlad Lesin
a6f258e47f MDEV-20605 Awaken transaction can miss inserted by other transaction records due to wrong persistent cursor restoration
Backported from 10.5 20e9e804c1 and
5948d7602e.

sel_restore_position_for_mysql() moves forward persistent cursor
position after btr_pcur_restore_position() call if cursor relative position
is BTR_PCUR_ON and the cursor points to the record with NOT the same field
values as in a stored record(and some other not important for this case
conditions).

It was done because btr_pcur_restore_position() sets
page_cur_mode_t mode  to PAGE_CUR_LE for cursor->rel_pos ==  BTR_PCUR_ON
before opening cursor. So we are searching for the record less or equal
to stored one. And if the found record is not equal to stored one, then
it is less and we need to move cursor forward.

But there can be a situation when the stored record was purged, but the
new one with the same key but different value was inserted while
row_search_mvcc() was suspended. In this case, when the thread is
awaken, it will invoke sel_restore_position_for_mysql(), which, in turns,
invoke btr_pcur_restore_position(), which will return false because found
record don't match stored record, and
sel_restore_position_for_mysql() will move forward cursor position.

The above can lead to the case when awaken row_search_mvcc() do not see
records inserted by other transactions while it slept. The mtr test case
shows the example how it can be.

The fix is to return special value from persistent cursor restoring
function which would notify its caller that uniq fields of restored
record and stored record are the same, and in this case
sel_restore_position_for_mysql() don't move cursor forward.

Delete-marked records are correctly processed in row_search_mvcc().
Non-unique secondary indexes are "uniquified" by adding the PK, the
index->n_uniq should then be index->n_fields. So there is no need in
additional checks in the fix.

If transaction's readview can't see the changes made in secondary index
record, it requests clustered index record in row_search_mvcc() to check
its transaction id and get the correspondent record version. After this
row_search_mvcc() commits mtr to preserve clustered index latching
order, and starts mtr. Between those mtr commit and start secondary
index pages are unlatched, and purge has the ability to remove stored in
the cursor record, what causes rows duplication in result set for
non-locking reads, as cursor position is restored to the previously
visited record.

To solve this the changes are just switched off for non-locking reads,
it's quite simple solution, besides the changes don't make sense for
non-locking reads.

The more complex and effective from performance perspective solution is
to create mtr savepoint before clustered record requesting and rolling
back to that savepoint after that. See MDEV-27557.

One more solution is to have per-record transaction id for secondary
indexes. See MDEV-17598.

If any of those is implemented, just remove select_lock_type argument in
sel_restore_position_for_mysql().
2022-02-21 12:49:54 +03:00
Vlad Lesin
5f001bd7b8 MDEV-27025 insert-intention lock conflicts with waiting ORDINARY lock
The code was backported from 10.5 be8113861c
commit. See that commit message for details.
2022-02-21 12:49:54 +03:00
Vladislav Vaintroub
4e667b9638 MDEV-27877 considerably speed up innodb_force_recovery test
decrease innodb_lock_wait_timeout for the current session.
2022-02-17 18:25:32 +01:00
Marko Mäkelä
f04b459fb7 Merge 10.5 into 10.6 2022-02-17 14:37:17 +02:00
Marko Mäkelä
cac995ec6f Merge 10.4 into 10.5 2022-02-17 11:58:25 +02:00
Marko Mäkelä
6f4740fde7 MDEV-27723 innodb.instant_alter,8k.rdiff does not apply on FreeBSD
The .rdiff files were not correctly adjusted in
commit 8f4a3bf07c (MDEV-25057).
2022-02-17 11:49:13 +02:00
Marko Mäkelä
f921db7aa5 Merge 10.3 into 10.4 2022-02-17 11:33:08 +02:00
Marko Mäkelä
5b237e5965 Merge 10.2 into 10.3 2022-02-17 10:53:58 +02:00
Marko Mäkelä
73c391afc5 MDEV-27583 InnoDB uses different constants for FK cascade error message in SQL vs error log
convert_error_code_to_mysql(): Use the correct limit FK_MAX_CASCADE_DEL
in the error message. The DICT_FK_MAX_RECURSIVE_LOAD applies to
the number of foreign key constraints in table definitions,
not to the number of rows that are visited while processing
a foreign key constraint.
2022-02-17 10:48:24 +02:00
Marko Mäkelä
cf574cf53b MDEV-27634 innodb_zip tests failing on s390x
Some GNU/Linux distributions ship a zlib that is modified to use
the s390x DFLTCC instruction. That modification would essentially
redefine compressBound(sourceLen) as (sourceLen * 16 + 2308) / 8 + 6.

Let us relax the tests for InnoDB ROW_FORMAT=COMPRESSED to cope with
such a weaker compression guarantee.

create_table_info_t::row_size_is_acceptable(): Remove a bogus debug-only
assertion that would fail to hold for the test innodb_zip.bug36169.
The function page_zip_empty_size() may indeed return 0.
2022-02-16 17:03:02 +02:00
Vlad Lesin
497809d26d Merge 10.5 into 10.6 2022-02-15 11:32:15 +03:00
Vlad Lesin
5948d7602e MDEV-20605 Awaken transaction can miss inserted by other transaction records due to wrong persistent cursor restoration
Post-push fix: remove unstable test.

The test was developed to find the reason of duplicated rows caused by
MDEV-20605 fix. The test is not necessary as the reason was found and
the bug was fixed.
2022-02-15 10:04:05 +03:00
Vlad Lesin
f2f22c382b Merge 10.5 into 10.6 2022-02-14 18:30:51 +03:00
Vlad Lesin
20e9e804c1 MDEV-20605 Awaken transaction can miss inserted by other transaction records due to wrong persistent cursor restoration
sel_restore_position_for_mysql() moves forward persistent cursor
position after btr_pcur_restore_position() call if cursor relative position
is BTR_PCUR_ON and the cursor points to the record with NOT the same field
values as in a stored record(and some other not important for this case
conditions).

It was done because btr_pcur_restore_position() sets
page_cur_mode_t mode  to PAGE_CUR_LE for cursor->rel_pos ==  BTR_PCUR_ON
before opening cursor. So we are searching for the record less or equal
to stored one. And if the found record is not equal to stored one, then
it is less and we need to move cursor forward.

But there can be a situation when the stored record was purged, but the
new one with the same key but different value was inserted while
row_search_mvcc() was suspended. In this case, when the thread is
awaken, it will invoke sel_restore_position_for_mysql(), which, in turns,
invoke btr_pcur_restore_position(), which will return false because found
record don't match stored record, and
sel_restore_position_for_mysql() will move forward cursor position.

The above can lead to the case when awaken row_search_mvcc() do not see
records inserted by other transactions while it slept. The mtr test case
shows the example how it can be.

The fix is to return special value from persistent cursor restoring
function which would notify its caller that uniq fields of restored
record and stored record are the same, and in this case
sel_restore_position_for_mysql() don't move cursor forward.

Delete-marked records are correctly processed in row_search_mvcc().
Non-unique secondary indexes are "uniquified" by adding the PK, the
index->n_uniq should then be index->n_fields. So there is no need in
additional checks in the fix.

If transaction's readview can't see the changes made in secondary index
record, it requests clustered index record in row_search_mvcc() to check
its transaction id and get the correspondent record version. After this
row_search_mvcc() commits mtr to preserve clustered index latching
order, and starts mtr. Between those mtr commit and start secondary
index pages are unlatched, and purge has the ability to remove stored in
the cursor record, what causes rows duplication in result set for
non-locking reads, as cursor position is restored to the previously
visited record.

To solve this the changes are just switched off for non-locking reads,
it's quite simple solution, besides the changes don't make sense for
non-locking reads.

The more complex and effective from performance perspective solution is
to create mtr savepoint before clustered record requesting and rolling
back to that savepoint after that. See MDEV-27557.

One more solution is to have per-record transaction id for secondary
indexes. See MDEV-17598.

If any of those is implemented, just remove select_lock_type argument in
sel_restore_position_for_mysql().
2022-02-14 17:35:04 +03:00
Marko Mäkelä
feb8004b58 Merge 10.5 into 10.6 2022-02-14 09:16:41 +02:00
Marko Mäkelä
52b32c60c2 Merge 10.4 into 10.5 2022-02-14 08:59:33 +02:00
Marko Mäkelä
c9bc10e6e8 Merge 10.3 into 10.4 2022-02-14 08:56:50 +02:00
Marko Mäkelä
e928fdbff1 Merge 10.2 into 10.3 2022-02-14 08:49:11 +02:00
Vlad Lesin
3b10e8f80c MDEV-27746 Wrong comparision of BLOB's empty preffix with non-preffixed BLOB causes rows count mismatch for clustered and secondary indexes during non-locking read
row_sel_sec_rec_is_for_clust_rec() treats empty BLOB prefix field in
secondary index as a field equal to any external BLOB field in clustered
index. Row_sel_get_clust_rec_for_mysql::operator() doesn't zerro out
clustered record pointer in row_search_mvcc(), and row_search_mvcc()
thinks that delete-marked secondary index record has visible for
"CHECK TABLE"'s read view old-versioned clustered index record, and
row_scan_index_for_mysql() counts it as a row.

The fix is to execute row_sel_sec_rec_is_for_blob() in
row_sel_sec_rec_is_for_clust_rec() if clustered field contains BLOB's
reference.
2022-02-11 12:26:27 +03:00
Marko Mäkelä
cce994057b Merge 10.5 into 10.6 2022-02-09 15:49:50 +02:00
Oleksandr Byelkin
34c5019698 Merge branch '10.5' into bb-10.5-release 2022-02-09 08:57:41 +01:00
Marko Mäkelä
5c46751f23 MDEV-27734 Set innodb_change_buffering=none by default
The aim of the InnoDB change buffer is to avoid delays when a leaf page
of a secondary index is not present in the buffer pool, and a record needs
to be inserted, delete-marked, or purged. Instead of reading the page into
the buffer pool for making such a modification, we may insert a record to
the change buffer (a special index tree in the InnoDB system tablespace).
The buffered changes are guaranteed to be merged if the index page
actually needs to be read later.

The change buffer could be useful when the database is stored on a
rotational medium (hard disk) where random seeks are slower than
sequential reads or writes.

Obviously, the change buffer will cause write amplification, due to
potentially large amount of metadata that is being written to the
change buffer. We will have to write redo log records for modifying
the change buffer tree as well as the user tablespace. Furthermore,
in the user tablespace, we must maintain a change buffer bitmap page
that uses 2 bits for estimating the amount of free space in pages,
and 1 bit to specify whether buffered changes exist. This bitmap needs
to be updated on every operation, which could reduce performance.

Even if the change buffer were free of bugs such as MDEV-24449
(potentially causing the corruption of any page in the system tablespace)
or MDEV-26977 (corruption of secondary indexes due to a currently
unknown reason), it will make diagnosis of other data corruption harder.

Because of all this, it is best to disable the change buffer by default.
2022-02-09 08:36:41 +02:00
Oleksandr Byelkin
f5c5f8e41e Merge branch '10.5' into 10.6 2022-02-03 17:01:31 +01:00
Oleksandr Byelkin
cf63eecef4 Merge branch '10.4' into 10.5 2022-02-01 20:33:04 +01:00
Oleksandr Byelkin
a576a1cea5 Merge branch '10.3' into 10.4 2022-01-30 09:46:52 +01:00
Oleksandr Byelkin
41a163ac5c Merge branch '10.2' into 10.3 2022-01-29 15:41:05 +01:00
Marko Mäkelä
e9aac09153 MDEV-25440: Indexed CHAR columns are broken with NO_PAD collations
cmp_data(): Compare different-length CHAR fields with
the new strnncollsp_nchars function that will pad spaces if needed.

Any InnoDB ROW_FORMAT except the original one that was named
ROW_FORMAT=REDUNDANT in MySQL 5.0.3 will internally store
CHAR(n) columns as variable-length if the character encoding is
variable length. Spaces may be trimmed from the end.
For NOT NULL values, the minimum length is always n*mbminlen.
In cmp_data() we only know the lengths in bytes and we cannot
easily know the ROW_FORMAT.

is_strnncoll_compatible(): Refactored from innobase_mysql_cmp().

innobase_mysql_cmp(): Merged to cmp_whole_field().

cmp_whole_field(): Invoke strnncollsp_nchars for the DATA_MYSQL
(the CHAR type with any other collation than latin1_swedish_ci).

Reviewed by: Alexander Barkov
Tested by: Roel Roel Van de Paar
2022-01-26 12:42:17 +02:00
Alexander Barkov
62e320c86d MDEV-18918 SQL mode EMPTY_STRING_IS_NULL breaks RBR upon CREATE TABLE .. SELECT
The 10.5 version of the patch.

Removing DEFAULT from INFORMATION_SCHEMA columns.
DEFAULT in read-only tables is rather meaningless.
Upgrade should go smoothly.

Also fixes:
 MDEV-20254 Problems with EMPTY_STRING_IS_NULL and I_S tables
2022-01-25 10:31:55 +04:00
Alexander Barkov
da37bfd8d6 MDEV-18918 SQL mode EMPTY_STRING_IS_NULL breaks RBR upon CREATE TABLE .. SELECT
Removing DEFAULT from INFORMATION_SCHEMA columns.
DEFAULT in read-only tables is rather meaningless.
Upgrade should go smoothly.

Also fixes:
 MDEV-20254 Problems with EMPTY_STRING_IS_NULL and I_S tables
2022-01-25 10:31:03 +04:00
Eugene Kosov
faaecc8fcf MDEV-27273 Confusing column count in IMPORT TABLESPACE error message
It's misleading to compare and write to user number of columns and fields.
Thus, it would be better to remove that check and let use see a subsequent
error message about missing or mispaced column.

row_import::match_schema(): remove misleading check
2022-01-21 20:25:56 +03:00
Marko Mäkelä
c1d7b4575e MDEV-26870 --skip-symbolic-links does not disallow .isl file creation
The InnoDB DATA DIRECTORY attribute is not implemented via
symbolic links but something similar, *.isl files that contain
the names of data files.

InnoDB failed to ignore the DATA DIRECTORY attribute even though
the server was started with --skip-symbolic-links.

Native ALTER TABLE in InnoDB will retain the DATA DIRECTORY attribute
of the table, no matter if the table will be rebuilt or not.

Generic ALTER TABLE (with ALGORITHM=COPY) as well as TRUNCATE TABLE
will discard the DATA DIRECTORY attribute.

All tests have been run with and without the ./mtr option
--mysqld=--skip-symbolic-links
and some tests that use the InnoDB DATA DIRECTORY attribute
have been adjusted for this.
2022-01-21 14:43:59 +02:00
Thirunarayanan Balathandayuthapani
28e166d643 MDEV-26784 [Warning] InnoDB: Difficult to find free blocks in the buffer pool
Problem:
=======
  InnoDB ran out of memory during recovery and it fails to
flush the dirty LRU blocks. The reason is that buffer pool
can ran out before the LRU list length reaches
BUF_LRU_OLD_MIN_LEN(256) threshold.

Fix:
====
During recovery, InnoDB should write out and evict all
dirty blocks.
2022-01-21 14:15:18 +05:30
Daniel Black
410c4edef3 MDEV-27467: innodb to enforce the minimum innodb_buffer_pool_size in SET GLOBAL
.. to be the same as startup.

In resolving MDEV-27461, BUF_LRU_MIN_LEN (256) is the minimum number of
pages for the innodb buffer pool size. Obviously we need more than just
flushing pages. Taking the 16k page size and its default minimum, an
extra 25% is needed on top of the flushing pages to make a workable buffer
pool.

The minimum innodb_buffer_pool_chunk_size (1M) restricts the minimum
otherwise we'd have a pool made up of different chunk sizes.

The resulting minimum innodb buffer pool sizes are:

Page Size, Previously minimum (startup), with change.
        4k                            5M           2M
        8k                            5M           3M
       16k                            5M           5M
       32k                           24M          10M
       64k                           24M          20M

With this patch, SET GLOBAL innodb_buffer_pool_size minimums are
enforced.

The evident minimum system variable size for innodb_buffer_pool_size
is 2M, however this is only setable if using 4k page size. As
the order of the page_size and buffer_pool_size aren't fixed, we can't
hide this change.

Subsequent changes:
* innodb_buffer_pool_resize_with_chunks.test - raised of pool resize due to new
  minimums. Chunk size also needed increase as the test was for
  pool_size < chunk_size to generate a warning.
* Removed srv_buf_pool_min_size and replaced use with MYSQL_SYSVAR_NAME(buffer_pool_size).min_val
* Removed srv_buf_pool_def_size and replaced constant defination in
  MYSQL_SYSVAR_LONGLONG(buffer_pool_size)
* Reordered ha_innodb to allow for direct use of MYSQL_SYSVAR_NAME(buffer_pool_size).min_val
* Moved buf_pool_size_align into ha_innodb to access to MYSQL_SYSVAR_NAME(buffer_pool_size).min_val
* loose-innodb_disable_resize_buffer_pool_debug is needed in the
  innodb.restart.opt test so that under debug mode, resizing of the
  innodb buffer pool can occur.
2022-01-19 11:10:45 +11:00
Eugene Kosov
e128d852e8 MDEV-27272 Crash on EXPORT/IMPORT tablespace with column added in the middle
dict_index_t::reconstruct_fields(): add input validation by replacing some
assertions

handle_instant_metadata(): fix nullptr dereference
2022-01-19 00:04:57 +03:00