1
0
mirror of https://github.com/MariaDB/server.git synced 2025-08-07 00:04:31 +03:00
Commit Graph

291 Commits

Author SHA1 Message Date
Sergei Golubchik
7d657fda64 Merge branch '10.11 into 11.4 2025-01-30 12:01:11 +01:00
Sergei Golubchik
e69f8cae1a Merge branch '10.6' into 10.11 2025-01-30 11:55:13 +01:00
Sergei Golubchik
066e8d6aea Merge branch '10.5' into 10.6 2025-01-29 11:17:38 +01:00
Nikita Malyavin
ecaedbe299 MDEV-33658 1/2 Refactoring: extract Key length initialization
mysql_prepare_create_table: Extract a Key initialization part that
relates to length calculation and long unique index designation.

append_system_key_parts call also moves there.

Move this initialization before the duplicate elimination.

Extract WITHOUT OVERPLAPS check into a separate function. It had to be moved
earlier in the code to preserve the order of the error checks, as in the tests.
2025-01-26 16:15:46 +01:00
Marko Mäkelä
17f01186f5 Merge 10.11 into 11.4 2025-01-09 07:58:08 +02:00
Marko Mäkelä
a54d151fc1 Merge 10.6 into 10.11 2024-12-19 15:38:53 +02:00
Marko Mäkelä
ddd7d5d8e3 MDEV-24035 Failing assertion: UT_LIST_GET_LEN(lock.trx_locks) == 0 causing disruption and replication failure
Under unknown circumstances, the SQL layer may wrongly disregard an
invocation of thd_mark_transaction_to_rollback() when an InnoDB
transaction had been aborted (rolled back) due to one of the following errors:
* HA_ERR_LOCK_DEADLOCK
* HA_ERR_RECORD_CHANGED (if innodb_snapshot_isolation=ON)
* HA_ERR_LOCK_WAIT_TIMEOUT (if innodb_rollback_on_timeout=ON)

Such an error used to cause a crash of InnoDB during transaction commit.
These changes aim to catch and report the error earlier, so that not only
this crash can be avoided but also the original root cause be found and
fixed more easily later.

The idea of this fix is from Michael 'Monty' Widenius.

HA_ERR_ROLLBACK: A new error code that will be translated into
ER_ROLLBACK_ONLY, signalling that the current transaction
has been aborted and the only allowed action is ROLLBACK.

trx_t::state: Add TRX_STATE_ABORTED that is like
TRX_STATE_NOT_STARTED, but noting that the transaction had been
rolled back and aborted.

trx_t::is_started(): Replaces trx_is_started().

ha_innobase: Check the transaction state in various places.
Simplify the logic around SAVEPOINT.

ha_innobase::is_valid_trx(): Replaces ha_innobase::is_read_only().

The InnoDB logic around transaction savepoints, commit, and rollback
was unnecessarily complex and might have contributed to this
inconsistency. So, we are simplifying that logic as well.

trx_savept_t: Replace with const undo_no_t*. When we rollback to
a savepoint, all we need to know is the number of undo log records
that must survive.

trx_named_savept_t, DB_NO_SAVEPOINT: Remove. We can store undo_no_t
directly in the space allocated at innobase_hton->savepoint_offset.

fts_trx_create(): Do not copy previous savepoints.

fts_savepoint_rollback(): If a savepoint was not found, roll back
everything after the default savepoint of fts_trx_create().
The test innodb_fts.savepoint is extended to cover this code.

Reviewed by: Vladislav Lesin
Tested by: Matthias Leich
2024-12-12 18:02:00 +02:00
Marko Mäkelä
2719cc4925 Merge 10.11 into 11.4 2024-12-02 11:35:34 +02:00
Marko Mäkelä
3d23adb766 Merge 10.6 into 10.11 2024-11-29 13:43:17 +02:00
Thirunarayanan Balathandayuthapani
9ba18d1aa0 MDEV-35394 Innochecksum misinterprets freed pages
- Innochecksum misinterprets the freed pages as active one.
This leads the user to think there are too many valid
pages exist.

- To avoid this confusion, innochecksum introduced one
more option --skip-freed-pages and -r to avoid the freed
pages while dumping or printing the summary of the tablespace.

- Innochecksum can safely assume the page is freed if
the respective extent doesn't belong to a segment and marked as
freed in XDES_BITMAP in extent descriptor page.

- Innochecksum shouldn't assume that zero-filled page as extent
descriptor page.

Reviewed-by: Marko Mäkelä
2024-11-27 13:00:51 +05:30
Marko Mäkelä
ebefef658e Merge 10.11 into 11.2 2024-10-18 11:32:22 +03:00
Marko Mäkelä
eca552a1a4 MDEV-34830: LSN in the future is not being treated as serious corruption
The invariant of write-ahead logging is that before any change to a
page is written to the data file, the corresponding log record must
must first have been durably written.

In crash recovery, there were some sloppy checks for this. Let us
implement accurate checks and flag an inconsistency as a hard error,
so that we can avoid further corruption of a corrupted database.
For data extraction from the corrupted database, innodb_force_recovery
can be used.

Before recovery is reading any data pages or invoking
buf_dblwr_t::recover() to recover torn pages from the
doublewrite buffer, InnoDB will have parsed the log until the
final LSN and updated log_sys.lsn to that. So, we can rely on
log_sys.lsn at all times. The doublewrite buffer recovery has been
refactored in such a way that the recv_sys.dblwr.pages may be consulted
while discovering files and their page sizes, but nothing will be
written back to data files before buf_dblwr_t::recover() is invoked.

recv_max_page_lsn, recv_lsn_checks_on: Remove.

recv_sys_t::validate_checkpoint(): Validate the write-ahead-logging
condition at the end of the recovery.

recv_dblwr_t::validate_page(): Keep track of the maximum LSN
(if we are checking a non-doublewrite copy of a page) but
do not complain LSN being in the future. The doublewrite buffer
is a special case, because it will be read early during recovery.
Besides, starting with commit 762bcb81b5
the dblwr=true copies of pages may legitimately be "too new".

recv_dblwr_t::find_page(): Find a valid page with the smallest
FIL_PAGE_LSN that is in the valid range for recovery.

recv_dblwr_t::restore_first_page(): Replaced by find_page().
Only buf_dblwr_t::recover() will write to data files.

buf_dblwr_t::recover(): Simplify the message output. Do attempt
doublewrite recovery on user page read error. Ignore doublewrite
pages whose FIL_PAGE_LSN is outside the usable bounds. Previously,
we could wrongly recover a too new page from the doublewrite buffer.
It is unlikely that this could have lead to an actual error.
Write back all recovered pages from the doublewrite buffer here,
including for the first page of any tablespace.

buf_page_is_corrupted(): Distinguish the return values
CORRUPTED_FUTURE_LSN and CORRUPTED_OTHER.

buf_page_check_corrupt(): Return the error code DB_CORRUPTION
in case the LSN is in the future.

Datafile::read_first_page_flags(): Split from read_first_page().
Take a copy of the first page as a parameter.

recv_sys_t::free_corrupted_page(): Take the file as a parameter
and return whether a message was displayed. This avoids some duplicated
and incomplete error messages.

buf_page_t::read_complete(): Remove some redundant output and always
display the name of the corrupted file. Never return DB_FAIL;
use it only in internal error handling.

IORequest::read_complete(): Assume that buf_page_t::read_complete()
will have reported any error.

fil_space_t::set_corrupted(): Return whether this is the first time
the tablespace had been flagged as corrupted.

Datafile::validate_first_page(), fil_node_open_file_low(),
fil_node_open_file(), fil_space_t::read_page0(),
fil_node_t::read_page0(): Add a parameter for a copy of the
first page, and a parameter to indicate whether the FIL_PAGE_LSN
check should be suppressed. Before buf_dblwr_t::recover() is
invoked, we cannot validate the FIL_PAGE_LSN, but we can trust the
FSP_SPACE_FLAGS and the tablespace ID that may be present in a
potentially too new copy of a page.

Reviewed by: Debarun Banerjee
2024-10-18 10:12:47 +03:00
Marko Mäkelä
bb47e575de MDEV-34830: LSN in the future is not being treated as serious corruption
The invariant of write-ahead logging is that before any change to a
page is written to the data file, the corresponding log record must
must first have been durably written.

On crash recovery, there were some sloppy checks for this. Let us
implement accurate checks and flag an inconsistency as a hard error,
so that we can avoid further corruption of a corrupted database.
For data extraction from the corrupted database, innodb_force_recovery
can be used.

Before recovery is reading any data pages or invoking
buf_dblwr_t::recover() to recover torn pages from the
doublewrite buffer, InnoDB will have parsed the log until the
final LSN and updated log_sys.lsn to that. So, we can rely on
log_sys.lsn at all times. The doublewrite buffer recovery has been
refactored in such a way that the recv_sys.dblwr.pages may be consulted
while discovering files and their page sizes, but nothing will be
written back to data files before buf_dblwr_t::recover() is invoked.

A section of the test mariabackup.innodb_redo_overwrite
that is parsing some mariadb-backup --backup output has
been removed, because that output "redo log block is overwritten"
would often be missing in a Microsoft Windows environment
as a result of these changes.

recv_max_page_lsn, recv_lsn_checks_on: Remove.

recv_sys_t::validate_checkpoint(): Validate the write-ahead-logging
condition at the end of the recovery.

recv_dblwr_t::validate_page(): Keep track of the maximum LSN
(if we are checking a non-doublewrite copy of a page) but
do not complain LSN being in the future. The doublewrite buffer
is a special case, because it will be read early during recovery.
Besides, starting with commit 762bcb81b5
the dblwr=true copies of pages may legitimately be "too new".

recv_dblwr_t::find_page(): Find a valid page with the smallest
FIL_PAGE_LSN that is in the valid range for recovery.

recv_dblwr_t::restore_first_page(): Replaced by find_page().
Only buf_dblwr_t::recover() will write to data files.

buf_dblwr_t::recover(): Simplify the message output. Do attempt
doublewrite recovery on user page read error. Ignore doublewrite
pages whose FIL_PAGE_LSN is outside the usable bounds. Previously,
we could wrongly recover a too new page from the doublewrite buffer.
It is unlikely that this could have lead to an actual error.
Write back all recovered pages from the doublewrite buffer here,
including for the first page of any tablespace.

buf_page_is_corrupted(): Distinguish the return values
CORRUPTED_FUTURE_LSN and CORRUPTED_OTHER.

buf_page_check_corrupt(): Return the error code DB_CORRUPTION
in case the LSN is in the future.

Datafile::read_first_page(): Handle FSP_SPACE_FLAGS=0xffffffff
in the same way on both 32-bit and 64-bit architectures.

Datafile::read_first_page_flags(): Split from read_first_page().
Take a copy of the first page as a parameter.

recv_sys_t::free_corrupted_page(): Take the file as a parameter
and return whether a message was displayed. This avoids some duplicated
and incomplete error messages.

buf_page_t::read_complete(): Remove some redundant output and always
display the name of the corrupted file. Never return DB_FAIL;
use it only in internal error handling.

IORequest::read_complete(): Assume that buf_page_t::read_complete()
will have reported any error.

fil_space_t::set_corrupted(): Return whether this is the first time
the tablespace had been flagged as corrupted.

Datafile::validate_first_page(), fil_node_open_file_low(),
fil_node_open_file(), fil_space_t::read_page0(),
fil_node_t::read_page0(): Add a parameter for a copy of the
first page, and a parameter to indicate whether the FIL_PAGE_LSN
check should be suppressed. Before buf_dblwr_t::recover() is
invoked, we cannot validate the FIL_PAGE_LSN, but we can trust the
FSP_SPACE_FLAGS and the tablespace ID that may be present in a
potentially too new copy of a page.

Reviewed by: Debarun Banerjee
2024-10-17 17:24:20 +03:00
Sergei Golubchik
f0a5412037 Merge branch '11.0' into 11.1 2024-05-13 09:52:30 +02:00
Sergei Golubchik
f9807aadef Merge branch '10.11' into 11.0 2024-05-12 12:18:28 +02:00
Sergei Golubchik
018d537ec1 Merge branch '10.6' into 10.11 2024-04-22 15:23:10 +02:00
Marko Mäkelä
bb2e125d07 Merge 10.5 into 10.6
This excludes commit 040069f4ba
because it is specific to innodb_sync_debug, which had been removed
in commit ff5d306e29.
2024-04-18 07:14:56 +03:00
Vladislav Vaintroub
061adae9a2 MDEV-16944 Fix file sharing issues on Windows in mysqltest
On Windows systems, occurrences of ERROR_SHARING_VIOLATION due to
conflicting share modes between processes accessing the same file can
result in CreateFile failures.

mysys' my_open() already incorporates a workaround by implementing
wait/retry logic on Windows.

But this does not help if files are opened using shell redirection like
mysqltest traditionally did it, i.e via

--echo exec "some text" > output_file

In such cases, it is cmd.exe, that opens the output_file, and it
won't do any sharing-violation retries.

This commit addresses the issue by introducing a new built-in command,
'write_line', in mysqltest. This new command serves as a brief alternative
to 'write_file', with a single line output, that also resolves variables
like "exec" would.

Internally, this command will use my_open(), and therefore retry-on-error
logic.

Hopefully this will eliminate the very sporadic "can't open file because
it is used by another process" error on CI.
2024-04-17 16:52:37 +02:00
Marko Mäkelä
683fbced6b Merge 11.0 into 11.1 2024-03-28 12:15:36 +02:00
Marko Mäkelä
fec2fd6add Merge 10.11 into 11.0 2024-03-28 10:51:36 +02:00
Marko Mäkelä
788953463d Merge 10.6 into 10.11
Some fixes related to commit f838b2d799 and
Rows_log_event::do_apply_event() and Update_rows_log_event::do_exec_row()
for system-versioned tables were provided by Nikita Malyavin.
This was required by test versioning.rpl,trx_id,row.
2024-03-28 09:16:57 +02:00
Marko Mäkelä
fa8a46eb68 MDEV-33613 InnoDB may still hang when temporarily running out of buffer pool
By design, InnoDB has always hung when permanently running out of
buffer pool, for example when several threads are waiting to allocate
a block, and all of the buffer pool is buffer-fixed by the active threads.

The hang that we are fixing here occurs when the buffer pool is only
temporarily running out and the situation could be rescued by writing out
some dirty pages or evicting some clean pages.

buf_LRU_get_free_block(): Simplify the way how we wait for
the buf_flush_page_cleaner thread. This fixes occasional hangs
of the test encryption.innochecksum that were introduced by
commit a55b951e60 (MDEV-26827).
To play it safe, we use a timed wait when waiting for the
buf_flush_page_cleaner() thread to perform its job. Should that
thread get stuck, we will invoke buf_pool.LRU_warn() in order to
display a message that pages could not be freed, and keep trying
to wake up the buf_flush_page_cleaner() thread.

The INFORMATION_SCHEMA.INNODB_METRICS counters
buffer_LRU_single_flush_failure_count and
buffer_LRU_get_free_waits will be removed.
The latter is represented by buffer_pool_wait_free.

Also removed will be the message
"InnoDB: Difficult to find free blocks in the buffer pool"
because in d34479dc66 we
introduced a more precise message
"InnoDB: Could not free any blocks in the buffer pool"
in the buf_flush_page_cleaner thread.

buf_pool_t::LRU_warn(): Issue the warning message that we could
not free any blocks in the buffer pool. This may also be invoked
by buf_LRU_get_free_block() if buf_flush_page_cleaner() appears
to be stuck.

buf_pool_t::n_flush_dec(): Remove.

buf_pool_t::n_flush_dec_holding_mutex(): Rename to n_flush_dec().

buf_flush_LRU_list_batch(): Increment the eviction counter for blocks
of temporary, discarded or dropped tablespaces.

buf_flush_LRU(): Make static, and remove the constant parameter
evict=false. The only caller will be the buf_flush_page_cleaner()
thread.

IORequest::is_LRU(): Remove. The only case of evicting pages on
write completion will be when we are writing out pages of the
temporary tablespace. Those pages are not in buf_pool.flush_list,
only in buf_pool.LRU.

buf_page_t::flush(): Remove the parameter evict.

buf_page_t::write_complete(): Change the parameter "bool temporary"
to "bool persistent" and add a parameter for an already read state().

Reviewed by: Debarun Banerjee
2024-03-22 14:17:39 +02:00
Sergei Golubchik
b6680e0101 Merge branch '11.0' into 11.1 2024-02-02 11:30:47 +01:00
Sergei Golubchik
87e13722a9 Merge branch '10.6' into 10.11 2024-02-01 18:36:14 +01:00
Sergei Golubchik
3f6038bc51 Merge branch '10.5' into 10.6 2024-01-31 18:04:03 +01:00
Sergei Golubchik
01f6abd1d4 Merge branch '10.4' into 10.5 2024-01-31 17:32:53 +01:00
Oleksandr Byelkin
fe490f85bb Merge branch '10.11' into 11.0 2024-01-30 08:54:10 +01:00
Oleksandr Byelkin
14d930db5d Merge branch '10.6' into 10.11 2024-01-30 08:17:58 +01:00
Oleksandr Byelkin
25c0806867 Merge branch '10.5' into 10.6 2024-01-30 07:43:15 +01:00
Oleksandr Byelkin
50107c4b22 Merge branch '10.4' into 10.5 2024-01-30 07:26:17 +01:00
Thirunarayanan Balathandayuthapani
653cb195d3 MDEV-26740 Inplace alter rebuild increases file size
PageBulk::init(): Unnecessary reserves the extent before
allocating a page for bulk insert. btr_page_alloc()
capable of handing the extending of tablespace.
2024-01-15 13:04:10 +05:30
Sergei Golubchik
7a5448f8da Merge branch '11.0' into 11.1 2023-12-19 20:11:54 +01:00
Sergei Golubchik
8c8bce05d2 Merge branch '10.11' into 11.0 2023-12-19 15:53:18 +01:00
Sergei Golubchik
fd0b47f9d6 Merge branch '10.6' into 10.11 2023-12-18 11:19:04 +01:00
Sergei Golubchik
e95bba9c58 Merge branch '10.5' into 10.6 2023-12-17 11:20:43 +01:00
Sergei Golubchik
98a39b0c91 Merge branch '10.4' into 10.5 2023-12-02 01:02:50 +01:00
Marko Mäkelä
edc478847b Merge 11.0 into 11.1 2023-11-24 15:58:35 +02:00
Marko Mäkelä
5b6134b040 Merge 10.11 into 11.0 2023-11-24 11:20:56 +02:00
Marko Mäkelä
f87c7d1772 Merge 10.6 into 10.11 2023-11-21 12:47:51 +02:00
Marko Mäkelä
4c16ec3e77 MDEV-32050 fixup: Stabilize tests
In any test that uses wait_all_purged.inc, ensure that InnoDB tables
will be created without persistent statistics.

This is a follow-up to commit cd04673a17
after a similar failure was observed in the innodb_zip.blob test.
2023-11-21 12:42:00 +02:00
Marko Mäkelä
90d968dab9 Merge 10.6 into 10.11 2023-11-20 10:08:19 +02:00
Marko Mäkelä
eb1f8b2919 MDEV-32027 Opening all .ibd files on InnoDB startup can be slow
dict_find_max_space_id(): Return SELECT MAX(SPACE) FROM SYS_TABLES.

dict_check_tablespaces_and_store_max_id(): In the normal case
(no encryption plugin has been loaded and the change buffer is empty),
invoke dict_find_max_space_id() and do not open any .ibd files.
If a std::set<uint32_t> has been specified, open the files whose
tablespace ID is mentioned. Else, open all data files that are identified
by SYS_TABLES records.

fil_ibd_open(): Remove a call to os_file_get_last_error() that can
report a misleading error, such as EINVAL inside my_realpath() that is
not an actual error. This could be invoked when a data file is found
but the FSP_SPACE_FLAGS are incorrect, such as is the case for
table test.td in
./mtr --mysqld=--innodb-buffer-pool-dump-at-shutdown=0 innodb.table_flags

buf_load(): If any tablespaces could not be found, invoke
dict_check_tablespaces_and_store_max_id() on the missing tablespaces.

dict_load_tablespace(): Try to load the tablespace unless it was found
to be futile. This fixes failures related to FTS_*.ibd files for
FULLTEXT INDEX.

btr_cur_t::search_leaf(): Prevent a crash when the tablespace
does not exist. This was caught by the test innodb_fts.fts_concurrent_insert
when the change to dict_load_tablespaces() was not present.

We modify a few tests to ensure that tables will not be loaded at startup.
For some fault injection tests this means that the corrupted tables
will not be loaded, because dict_load_tablespace() would perform stricter
checks than dict_check_tablespaces_and_store_max_id().

Tested by: Matthias Leich
Reviewed by: Thirunarayanan Balathandayuthapani
2023-11-17 15:07:51 +02:00
Oleksandr Byelkin
0f5613a25f Merge branch '11.0' into 11.1 2023-11-08 18:03:08 +01:00
Oleksandr Byelkin
48af85db21 Merge branch '10.11' into 11.0 2023-11-08 17:09:44 +01:00
Oleksandr Byelkin
04d9a46c41 Merge branch '10.6' into 10.10 2023-11-08 16:23:30 +01:00
Marko Mäkelä
f77a3868a7 MDEV-11816 fixup: Remove an orphan test file 2023-11-06 10:20:51 +02:00
Marko Mäkelä
14685b10df MDEV-32050: Deprecate&ignore innodb_purge_rseg_truncate_frequency
The motivation of introducing the parameter
innodb_purge_rseg_truncate_frequency in
mysql/mysql-server@28bbd66ea5 and
mysql/mysql-server@8fc2120fed
seems to have been to avoid stalls due to freeing undo log pages
or truncating undo log tablespaces. In MariaDB Server,
innodb_undo_log_truncate=ON should be a much lighter operation
than in MySQL, because it will not involve any log checkpoint.

Another source of performance stalls should be
trx_purge_truncate_rseg_history(), which is shrinking the history list
by freeing the undo log pages whose undo records have been purged.
To alleviate that, we will introduce a purge_truncation_task that will
offload this from the purge_coordinator_task. In that way, the next
innodb_purge_batch_size pages may be parsed and purged while the pages
from the previous batch are being freed and the history list being shrunk.

The processing of innodb_undo_log_truncate=ON will still remain the
responsibility of the purge_coordinator_task.

purge_coordinator_state::count: Remove. We will ignore
innodb_purge_rseg_truncate_frequency, and act as if it had been
set to 1 (the maximum shrinking frequency).

purge_coordinator_state::do_purge(): Invoke an asynchronous task
purge_truncation_callback() to free the undo log pages.

purge_sys_t::iterator::free_history(): Free those undo log pages
that have been processed. This used to be a part of
trx_purge_truncate_history().

purge_sys_t::clone_end_view(): Take a new value of purge_sys.head
as a parameter, so that it will be updated while holding exclusive
purge_sys.latch. This is needed for race-free access to the field
in purge_truncation_callback().

Reviewed by: Vladislav Lesin
2023-10-25 09:11:58 +03:00
Marko Mäkelä
9b2a65e41a Merge 11.0 into 11.1 2023-10-19 08:26:16 +03:00
Marko Mäkelä
be24e75229 Merge 10.11 into 11.0 2023-10-19 08:12:16 +03:00
Marko Mäkelä
d5e15424d8 Merge 10.6 into 10.10
The MDEV-29693 conflict resolution is from Monty, as well as is
a bug fix where ANALYZE TABLE wrongly built histograms for
single-column PRIMARY KEY.
Also includes a fix for safe_malloc error reporting.

Other things:
- Copied main.log_slow from 10.4 to avoid mtr issue

Disabled test:
- spider/bugfix.mdev_27239 because we started to get
  +Error	1429 Unable to connect to foreign data source: localhost
  -Error	1158 Got an error reading communication packets
- main.delayed
  - Bug#54332 Deadlock with two connections doing LOCK TABLE+INSERT DELAYED
    This part is disabled for now as it fails randomly with different
    warnings/errors (no corruption).
2023-10-14 13:36:11 +03:00