1
0
mirror of https://github.com/MariaDB/server.git synced 2025-11-09 11:41:36 +03:00
Commit Graph

574 Commits

Author SHA1 Message Date
Thirunarayanan Balathandayuthapani
a390aaaf23 MDEV-36180 Doublewrite recovery of innodb_checksum_algorithm=full_crc32 page_compressed pages does not work
- InnoDB fails to recover the full crc32 page_compressed page
from doublewrite buffer. The reason is that buf_dblwr_t::recover()
fails to identify the space id from the page because the page
has compressed from FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION bytes.

Fix:
===
recv_dblwr_t::find_deferred_page(): Find the page which
has the same page number and try to decompress/decrypt the page
based on the tablespace metadata. After the decompression/decryption,
compare the space id and write the recovered page back to the file.

buf_page_t::read_complete(): Page read from disk is corrupted then
try to read the page from deferred pages in doublewrite buffer.
2025-03-26 12:03:44 +01:00
Sergei Golubchik
066e8d6aea Merge branch '10.5' into 10.6 2025-01-29 11:17:38 +01:00
Igor Babaev
77ea99a4b5 MDEV-35869 Wrong result using degenerated subquery with window function
This bug affected queries containing degenerated single-value subqueries
with window functions. The bug led mostly to wrong results for such
queries.

A subquery is called degenerated if it is of the form (SELECT <expr>...).

For degenerated subqueries of the form (SELECT <expr>) the transformation
   (SELECT <expr>) => <expr>
usually is applied. However if <expr> contains set functions or window
functions such rewriting is not valid for an obvious reason. The code
before this patch erroneously applied the transformation when <expr>
contained window functions and did not contain set functions.

Approved by Rex Johnston <rex.johnston@mariadb.com>
2025-01-23 13:50:29 -08:00
Thirunarayanan Balathandayuthapani
a6ab0e6c0b MDEV-34898 Doublewrite recovery of innodb_checksum_algorithm=full_crc32 encrypted pages does not work
- Use file_key_management encryption plugin instead of
example_key_management_plugin for the encryption.doublewrite_debug
test case
2025-01-16 18:14:49 +05:30
Thirunarayanan Balathandayuthapani
f8cf493290 MDEV-34898 Doublewrite recovery of innodb_checksum_algorithm=full_crc32 encrypted pages does not work
- InnoDB fails to recover the full crc32 encrypted page from
doublewrite buffer. The reason is that buf_dblwr_t::recover()
fails to identify the space id from the page because the page has
been encrypted from FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION bytes.

Fix:
===
buf_dblwr_t::recover(): preserve any pages whose space_id
does not match a known tablespace. These could be encrypted pages
of tablespaces that had been created with
innodb_checksum_algorithm=full_crc32.

buf_page_t::read_complete(): If the page looks corrupted and the
tablespace is encrypted and in full_crc32 format, try to
restore the page from doublewrite buffer.

recv_dblwr_t::recover_encrypted_page(): Find the page which
has the same page number and try to decrypt the page using
space->crypt_data. After decryption, compare the space id.
Write the recovered page back to the file.
2025-01-07 19:33:56 +05:30
Monty
88d9348dfc Remove dates from all rdiff files 2025-01-05 16:40:11 +02:00
Marko Mäkelä
bb47e575de MDEV-34830: LSN in the future is not being treated as serious corruption
The invariant of write-ahead logging is that before any change to a
page is written to the data file, the corresponding log record must
must first have been durably written.

On crash recovery, there were some sloppy checks for this. Let us
implement accurate checks and flag an inconsistency as a hard error,
so that we can avoid further corruption of a corrupted database.
For data extraction from the corrupted database, innodb_force_recovery
can be used.

Before recovery is reading any data pages or invoking
buf_dblwr_t::recover() to recover torn pages from the
doublewrite buffer, InnoDB will have parsed the log until the
final LSN and updated log_sys.lsn to that. So, we can rely on
log_sys.lsn at all times. The doublewrite buffer recovery has been
refactored in such a way that the recv_sys.dblwr.pages may be consulted
while discovering files and their page sizes, but nothing will be
written back to data files before buf_dblwr_t::recover() is invoked.

A section of the test mariabackup.innodb_redo_overwrite
that is parsing some mariadb-backup --backup output has
been removed, because that output "redo log block is overwritten"
would often be missing in a Microsoft Windows environment
as a result of these changes.

recv_max_page_lsn, recv_lsn_checks_on: Remove.

recv_sys_t::validate_checkpoint(): Validate the write-ahead-logging
condition at the end of the recovery.

recv_dblwr_t::validate_page(): Keep track of the maximum LSN
(if we are checking a non-doublewrite copy of a page) but
do not complain LSN being in the future. The doublewrite buffer
is a special case, because it will be read early during recovery.
Besides, starting with commit 762bcb81b5
the dblwr=true copies of pages may legitimately be "too new".

recv_dblwr_t::find_page(): Find a valid page with the smallest
FIL_PAGE_LSN that is in the valid range for recovery.

recv_dblwr_t::restore_first_page(): Replaced by find_page().
Only buf_dblwr_t::recover() will write to data files.

buf_dblwr_t::recover(): Simplify the message output. Do attempt
doublewrite recovery on user page read error. Ignore doublewrite
pages whose FIL_PAGE_LSN is outside the usable bounds. Previously,
we could wrongly recover a too new page from the doublewrite buffer.
It is unlikely that this could have lead to an actual error.
Write back all recovered pages from the doublewrite buffer here,
including for the first page of any tablespace.

buf_page_is_corrupted(): Distinguish the return values
CORRUPTED_FUTURE_LSN and CORRUPTED_OTHER.

buf_page_check_corrupt(): Return the error code DB_CORRUPTION
in case the LSN is in the future.

Datafile::read_first_page(): Handle FSP_SPACE_FLAGS=0xffffffff
in the same way on both 32-bit and 64-bit architectures.

Datafile::read_first_page_flags(): Split from read_first_page().
Take a copy of the first page as a parameter.

recv_sys_t::free_corrupted_page(): Take the file as a parameter
and return whether a message was displayed. This avoids some duplicated
and incomplete error messages.

buf_page_t::read_complete(): Remove some redundant output and always
display the name of the corrupted file. Never return DB_FAIL;
use it only in internal error handling.

IORequest::read_complete(): Assume that buf_page_t::read_complete()
will have reported any error.

fil_space_t::set_corrupted(): Return whether this is the first time
the tablespace had been flagged as corrupted.

Datafile::validate_first_page(), fil_node_open_file_low(),
fil_node_open_file(), fil_space_t::read_page0(),
fil_node_t::read_page0(): Add a parameter for a copy of the
first page, and a parameter to indicate whether the FIL_PAGE_LSN
check should be suppressed. Before buf_dblwr_t::recover() is
invoked, we cannot validate the FIL_PAGE_LSN, but we can trust the
FSP_SPACE_FLAGS and the tablespace ID that may be present in a
potentially too new copy of a page.

Reviewed by: Debarun Banerjee
2024-10-17 17:24:20 +03:00
Marko Mäkelä
7e0afb1c73 Merge 10.5 into 10.6 2024-10-03 09:31:39 +03:00
Lena Startseva
0a5e4a0191 MDEV-31005: Make working cursor-protocol
Updated tests: cases with bugs or which cannot be run
with the cursor-protocol were excluded with
"--disable_cursor_protocol"/"--enable_cursor_protocol"

Fix for v.10.5
2024-09-18 18:39:26 +07:00
Marko Mäkelä
a74bea7ba9 MDEV-34879 InnoDB fails to merge the change buffer to ROW_FORMAT=COMPRESSED tables
buf_page_t::read_complete(): Fix an incorrect condition that had been
added in commit aaef2e1d8c (MDEV-27058).
Also for compressed-only pages we must remember that buffered changes
may exist.

buf_read_page(): Correct the function comment; this is for a synchronous
and not asynchronous read. Pass the parameter unzip=true to
buf_read_page_low(), because each of our callers will be interested in
the uncompressed page frame. This will cause the test
encryption.innodb-compressed-blob to emit more errors when the
correct keys for decrypting the clustered index root page are unavailable.

Reviewed by: Debarun Banerjee
2024-09-12 10:52:55 +03:00
Oleksandr Byelkin
8f020508c8 Merge branch '10.5' into 10.6 2024-08-03 09:04:24 +02:00
Thirunarayanan Balathandayuthapani
533e6d5d13 MDEV-34670 IMPORT TABLESPACE unnecessary traverses tablespace list
Problem:
========
- After the commit ada1074bb1 (MDEV-14398)
fil_crypt_set_encrypt_tables() iterates through all tablespaces to
fill the default_encrypt tables list. This was a trigger to
encrypt or decrypt when key rotation age is set to 0. But import
tablespace does call fil_crypt_set_encrypt_tables() unnecessarily.
The motivation for the call is to signal the encryption threads.

Fix:
====
ha_innobase::discard_or_import_tablespace: Remove the
fil_crypt_set_encrypt_tables() and add the import tablespace
to the default encrypt list if necessary
2024-07-31 14:13:38 +05:30
Daniel Black
0939bfc093 MDEV-19052 main.win postfix --view-protocol compat
Correct compatibility with view-protocol.

Thanks Lena Startseva
2024-07-27 14:11:03 +10:00
Daniel Black
7788593547 MDEV-19052 Range-type window frame supports only numeric datatype
When there is no bounds on the upper or lower part of the window,
it doesn't matter if the type is numeric.

It also doesn't matter how many ORDER BY items there are in the
query.

Reviewers: Sergei Petrunia and Oleg Smirnov
2024-07-25 19:16:37 +10:00
Oleksandr Byelkin
dcd8a64892 Merge branch '10.5' into 10.6 2024-07-03 13:27:23 +02:00
Lena Startseva
9e74a7f4f3 Removing MDEV-27871 from tastcases because it is not a bug 2024-06-28 16:45:50 +07:00
Marko Mäkelä
a687cf8661 Merge 10.5 into 10.6 2024-06-07 10:03:51 +03:00
Igor Babaev
4d38267fc7 MDEV-29307 Wrong result when joining two derived tables over the same view
This bug could affect queries containing a join of derived tables over
grouping views such that one of the derived tables contains a window
function while another uses view V with dependent subquery DSQ containing
a set function aggregated outside of the subquery in the view V. The
subquery also refers to the fields from the group clause of the view.Due to
this bug execution of such queries could produce wrong result sets.

When the fix_fields() method performs context analysis of a set function AF
first, at the very beginning the function Item_sum::init_sum_func_check()
is called. The function copies the pointer to the embedding set function,
if any, stored in THD::LEX::in_sum_func into the corresponding field of the
set function AF simultaneously changing the value of THD::LEX::in_sum_func
to point to AF. When at the very end of the fix_fields() method the function
Item_sum::check_sum_func() is called it is supposed to restore the value
of THD::LEX::in_sum_func to point to the embedding set function. And in
fact Item_sum::check_sum_func() did it, but only for regular set functions,
not for those used in window functions. As a result after the context
analysis of AF had finished THD::LEX::in_sum_func still pointed to AF.
It confused the further context analysis. In particular it led to wrong
resolution of Item_outer_ref objects in the fix_inner_refs() function.
This wrong resolution forced reading the values of grouping fields referred
in DSQ not from the temporary table used for aggregation from which they
were supposed to be read, but from the table used as the source table for
aggregation.

This patch guarantees that the value of THD::LEX::in_sum_func is properly
restored after the call of fix_fields() for any set function.
2024-06-04 17:54:01 -07:00
Marko Mäkelä
bb2e125d07 Merge 10.5 into 10.6
This excludes commit 040069f4ba
because it is specific to innodb_sync_debug, which had been removed
in commit ff5d306e29.
2024-04-18 07:14:56 +03:00
Vladislav Vaintroub
061adae9a2 MDEV-16944 Fix file sharing issues on Windows in mysqltest
On Windows systems, occurrences of ERROR_SHARING_VIOLATION due to
conflicting share modes between processes accessing the same file can
result in CreateFile failures.

mysys' my_open() already incorporates a workaround by implementing
wait/retry logic on Windows.

But this does not help if files are opened using shell redirection like
mysqltest traditionally did it, i.e via

--echo exec "some text" > output_file

In such cases, it is cmd.exe, that opens the output_file, and it
won't do any sharing-violation retries.

This commit addresses the issue by introducing a new built-in command,
'write_line', in mysqltest. This new command serves as a brief alternative
to 'write_file', with a single line output, that also resolves variables
like "exec" would.

Internally, this command will use my_open(), and therefore retry-on-error
logic.

Hopefully this will eliminate the very sporadic "can't open file because
it is used by another process" error on CI.
2024-04-17 16:52:37 +02:00
Sergei Golubchik
f71d7f2f0f Merge branch '10.5' into 10.6 2024-03-13 21:02:34 +01:00
Marko Mäkelä
f703e72bd8 Merge 10.4 into 10.5 2024-03-11 10:08:20 +02:00
Thirunarayanan Balathandayuthapani
8532dd82f1 MDEV-13765 encryption.encrypt_and_grep failed in buildbot with wrong result
- Adjust the test case to check whether all tablespaces
are encrypted by comparing it with existing table count.
2024-03-06 11:57:09 +05:30
Marko Mäkelä
466069b184 Merge 10.5 into 10.6 2024-02-08 10:38:53 +02:00
mariadb-DebarunBanerjee
5e7047067e MDEV-33274 The test encryption.innodb-redo-nokeys often fails
If we fail to open a tablespace while looking for FILE_CHECKPOINT, we
set the corruption flag. Specifically, if encryption key is missing, we
would not be able to open an encrypted tablespace and the flag could be
set. We miss checking for this flag and report "Missing FILE_CHECKPOINT"

Address review comment to improve the test. Flush pages before starting
no-checkpoint block. It should improve the number of cases where the
test is skipped because some intermediate checkpoint is triggered.
2024-02-08 08:13:16 +05:30
Marko Mäkelä
2b01e5103d Merge 10.5 into 10.6 2023-12-19 18:41:42 +02:00
Marko Mäkelä
4ae105a37d Merge 10.4 into 10.5 2023-12-18 08:59:07 +02:00
Marko Mäkelä
7e34bb5ce1 MDEV-11905: Simplify encryption.innodb_encrypt_discard_import
The test was populating unnecessarily large tables and
restarting the server several times for no real reason.

Let us hope that a smaller version of the test will produce more
stable results. Occasionally, some unencrypted contents in the table t2
was revealed in the old test.
2023-12-11 10:31:49 +02:00
Dmitry Shulga
47f2b16a8c MDEV-31296: Crash in Item_func::fix_fields when prepared statement with subqueries and window function is executed with sql_mode = ONLY_FULL_GROUP_BY
Crash was caused by referencing a null pointer on getting
the number of the nesting levels of the set function for the current
select_lex at the method Item_field::fix_fields.

The current select for processing is taken from Name_resolution_context
that filled in at the function set_new_item_local_context() and
where initialization of the data member Name_resolution_context
was mistakenly removed by the commit
  d6ee351bbb
   (Revert "MDEV-24454 Crash at change_item_tree")

To fix the issue, correct initialization of data member
  Name_resolution_context::select_lex
that was removed by the commit d6ee351bbb
is restored.
2023-12-11 14:47:02 +07:00
Marko Mäkelä
5775df0127 MDEV-20142 encryption.innodb_encrypt_temporary_tables fails
The data type of the column INFORMATION_SCHEMA.GLOBAL_STATUS.VARIABLE_VALUE
is a character string. Therefore, if we want to compare some values as
integers, we must explicitly cast them to integer type, to avoid an
awkward comparison where '10'<'9' because the first digit is smaller.
2023-12-10 13:19:21 +02:00
Marko Mäkelä
1ac03fd914 Fix occasional failure of encryption.corrupted_during_recovery 2023-12-04 11:21:58 +02:00
Marko Mäkelä
14685b10df MDEV-32050: Deprecate&ignore innodb_purge_rseg_truncate_frequency
The motivation of introducing the parameter
innodb_purge_rseg_truncate_frequency in
mysql/mysql-server@28bbd66ea5 and
mysql/mysql-server@8fc2120fed
seems to have been to avoid stalls due to freeing undo log pages
or truncating undo log tablespaces. In MariaDB Server,
innodb_undo_log_truncate=ON should be a much lighter operation
than in MySQL, because it will not involve any log checkpoint.

Another source of performance stalls should be
trx_purge_truncate_rseg_history(), which is shrinking the history list
by freeing the undo log pages whose undo records have been purged.
To alleviate that, we will introduce a purge_truncation_task that will
offload this from the purge_coordinator_task. In that way, the next
innodb_purge_batch_size pages may be parsed and purged while the pages
from the previous batch are being freed and the history list being shrunk.

The processing of innodb_undo_log_truncate=ON will still remain the
responsibility of the purge_coordinator_task.

purge_coordinator_state::count: Remove. We will ignore
innodb_purge_rseg_truncate_frequency, and act as if it had been
set to 1 (the maximum shrinking frequency).

purge_coordinator_state::do_purge(): Invoke an asynchronous task
purge_truncation_callback() to free the undo log pages.

purge_sys_t::iterator::free_history(): Free those undo log pages
that have been processed. This used to be a part of
trx_purge_truncate_history().

purge_sys_t::clone_end_view(): Take a new value of purge_sys.head
as a parameter, so that it will be updated while holding exclusive
purge_sys.latch. This is needed for race-free access to the field
in purge_truncation_callback().

Reviewed by: Vladislav Lesin
2023-10-25 09:11:58 +03:00
Thirunarayanan Balathandayuthapani
cbad0bcd41 MDEV-31098 InnoDB Recovery doesn't display encryption message when no encryption configuration passed
- InnoDB fails to report the error when encryption configuration
wasn't passed. This patch addresses the issue by adding
the error while loading the tablespace and deferring the
tablespace creation.
2023-10-13 17:27:27 +05:30
Marko Mäkelä
0dd25f28f7 Merge 10.5 into 10.6 2023-09-11 14:46:39 +03:00
Marko Mäkelä
f8f7d9de2c Merge 10.4 into 10.5 2023-09-11 11:29:31 +03:00
Marko Mäkelä
5299f0c45e MDEV-21664 Add opt files for have_innodb_Xk.inc
Currently include/have_innodb_4k.inc etc. files only check that the
server is running with the corresponding page size. I think it would
be more convenient if they actually enforced the setting.
2023-09-11 09:09:02 +03:00
Sergei Petrunia
6e484c3bd9 MDEV-31577: Make ANALYZE FORMAT=JSON print innodb stats
ANALYZE FORMAT=JSON output now includes table.r_engine_stats which
has the engine statistics. Only non-zero members are printed.

Internally: EXPLAIN data structures Explain_table_acccess and
Explain_update now have handler* handler_for_stats pointer.
It is used to read statistics from handler_for_stats->handler_stats.

The following applies only to 10.9+, backport doesn't use it:

Explain data structures exist after the tables are closed. We avoid
walking invalid pointers using this:
- SQL layer calls Explain_query::notify_tables_are_closed() before
  closing tables.
- After that call, printing of JSON output is disabled. Non-JSON output
  can be printed but we don't access handler_for_stats when doing that.
2023-07-21 16:50:11 +03:00
Marko Mäkelä
5bada1246d Merge 10.5 into 10.6 2023-04-11 16:15:19 +03:00
Oleksandr Byelkin
ac5a534a4c Merge remote-tracking branch '10.4' into 10.5 2023-03-31 21:32:41 +02:00
Marko Mäkelä
c73a65f55b MDEV-29692 Assertion `(writeptr + (i * size)) != local_frame' failed upon IMPORT TABLESPACE
fil_iterate(): Allocation bitmap pages are never encrypted.

Reviewed by: Thirunarayanan Balathandayuthapani
2023-03-21 14:33:54 +02:00
Marko Mäkelä
85cbfaefee Merge 10.5 into 10.6 2023-03-16 15:48:08 +02:00
Oleksandr Byelkin
c3a5cf2b5b Merge branch '10.5' into 10.6 2023-01-31 09:31:42 +01:00
Oleksandr Byelkin
a977054ee0 Merge branch '10.3' into 10.4 2023-01-28 18:22:55 +01:00
Oleksandr Byelkin
7fa02f5c0b Merge branch '10.4' into 10.5 2023-01-27 13:54:14 +01:00
Oleksandr Byelkin
dd24fa3063 Merge branch '10.3' into 10.4 2023-01-26 10:34:26 +01:00
Marko Mäkelä
82b18a8361 MDEV-29374 fixup: Suppress an error in a test 2023-01-25 10:56:07 +02:00
Sergei Petrunia
f18c2b6c8a MDEV-15178: Filesort::make_sortorder: Assertion `pos->field != __null |
(Initial patch by Varun Gupta. Amended and added comments).

When the query has both
1. Aggregate functions that require sorting data by group, and
2. Window functions

we need to use two temporary tables. The first temp.table will hold the
join output.  Then it is passed to filesort(). Reading it in sorted
order allows to compute the aggregate functions.

Then, we need to write their values into the second temp. table. Then,
Window Function computation step can pass that to filesort() and read
them in the order it needs.

Failure to create the second temp. table would cause an assertion
failure: window function could would not find where to get the values
of the aggregate functions.
2023-01-23 18:22:21 +02:00
Marko Mäkelä
a8a5c8a1b8 Merge 10.5 into 10.6 2022-12-13 16:58:58 +02:00
Marko Mäkelä
1dc2f35598 Merge 10.4 into 10.5 2022-12-13 14:39:18 +02:00
Marko Mäkelä
fdf43b5c78 Merge 10.3 into 10.4 2022-12-13 11:37:33 +02:00