1
0
mirror of https://github.com/MariaDB/server.git synced 2025-11-09 11:41:36 +03:00
Commit Graph

472 Commits

Author SHA1 Message Date
Thirunarayanan Balathandayuthapani
a390aaaf23 MDEV-36180 Doublewrite recovery of innodb_checksum_algorithm=full_crc32 page_compressed pages does not work
- InnoDB fails to recover the full crc32 page_compressed page
from doublewrite buffer. The reason is that buf_dblwr_t::recover()
fails to identify the space id from the page because the page
has compressed from FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION bytes.

Fix:
===
recv_dblwr_t::find_deferred_page(): Find the page which
has the same page number and try to decompress/decrypt the page
based on the tablespace metadata. After the decompression/decryption,
compare the space id and write the recovered page back to the file.

buf_page_t::read_complete(): Page read from disk is corrupted then
try to read the page from deferred pages in doublewrite buffer.
2025-03-26 12:03:44 +01:00
Sergei Golubchik
066e8d6aea Merge branch '10.5' into 10.6 2025-01-29 11:17:38 +01:00
Igor Babaev
77ea99a4b5 MDEV-35869 Wrong result using degenerated subquery with window function
This bug affected queries containing degenerated single-value subqueries
with window functions. The bug led mostly to wrong results for such
queries.

A subquery is called degenerated if it is of the form (SELECT <expr>...).

For degenerated subqueries of the form (SELECT <expr>) the transformation
   (SELECT <expr>) => <expr>
usually is applied. However if <expr> contains set functions or window
functions such rewriting is not valid for an obvious reason. The code
before this patch erroneously applied the transformation when <expr>
contained window functions and did not contain set functions.

Approved by Rex Johnston <rex.johnston@mariadb.com>
2025-01-23 13:50:29 -08:00
Thirunarayanan Balathandayuthapani
f8cf493290 MDEV-34898 Doublewrite recovery of innodb_checksum_algorithm=full_crc32 encrypted pages does not work
- InnoDB fails to recover the full crc32 encrypted page from
doublewrite buffer. The reason is that buf_dblwr_t::recover()
fails to identify the space id from the page because the page has
been encrypted from FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION bytes.

Fix:
===
buf_dblwr_t::recover(): preserve any pages whose space_id
does not match a known tablespace. These could be encrypted pages
of tablespaces that had been created with
innodb_checksum_algorithm=full_crc32.

buf_page_t::read_complete(): If the page looks corrupted and the
tablespace is encrypted and in full_crc32 format, try to
restore the page from doublewrite buffer.

recv_dblwr_t::recover_encrypted_page(): Find the page which
has the same page number and try to decrypt the page using
space->crypt_data. After decryption, compare the space id.
Write the recovered page back to the file.
2025-01-07 19:33:56 +05:30
Monty
88d9348dfc Remove dates from all rdiff files 2025-01-05 16:40:11 +02:00
Marko Mäkelä
bb47e575de MDEV-34830: LSN in the future is not being treated as serious corruption
The invariant of write-ahead logging is that before any change to a
page is written to the data file, the corresponding log record must
must first have been durably written.

On crash recovery, there were some sloppy checks for this. Let us
implement accurate checks and flag an inconsistency as a hard error,
so that we can avoid further corruption of a corrupted database.
For data extraction from the corrupted database, innodb_force_recovery
can be used.

Before recovery is reading any data pages or invoking
buf_dblwr_t::recover() to recover torn pages from the
doublewrite buffer, InnoDB will have parsed the log until the
final LSN and updated log_sys.lsn to that. So, we can rely on
log_sys.lsn at all times. The doublewrite buffer recovery has been
refactored in such a way that the recv_sys.dblwr.pages may be consulted
while discovering files and their page sizes, but nothing will be
written back to data files before buf_dblwr_t::recover() is invoked.

A section of the test mariabackup.innodb_redo_overwrite
that is parsing some mariadb-backup --backup output has
been removed, because that output "redo log block is overwritten"
would often be missing in a Microsoft Windows environment
as a result of these changes.

recv_max_page_lsn, recv_lsn_checks_on: Remove.

recv_sys_t::validate_checkpoint(): Validate the write-ahead-logging
condition at the end of the recovery.

recv_dblwr_t::validate_page(): Keep track of the maximum LSN
(if we are checking a non-doublewrite copy of a page) but
do not complain LSN being in the future. The doublewrite buffer
is a special case, because it will be read early during recovery.
Besides, starting with commit 762bcb81b5
the dblwr=true copies of pages may legitimately be "too new".

recv_dblwr_t::find_page(): Find a valid page with the smallest
FIL_PAGE_LSN that is in the valid range for recovery.

recv_dblwr_t::restore_first_page(): Replaced by find_page().
Only buf_dblwr_t::recover() will write to data files.

buf_dblwr_t::recover(): Simplify the message output. Do attempt
doublewrite recovery on user page read error. Ignore doublewrite
pages whose FIL_PAGE_LSN is outside the usable bounds. Previously,
we could wrongly recover a too new page from the doublewrite buffer.
It is unlikely that this could have lead to an actual error.
Write back all recovered pages from the doublewrite buffer here,
including for the first page of any tablespace.

buf_page_is_corrupted(): Distinguish the return values
CORRUPTED_FUTURE_LSN and CORRUPTED_OTHER.

buf_page_check_corrupt(): Return the error code DB_CORRUPTION
in case the LSN is in the future.

Datafile::read_first_page(): Handle FSP_SPACE_FLAGS=0xffffffff
in the same way on both 32-bit and 64-bit architectures.

Datafile::read_first_page_flags(): Split from read_first_page().
Take a copy of the first page as a parameter.

recv_sys_t::free_corrupted_page(): Take the file as a parameter
and return whether a message was displayed. This avoids some duplicated
and incomplete error messages.

buf_page_t::read_complete(): Remove some redundant output and always
display the name of the corrupted file. Never return DB_FAIL;
use it only in internal error handling.

IORequest::read_complete(): Assume that buf_page_t::read_complete()
will have reported any error.

fil_space_t::set_corrupted(): Return whether this is the first time
the tablespace had been flagged as corrupted.

Datafile::validate_first_page(), fil_node_open_file_low(),
fil_node_open_file(), fil_space_t::read_page0(),
fil_node_t::read_page0(): Add a parameter for a copy of the
first page, and a parameter to indicate whether the FIL_PAGE_LSN
check should be suppressed. Before buf_dblwr_t::recover() is
invoked, we cannot validate the FIL_PAGE_LSN, but we can trust the
FSP_SPACE_FLAGS and the tablespace ID that may be present in a
potentially too new copy of a page.

Reviewed by: Debarun Banerjee
2024-10-17 17:24:20 +03:00
Marko Mäkelä
a74bea7ba9 MDEV-34879 InnoDB fails to merge the change buffer to ROW_FORMAT=COMPRESSED tables
buf_page_t::read_complete(): Fix an incorrect condition that had been
added in commit aaef2e1d8c (MDEV-27058).
Also for compressed-only pages we must remember that buffered changes
may exist.

buf_read_page(): Correct the function comment; this is for a synchronous
and not asynchronous read. Pass the parameter unzip=true to
buf_read_page_low(), because each of our callers will be interested in
the uncompressed page frame. This will cause the test
encryption.innodb-compressed-blob to emit more errors when the
correct keys for decrypting the clustered index root page are unavailable.

Reviewed by: Debarun Banerjee
2024-09-12 10:52:55 +03:00
Oleksandr Byelkin
8f020508c8 Merge branch '10.5' into 10.6 2024-08-03 09:04:24 +02:00
Thirunarayanan Balathandayuthapani
533e6d5d13 MDEV-34670 IMPORT TABLESPACE unnecessary traverses tablespace list
Problem:
========
- After the commit ada1074bb1 (MDEV-14398)
fil_crypt_set_encrypt_tables() iterates through all tablespaces to
fill the default_encrypt tables list. This was a trigger to
encrypt or decrypt when key rotation age is set to 0. But import
tablespace does call fil_crypt_set_encrypt_tables() unnecessarily.
The motivation for the call is to signal the encryption threads.

Fix:
====
ha_innobase::discard_or_import_tablespace: Remove the
fil_crypt_set_encrypt_tables() and add the import tablespace
to the default encrypt list if necessary
2024-07-31 14:13:38 +05:30
Daniel Black
0939bfc093 MDEV-19052 main.win postfix --view-protocol compat
Correct compatibility with view-protocol.

Thanks Lena Startseva
2024-07-27 14:11:03 +10:00
Daniel Black
7788593547 MDEV-19052 Range-type window frame supports only numeric datatype
When there is no bounds on the upper or lower part of the window,
it doesn't matter if the type is numeric.

It also doesn't matter how many ORDER BY items there are in the
query.

Reviewers: Sergei Petrunia and Oleg Smirnov
2024-07-25 19:16:37 +10:00
Oleksandr Byelkin
dcd8a64892 Merge branch '10.5' into 10.6 2024-07-03 13:27:23 +02:00
Lena Startseva
9e74a7f4f3 Removing MDEV-27871 from tastcases because it is not a bug 2024-06-28 16:45:50 +07:00
Marko Mäkelä
a687cf8661 Merge 10.5 into 10.6 2024-06-07 10:03:51 +03:00
Igor Babaev
4d38267fc7 MDEV-29307 Wrong result when joining two derived tables over the same view
This bug could affect queries containing a join of derived tables over
grouping views such that one of the derived tables contains a window
function while another uses view V with dependent subquery DSQ containing
a set function aggregated outside of the subquery in the view V. The
subquery also refers to the fields from the group clause of the view.Due to
this bug execution of such queries could produce wrong result sets.

When the fix_fields() method performs context analysis of a set function AF
first, at the very beginning the function Item_sum::init_sum_func_check()
is called. The function copies the pointer to the embedding set function,
if any, stored in THD::LEX::in_sum_func into the corresponding field of the
set function AF simultaneously changing the value of THD::LEX::in_sum_func
to point to AF. When at the very end of the fix_fields() method the function
Item_sum::check_sum_func() is called it is supposed to restore the value
of THD::LEX::in_sum_func to point to the embedding set function. And in
fact Item_sum::check_sum_func() did it, but only for regular set functions,
not for those used in window functions. As a result after the context
analysis of AF had finished THD::LEX::in_sum_func still pointed to AF.
It confused the further context analysis. In particular it led to wrong
resolution of Item_outer_ref objects in the fix_inner_refs() function.
This wrong resolution forced reading the values of grouping fields referred
in DSQ not from the temporary table used for aggregation from which they
were supposed to be read, but from the table used as the source table for
aggregation.

This patch guarantees that the value of THD::LEX::in_sum_func is properly
restored after the call of fix_fields() for any set function.
2024-06-04 17:54:01 -07:00
Marko Mäkelä
466069b184 Merge 10.5 into 10.6 2024-02-08 10:38:53 +02:00
mariadb-DebarunBanerjee
5e7047067e MDEV-33274 The test encryption.innodb-redo-nokeys often fails
If we fail to open a tablespace while looking for FILE_CHECKPOINT, we
set the corruption flag. Specifically, if encryption key is missing, we
would not be able to open an encrypted tablespace and the flag could be
set. We miss checking for this flag and report "Missing FILE_CHECKPOINT"

Address review comment to improve the test. Flush pages before starting
no-checkpoint block. It should improve the number of cases where the
test is skipped because some intermediate checkpoint is triggered.
2024-02-08 08:13:16 +05:30
Marko Mäkelä
2b01e5103d Merge 10.5 into 10.6 2023-12-19 18:41:42 +02:00
Marko Mäkelä
4ae105a37d Merge 10.4 into 10.5 2023-12-18 08:59:07 +02:00
Marko Mäkelä
7e34bb5ce1 MDEV-11905: Simplify encryption.innodb_encrypt_discard_import
The test was populating unnecessarily large tables and
restarting the server several times for no real reason.

Let us hope that a smaller version of the test will produce more
stable results. Occasionally, some unencrypted contents in the table t2
was revealed in the old test.
2023-12-11 10:31:49 +02:00
Dmitry Shulga
47f2b16a8c MDEV-31296: Crash in Item_func::fix_fields when prepared statement with subqueries and window function is executed with sql_mode = ONLY_FULL_GROUP_BY
Crash was caused by referencing a null pointer on getting
the number of the nesting levels of the set function for the current
select_lex at the method Item_field::fix_fields.

The current select for processing is taken from Name_resolution_context
that filled in at the function set_new_item_local_context() and
where initialization of the data member Name_resolution_context
was mistakenly removed by the commit
  d6ee351bbb
   (Revert "MDEV-24454 Crash at change_item_tree")

To fix the issue, correct initialization of data member
  Name_resolution_context::select_lex
that was removed by the commit d6ee351bbb
is restored.
2023-12-11 14:47:02 +07:00
Marko Mäkelä
5775df0127 MDEV-20142 encryption.innodb_encrypt_temporary_tables fails
The data type of the column INFORMATION_SCHEMA.GLOBAL_STATUS.VARIABLE_VALUE
is a character string. Therefore, if we want to compare some values as
integers, we must explicitly cast them to integer type, to avoid an
awkward comparison where '10'<'9' because the first digit is smaller.
2023-12-10 13:19:21 +02:00
Marko Mäkelä
1ac03fd914 Fix occasional failure of encryption.corrupted_during_recovery 2023-12-04 11:21:58 +02:00
Thirunarayanan Balathandayuthapani
cbad0bcd41 MDEV-31098 InnoDB Recovery doesn't display encryption message when no encryption configuration passed
- InnoDB fails to report the error when encryption configuration
wasn't passed. This patch addresses the issue by adding
the error while loading the tablespace and deferring the
tablespace creation.
2023-10-13 17:27:27 +05:30
Marko Mäkelä
0dd25f28f7 Merge 10.5 into 10.6 2023-09-11 14:46:39 +03:00
Marko Mäkelä
f8f7d9de2c Merge 10.4 into 10.5 2023-09-11 11:29:31 +03:00
Marko Mäkelä
5299f0c45e MDEV-21664 Add opt files for have_innodb_Xk.inc
Currently include/have_innodb_4k.inc etc. files only check that the
server is running with the corresponding page size. I think it would
be more convenient if they actually enforced the setting.
2023-09-11 09:09:02 +03:00
Sergei Petrunia
6e484c3bd9 MDEV-31577: Make ANALYZE FORMAT=JSON print innodb stats
ANALYZE FORMAT=JSON output now includes table.r_engine_stats which
has the engine statistics. Only non-zero members are printed.

Internally: EXPLAIN data structures Explain_table_acccess and
Explain_update now have handler* handler_for_stats pointer.
It is used to read statistics from handler_for_stats->handler_stats.

The following applies only to 10.9+, backport doesn't use it:

Explain data structures exist after the tables are closed. We avoid
walking invalid pointers using this:
- SQL layer calls Explain_query::notify_tables_are_closed() before
  closing tables.
- After that call, printing of JSON output is disabled. Non-JSON output
  can be printed but we don't access handler_for_stats when doing that.
2023-07-21 16:50:11 +03:00
Marko Mäkelä
5bada1246d Merge 10.5 into 10.6 2023-04-11 16:15:19 +03:00
Oleksandr Byelkin
ac5a534a4c Merge remote-tracking branch '10.4' into 10.5 2023-03-31 21:32:41 +02:00
Marko Mäkelä
c73a65f55b MDEV-29692 Assertion `(writeptr + (i * size)) != local_frame' failed upon IMPORT TABLESPACE
fil_iterate(): Allocation bitmap pages are never encrypted.

Reviewed by: Thirunarayanan Balathandayuthapani
2023-03-21 14:33:54 +02:00
Oleksandr Byelkin
c3a5cf2b5b Merge branch '10.5' into 10.6 2023-01-31 09:31:42 +01:00
Oleksandr Byelkin
a977054ee0 Merge branch '10.3' into 10.4 2023-01-28 18:22:55 +01:00
Oleksandr Byelkin
7fa02f5c0b Merge branch '10.4' into 10.5 2023-01-27 13:54:14 +01:00
Oleksandr Byelkin
dd24fa3063 Merge branch '10.3' into 10.4 2023-01-26 10:34:26 +01:00
Marko Mäkelä
82b18a8361 MDEV-29374 fixup: Suppress an error in a test 2023-01-25 10:56:07 +02:00
Sergei Petrunia
f18c2b6c8a MDEV-15178: Filesort::make_sortorder: Assertion `pos->field != __null |
(Initial patch by Varun Gupta. Amended and added comments).

When the query has both
1. Aggregate functions that require sorting data by group, and
2. Window functions

we need to use two temporary tables. The first temp.table will hold the
join output.  Then it is passed to filesort(). Reading it in sorted
order allows to compute the aggregate functions.

Then, we need to write their values into the second temp. table. Then,
Window Function computation step can pass that to filesort() and read
them in the order it needs.

Failure to create the second temp. table would cause an assertion
failure: window function could would not find where to get the values
of the aggregate functions.
2023-01-23 18:22:21 +02:00
Marko Mäkelä
a8a5c8a1b8 Merge 10.5 into 10.6 2022-12-13 16:58:58 +02:00
Marko Mäkelä
1dc2f35598 Merge 10.4 into 10.5 2022-12-13 14:39:18 +02:00
Marko Mäkelä
fdf43b5c78 Merge 10.3 into 10.4 2022-12-13 11:37:33 +02:00
Marko Mäkelä
782b2a7500 MDEV-29144 ER_TABLE_SCHEMA_MISMATCH or crash on DISCARD/IMPORT
mysql_discard_or_import_tablespace(): On successful
ALTER TABLE...DISCARD TABLESPACE, evict the table handle from the
table definition cache, so that ha_innobase::close() will be invoked,
like InnoDB expects to be the case. This will avoid an assertion failure
ut_a(table->get_ref_count() == 0) during IMPORT TABLESPACE.

ha_innobase::open(): Do not issue any ER_TABLESPACE_DISCARDED warning.
Member functions for DML will do that.

ha_innobase::truncate(), ha_innobase::check_if_supported_inplace_alter():
Issue ER_TABLESPACE_DISCARDED warnings, to compensate for the removal of
the warning in ha_innobase::open().

row_quiesce_write_indexes(): Only write information about committed
indexes. The ALTER TABLE t NOWAIT ADD INDEX(c) in the nondeterministic
test case will most of the time fail due to a metadata lock (MDL) timeout
and leave behind an uncommitted index.

Reviewed by: Sergei Golubchik
2022-12-09 10:42:19 +02:00
Daniel Black
072b3668ca MDEV-28206: SIGSEGV in Item_field::fix_fields when using LEAD...OVER
thd->lex->in_sum_func->max_arg_level cannot be set to a
bigger value of select->nest_level if select is null.
2022-12-02 17:22:04 +11:00
Marko Mäkelä
c59985fcf5 Merge 10.5 into 10.6 2022-11-30 07:06:41 +02:00
Marko Mäkelä
846112ce36 MDEV-24412: Create a separate test
Some builders in our CI, most notably FreeBSD and IBM AIX, do not support
sparse files. Also, Microsoft Windows requires special means for creating
sparse files. Since these platforms do not run ./mtr --big-test, we will
for now simply move the test to a separate file that requires that option.
2022-11-30 06:57:32 +02:00
Thirunarayanan Balathandayuthapani
bb29712b45 MDEV-30119 INFORMATION_SCHEMA.INNODB_TABLESPACES_ENCRYPTION.NAME is NULL for undo tablespaces
- Information_schema.innodb_tablespaces_encryption should print
undo tablespace name as innodb_undo001, innodb_undo002 and soon.

- Encryption test should include undo tablespaces count when
the tests are waiting for the condition to check whether all
tables are encrypted or decrypted.
2022-11-29 19:49:53 +05:30
Marko Mäkelä
fdc582fd98 Merge 10.5 into 10.6 2022-11-28 12:20:17 +02:00
Marko Mäkelä
bd694bb7b2 MDEV-24412 InnoDB: Upgrade after a crash is not supported
recv_log_recover_10_4(): Widen the operand of bitwise and to 64 bits,
so that the upgrade check will work when the redo log record is located
more than 4 gigabytes from the start of the first file.
2022-11-28 11:56:09 +02:00
Marko Mäkelä
6d40274f65 Merge 10.5 into 10.6 2022-11-23 18:13:28 +02:00
Marko Mäkelä
cff9939d09 MDEV-30068 Confusing error message when encryption is not available on recovery
fil_name_process(): If fil_ibd_load() returns FIL_LOAD_INVALID,
display the file name and the tablespace identifier.
2022-11-22 15:31:12 +02:00
Marko Mäkelä
9aea7d83c8 Merge 10.5 into 10.6 2022-11-17 08:37:35 +02:00