mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-08-07 00:04:31 +03:00

Author	SHA1	Message	Date
Thirunarayanan Balathandayuthapani	4f7faa4bc8	MDEV-36650 Unexpected checkpoint in the test innodb.doublewrite innodb.doublewrite: Skip the test case if we get an unexpected checkpoint. This could happen because page cleaner thread could be active after reading the initial checkpoint information.	2025-06-18 18:36:45 +05:30
Marko Mäkelä	900bbbe4a8	MDEV-33295 innodb.doublewrite occasionally fails When the first attempt of XA ROLLBACK is expected to fail, some recovered changes could be written back through the doublewrite buffer. Should that happen, the next recovery attempt (after mangling the data file t1.ibd further) could fail because no copy of the affected pages would be available in the doublewrite buffer. To prevent this from happening, ensure that the doublewrite buffer will not be used and no log checkpoint occurs during the previous failed recovery attempt. Also, let a successful XA ROLLBACK serve the additional purpose of freeing a BLOB page and therefore rewriting page 0, which we must then be able to recover despite induced corruption. In the last restart step, we will tolerate an unexpected checkpoint, because one is frequently occurring on FreeBSD and AIX, despite our efforts to force a buffer pool flush before each "no checkpoint" section.	2025-02-03 08:11:43 +02:00
Marko Mäkelä	bb47e575de	MDEV-34830: LSN in the future is not being treated as serious corruption The invariant of write-ahead logging is that before any change to a page is written to the data file, the corresponding log record must must first have been durably written. On crash recovery, there were some sloppy checks for this. Let us implement accurate checks and flag an inconsistency as a hard error, so that we can avoid further corruption of a corrupted database. For data extraction from the corrupted database, innodb_force_recovery can be used. Before recovery is reading any data pages or invoking buf_dblwr_t::recover() to recover torn pages from the doublewrite buffer, InnoDB will have parsed the log until the final LSN and updated log_sys.lsn to that. So, we can rely on log_sys.lsn at all times. The doublewrite buffer recovery has been refactored in such a way that the recv_sys.dblwr.pages may be consulted while discovering files and their page sizes, but nothing will be written back to data files before buf_dblwr_t::recover() is invoked. A section of the test mariabackup.innodb_redo_overwrite that is parsing some mariadb-backup --backup output has been removed, because that output "redo log block is overwritten" would often be missing in a Microsoft Windows environment as a result of these changes. recv_max_page_lsn, recv_lsn_checks_on: Remove. recv_sys_t::validate_checkpoint(): Validate the write-ahead-logging condition at the end of the recovery. recv_dblwr_t::validate_page(): Keep track of the maximum LSN (if we are checking a non-doublewrite copy of a page) but do not complain LSN being in the future. The doublewrite buffer is a special case, because it will be read early during recovery. Besides, starting with commit `762bcb81b5` the dblwr=true copies of pages may legitimately be "too new". recv_dblwr_t::find_page(): Find a valid page with the smallest FIL_PAGE_LSN that is in the valid range for recovery. recv_dblwr_t::restore_first_page(): Replaced by find_page(). Only buf_dblwr_t::recover() will write to data files. buf_dblwr_t::recover(): Simplify the message output. Do attempt doublewrite recovery on user page read error. Ignore doublewrite pages whose FIL_PAGE_LSN is outside the usable bounds. Previously, we could wrongly recover a too new page from the doublewrite buffer. It is unlikely that this could have lead to an actual error. Write back all recovered pages from the doublewrite buffer here, including for the first page of any tablespace. buf_page_is_corrupted(): Distinguish the return values CORRUPTED_FUTURE_LSN and CORRUPTED_OTHER. buf_page_check_corrupt(): Return the error code DB_CORRUPTION in case the LSN is in the future. Datafile::read_first_page(): Handle FSP_SPACE_FLAGS=0xffffffff in the same way on both 32-bit and 64-bit architectures. Datafile::read_first_page_flags(): Split from read_first_page(). Take a copy of the first page as a parameter. recv_sys_t::free_corrupted_page(): Take the file as a parameter and return whether a message was displayed. This avoids some duplicated and incomplete error messages. buf_page_t::read_complete(): Remove some redundant output and always display the name of the corrupted file. Never return DB_FAIL; use it only in internal error handling. IORequest::read_complete(): Assume that buf_page_t::read_complete() will have reported any error. fil_space_t::set_corrupted(): Return whether this is the first time the tablespace had been flagged as corrupted. Datafile::validate_first_page(), fil_node_open_file_low(), fil_node_open_file(), fil_space_t::read_page0(), fil_node_t::read_page0(): Add a parameter for a copy of the first page, and a parameter to indicate whether the FIL_PAGE_LSN check should be suppressed. Before buf_dblwr_t::recover() is invoked, we cannot validate the FIL_PAGE_LSN, but we can trust the FSP_SPACE_FLAGS and the tablespace ID that may be present in a potentially too new copy of a page. Reviewed by: Debarun Banerjee	2024-10-17 17:24:20 +03:00
Thirunarayanan Balathandayuthapani	7573fe8b07	MDEV-32968 InnoDB fails to restore tablespace first page from doublewrite buffer when page is empty recv_dblwr_t::find_first_page(): Free the allocated memory to read the first 3 pages from tablespace. innodb.doublewrite: Added sleep to ensure page cleaner thread wake up from my_cond_wait	2024-01-19 17:01:36 +05:30
Thirunarayanan Balathandayuthapani	caad34df54	MDEV-32968 InnoDB fails to restore tablespace first page from doublewrite buffer when page is empty - InnoDB fails to find the space id from the page0 of the tablespace. In that case, InnoDB can use doublewrite buffer to recover the page0 and write into the file. - buf_dblwr_t::init_or_load_pages(): Loads only the pages which are valid.(page lsn >= checkpoint). To do that, InnoDB has to open the redo log before system tablespace, read the latest checkpoint information. recv_dblwr_t::find_first_page(): 1) Iterate the doublewrite buffer pages and find the 0th page 2) Read the tablespace flags, space id from the 0th page. 3) Read the 1st, 2nd and 3rd page from tablespace file and compare the space id with the space id which is stored in doublewrite buffer. 4) If it matches then we can write into the file. 5) Return space which matches the pages from the file. SysTablespace::read_lsn_and_check_flags(): Remove the retry logic for validating the first page. After restoring the first page from doublewrite buffer, assign tablespace flags by reading the first page. recv_recovery_read_max_checkpoint(): Reads the maximum checkpoint information from log file recv_recovery_from_checkpoint_start(): Avoid reading the checkpoint header information from log file Datafile::validate_first_page(): Throw error in case of first page validation fails.	2024-01-15 14:08:27 +05:30
Thirunarayanan Balathandayuthapani	d5a6ea36f3	MDEV-32242 innodb.doublewrite test case always gets skipped - Split the doublewrite test into two test (doublewrite, doublewrite_debug) to reduce the execution time of the test - Removed big_test tag for the newly added test case - Made doublewrite test as non-debug test - Added search pattern to make sure that InnoDB uses doublewrite buffer - Replaced all kill_mysqld.inc with shutdown_mysqld.inc and zero shutdown timeout - Removed the case where fsp_flags got corrupted. Because from commit `3da5d047b8` (MDEV-31851) onwards, doublewrite buffer removes the conversion the fsp flags from buggy 10.1 format Thanks to Marko Mäkelä for providing the non-debug test	2023-12-07 18:44:28 +05:30
Marko Mäkelä	d44a10f4dd	MDEV-23855 follow-up: Make innodb.doublewrite more stable The test innodb.doublewrite could occasionally fail with 64KiB page size because the page 0 would no longer be in the doublewrite buffer. Let us stop purge before the server is killed, and ensure that the entire buffer pool will be flushed before we initiate an extra write of page 0.	2021-05-05 12:51:44 +03:00
Eugene Kosov	9ef2d29ff4	MDEV-14425 deprecate and ignore innodb_log_files_in_group Now there can be only one log file instead of several which logically work as a single file. Possible names of redo log files: ib_logfile0, ib_logfile101 (for just created one) innodb_log_fiels_in_group: value of this variable is not used by InnoDB. Possible values are still 1..100, to not break upgrade LOG_FILE_NAME: add constant of value "ib_logfile0" LOG_FILE_NAME_PREFIX: add constant of value "ib_logfile" get_log_file_path(): convenience function that returns full path of a redo log file SRV_N_LOG_FILES_MAX: removed srv_n_log_files: we can't remove this for compatibility reasons, but now server doesn't use this variable log_sys_t::file::fd: now just one, not std::vector log_sys_t::log_capacity: removed word 'group' find_and_check_log_file(): part of logic from huge srv_start() moved here recv_sys_t::files: file descriptors of redo log files. There can be several of those in case we're upgrading from older MariaDB version. recv_sys_t::remove_extra_log_files: whether to remove ib_logfile{1,2,3...} after successfull upgrade. recv_sys_t::read(): open if needed and read from one of several log files recv_sys_t::files_size(): open if needed and return files count redo_file_sizes_are_correct(): check that redo log files sizes are equal. Just to log an error for a user. Corresponding check was moved from srv0start.cc namespace deprecated: put all deprecated variables here to prevent usage of it by us, developers	2020-02-19 12:21:59 +03:00
Marko Mäkelä	504202bd7f	MDEV-21216: Remove fsp_header_get_space_id() The function fsp_header_get_space_id() returns ulint instead of uint32_t, only to be able to complain that the two adjacent tablespace ID fields in the page differ. Remove the function, and merge the check to the callers. Also, make some more use of aligned_malloc().	2019-12-04 20:01:04 +02:00
Marko Mäkelä	613e9e7d4d	MDEV-20907 Set innodb_log_files_in_group=1 by default Historically, InnoDB split the redo log into at least 2 files. MDEV-12061 allowed the minimum to be innodb_log_files_in_group=1, but it kept the default at innodb_log_files_in_group=2. Because performance seems to be slightly better with only one log file, and because implementing an append-only variant of the log would require a single file, let us define the default to be 1, and have innodb_log_file_size=96M, to retain the same default total size.	2019-10-28 17:11:10 +02:00
Marko Mäkelä	f98bb23168	Merge 10.3 into 10.4	2019-05-29 22:17:00 +03:00
Marko Mäkelä	90a9193685	Merge 10.2 into 10.3	2019-05-29 11:32:46 +03:00
Marko Mäkelä	eeee1832d7	Speed up buildbot by requiring --big-test for some slow tests	2019-05-29 08:28:15 +03:00
Marko Mäkelä	514b305dfb	Merge 10.3 into 10.4 The MDEV-17262 commit `26432e49d3` was skipped. In Galera 4, the implementation would seem to require changes to the streaming replication. In the tests archive.rnd_pos main.profiling, disable_ps_protocol for SHOW STATUS and SHOW PROFILE commands until MDEV-18974 has been fixed.	2019-03-20 10:41:32 +02:00
Sergei Golubchik	b64fde8f38	Merge branch '10.2' into 10.3	2019-03-17 13:06:41 +01:00
Marko Mäkelä	d3afdb1e8f	Datafile::validate_first_page(): Change some ERROR to Note On startup, if the InnoDB doublewrite buffer can be used to recover a corrupted page, raising an ERROR about a recoverable error seems inappropriate. Issue Note instead, and adjust tests accordingly. Also, correctly validate the tablespace ID in the files.	2019-03-14 10:15:50 +02:00
Thirunarayanan Balathandayuthapani	c0f47a4a58	MDEV-12026: Implement innodb_checksum_algorithm=full_crc32 MariaDB data-at-rest encryption (innodb_encrypt_tables) had repurposed the same unused data field that was repurposed in MySQL 5.7 (and MariaDB 10.2) for the Split Sequence Number (SSN) field of SPATIAL INDEX. Because of this, MariaDB was unable to support encryption on SPATIAL INDEX pages. Furthermore, InnoDB page checksums skipped some bytes, and there are multiple variations and checksum algorithms. By default, InnoDB accepts all variations of all algorithms that ever existed. This unnecessarily weakens the page checksums. We hereby introduce two more innodb_checksum_algorithm variants (full_crc32, strict_full_crc32) that are special in a way: When either setting is active, newly created data files will carry a flag (fil_space_t::full_crc32()) that indicates that all pages of the file will use a full CRC-32C checksum over the entire page contents (excluding the bytes where the checksum is stored, at the very end of the page). Such files will always use that checksum, no matter what the parameter innodb_checksum_algorithm is assigned to. For old files, the old checksum algorithms will continue to be used. The value strict_full_crc32 will be equivalent to strict_crc32 and the value full_crc32 will be equivalent to crc32. ROW_FORMAT=COMPRESSED tables will only use the old format. These tables do not support new features, such as larger innodb_page_size or instant ADD/DROP COLUMN. They may be deprecated in the future. We do not want an unnecessary file format change for them. The new full_crc32() format also cleans up the MariaDB tablespace flags. We will reserve flags to store the page_compressed compression algorithm, and to store the compressed payload length, so that checksum can be computed over the compressed (and possibly encrypted) stream and can be validated without decrypting or decompressing the page. In the full_crc32 format, there no longer are separate before-encryption and after-encryption checksums for pages. The single checksum is computed on the page contents that is written to the file. We do not make the new algorithm the default for two reasons. First, MariaDB 10.4.2 was a beta release, and the default values of parameters should not change after beta. Second, we did not yet implement the full_crc32 format for page_compressed pages. This will be fixed in MDEV-18644. This is joint work with Marko Mäkelä.	2019-02-19 18:50:19 +02:00
Marko Mäkelä	df51dc28f5	Fix tests for innodb_checksum_algorithm=strict_crc32 In tests that directly write InnoDB data file pages, compute the innodb_checksum_algorithm=crc32 checksums, instead of writing the 0xdeadbeef value used by innodb_checksum_algorithm=none. In this way, these tests will not cause failures when executing ./mtr --mysqld=--loose-innodb-checksum-algorithm=strict_crc32	2019-02-16 12:06:52 +02:00
Marko Mäkelä	2d8fdfbde5	Merge 10.1 into 10.2 Replace have_innodb_zip.inc with innodb_page_size_small.inc.	2017-06-08 12:45:08 +03:00
Marko Mäkelä	fbeb9489cd	Cleanup of MDEV-12600: crash during install_db with innodb_page_size=32K and ibdata1=3M The doublewrite buffer pages must fit in the first InnoDB system tablespace data file. The checks that were added in the initial patch (commit `112b21da37`) were at too high level and did not cover all cases. innodb.log_data_file_size: Test all innodb_page_size combinations. fsp_header_init(): Never return an error. Move the change buffer creation to the only caller that needs to do it. btr_create(): Clean up the logic. Remove the error log messages. buf_dblwr_create(): Try to return an error on non-fatal failure. Check that the first data file is big enough for creating the doublewrite buffers. buf_dblwr_process(): Check if the doublewrite buffer is available. Display the message only if it is available. recv_recovery_from_checkpoint_start_func(): Remove a redundant message about FIL_PAGE_FILE_FLUSH_LSN mismatch when crash recovery has already been initiated. fil_report_invalid_page_access(): Simplify the message. fseg_create_general(): Do not emit messages to the error log. innobase_init(): Revert the changes. trx_rseg_create(): Refactor (no functional change).	2017-06-08 11:55:47 +03:00
Marko Mäkelä	0f34160d1d	Clean up a few tests that kill the server. As noted in MDEV-8841, any test that kills the server must issue FLUSH TABLES, so that tables of crash-unsafe storage engines will not be corrupted. Consistently issue this statement after any call mtr.add_suppression() calls. Also, do not invoke shutdown_server directly, but use helpers instead.	2017-01-27 17:07:45 +02:00
Marko Mäkelä	1ebfeceeb2	Merge 10.0 into 10.1 (test-only changes) Adjust the 10.1 tests innodb.doublewrite and innodb.101_compatibility in the same way.	2017-01-27 16:34:09 +02:00
Marko Mäkelä	b05bf8ff0f	Merge 10.1 to 10.2. Most notably, this includes MDEV-11623, which includes a fix and an upgrade procedure for the InnoDB file format incompatibility that is present in MariaDB Server 10.1.0 through 10.1.20. In other words, this merge should address MDEV-11202 InnoDB 10.1 -> 10.2 migration does not work	2017-01-19 12:06:13 +02:00
Marko Mäkelä	7e3f3deb41	MDEV-11623 follow-up: Adjust tests. innodb.doublewrite: Similar to what was done to innodb.101_compatibility, add an explicit $_ parameter to the Perl unpack function. Also, fix some diagnostic messages in the Perl code. innodb.innodb-wl5522-debug: Adjust for the changed error codes and messages on fault injection.	2017-01-16 11:23:12 +02:00
Marko Mäkelä	ab1e6fefd8	MDEV-11623 MariaDB 10.1 fails to start datadir created with MariaDB 10.0/MySQL 5.6 using innodb-page-size!=16K The storage format of FSP_SPACE_FLAGS was accidentally broken already in MariaDB 10.1.0. This fix is bringing the format in line with other MySQL and MariaDB release series. Please refer to the comments that were added to fsp0fsp.h for details. This is an INCOMPATIBLE CHANGE that affects users of page_compression and non-default innodb_page_size. Upgrading to this release will correct the flags in the data files. If you want to downgrade to earlier MariaDB 10.1.x, please refer to the test innodb.101_compatibility how to reset the FSP_SPACE_FLAGS in the files. NOTE: MariaDB 10.1.0 to 10.1.20 can misinterpret uncompressed data files with innodb_page_size=4k or 64k as compressed innodb_page_size=16k files, and then probably fail when trying to access the pages. See the comments in the function fsp_flags_convert_from_101() for detailed analysis. Move PAGE_COMPRESSION to FSP_SPACE_FLAGS bit position 16. In this way, compressed innodb_page_size=16k tablespaces will not be mistaken for uncompressed ones by MariaDB 10.1.0 to 10.1.20. Derive PAGE_COMPRESSION_LEVEL, ATOMIC_WRITES and DATA_DIR from the dict_table_t::flags when the table is available, in fil_space_for_table_exists_in_mem() or fil_open_single_table_tablespace(). During crash recovery, fil_load_single_table_tablespace() will use innodb_compression_level for the PAGE_COMPRESSION_LEVEL. FSP_FLAGS_MEM_MASK: A bitmap of the memory-only fil_space_t::flags that are not to be written to FSP_SPACE_FLAGS. Currently, these will include PAGE_COMPRESSION_LEVEL, ATOMIC_WRITES and DATA_DIR. Introduce the macro FSP_FLAGS_PAGE_SSIZE(). We only support one innodb_page_size for the whole instance. When creating a dummy tablespace for the redo log, use fil_space_t::flags=0. The flags are never written to the redo log files. Remove many FSP_FLAGS_SET_ macros. dict_tf_verify_flags(): Remove. This is basically only duplicating the logic of dict_tf_to_fsp_flags(), used in a debug assertion. fil_space_t::mark: Remove. This flag was not used for anything. fil_space_for_table_exists_in_mem(): Remove the unnecessary parameter mark_space, and add a parameter for table flags. Check that fil_space_t::flags match the table flags, and adjust the (memory-only) flags based on the table flags. fil_node_open_file(): Remove some redundant or unreachable conditions, do not use stderr for output, and avoid unnecessary server aborts. fil_user_tablespace_restore_page(): Convert the flags, so that the correct page_size will be used when restoring a page from the doublewrite buffer. fil_space_get_page_compressed(), fsp_flags_is_page_compressed(): Remove. It suffices to have fil_space_is_page_compressed(). FSP_FLAGS_WIDTH_DATA_DIR, FSP_FLAGS_WIDTH_PAGE_COMPRESSION_LEVEL, FSP_FLAGS_WIDTH_ATOMIC_WRITES: Remove, because these flags do not exist in the FSP_SPACE_FLAGS but only in memory. fsp_flags_try_adjust(): New function, to adjust the FSP_SPACE_FLAGS in page 0. Called by fil_open_single_table_tablespace(), fil_space_for_table_exists_in_mem(), innobase_start_or_create_for_mysql() except if --innodb-read-only is active. fsp_flags_is_valid(ulint): Reimplement from the scratch, with accurate comments. Do not display any details of detected inconsistencies, because the output could be confusing when dealing with MariaDB 10.1.x data files. fsp_flags_convert_from_101(ulint): Convert flags from buggy MariaDB 10.1.x format, or return ULINT_UNDEFINED if the flags cannot be in MariaDB 10.1.x format. fsp_flags_match(): Check the flags when probing files. Implemented based on fsp_flags_is_valid() and fsp_flags_convert_from_101(). dict_check_tablespaces_and_store_max_id(): Do not access the page after committing the mini-transaction. IMPORT TABLESPACE fixes: AbstractCallback::init(): Convert the flags. FetchIndexRootPages::operator(): Check that the tablespace flags match the table flags. Do not attempt to convert tablespace flags to table flags, because the conversion would necessarily be lossy. PageConverter::update_header(): Write back the correct flags. This takes care of the flags in IMPORT TABLESPACE.	2017-01-15 19:05:50 +02:00
Marko Mäkelä	f493e395b0	Make the test work with any innodb_page_size.	2016-12-30 09:51:11 +02:00
Marko Mäkelä	341c375d4b	Merge 10.1 into 10.2	2016-12-30 08:53:54 +02:00
Marko Mäkelä	195241e125	Port the test innodb.doublewrite from MySQL 5.7.	2016-12-20 15:03:56 +02:00

28 Commits