In commit 1c55b845e0 (MDEV-32932) the
test mariabackup.innodb_ddl_on_intermediate_table was introduced but
disabled.
xb_load_single_table_tablespace(): Properly handle missing FTS_ tables.
backup_file_op_fail(): Properly handle FILE_DELETE records.
The invariant of write-ahead logging is that before any change to a
page is written to the data file, the corresponding log record must
must first have been durably written.
In crash recovery, there were some sloppy checks for this. Let us
implement accurate checks and flag an inconsistency as a hard error,
so that we can avoid further corruption of a corrupted database.
For data extraction from the corrupted database, innodb_force_recovery
can be used.
Before recovery is reading any data pages or invoking
buf_dblwr_t::recover() to recover torn pages from the
doublewrite buffer, InnoDB will have parsed the log until the
final LSN and updated log_sys.lsn to that. So, we can rely on
log_sys.lsn at all times. The doublewrite buffer recovery has been
refactored in such a way that the recv_sys.dblwr.pages may be consulted
while discovering files and their page sizes, but nothing will be
written back to data files before buf_dblwr_t::recover() is invoked.
recv_max_page_lsn, recv_lsn_checks_on: Remove.
recv_sys_t::validate_checkpoint(): Validate the write-ahead-logging
condition at the end of the recovery.
recv_dblwr_t::validate_page(): Keep track of the maximum LSN
(if we are checking a non-doublewrite copy of a page) but
do not complain LSN being in the future. The doublewrite buffer
is a special case, because it will be read early during recovery.
Besides, starting with commit 762bcb81b5
the dblwr=true copies of pages may legitimately be "too new".
recv_dblwr_t::find_page(): Find a valid page with the smallest
FIL_PAGE_LSN that is in the valid range for recovery.
recv_dblwr_t::restore_first_page(): Replaced by find_page().
Only buf_dblwr_t::recover() will write to data files.
buf_dblwr_t::recover(): Simplify the message output. Do attempt
doublewrite recovery on user page read error. Ignore doublewrite
pages whose FIL_PAGE_LSN is outside the usable bounds. Previously,
we could wrongly recover a too new page from the doublewrite buffer.
It is unlikely that this could have lead to an actual error.
Write back all recovered pages from the doublewrite buffer here,
including for the first page of any tablespace.
buf_page_is_corrupted(): Distinguish the return values
CORRUPTED_FUTURE_LSN and CORRUPTED_OTHER.
buf_page_check_corrupt(): Return the error code DB_CORRUPTION
in case the LSN is in the future.
Datafile::read_first_page_flags(): Split from read_first_page().
Take a copy of the first page as a parameter.
recv_sys_t::free_corrupted_page(): Take the file as a parameter
and return whether a message was displayed. This avoids some duplicated
and incomplete error messages.
buf_page_t::read_complete(): Remove some redundant output and always
display the name of the corrupted file. Never return DB_FAIL;
use it only in internal error handling.
IORequest::read_complete(): Assume that buf_page_t::read_complete()
will have reported any error.
fil_space_t::set_corrupted(): Return whether this is the first time
the tablespace had been flagged as corrupted.
Datafile::validate_first_page(), fil_node_open_file_low(),
fil_node_open_file(), fil_space_t::read_page0(),
fil_node_t::read_page0(): Add a parameter for a copy of the
first page, and a parameter to indicate whether the FIL_PAGE_LSN
check should be suppressed. Before buf_dblwr_t::recover() is
invoked, we cannot validate the FIL_PAGE_LSN, but we can trust the
FSP_SPACE_FLAGS and the tablespace ID that may be present in a
potentially too new copy of a page.
Reviewed by: Debarun Banerjee
In mariadb-backup --backup there are multiple mechanisms for ensuring that
a sufficient amount of the InnoDB write-ahead log (ib_logfile0) is being
copied at the end of the backup. The backup needs to include the latest
committed transaction. While further transaction commits are blocked by
BACKUP STAGE BLOCK_COMMIT, ongoing transactions may modify the database
contents and write log records. We were unnecessarily copying such log,
which would also cause further effort of rolling back incomplete
transactions after the backup is restored.
backup_wait_for_lsn(): Declare as static, and refactor some code
to separate functions backup_wait_for_lsn_low() and
backup_wait_timeout().
backup_wait_for_commit_lsn(): A new function to determine the current
LSN (within BACKUP STAGE BLOCK_COMMIT) and to wait for the log to be
copied until that. Invoked by BackupStages::stage_block_commit().
xtrabackup_backup_func(): Remove a condition that had already been
checked by a caller of backup_wait_timeout().
server_lsn_after_lock: Declare as a local variable in
BackupStages::stage_block_ddl().
log_copying_thread(), io_watching_thread(): Use metadata_last_lsn
instead of metadata_to_lsn as the stop condition.
BackupStages::stage_block_commit(): Ensure that the log tables
(in particular, mysql.general_log) will have been copied before
the BACKUP STAGE BLOCK_COMMIT is being followed by any further
SQL statements.
Reviewed by: Debarun Banerjee
Tested by: Matthias Leich
The test requires a larger innodb log file size; this was lost as a
side-effect of d7699c51eb.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
As part of commit 685d958e38 (MDEV-14425)
the parameter innodb_log_write_ahead_size was removed, because it was
thought that determining the physical block size would be a sufficient
replacement.
However, we can only determine the physical block size on Linux or
Microsoft Windows. On some file systems, the physical block size
is not relevant. For example, XFS uses a block size of 4096 bytes
even if the underlying block size may be smaller.
On Linux, we failed to determine the physical block size if
innodb_log_file_buffered=OFF was not requested or possible.
This will be fixed.
log_sys.write_size: The value of the reintroduced parameter
innodb_log_write_ahead_size. To keep it simple, this is read-only
and a power of two between 512 and 4096 bytes, so that the previous
alignment guarantees are fulfilled. This will replace the previous
log_sys.get_block_size().
log_sys.block_size, log_t::get_block_size(): Remove.
log_t::set_block_size(): Ensure that write_size will not be less
than the physical block size. There is no point to invoke this
function with 512 or less, because that is the minimum value of
write_size.
innodb_params_adjust(): Add some disabled code for adjusting
the minimum value and default value of innodb_log_write_ahead_size
to reflect the log_sys.write_size.
log_t::set_recovered(): Mark the recovery completed. This is the
place to adjust some things if we want to allow write_size>4096.
log_t::resize_write_buf(): Refer to write_size.
log_t::resize_start(): Refer to write_size instead of get_block_size().
log_write_buf(): Simplify some arithmetics and remove a goto.
log_t::write_buf(): Refer to write_size. If we are writing less than
that, do not switch buffers, but keep writing to the same buffer.
Move some code to improve the locality of reference.
recv_scan_log(): Refer to write_size instead of get_block_size().
os_file_create_func(): For type==OS_LOG_FILE on Linux, always invoke
os_file_log_maybe_unbuffered(), so that log_sys.set_block_size() will
be invoked even if we are not attempting to use O_DIRECT.
recv_sys_t::find_checkpoint(): Read the entire log header
in a single 12 KiB request into log_sys.buf.
Tested with:
./mtr --loose-innodb-log-write-ahead-size=4096
./mtr --loose-innodb-log-write-ahead-size=2048
MariaDB-backup needs to check for SLAVE MONITOR as that is
what is returned by SHOW GRANTS.
Update test to ensure that warnings about missing privileges
do not occur when the backup is successful.
Reviewer: Andrew Hutchings
Thanks Eugene for reporting the issue.