Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.
This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.
HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.
OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).
os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.
os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.
create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.
row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.
Reviewed by: Vladislav Vaintroub
The `unused-but-set-variable` warning is raised on MacOS from the
`posix_fadvise` standin macro, since offset is often otherwise unused. Add a
cast to absorb this warning.
Signed-off-by: Trevor Gross <tmgross@umich.edu>
This commit adds correct handling of binlogs for SST using rsync
or mariabackup. Before this fix, binlogs were handled incorrectly -
- only one (last) binary log file was transferred during SST, which
then led to various failures (for example, when trying to list all
events from the binary log). These bugs were long masked by flaws
in the primitive binlogs handling code in the SST scripts, which
causing binary logs files to be erased after transfer or not added
to the binlog index on the joiner node. Now the correct transfer
of all binary logs (not just the last of the binary log files) has
been implemented both for the rsync (at the script level) and for
the mariabackup (at the level of the main utility code).
This commit also adds a new sst_max_binlogs=<n> parameter, which
can be located in the [sst] section or in the [xtrabackup] section
(historically, supported for mariabackup only, not for rsync), or
in one of the server sections. This parameter specifies the number
of binary log files to be sent to the joiner node during SST. This
option is added for compatibility with old SST scripting behavior,
which can be emulated by setting the sst_max_binlogs=1 (although
in general this can cause problems for the reasons described above).
In addition, setting the sst_max_binlogs=0 can be used to suppress
the transmission of binary logs to the joiner nodes during SST
(although sometimes a single file with the current binary log can
still be transmitted to the joiner, even with sst_max_binlogs=0,
because this sometimes necessary in modes that involve the use of
GTIDs with Galera).
Also, this commit ensures correct handling of paths to various
innodb files and directories in the SST scripts, and fixes some
problems with this that existed in mariabackup utility (which
were associated with incorrect handling of the innodb_data_dir
parameter in some scenarios).
In addition, this commit contains the following enhancements:
1) Added tests for mtr, which check the correct work with binlogs
after SST (using rsync and mariabackup);
2) Added correct handling of slashes at the end of all paths that
the SST script receives as parameters;
3) Improved parsing code for --mysqld-args parameters. Now it
correctly processes the sequence "--" after the name of the
one-letter option;
4) Checking the secret signature during joiner authentication
is made independent of presence of bash (as a unix shell)
in the system and diff utility no longer needed to check
certificates compliance;
5) All directories that are necessary for the correct placement
of various logs are automatically created by SST scripts in
advance (before running mariabackup on the joiner node);
6) Removal of old binary logs on joiner is done using the binlog
index (if it exists) (not only by fixed pattern that based
on the current binlog name, as before);
7) Paths for placing binary logs are correctly processed if they
are set as relative paths (to the datadir);
8) SST scripts are made even more resistant to spaces in filenames
(now for binlogs);
9) In case of failure, SST scripts now always end with an exit
code other than zero;
10) SST script for rsync now correctly create a tar file with
the binlogs, even if the paths to them (in the binlog index
file) are specified as a mix of absolute and relative paths,
and even if they do not match with the datadir path specified
in the current configuration settings.
In the InnoDB data files, we allocate 32 bits for tablespace identifiers
and page numbers as well as tablespace flags. But, in main memory
data structures we allocate 32 or 64 bits, depending on the register
width of the processor. Let us always use 32-bit fields to eliminate
a mismatch and reduce the memory footprint on 64-bit systems.