Seen on Windows using newest pcre2-10.45. Tests that use regular expression
e.g., mariabackup.partial, fail very often.
To fix, serialize regexec calls, which are documented as non-thread-safe
in https://www.pcre.org/current/doc/html/pcre2posix.html.
The mariadb-backup --use-memory parameter will be trimmed to a multiple
of 8 MiB, or 2 MiB on 32-bit systems.
Also, let us remove a bogus message in mariadb-backup:
Warning: option 'use-memory': signed value -1 adjusted to 8388608
We deprecate and ignore the parameter innodb_buffer_pool_chunk_size
and let the buffer pool size to be changed in arbitrary 1-megabyte
increments.
innodb_buffer_pool_size_max: A new read-only startup parameter
that specifies the maximum innodb_buffer_pool_size. If 0 or
unspecified, it will default to the specified innodb_buffer_pool_size
rounded up to the allocation unit (2 MiB or 8 MiB). The maximum value
is 4GiB-2MiB on 32-bit systems and 16EiB-8MiB on 64-bit systems.
This maximum is very likely to be limited further by the operating system.
The status variable Innodb_buffer_pool_resize_status will reflect
the status of shrinking the buffer pool. When no shrinking is in
progress, the string will be empty.
Unlike before, the execution of SET GLOBAL innodb_buffer_pool_size
will block until the requested buffer pool size change has been
implemented, or the execution is interrupted by a KILL statement
a client disconnect, or server shutdown. If the
buf_flush_page_cleaner() thread notices that we are running out of
memory, the operation may fail with ER_WRONG_USAGE.
SET GLOBAL innodb_buffer_pool_size will be refused
if the server was started with --large-pages (even if
no HugeTLB pages were successfully allocated). This functionality
is somewhat exercised by the test main.large_pages, which now runs
also on Microsoft Windows. On Linux, explicit HugeTLB mappings are
apparently excluded from the reported Redident Set Size (RSS), and
apparently unshrinkable between mmap(2) and munmap(2).
The buffer pool will be mapped to a contiguous virtual memory area
that will be aligned and partitioned into extents of 8 MiB on
64-bit systems and 2 MiB on 32-bit systems.
Within an extent, the first few innodb_page_size blocks contain
buf_block_t objects that will cover the page frames in the rest
of the extent. The number of such frames is precomputed in the
array first_page_in_extent[] for each innodb_page_size.
In this way, there is a trivial mapping between
page frames and block descriptors and we do not need any
lookup tables like buf_pool.zip_hash or buf_pool_t::chunk_t::map.
We will always allocate the same number of block descriptors for
an extent, even if we do not need all the buf_block_t in the last
extent in case the innodb_buffer_pool_size is not an integer multiple
of the of extents size.
The minimum innodb_buffer_pool_size is 256*5/4 pages. At the default
innodb_page_size=16k this corresponds to 5 MiB. However, now that the
innodb_buffer_pool_size includes the memory allocated for the block
descriptors, the minimum would be innodb_buffer_pool_size=6m.
my_large_virtual_alloc(): A new function, similar to my_large_malloc().
my_virtual_mem_reserve(), my_virtual_mem_commit(),
my_virtual_mem_decommit(), my_virtual_mem_release():
New interface mostly by Vladislav Vaintroub, to separately
reserve and release virtual address space, as well as to
commit and decommit memory within it.
After my_virtual_mem_decommit(), the virtual memory range will be
read-only or unaccessible, depending on whether the build option
cmake -DHAVE_UNACCESSIBLE_AFTER_MEM_DECOMMIT=1
has been specified. This option is hard-coded on Microsoft Windows,
where VirtualMemory(MEM_DECOMMIT) will make the memory unaccessible.
On IBM AIX, Linux, Illumos and possibly Apple macOS, the virtual memory
will be zeroed out immediately. On other POSIX-like systems,
madvise(MADV_FREE) will be used if available, to give the operating
system kernel a permission to zero out the virtual memory range.
We prefer immediate freeing so that the reported
resident set size (RSS) of the process will reflect the current
innodb_buffer_pool_size. Shrinking the buffer pool is a rarely
executed resource intensive operation, and the immediate configuration
of the MMU mappings should not incur significant additional penalty.
opt_super_large_pages: Declare only on Solaris. Actually, this is
specific to the SPARC implementation of Solaris, but because we
lack access to a Solaris development environment, we will not revise
this for other MMU and ISA.
buf_pool_t::chunk_t::create(): Remove.
buf_pool_t::create(): Initialize all n_blocks of the buf_pool.free list.
buf_pool_t::allocate(): Renamed from buf_LRU_get_free_only().
buf_pool_t::LRU_warned: Changed to Atomic_relaxed<bool>,
only to be modified by the buf_flush_page_cleaner() thread.
buf_pool_t::shrink(): Attempt to shrink the buffer pool.
There are 3 possible outcomes: SHRINK_DONE (success),
SHRINK_IN_PROGRESS (the caller may keep trying),
and SHRINK_ABORT (we seem to be running out of buffer pool).
While traversing buf_pool.LRU, release the contended
buf_pool.mutex once in every 32 iterations in order to
reduce starvation. Use lru_scan_itr for efficient traversal,
similar to buf_LRU_free_from_common_LRU_list().
buf_pool_t::shrunk(): Update the reduced size of the buffer pool
in a way that is compatible with buf_pool_t::page_guess(),
and invoke my_virtual_mem_decommit().
buf_pool_t::resize(): Before invoking shrink(), run one batch of
buf_flush_page_cleaner() in order to prevent LRU_warn().
Abort if shrink() recommends it, or no blocks were withdrawn in
the past 15 seconds, or the execution of the statement
SET GLOBAL innodb_buffer_pool_size was interrupted.
buf_pool_t::first_to_withdraw: The first block descriptor that is
out of the bounds of the shrunk buffer pool.
buf_pool_t::withdrawn: The list of withdrawn blocks.
If buf_pool_t::resize() is aborted before shrink() completes,
we must be able to resurrect the withdrawn blocks in the free list.
buf_pool_t::contains_zip(): Added a parameter for the
number of least significant pointer bits to disregard,
so that we can find any pointers to within a block
that is supposed to be free.
buf_pool_t::is_shrinking(): Return the total number or blocks that
were withdrawn or are to be withdrawn.
buf_pool_t::to_withdraw(): Return the number of blocks that will need to
be withdrawn.
buf_pool_t::usable_size(): Number of usable pages, considering possible
in-progress attempt at shrinking the buffer pool.
buf_pool_t::page_guess(): Try to buffer-fix a guessed block pointer.
If HAVE_UNACCESSIBLE_AFTER_MEM_DECOMMIT is set, the pointer will
be validated before being dereferenced.
buf_pool_t::get_info(): Replaces buf_stats_get_pool_info().
innodb_init_param(): Refactored. We must first compute
srv_page_size_shift and then determine the valid bounds of
innodb_buffer_pool_size.
buf_buddy_shrink(): Replaces buf_buddy_realloc().
Part of the work is deferred to buf_buddy_condense_free(),
which is being executed when we are not holding any
buf_pool.page_hash latch.
buf_buddy_condense_free(): Do not relocate blocks.
buf_buddy_free_low(): Do not care about buffer pool shrinking.
This will be handled by buf_buddy_shrink() and
buf_buddy_condense_free().
buf_buddy_alloc_zip(): Assert !buf_pool.contains_zip()
when we are allocating from the binary buddy system.
Previously we were asserting this on multiple recursion levels.
buf_buddy_block_free(), buf_buddy_free_low():
Assert !buf_pool.contains_zip().
buf_buddy_alloc_from(): Remove the redundant parameter j.
buf_flush_LRU_list_batch(): Add the parameter to_withdraw
to keep track of buf_pool.n_blocks_to_withdraw.
buf_do_LRU_batch(): Skip buf_free_from_unzip_LRU_list_batch()
if we are shrinking the buffer pool. In that case, we want
to minimize the page relocations and just finish as quickly
as possible.
trx_purge_attach_undo_recs(): Limit purge_sys.n_pages_handled()
in every iteration, in case the buffer pool is being shrunk
in the middle of a purge batch.
Reviewed by: Debarun Banerjee
- During prepare of incremental backup, mariabackup does create
new file in target directory with default file size of
4 * innodb_page_size. While applying .delta file to corresponding
data file, it encounters the FSP_SIZE modification on page0 and
tries to extend the file to the size which is 4 in our case.
Since the table is in compressed row format, page_size for the
table is 8k. This lead to shrinking of tablespace file from 65536
to 32768. This issue happens only in windows because
os_file_set_size() doesn't check for the current size
and shrinks the file.
Solution:
========
xtrabackup_apply_delta(): Check for the current size before
doing setting size for the file.
At the start of mariadb-backup --backup, trigger a flush of the
InnoDB buffer pool, so that as little log as possible will have
to be copied.
The previously debug-build-only interface
SET GLOBAL innodb_log_checkpoint_now=ON;
will be made available on all builds, and
mariadb-backup --backup will invoke it, unless the option
--skip-innodb-log-checkpoint-now is specified.
Reviewed by: Vladislav Vaintroub
- Remove the redundant check of TRX_SYS page change in
wf_incremental_process()
- Remove the double casting of srv_undo_tablespaces
in write_backup_config_file()
- Remove the unused variables like checkpoint_lsn_start
and checkpoint_no_start.
This is a regression which caused by commit 1c55b845e0fe337e647ba230288ed13e966cb7c7.
When running mariabackup with --prepare --export options, a deprecation
warning appears because the program name is set to "mysqld" instead of "mariadbd"
The fix ensures that when running in mysqld mode, we properly set the
program name to "mariadbd" to avoid the deprecation warning while
maintaining the original functionality
* Migrate `sql/share/errmsg-utf8.txt` to use suffix-based, `-Wformat`
-compatible `my_snprintf` format extensions introduced in MDEV-21978
* Update relevant tests caught by BuildBot as well
While GCC `-Wformat` (with `ATTRIBUTE_FORMAT`) can catch obsolete or
malformed format string literals, formats originating from other sources
(such as this translations file) (still) require manual review.
This commit also escapes the only (1) instance of existing strings
conflicted by the introduction of suffixes:
(Not all `printf`s goes to `my_snprintf`, thus I `grep`ped and
confirmed that this does indeed land on `my_snprintf` eventually.)
chi "不能%sSLAVE'%.*s'"
This commit also fixes the following: (You’re welcome.)
* Delete extraneous spaces after the `%` (they’re all Swahili)
* Update `extra/comp_err.c`
* Add the missing standard C/C++ specifiers `c`, `i`, `o`, `p` and `X`
(Especially `%i`: it otherwise was complaining about the new `%iE`)
* Removed the old and obsolete extension formats `%b`, `%M` and `%T`
* rpl.rpl_system_versioning_partitions updated for MDEV-32188
* innodb.row_size_error_log_warnings_3 changed error for MDEV-33658
(checks are done in a different order)
followup 136e866119779668736a4d52ae3301e1f6e3eff2
Remove HAVE_CONFIG_H from wolfssl compilation.
WolfSSL knows about it, and would include server's config.h, which is
mostly fine, but server pretends to have HAVE_GMTIME_R on Windows, which
leads to compilation problems. In any case, on Windows, there is no need
for config.h for WolfSSL, and no need for gmtime_r/_s(), as gmtime() is
thread-safe on Windpows (it returns pointer to thread-local struct)
Workaround build bugs with preprocessor flags du-jour
1. OPENSSL_ALL does not work anymore (error in ssl.h)
nor WOLFSSL_MYSQL_COMPATIBLE, would work, when building library.
2. OPENSSL_EXTRA has to be used instead of OPENSSL_ALL now.
WOLFSSL_MYSQL_COMPATIBLE needs to be used to workaround their conflicting
definition of protocol_version, which is used in server code.
3. -D_CRT_USE_CONFORMING_ANNEX_K_TIME to force C11-correct definition of
gmtime_s on Windows, set some other flags WOLFSSL_MYSQL_COMPATIBLE was
previously defining
4. Use HAVE_EMPTY_AGGREGATES=0 to workaround build error on clang
(error: struct has size 0 in C, size 1 in C++ [-Werror,-Wextern-c-compat]
WOLF_AGG_DUMMY_MEMBER;)
extra/mariabackup/xtrabackup.cc:3407:15: error: conversion from ‘lsn_t’ {aka ‘long long unsigned int’} to ‘size_t’ {aka ‘unsigned int’} may change value [-Werror=conversion]
followup for 6acada713a95
innodb_log_file_mmap: Use a constant documentation string that
refers to persistent memory also when it is not available in the build.
HAVE_INNODB_MMAP: Remove, and unconditionally enable this code.
log_mmap(): On 32-bit systems, ensure that the size fits in 32 bits.
log_t::resize_start(), log_t::resize_abort(): Only handle memory-mapping
if HAVE_PMEM is defined. The generic memory-mapped interface is only for
reading the log in recovery. Writable memory mappings are only for
persistent memory, that is, Linux file systems with mount -o dax.
Reviewed by: Debarun Banerjee, Otto Kekäläinen
Now that ut_fold_ulint_pair() and ut_fold_binary() are no longer needed
for anything else than compatibility with old InnoDB data files that may
use innodb_checksum_algorithm=innodb, let us move the code to a single
compilation unit.
Reviewed by: Vladislav Lesin
Let us use implement a simple fixed-size allocator for the adaptive hash
index, insted of complicating mem_heap_t or mem_block_info_t.
MEM_HEAP_BTR_SEARCH: Remove.
mem_block_info_t::free_block(), mem_heap_free_block_free(): Remove.
mem_heap_free_top(), mem_heap_get_top(): Remove.
btr_sea::partition::spare: Replaces mem_block_info_t::free_block.
This keeps one spare block per adaptive hash index partition, to
process an insert.
We must not wait for buf_pool.mutex while holding
any btr_sea::partition::latch. That is why we cache one block for
future allocations. This is protected by a new
btr_sea::partition::blocks_mutex in order to relieve pressure on
btr_sea::partition::latch.
btr_sea::partition::prepare_insert(): Replaces
btr_search_check_free_space_in_heap().
btr_sea::partition::erase(): Replaces ha_search_and_delete_if_found().
btr_sea::partition::cleanup_after_erase(): Replaces the most part of
ha_delete_hash_node(). Unlike the previous implementation, we will
retain a spare block for prepare_insert().
This should reduce some contention on buf_pool.mutex.
btr_search.n_parts: Replaces btr_ahi_parts.
btr_search.enabled: Replaces btr_search_enabled. This must hold
whenever buf_block_t::index is set while a thread is holding a
btr_sea::partition::latch.
dict_index_t::search_info: Remove pointer indirection, and use
Atomic_relaxed or Atomic_counter for most fields.
btr_search_guess_on_hash(): Let the caller ensure that latch_mode is
BTR_MODIFY_LEAF or BTR_SEARCH_LEAF. Release btr_sea::partition::latch
before buffer-fixing the block. The page latch that we already acquired
is preventing buffer pool eviction. We must validate both
block->index and block->page.state while holding part.latch
in order to avoid race conditions with buffer page relocation
or buf_pool_t::resize().
btr_search_check_guess(): Remove the constant parameter
can_only_compare_to_cursor_rec=false.
ahi_node: Replaces ha_node_t.
This has been tested by running the regression test suite
with the adaptive hash index enabled:
./mtr --mysqld=--loose-innodb-adaptive-hash-index=ON
Reviewed by: Vladislav Lesin
- InnoDB fails to recover the full crc32 encrypted page from
doublewrite buffer. The reason is that buf_dblwr_t::recover()
fails to identify the space id from the page because the page has
been encrypted from FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION bytes.
Fix:
===
buf_dblwr_t::recover(): preserve any pages whose space_id
does not match a known tablespace. These could be encrypted pages
of tablespaces that had been created with
innodb_checksum_algorithm=full_crc32.
buf_page_t::read_complete(): If the page looks corrupted and the
tablespace is encrypted and in full_crc32 format, try to
restore the page from doublewrite buffer.
recv_dblwr_t::recover_encrypted_page(): Find the page which
has the same page number and try to decrypt the page using
space->crypt_data. After decryption, compare the space id.
Write the recovered page back to the file.
Note: Changes to the test innodb.stats_persistent
in commit e5c4c0842d0a11b9919efcda09377083a4a0d69a (MDEV-35443)
are not merged, because the test scenario is impossible
due to commit e66928ab28dbff3563a134c423b4ac1eb348f7c0 (MDEV-33462).
fil_space_t::create(): Instead of invoking the default fil_space_t
constructor on a zero-filled buffer, allocate an uninitialized buffer
and invoke an explicitly defined constructor on it. Also, specify
initializer expressions for all constant data members, so that all of them
will be initialized in the constructor.
fil_space_t::being_imported: Replaces part of fil_space_t::purpose.
fil_space_t::is_being_imported(), fil_space_t::is_temporary():
Replaces fil_space_t::purpose.
fil_space_t:🆔 Changed the type from ulint to uint32_t to reduce
incompatibility with later branches that include
commit ca501ffb04246dcaa1f1d433d916d8436e30602e (MDEV-26195).
fil_space_t::try_to_close(): Do not attempt to close files that are
in an I/O bound phase of ALTER TABLE…IMPORT TABLESPACE.
log_file_op, first_page_init: recv_spaces_t:
Use uint32_t for the tablespace id.
Reviewed by: Debarun Banerjee
- Innochecksum misinterprets the freed pages as active one.
This leads the user to think there are too many valid
pages exist.
- To avoid this confusion, innochecksum introduced one
more option --skip-freed-pages and -r to avoid the freed
pages while dumping or printing the summary of the tablespace.
- Innochecksum can safely assume the page is freed if
the respective extent doesn't belong to a segment and marked as
freed in XDES_BITMAP in extent descriptor page.
- Innochecksum shouldn't assume that zero-filled page as extent
descriptor page.
Reviewed-by: Marko Mäkelä