Let us make innodb_buffer_pool_filename a read-only variable
so that a malicious user cannot cause an important file to be
deleted on InnoDB shutdown. An attempt to delete a directory
will fail because it is not a regular file, but what if the
variable pointed to (say) ibdata1, ib_logfile0 or some *.ibd file?
It does not seem to make much sense for this parameter to be
configurable in the first place, but we will not change that in order
to avoid breaking compatibility.
We will remove the parameter innodb_disallow_writes because it is badly
designed and implemented. The parameter was never allowed at startup.
It was only internally used by Galera snapshot transfer.
If a user executed
SET GLOBAL innodb_disallow_writes=ON;
the server could hang even on subsequent read operations.
During Galera snapshot transfer, we will block writes
to implement an rsync friendly snapshot, as follows:
sst_flush_tables() will acquire a global lock by executing
FLUSH TABLES WITH READ LOCK, which will block any writes
at the high level.
sst_disable_innodb_writes(), invoked via ha_disable_internal_writes(true),
will suspend or disable InnoDB background tasks or threads that could
initiate writes. As part of this, log_make_checkpoint() will be invoked
to ensure that anything in the InnoDB buf_pool.flush_list will be written
to the data files. This has the nice side effect that the Galera joiner
will avoid crash recovery.
The changes to sql/wsrep.cc and to the tests are based on a prototype
that was developed by Jan Lindström.
Reviewed by: Jan Lindström
.. to be the same as startup.
In resolving MDEV-27461, BUF_LRU_MIN_LEN (256) is the minimum number of
pages for the innodb buffer pool size. Obviously we need more than just
flushing pages. Taking the 16k page size and its default minimum, an
extra 25% is needed on top of the flushing pages to make a workable buffer
pool.
The minimum innodb_buffer_pool_chunk_size (1M) restricts the minimum
otherwise we'd have a pool made up of different chunk sizes.
The resulting minimum innodb buffer pool sizes are:
Page Size, Previously minimum (startup), with change.
4k 5M 2M
8k 5M 3M
16k 5M 5M
32k 24M 10M
64k 24M 20M
With this patch, SET GLOBAL innodb_buffer_pool_size minimums are
enforced.
The evident minimum system variable size for innodb_buffer_pool_size
is 2M, however this is only setable if using 4k page size. As
the order of the page_size and buffer_pool_size aren't fixed, we can't
hide this change.
Subsequent changes:
* innodb_buffer_pool_resize_with_chunks.test - raised of pool resize due to new
minimums. Chunk size also needed increase as the test was for
pool_size < chunk_size to generate a warning.
* Removed srv_buf_pool_min_size and replaced use with MYSQL_SYSVAR_NAME(buffer_pool_size).min_val
* Removed srv_buf_pool_def_size and replaced constant defination in
MYSQL_SYSVAR_LONGLONG(buffer_pool_size)
* Reordered ha_innodb to allow for direct use of MYSQL_SYSVAR_NAME(buffer_pool_size).min_val
* Moved buf_pool_size_align into ha_innodb to access to MYSQL_SYSVAR_NAME(buffer_pool_size).min_val
* loose-innodb_disable_resize_buffer_pool_debug is needed in the
innodb.restart.opt test so that under debug mode, resizing of the
innodb buffer pool can occur.
innodb_debug_sync was introduced in commit
b393e2cb0c and reverted in
commit fc58c17216 due to memory leak reported
by valgrind, see MDEV-21336.
The leak is now fixed by adding `rw_lock_free(&slot->debug_sync_lock)`
after background thread working loop is finished, and the patch is
reapplied, with respect to c++98 fixes by Marko.
The missing DEBUG_SYNC for MDEV-18546 in row0vers.cc is also reapplied.
As suggested by Vladislav Vaintroub, let us remove misleading
and malformatted startup messages.
Even if the global variable srv_use_atomic_writes were set, we would
still invoke my_test_if_atomic_write() to check if writes are atomic
with a particular page size.
When using the default innodb_page_size=16k, page writes should be
atomic on NTFS when using ROW_FORMAT=COMPRESSED and KEY_BLOCK_SIZE<=4.
Disabling srv_use_atomic_writes when innodb_file_per_table=OFF does
not make sense, because that is a dynamic parameter.
We also correct the documentation string of innodb_use_atomic_writes
and remove the duplicate variable innobase_use_atomic_writes.
The debug parameter innodb_simulate_comp_failures injected compression
failures for ROW_FORMAT=COMPRESSED tables, breaking the pre-existing
logic that I had implemented in the InnoDB Plugin for MySQL 5.1 to prevent
compressed page overflows. A much better check is already achieved by
defining UNIV_ZIP_COPY at the compilation time.
(Only UNIV_ZIP_DEBUG is part of cmake -DWITH_INNODB_EXTRA_DEBUG=ON.)
The parameter innodb_idle_flush_pct that was introduced in
MariaDB Server 10.1.2 by MDEV-6932 has no effect ever since
the InnoDB changes from MySQL 5.7.9 were applied in
commit 2e814d4702.
Let us declare the parameter as deprecated and having no effect.
Let us introduce a dummy variable innodb_max_purge_lag_wait
for waiting that the InnoDB history list length is below
the user-specified limit. Specifically,
SET GLOBAL innodb_max_purge_lag_wait=0;
should wait for all history to be purged. This could be useful
when upgrading from an older version to MariaDB 10.3 or later,
to avoid hitting MDEV-15912.
Note: the history cannot be purged if there exist transactions
that may see old versions.
Reviewed by: Vladislav Vaintroub
MariaDB 10.2.2 inherited from MySQL 5.7 a perceived optimization
of ALTER TABLE, which skips the writing of redo log records.
In MDEV-16809 we introduced a parameter that allows the redo log to
be written, so that Mariabackup would not be impacted, but we kept
the MySQL 5.7 behaviour enabled by default (innodb_log_optimize_ddl=ON).
As noted in MDEV-19747 (Deprecate and ignore innodb_log_optimize_ddl,
implemented in MariaDB 10.5.1), omitting the redo log writes can
actually reduce performance, because we will have to wait for the data
pages to be written out. When the redo log file is configured to be
large enough, it actually can be much faster to write the redo log and
avoid the extra page flushing.
When the redo log is omitted (innodb_log_optimize_ddl=ON), also
Mariabackup may have to perform a lot of extra work, to re-copy the
entire data file if it is possible that any log was omitted during
the backup.
Starting with MariaDB 10.5.1, the parameter innodb_log_optimize_ddl
is deprecated and ignored. We hereby deprecate (but will not ignore)
the parameter in earlier versions as well.
Regretfully, the parameter innodb_log_checksums was introduced
in MySQL 5.7.9 (the first GA release of that series) by
mysql/mysql-server@af0acedd88
which partly replaced a parameter that had been introduced in 5.7.8
mysql/mysql-server@22ba38218e
as innodb_log_checksum_algorithm.
Given that the CRC-32C operations are accelerated on many processor
implementations (AMD64 with SSE4.2; since MDEV-22669 also on IA-32
with SSE4.2, POWER 8 and later, ARMv8 with some extensions)
and by lookup tables when only generic SISD instructions are available,
there should be no valid reason to disable checksums.
In MariaDB 10.5.2, as a preparation for MDEV-12353, MDEV-19543 deprecated
and ignored the parameter innodb_log_checksums altogether. This should
imply that after a clean shutdown with innodb_log_checksums=OFF one
cannot upgrade to MariaDB Server 10.5 at all.
Due to these problems, let us deprecate the parameter innodb_log_checksums
and honor it only during server startup.
The command SET GLOBAL innodb_log_checksums will always set the
parameter to ON.
The usage message for the innodb_compression_algorithm system variable
did not list snappy, which was added as an optional compression algorithm
in MariaDB 10.1.3 and might actually work since
commit 90c52e5291 (MDEV-12615)
in MariaDB 10.1.24.
Unfortunately, we will include also unavailable compression algorithms
in the list, because ENUM parameters allow numeric values, and we do
not want innodb_compression_algorithm=3 to change meaning depending on
the way how the source code was compiled.
For no good reason, innodb_encryption_threads was limited to
4,294,967,295. Expectedly, the server would crash if such an
insane value was specified. Let us limit the maximum to 255.
The encryption threads are not doing much useful work.
They are basically only dirtying pages by performing
dummy writes via the redo log. The encryption key rotation
or the in-place addition or removal of encryption
will take place in the page cleaner.
In a quick test on a 20-core CPU (40 threads in total),
the sweet spot on an otherwise idle server seemed to be
innodb_encryption_threads=16 for the test
encryption.encrypt_and_grep. The new limit 255 should be
more than enough for even bigger servers.
This essentially reverts commit b393e2cb0c.
The leak might have been fixed, but because the
DEBUG_SYNC instrumentation for InnoDB purge threads was reverted
in 10.5 commit 5e62b6a5e0
as part of introducing a thread pool, it is easiest to revert
the entire change.
Let us limit the maximum value of the debug parameter
innodb_data_file_size to 256 MiB. It is only being used
in the test innodb.log_data_file_size, and the size
of the system tablespace should never exceed some 70 MiB
in ./mtr. Thus, 256 MiB should be a reasonable limit.
The fact that negative values that are passed to unsigned parameters
wrap around to the maximum value appears to be a regression due to
commit 18ef02b04d
and has been filed as bug MDEV-22219.
If a table is altered using the MDEV-11369/MDEV-15562/MDEV-13134
ALGORITHM=INSTANT, it can force the table to use a non-canonical
format:
* A hidden metadata record at the start of the clustered index
is used to store each column's DEFAULT value. This makes it possible
to add new columns that have default values without rebuilding the table.
* Starting with MDEV-15562 in MariaDB Server 10.4, a BLOB in the
hidden metadata record is used to store column mappings. This makes
it possible to drop or reorder columns without rebuilding the table.
This also makes it possible to add columns to any position or drop
columns from any position in the table without rebuilding the table.
If a column is dropped without rebuilding the table, old records
will contain garbage in that column's former position, and new records
will be written with NULL values, empty strings, or dummy values.
This is generally not a problem. However, there may be cases where
users may want to avoid putting a table into this format.
For example, users may want to ensure that future UPDATE operations
after an ADD COLUMN will be performed in-place, to reduce write
amplification. (Instantly added columns are essentially always
variable-length.) Users might also want to avoid bugs similar to
MDEV-19916, or they may want to be able to export tables to
older versions of the server.
We will introduce the option innodb_instant_alter_column_allowed,
with the following values:
* never (0): Do not allow instant add/drop/reorder,
to maintain format compatibility with MariaDB 10.x and MySQL 5.x.
If the table (or partition) is not in the canonical format, then
any ALTER TABLE (even one that does not involve instant column
operations) will force a table rebuild.
* add_last (1, default in 10.3): Store a hidden metadata record that
allows columns to be appended to the table instantly (MDEV-11369).
In 10.4 or later, if the table (or partition) is not in this format,
then any ALTER TABLE (even one that does not involve column changes)
will force a table rebuild.
Starting with 10.4:
* add_drop_reorder (2, default): Like 'add_last', but allow the
metadata record to store a column map, to support instant
add/drop/reorder of columns (MDEV-15562).
The only change is a change of the version number.
In MySQL 5.6.46, the copyright comments in a number of files were changed
in mysql/mysql-server@f1a006ece7
but there was no functional change to InnoDB code.
This was also reflected by XtraDB. We are not changing the copyright
comments in MariaDB Server for now.
Between MySQL 5.6.46 and 5.6.47, InnoDB was not changed at all.
Actually, we had forgotten to update the InnoDB version number to
5.6.46. With this change, we are updating InnoDB
from 5.6.45 to 5.6.47 and XtraDB from 5.6.45-86.1 to 5.6.46-86.2.
InnoDB RNG maintains global state, causing otherwise unnecessary bus
traffic. Even worse, this is cross-mutex traffic. That is, different
mutexes suffer from contention.
Fixed delay of 4 was verified to give best throughput by OLTP update
index and read-write benchmarks on Intel Broadwell (2/20/40) and
ARM (1/46/46).
This is a backport of ce04790065 from
MariaDB Server 10.3.
To diagnose a hang in slow shutdown (innodb_fast_shutdown=0),
let us introduce a Boolean startup option in debug builds
that will cause the contents of the InnoDB change buffer
to be dumped to the server error log at startup.