1
0
mirror of https://github.com/MariaDB/server.git synced 2025-12-07 17:42:39 +03:00
Commit Graph

329 Commits

Author SHA1 Message Date
Marko Mäkelä
23047d3ed4 Merge 10.4 into 10.5 2020-05-18 17:30:02 +03:00
Marko Mäkelä
9e6e43551f Merge 10.3 into 10.4
We will expose some more std::atomic internals in Atomic_counter,
so that dict_index_t::lock will support the default assignment operator.
2020-05-16 07:39:15 +03:00
Marko Mäkelä
3d0bb2b7f1 Merge 10.2 into 10.3 2020-05-15 19:11:57 +03:00
Marko Mäkelä
ad6171b91c MDEV-22456 Dropping the adaptive hash index may cause DDL to lock up InnoDB
If the InnoDB buffer pool contains many pages for a table or index
that is being dropped or rebuilt, and if many of such pages are
pointed to by the adaptive hash index, dropping the adaptive hash index
may consume a lot of time.

The time-consuming operation of dropping the adaptive hash index entries
is being executed while the InnoDB data dictionary cache dict_sys is
exclusively locked.

It is not actually necessary to drop all adaptive hash index entries
at the time a table or index is being dropped or rebuilt. We can let
the LRU replacement policy of the buffer pool take care of this gradually.
For this to work, we must detach the dict_table_t and dict_index_t
objects from the main dict_sys cache, and once the last
adaptive hash index entry for the detached table is removed
(when the garbage page is evicted from the buffer pool) we can free
the dict_table_t and dict_index_t object.

Related to this, in MDEV-16283, we made ALTER TABLE...DISCARD TABLESPACE
skip both the buffer pool eviction and the drop of the adaptive hash index.
We shifted the burden to ALTER TABLE...IMPORT TABLESPACE or DROP TABLE.
We can remove the eviction from DROP TABLE. We must retain the eviction
in the ALTER TABLE...IMPORT TABLESPACE code path, so that in case the
discarded table is being re-imported with the same tablespace identifier,
the fresh data from the imported tablespace will replace any stale pages
in the buffer pool.

rpl.rpl_failed_drop_tbl_binlog: Remove the test. DROP TABLE can
no longer be interrupted inside InnoDB.

fseg_free_page(), fseg_free_step(), fseg_free_step_not_header(),
fseg_free_page_low(), fseg_free_extent(): Remove the parameter
that specifies whether the adaptive hash index should be dropped.

btr_search_lazy_free(): Lazily free an index when the last
reference to it is dropped from the adaptive hash index.

buf_pool_clear_hash_index(): Declare static, and move to the
same compilation unit with the bulk of the adaptive hash index
code.

dict_index_t::clone(), dict_index_t::clone_if_needed():
Clone an index that is being rebuilt while adaptive hash index
entries exist. The original index will be inserted into
dict_table_t::freed_indexes and dict_index_t::set_freed()
will be called.

dict_index_t::set_freed(), dict_index_t::freed(): Note that
or check whether the index has been freed. We will use the
impossible page number 1 to denote this condition.

dict_index_t::n_ahi_pages(): Replaces btr_search_info_get_ref_count().

dict_index_t::detach_columns(): Move the assignment n_fields=0
to ha_innobase_inplace_ctx::clear_added_indexes().
We must have access to the columns when freeing the
adaptive hash index. Note: dict_table_t::v_cols[] will remain
valid. If virtual columns are dropped or added, the table
definition will be reloaded in ha_innobase::commit_inplace_alter_table().

buf_page_mtr_lock(): Drop a stale adaptive hash index if needed.

We will also reduce the number of btr_get_search_latch() calls
and enclose some more code inside #ifdef BTR_CUR_HASH_ADAPT
in order to benefit cmake -DWITH_INNODB_AHI=OFF.
2020-05-15 17:23:08 +03:00
Marko Mäkelä
7bcaa541aa Merge 10.4 into 10.5 2020-05-05 21:16:22 +03:00
Marko Mäkelä
2c3c851d2c Merge 10.3 into 10.4 2020-05-05 20:33:10 +03:00
Oleksandr Byelkin
7fb73ed143 Merge branch '10.2' into 10.3 2020-05-04 16:47:11 +02:00
Daniel Black
ba2061da52 MDEV-21595: innodb offset_t rename to rec_offs
thanks to:

perl -i -pe 's/\boffset_t\b/rec_offs/g' $(git grep -lw offset_t storage/innobase)
2020-04-29 12:02:47 +03:00
Marko Mäkelä
30c9833751 MDEV-22332: Assertion mtr_started == mtr.is_active() failed
Commit 5defdc382b
(which aimed to reduce sizeof(mtr_t) for non-debug builds)
replaced the ternary mtr_t::status with two debug-only bool
data members m_start, m_commit and inadvertently made the
(now debug-only) predicate mtr_t::is_active() wrongly hold
after mtr_t::commit().

mtr_t::is_active(): Evaluate both m_start and m_commit,
to be compatible with the old definition.

row_merge_read_clustered_index(): Correct a debug assertion.
2020-04-22 12:36:11 +03:00
Daniel Black
e8351934b6 Merge pull request #1221 from grooverdan/10.4-MDEV-18851-multiple-sized-large-page-support
MDEV-18851: multiple sized large page support (linux)
2020-04-02 23:54:08 +04:00
Marko Mäkelä
5203bc10f1 Merge 10.4 into 10.5 2020-03-21 11:37:10 +02:00
Marko Mäkelä
bd3c8f47cd Merge 10.3 into 10.4 2020-03-20 22:06:55 +02:00
Marko Mäkelä
44298e4dea Merge 10.2 into 10.3
Also, clean up the test innodb_gis.geometry a little further.
2020-03-20 18:12:17 +02:00
Marko Mäkelä
6960e9ed24 MDEV-21983: Crash on DROP/RENAME TABLE after DISCARD TABLESPACE
fil_delete_tablespace(): Remove the unused parameter drop_ahi,
and add the parameter if_exists=false. We want to suppress
error messages if we know that the tablespace has been discarded.

dict_table_rename_in_cache(): Pass the new parameter to
fil_delete_tablespace(), that is, do not complain about
missing tablespace if the tablespace has been discarded.

row_make_new_pathname(): Declare as static.

row_drop_table_for_mysql(): Tolerate !table->data_dir_path
when the tablespace has been discarded.

row_rename_table_for_mysql(): Skip part of the RENAME TABLE
when fil_space_get_first_path() returns NULL.
2020-03-19 14:23:47 +02:00
Marko Mäkelä
f224525204 MDEV-21907: InnoDB: Enable -Wconversion on clang and GCC
The -Wconversion in GCC seems to be stricter than in clang.
GCC at least since version 4.4.7 issues truncation warnings for
assignments to bitfields, while clang 10 appears to only issue
warnings when the sizes in bytes rounded to the nearest integer
powers of 2 are different.

Before GCC 10.0.0, -Wconversion required more casts and would not
allow some operations, such as x<<=1 or x+=1 on a data type that
is narrower than int.

GCC 5 (but not GCC 4, GCC 6, or any later version) is complaining
about x|=y even when x and y are compatible types that are narrower
than int.  Hence, we must rewrite some x|=y as
x=static_cast<byte>(x|y) or similar, or we must disable -Wconversion.

In GCC 6 and later, the warning for assigning wider to bitfields
that are narrower than 8, 16, or 32 bits can be suppressed by
applying a bitwise & with the exact bitmask of the bitfield.
For older GCC, we must disable -Wconversion for GCC 4 or 5 in such
cases.

The bitwise negation operator appears to promote short integers
to a wider type, and hence we must add explicit truncation casts
around them. Microsoft Visual C does not allow a static_cast to
truncate a constant, such as static_cast<byte>(1) truncating int.
Hence, we will use the constructor-style cast byte(~1) for such cases.

This has been tested at least with GCC 4.8.5, 5.4.0, 7.4.0, 9.2.1, 10.0.0,
clang 9.0.1, 10.0.0, and MSVC 14.22.27905 (Microsoft Visual Studio 2019)
on 64-bit and 32-bit targets (IA-32, AMD64, POWER 8, POWER 9, ARMv8).
2020-03-12 19:46:41 +02:00
Marko Mäkelä
574d8b2940 MDEV-21907: Fix most clang -Wconversion in InnoDB
Declare innodb_purge_threads as 4-byte integer (UINT)
instead of 4-or-8-byte (ULONG) and adjust the documentation string.
2020-03-11 08:29:48 +02:00
Sergei Golubchik
7af733a5a2 perfschema compilation, test and misc fixes 2020-03-10 19:24:23 +01:00
Marko Mäkelä
a4ab54d70f MDEV-14425 Cleanup: Use std::atomic for some log_sys members
Some fields were protected by log_sys.mutex, which adds quite some
overhead for readers. Some readers were submitting dirty reads.

log_t::lsn: Declare private and atomic. Add wrappers get_lsn()
and set_lsn() that will use relaxed memory access. Many accesses
to log_sys.lsn are still protected by log_sys.mutex; we avoid the
mutex for some readers.

log_t::flushed_to_disk_lsn: Declare private and atomic, and move
to the same cache line with log_t::lsn.

log_t::buf_free: Declare as size_t, and move to the same cache line
with log_t::lsn.

log_t::check_flush_or_checkpoint_: Declare private and atomic,
and move to the same cache line with log_t::lsn.

log_get_lsn(): Define as an alias of log_sys.get_lsn().

log_get_lsn_nowait(), log_peek_lsn(): Remove.

log_get_flush_lsn(): Define as an alias of log_sys.get_flush_lsn().

log_t::initiate_write(): Replaces log_buffer_sync_in_background().
2020-03-05 16:21:31 +02:00
Marko Mäkelä
37e7bde12a MDEV-14425 preparation: Remove log_t::append_on_checkpoint
Simplify the logging of ALTER TABLE operations, by making use of the
TRX_UNDO_RENAME_TABLE undo log record that was introduced in
commit 0bc36758ba.

commit_try_rebuild(): Invoke row_rename_table_for_mysql() and
actually rename the files before committing the transaction.

fil_mtr_rename_log(), commit_cache_rebuild(),
log_append_on_checkpoint(), row_merge_rename_tables_dict(): Remove.

mtr_buf_copy_t, log_t::append_on_checkpoint: Remove.

row_rename_table_for_mysql(): If !use_fk, ignore missing foreign
keys. Remove a call to dict_table_rename_in_cache(), because
trx_rollback_to_savepoint() should invoke the function if needed.
2020-03-03 22:25:20 +02:00
Marko Mäkelä
fc2f2fa853 MDEV-19747: Deprecate and ignore innodb_log_optimize_ddl
During native table rebuild or index creation, InnoDB used to skip
redo logging and write MLOG_INDEX_LOAD records to inform crash recovery
and Mariabackup of the gaps in redo log. This is fragile and prohibits
some optimizations, such as skipping the doublewrite buffer for
newly (re)initialized pages (MDEV-19738).

row_merge_write_redo(): Remove. We do not write MLOG_INDEX_LOAD
records any more. Instead, we write full redo log.

FlushObserver: Remove.

fseg_free_page_func(): Remove the parameter log. Redo logging
cannot be disabled.

fil_space_t::redo_skipped_count: Remove.

We cannot remove buf_block_t::skip_flush_check, because PageBulk
will temporarily generate invalid B-tree pages in the buffer pool.
2020-02-11 18:44:26 +02:00
Marko Mäkelä
5defdc382b Cleanup: Remove mtr_state_t and mtr_t::m_state
mtr_t::is_active(), mtr_t::is_committed(): Make debug-only.
2020-01-29 14:28:45 +02:00
Marko Mäkelä
ded128aa9b Merge 10.4 into 10.5 2020-01-20 16:48:56 +02:00
Marko Mäkelä
87a61355e8 Merge 10.3 into 10.4
The MDEV-17062 fix in commit c4195305b2
was omitted.
2020-01-20 15:49:48 +02:00
Marko Mäkelä
6373ec3ec7 Merge 10.2 into 10.3 2020-01-18 16:56:16 +02:00
Marko Mäkelä
c3695b4058 MDEV-21511: Remove unnecessary code
Now that we will be invoking dtuple_get_n_ext() instead of
letting btr_push_update_extern_fields() update an already
calculated value, it is unnecessary to calculate the n_ext
upfront.

row_rec_to_index_entry(), row_rec_to_index_entry_low():
Remove the output parameter n_ext.
2020-01-17 14:27:29 +02:00
Eugene Kosov
496532b5c5 MDEV-20950: Fix 32-bit Windows build 2019-12-21 21:36:25 +02:00
Marko Mäkelä
28c89b7151 Merge 10.4 into 10.5 2019-12-16 07:47:17 +02:00
Marko Mäkelä
8fa759a576 Merge 10.3 into 10.4
We disable the MDEV-21189 test galera.galera_partition
because it times out.
2019-12-13 17:30:37 +02:00
Marko Mäkelä
3466b47b0d Merge 10.2 into 10.3 2019-12-13 10:08:57 +02:00
Eugene Kosov
f0aa073f2b MDEV-20950 Reduce size of record offsets
offset_t: this is a type which represents one record offset.
It's unsigned short int.

a lot of functions: replace ulint with offset_t

btr_pcur_restore_position_func(),
page_validate(),
row_ins_scan_sec_index_for_duplicate(),
row_upd_clust_rec_by_insert_inherit_func(),
row_vers_impl_x_locked_low(),
trx_undo_prev_version_build():
  allocate record offsets on the stack instead of waiting for rec_get_offsets()
  to allocate it from mem_heap_t. So, reducing  memory allocations.

RECORD_OFFSET, INDEX_OFFSET:
  now it's less convenient to store pointers in offset_t*
  array. One pointer occupies now several offset_t. And those constant are start
  indexes into array to places where to store pointer values

REC_OFFS_HEADER_SIZE: adjusted for the new reality

REC_OFFS_NORMAL_SIZE:
  increase size from 100 to 300 which means less heap allocations.
  And sizeof(offset_t[REC_OFFS_NORMAL_SIZE]) now is 600 bytes which
  is smaller than previous 800 bytes.

REC_OFFS_SEC_INDEX_SIZE: adjusted for the new reality

rem0rec.h, rem0rec.ic, rem0rec.cc:
  various arguments, return values and local variables types were changed to
  fix numerous integer conversions issues.

enum field_type_t:
  offset types concept was introduces which replaces old offset flags stuff.
  Like in earlier version, 2 upper bits are used to store offset type.
  And this enum represents those types.

REC_OFFS_SQL_NULL, REC_OFFS_MASK: removed

get_type(), set_type(), get_value(), combine():
  these are convenience functions to work with offsets and it's types

rec_offs_base()[0]:
  still uses an old scheme with flags REC_OFFS_COMPACT and REC_OFFS_EXTERNAL

rec_offs_base()[i]:
  these have type offset_t now. Two upper bits contains type.
2019-12-13 00:26:50 +07:00
Vladislav Vaintroub
5e62b6a5e0 MDEV-16264 Use threadpool for Innodb background work.
Almost all threads have gone
- the "ticking" threads, that sleep a while then do some work)
(srv_monitor_thread, srv_error_monitor_thread, srv_master_thread)
were replaced with timers. Some timers are periodic,
e.g the "master" timer.

- The btr_defragment_thread is also replaced by a timer , which
reschedules it self when current defragment "item" needs throttling

- the buf_resize_thread and buf_dump_threads are substitutes with tasks
Ditto with page cleaner workers.

- purge workers threads are not tasks as well, and purge cleaner
coordinator is a combination of a task and timer.

- All AIO is outsourced to tpool, Innodb just calls thread_pool::submit_io()
and provides the callback.

- The srv_slot_t was removed, and innodb_debug_sync used in purge
is currently not working, and needs reimplementation.
2019-11-15 18:09:30 +01:00
Marko Mäkelä
0117d0e65a Merge 10.4 into 10.5 2019-11-11 15:21:58 +02:00
Marko Mäkelä
3da895a736 Merge 10.3 into 10.4 2019-11-11 15:03:46 +02:00
Marko Mäkelä
4fcfdb60e7 Merge 10.2 into 10.3 2019-11-11 14:56:51 +02:00
Marko Mäkelä
29d67d051a Cleanup btr_page_get_prev(), btr_page_get_next()
Remove the redundant parameter mtr_t*.

Make use of page_has_prev(), page_has_next() whenever possible.
2019-11-11 13:36:21 +02:00
Oleksandr Byelkin
3ad37ed0eb Merge 10.4 into 10.5 2019-11-07 08:52:30 +01:00
Marko Mäkelä
ec40980ddd Merge 10.3 into 10.4 2019-11-01 15:23:18 +02:00
Oleksandr Byelkin
55b2281a5d Merge branch '10.2' into 10.3 2019-10-31 10:58:06 +01:00
Sergei Golubchik
c13519312b MDEV-20799 DROP Virtual Column crashes MariaDB
use the correct table for evaluating virtual columns in the
InnoDB ALTER TABLE.
2019-10-28 08:40:48 +01:00
Marko Mäkelä
b42294bc64 MDEV-19514 Defer change buffer merge until pages are requested
We will remove the InnoDB background operation of merging buffered
changes to secondary index leaf pages. Changes will only be merged as a
result of an operation that accesses a secondary index leaf page,
such as a SQL statement that performs a lookup via that index,
or is modifying the index. Also ROLLBACK and some background operations,
such as purging the history of committed transactions, or computing
index cardinality statistics, can cause change buffer merge.
Encryption key rotation will not perform change buffer merge.

The motivation of this change is to simplify the I/O logic and to
allow crash recovery to happen in the background (MDEV-14481).
We also hope that this will reduce the number of "mystery" crashes
due to corrupted data. Because change buffer merge will typically
take place as a result of executing SQL statements, there should be
a clearer connection between the crash and the SQL statements that
were executed when the server crashed.

In many cases, a slight performance improvement was observed.

This is joint work with Thirunarayanan Balathandayuthapani
and was tested by Axel Schwenke and Matthias Leich.

The InnoDB monitor counter innodb_ibuf_merge_usec will be removed.

On slow shutdown (innodb_fast_shutdown=0), we will continue to
merge all buffered changes (and purge all undo log history).

Two InnoDB configuration parameters will be changed as follows:

innodb_disable_background_merge: Removed.
This parameter existed only in debug builds.
All change buffer merges will use synchronous reads.

innodb_force_recovery will be changed as follows:
* innodb_force_recovery=4 will be the same as innodb_force_recovery=3
(the change buffer merge cannot be disabled; it can only happen as
a result of an operation that accesses a secondary index leaf page).
The option used to be capable of corrupting secondary index leaf pages.
Now that capability is removed, and innodb_force_recovery=4 becomes 'safe'.
* innodb_force_recovery=5 (which essentially hard-wires
SET GLOBAL TRANSACTION ISOLATION LEVEL READ UNCOMMITTED)
becomes safe to use. Bogus data can be returned to SQL, but
persistent InnoDB data files will not be corrupted further.
* innodb_force_recovery=6 (ignore the redo log files)
will be the only option that can potentially cause
persistent corruption of InnoDB data files.

Code changes:

buf_page_t::ibuf_exist: New flag, to indicate whether buffered
changes exist for a buffer pool page. Pages with pending changes
can be returned by buf_page_get_gen(). Previously, the changes
were always merged inside buf_page_get_gen() if needed.

ibuf_page_exists(const buf_page_t&): Check if a buffered changes
exist for an X-latched or read-fixed page.

buf_page_get_gen(): Add the parameter allow_ibuf_merge=false.
All callers that know that they may be accessing a secondary index
leaf page must pass this parameter as allow_ibuf_merge=true,
unless it does not matter for that caller whether all buffered
changes have been applied. Assert that whenever allow_ibuf_merge
holds, the page actually is a leaf page. Attempt change buffer
merge only to secondary B-tree index leaf pages.

btr_block_get(): Add parameter 'bool merge'.
All callers of btr_block_get() should know whether the page could be
a secondary index leaf page. If it is not, we should avoid consulting
the change buffer bitmap to even consider a merge. This is the main
interface to requesting index pages from the buffer pool.

ibuf_merge_or_delete_for_page(), recv_recover_page(): Replace
buf_page_get_known_nowait() with much simpler logic, because
it is now guaranteed that that the block is x-latched or read-fixed.

mlog_init_t::mark_ibuf_exist(): Renamed from mlog_init_t::ibuf_merge().
On crash recovery, we will no longer merge any buffered changes
for the pages that we read into the buffer pool during the last batch
of applying log records.

buf_page_get_gen_known_nowait(), BUF_MAKE_YOUNG, BUF_KEEP_OLD: Remove.

btr_search_guess_on_hash(): Merge buf_page_get_gen_known_nowait()
to its only remaining caller.

buf_page_make_young_if_needed(): Define as an inline function.
Add the parameter buf_pool.

buf_page_peek_if_young(), buf_page_peek_if_too_old(): Add the
parameter buf_pool.

fil_space_validate_for_mtr_commit(): Remove a bogus comment
about background merge of the change buffer.

btr_cur_open_at_rnd_pos_func(), btr_cur_search_to_nth_level_func(),
btr_cur_open_at_index_side_func(): Use narrower data types and scopes.

ibuf_read_merge_pages(): Replaces buf_read_ibuf_merge_pages().
Merge the change buffer by invoking buf_page_get_gen().
2019-10-11 17:28:15 +03:00
Marko Mäkelä
a340af9223 btr_block_get(): Remove redundant parameters 2019-09-25 16:08:48 +03:00
Marko Mäkelä
5d0bab47fc btr_block_get(), btr_block_get_func(): Change the parameter to
const dict_index_t&

btr_level_list_remove(): Clean up the parameters. Renamed from
btr_level_list_remove_func().
2019-09-25 13:34:49 +03:00
Alexander Barkov
c1599821a5 Merge remote-tracking branch 'origin/10.4' into 10.5 2019-08-13 23:49:10 +04:00
Marko Mäkelä
624dd71b94 Merge 10.4 into 10.5 2019-08-13 18:57:00 +03:00
Vladislav Vaintroub
f61a980686 Update WolfSSL, remove older workarounds. 2019-07-28 13:45:15 +02:00
Marko Mäkelä
e9c1701e11 Merge 10.3 into 10.4 2019-07-25 18:42:06 +03:00
Marko Mäkelä
fdef9f9b89 Merge 10.2 into 10.3 2019-07-25 15:31:11 +03:00
Marko Mäkelä
a7e9395f9d fts_sync_table(), fts_sync() dead code removal
fts_sync(): Remove the constant parameter has_dict=false.

fts_sync_table(): Remove the constant parameter has_dict=false,
and the redundant parameter unlock_cache = !wait.
Make wait=true the default parameter.
2019-07-25 13:34:36 +03:00
Marko Mäkelä
ef44ec4afa Merge 10.2 into 10.3 2019-07-19 12:31:56 +03:00
Eugene Kosov
9c29d06862 MDEV-20097 potential use-after-free
row_merge_read_clustered_index(): fix one more place with buf and merge_buf[i]
2019-07-19 11:42:08 +03:00