During the UPDATE of PRIMARY KEY columns, we may miscalculate the
size of the clustered index record.
row_upd_clust_rec_by_insert(): Pass the total number of off-page columns,
which may include such columns that were inherited from the record
and not created as part of the UPDATE operation.
This is based on
mysql/mysql-server@490c45e8c8
which is a follow-up to
mysql/mysql-server@1fa475b85d
which we filed and fixed as MDEV-21511.
No test case was provided by Oracle.
The function wsrep_on() was being called rather frequently
in InnoDB and XtraDB. Let us cache it in trx_t and invoke
trx_t::is_wsrep() instead.
innobase_trx_init(): Cache trx->wsrep = wsrep_on(thd).
ha_innobase::write_row(): Replace many repeated calls to current_thd,
and test the cheapest condition first.
There is no background change buffer merge any more.
Change buffer merge will only take place during a slow shutdown
(a shutdown initiated after SET GLOBAL innodb_fast_shutdown=0).
The -Wconversion in GCC seems to be stricter than in clang.
GCC at least since version 4.4.7 issues truncation warnings for
assignments to bitfields, while clang 10 appears to only issue
warnings when the sizes in bytes rounded to the nearest integer
powers of 2 are different.
Before GCC 10.0.0, -Wconversion required more casts and would not
allow some operations, such as x<<=1 or x+=1 on a data type that
is narrower than int.
GCC 5 (but not GCC 4, GCC 6, or any later version) is complaining
about x|=y even when x and y are compatible types that are narrower
than int. Hence, we must rewrite some x|=y as
x=static_cast<byte>(x|y) or similar, or we must disable -Wconversion.
In GCC 6 and later, the warning for assigning wider to bitfields
that are narrower than 8, 16, or 32 bits can be suppressed by
applying a bitwise & with the exact bitmask of the bitfield.
For older GCC, we must disable -Wconversion for GCC 4 or 5 in such
cases.
The bitwise negation operator appears to promote short integers
to a wider type, and hence we must add explicit truncation casts
around them. Microsoft Visual C does not allow a static_cast to
truncate a constant, such as static_cast<byte>(1) truncating int.
Hence, we will use the constructor-style cast byte(~1) for such cases.
This has been tested at least with GCC 4.8.5, 5.4.0, 7.4.0, 9.2.1, 10.0.0,
clang 9.0.1, 10.0.0, and MSVC 14.22.27905 (Microsoft Visual Studio 2019)
on 64-bit and 32-bit targets (IA-32, AMD64, POWER 8, POWER 9, ARMv8).
page_mem_alloc_free(), page_dir_set_n_heap(), page_ptr_set_direction():
Merge with the callers.
page_direction_reset(), page_direction_increment(),
page_zip_dir_insert(), page_zip_write_rec_ext(), page_zip_write_rec():
Add the parameter mtr, and write log.
PageBulk::insert(), PageBulk::finish(): Write log for all changes.
page_cur_rec_insert(), page_cur_insert_rec_write_log(),
page_cur_insert_rec_write_log(): Remove.
page_rec_set_next(), page_header_set_field(), page_header_set_ptr():
Remove. Use lower-level operations with or without logging.
page_zip_dir_add_slot(): Move to the same compilation unit with
its only caller, page_cur_insert_rec_zip().
page_cur_insert_rec_zip(): Mark pieces of code that must be skipped
once this task is completed.
btr_defragment_chunk(): Before starting a mini-transaction that
is writing (a lot), invoke log_free_check(). This should allow
the test innodb.innodb_defrag_concurrent to pass with the
mtr default_mysqld.cnf setting of innodb_log_file_size=10M.
MLOG_BUF_MARGIN: Remove.
btr_cur_upd_rec_sys(): Replaces row_upd_rec_sys_fields() and
implements redo logging.
row_upd_rec_sys_fields_in_recovery(): Remove, and merge to the
only remaining caller btr_cur_parse_update_in_place().
btr_cur_del_mark_set_clust_rec_log(),
btr_cur_del_mark_set_sec_rec_log(),
btr_cur_set_deleted_flag_for_ibuf():
Remove, and replace with btr_rec_set_deleted<bool>().
page_zip_rec_set_deleted(): Add the parameter mtr, and write a
MLOG_ZIP_WRITE_STRING record to the log.
Now that we will be invoking dtuple_get_n_ext() instead of
letting btr_push_update_extern_fields() update an already
calculated value, it is unnecessary to calculate the n_ext
upfront.
row_rec_to_index_entry(), row_rec_to_index_entry_low():
Remove the output parameter n_ext.
offset_t: this is a type which represents one record offset.
It's unsigned short int.
a lot of functions: replace ulint with offset_t
btr_pcur_restore_position_func(),
page_validate(),
row_ins_scan_sec_index_for_duplicate(),
row_upd_clust_rec_by_insert_inherit_func(),
row_vers_impl_x_locked_low(),
trx_undo_prev_version_build():
allocate record offsets on the stack instead of waiting for rec_get_offsets()
to allocate it from mem_heap_t. So, reducing memory allocations.
RECORD_OFFSET, INDEX_OFFSET:
now it's less convenient to store pointers in offset_t*
array. One pointer occupies now several offset_t. And those constant are start
indexes into array to places where to store pointer values
REC_OFFS_HEADER_SIZE: adjusted for the new reality
REC_OFFS_NORMAL_SIZE:
increase size from 100 to 300 which means less heap allocations.
And sizeof(offset_t[REC_OFFS_NORMAL_SIZE]) now is 600 bytes which
is smaller than previous 800 bytes.
REC_OFFS_SEC_INDEX_SIZE: adjusted for the new reality
rem0rec.h, rem0rec.ic, rem0rec.cc:
various arguments, return values and local variables types were changed to
fix numerous integer conversions issues.
enum field_type_t:
offset types concept was introduces which replaces old offset flags stuff.
Like in earlier version, 2 upper bits are used to store offset type.
And this enum represents those types.
REC_OFFS_SQL_NULL, REC_OFFS_MASK: removed
get_type(), set_type(), get_value(), combine():
these are convenience functions to work with offsets and it's types
rec_offs_base()[0]:
still uses an old scheme with flags REC_OFFS_COMPACT and REC_OFFS_EXTERNAL
rec_offs_base()[i]:
these have type offset_t now. Two upper bits contains type.
Passing buf_block_t helps us avoid calling
mlog_write_initial_log_record_fast() and page_get_page_no(),
and allows us to implement more debug checks, such as
that on ROW_FORMAT=COMPRESSED index pages, only the page header
may be modified by MLOG_MEMSET records.
fseg_n_reserved_pages(): Add a buf_block_t parameter.
mtr_t::write(): Replaces mlog_write_ulint(), mlog_write_ull().
Optimize away writes if the page contents does not change,
except when a dummy write has been explicitly requested.
Because the member function template takes a block descriptor as a
parameter, it is possible to introduce better consistency checks.
Due to this, the code for handling file-based lists, undo logs
and user transactions was refactored to pass around buf_block_t.
buf_read_ibuf_merge_pages(): Discard any page numbers that are
outside the current bounds of the tablespace, by invoking the
function ibuf_delete_recs() that was introduced in MDEV-20934.
This could avoid an infinite change buffer merge loop on
innodb_fast_shutdown=0, because normally the change buffer merge
would only be attempted if a page was successfully loaded into
the buffer pool.
dict_drop_index_tree(): Add the parameter trx_t*.
To prevent the DROP TABLE crash, do not invoke btr_free_if_exists()
if the entire .ibd file will be dropped. Thus, we will avoid a crash
if the BTR_SEG_LEAF or BTR_SEG_TOP of the index is corrupted,
and we will also avoid unnecessarily accessing the to-be-dropped
tablespace via the buffer pool.
In MariaDB 10.2, we disable the DROP TABLE fix if innodb_safe_truncate=0,
because the backup-unsafe MySQL 5.7 WL#6501 form of TRUNCATE TABLE
requires that the individual pages be freed inside the tablespace.
Apart from page latches (buf_block_t::lock), mini-transactions
are keeping track of at most one dict_index_t::lock and
fil_space_t::latch at a time, and in a rare case, purge_sys.latch.
Let us introduce interfaces for acquiring an index latch
or a tablespace latch.
In a later version, we may want to introduce mtr_t members
for holding a latched dict_index_t* and fil_space_t*,
and replace the remaining use of mtr_t::m_memo
with std::set<buf_block_t*> or with a map<buf_block_t*,byte*>
pointing to log records.
Lock wait can happen on secondary index when doing FK checks for wsrep.
We should just return error to upper layer and applier will retry
operation when needed.
MDEV-16210 original case was wrongly allowed versioned DELETE from
referenced table where reference is by non-primary key. InnoDB UPDATE
has optimization for new rows not changing its clustered index
position. In this case InnoDB doesn't update all secondary indexes and
misses the one holding the referenced key. The fix was to disable this
optimization for versioned DELETE. In case of versioned DELETE we
forcely update all secondary indexes and therefore check them for
constraints.
But the above fix raised another problem with versioned DELETE on
foreign table side. In case when there was no corresponding record in
referenced table (illegal foreign reference can be done with "set
foreign_key_checks=off") there was spurious constraint check (because
versioned DELETE is actually UPDATE) and hence the operation failed
with constraint error.
MDEV-16210 tried to fix the above problem by checking foreign table
instead of referenced table and that at least was illegal.
Constraint check is done by row_ins_check_foreign_constraint() no
matter what kind of table is checked, referenced or foreign
(controlled by check_ref argument).
Referenced table is checked by row_upd_check_references_constraints().
Foreign table is checked by row_ins_check_foreign_constraints().
Current fix rolls back the wrong fix for the above problem and
disables referenced table check for DELETE on foreign side by
introducing `check_foreign` argument which when set to *false* skips
row_ins_check_foreign_constraints() call.
Constraint check is done on secondary index update.
F.ex. DELETE does row_upd_sec_index_entry() and checks constraints in
row_upd_check_references_constraints(). UPDATE is optimized for the
case when order is not changed (node->cmpl_info & UPD_NODE_NO_ORD_CHANGE)
and doesn't do row_upd_sec_index_entry(), so it doesn't check constraints.
Since for versioned DELETE we do UPDATE actually, but expect behaviour
of DELETE in terms of constraints, we should deny this optimization to
get constraints checked.
Fix wrong referenced table check when versioned DELETE inserts history in parent
table. Set check_ref to false in this case.
Removed unused dup_chk_only argument for row_ins_sec_index_entry() and
added check_ref argument.
MDEV-18057 fix was superseded by this fix and reverted.
foreign.test:
All key_type combinations: pk, unique, sec(ondary).
row_upd_build_difference_binary(): Correctly handle the
case where columns (or clustered index fields) have been added
since the 'entry' was originally created. In this case,
the update vector must replace any missing columns with the
default values of the instantly added columns.