Some fixes related to commit f838b2d7998f18ac2a1bb9d56081aac6e563de1e and
Rows_log_event::do_apply_event() and Update_rows_log_event::do_exec_row()
for system-versioned tables were provided by Nikita Malyavin.
This was required by test versioning.rpl,trx_id,row.
Problem:
========
- InnoDB reads the length of the variable length field wrongly
while applying the modification log of instant table.
Solution:
========
rec_init_offsets_comp_ordinary(): For the temporary instant
file record, InnoDB should read the length of the variable length
field from the record itself.
In any test that uses wait_all_purged.inc, ensure that InnoDB tables
will be created without persistent statistics.
This is a follow-up to commit cd04673a177d40f7c409284d87ead851ec775c36
after a similar failure was observed in the innodb_zip.blob test.
The motivation of introducing the parameter
innodb_purge_rseg_truncate_frequency in
mysql/mysql-server@28bbd66ea5 and
mysql/mysql-server@8fc2120fed
seems to have been to avoid stalls due to freeing undo log pages
or truncating undo log tablespaces. In MariaDB Server,
innodb_undo_log_truncate=ON should be a much lighter operation
than in MySQL, because it will not involve any log checkpoint.
Another source of performance stalls should be
trx_purge_truncate_rseg_history(), which is shrinking the history list
by freeing the undo log pages whose undo records have been purged.
To alleviate that, we will introduce a purge_truncation_task that will
offload this from the purge_coordinator_task. In that way, the next
innodb_purge_batch_size pages may be parsed and purged while the pages
from the previous batch are being freed and the history list being shrunk.
The processing of innodb_undo_log_truncate=ON will still remain the
responsibility of the purge_coordinator_task.
purge_coordinator_state::count: Remove. We will ignore
innodb_purge_rseg_truncate_frequency, and act as if it had been
set to 1 (the maximum shrinking frequency).
purge_coordinator_state::do_purge(): Invoke an asynchronous task
purge_truncation_callback() to free the undo log pages.
purge_sys_t::iterator::free_history(): Free those undo log pages
that have been processed. This used to be a part of
trx_purge_truncate_history().
purge_sys_t::clone_end_view(): Take a new value of purge_sys.head
as a parameter, so that it will be updated while holding exclusive
purge_sys.latch. This is needed for race-free access to the field
in purge_truncation_callback().
Reviewed by: Vladislav Lesin
The MDEV-29693 conflict resolution is from Monty, as well as is
a bug fix where ANALYZE TABLE wrongly built histograms for
single-column PRIMARY KEY.
Also includes a fix for safe_malloc error reporting.
Other things:
- Copied main.log_slow from 10.4 to avoid mtr issue
Disabled test:
- spider/bugfix.mdev_27239 because we started to get
+Error 1429 Unable to connect to foreign data source: localhost
-Error 1158 Got an error reading communication packets
- main.delayed
- Bug#54332 Deadlock with two connections doing LOCK TABLE+INSERT DELAYED
This part is disabled for now as it fails randomly with different
warnings/errors (no corruption).
trx_t::commit_empty(): A special case of transaction "commit" when
the transaction was actually rolled back or the persistent undo log
is empty. In this case, we need to change the undo log header state to
TRX_UNDO_CACHED and move the undo log from rseg->undo_list to
rseg->undo_cached for fast reuse. Furthermore, unless this is the only
undo log record in the page, we will remove the record and rewind
TRX_UNDO_PAGE_START, TRX_UNDO_PAGE_FREE, TRX_UNDO_LAST_LOG.
We must also ensure that the system-wide transaction identifier
will be persisted up to this->id, so that there will not be warnings or
errors due to a PAGE_MAX_TRX_ID being too large. We might have modified
secondary index pages before being rolled back, and any changes of
PAGE_MAX_TRX_ID are never rolled back.
Even though it is not going to be written persistently anywhere,
we will invoke trx_sys.assign_new_trx_no(this), so that in the test
innodb.instant_alter everything will be purged as expected.
trx_t::write_serialisation_history(): Renamed from
trx_write_serialisation_history(). If there is no undo log,
invoke commit_empty().
trx_purge_add_undo_to_history(): Simplify an assertion and remove a
comment. This function will not be invoked on an empty undo log anymore.
trx_undo_header_create(): Add a debug assertion.
trx_undo_mem_create_at_db_start(): Remove a duplicated assignment.
Reviewed by: Vladislav Lesin
Tested by: Matthias Leich
purge_sys_t::sees(): Wrapper for view.sees().
trx_purge_truncate_history(): Invoke purge_sys.sees() instead of
comparing to head.trx_no, to determine if undo pages can be safely freed.
The test innodb.cursor-restore-locking was adjusted by Vladislav Lesin,
as was the the debug instrumentation in row_purge_del_mark().
Reviewed by: Vladislav Lesin
The deprecated parameters will be removed:
innodb_defragment
innodb_defragment_n_pages
innodb_defragment_stats_accuracy
innodb_defragment_fill_factor_n_recs
innodb_defragment_fill_factor
innodb_defragment_frequency
The mysql.innodb_index_stats.stat_name values 'n_page_split' and
'n_pages_freed' will lose their special meaning.
The related changes to OPTIMIZE TABLE in InnoDB will be removed as well.
The parameter innodb_optimize_fulltext_only will retain its special
meaning in OPTIMIZE TABLE.
Tested by: Matthias Leich
There is a little used option innodb_defragment that would make
OPTIMIZE TABLE not rebuild the table as usual for InnoDB, but
instead cause the index B-trees to be optimized in place.
This option uses excessive locking (exclusively locking index trees).
It never covered SPATIAL INDEX or FULLTEXT INDEX. Storage space
was never reclaimed.
Because this option is not particularly useful and causes a
maintenance burden (most recently in
commit de4030e4d49805a7ded5c0bfee01cc3fd7623522),
it is best to deprecate it, to prepare for its removal.
- InnoDB DDL results in `Duplicate entry' if concurrent DML throws
duplicate key error. The following scenario explains the problem
connection con1:
ALTER TABLE t1 FORCE;
connection con2:
INSERT INTO t1(pk, uk) VALUES (2, 2), (3, 2);
In connection con2, InnoDB throws the 'DUPLICATE KEY' error because
of unique index. Alter operation will throw the error when applying
the concurrent DML log.
- Inserting the duplicate key for unique index logs the insert
operation for online ALTER TABLE. When insertion fails,
transaction does rollback and it leads to logging of
delete operation for online ALTER TABLE.
While applying the insert log entries, alter operation
encounters 'DUPLICATE KEY' error.
- To avoid the above fake duplicate scenario, InnoDB should
not write any log for online ALTER TABLE before DML transaction
commit.
- User thread which does DML can apply the online log if
InnoDB ran out of online log and index is marked as completed.
Set online log error if apply phase encountered any error.
It can also clear all other indexes log, marks the newly
added indexes as corrupted.
- Removed the old online code which was a part of DML operations
commit_inplace_alter_table() : Does apply the online log
for the last batch of secondary index log and does frees
the log for the completed index.
trx_t::apply_online_log: Set to true while writing the undo
log if the modified table has active DDL
trx_t::apply_log(): Apply the DML changes to online DDL tables
dict_table_t::is_active_ddl(): Returns true if the table
has an active DDL
dict_index_t::online_log_make_dummy(): Assign dummy value
for clustered index online log to indicate the secondary
indexes are being rebuild.
dict_index_t::online_log_is_dummy(): Check whether the online
log has dummy value
ha_innobase_inplace_ctx::log_failure(): Handle the apply log
failure for online DDL transaction
row_log_mark_other_online_index_abort(): Clear out all other
online index log after encountering the error during
row_log_apply()
row_log_get_error(): Get the error happened during row_log_apply()
row_log_online_op(): Does apply the online log if index is
completed and ran out of memory. Returns false if apply log fails
UndorecApplier: Introduced a class to maintain the undo log
record, latched undo buffer page, parse the undo log record,
maintain the undo record type, info bits and update vector
UndorecApplier::get_old_rec(): Get the correct version of the
clustered index record that was modified by the current undo
log record
UndorecApplier::clear_undo_rec(): Clear the undo log related
information after applying the undo log record
UndorecApplier::log_update(): Handle the update, delete undo
log and apply it on online indexes
UndorecApplier::log_insert(): Handle the insert undo log
and apply it on online indexes
UndorecApplier::is_same(): Check whether the given roll pointer
is generated by the current undo log record information
trx_t::rollback_low(): Set apply_online_log for the transaction
after partially rollbacked transaction has any active DDL
prepare_inplace_alter_table_dict(): After allocating the online
log, InnoDB does create fulltext common tables. Fulltext index
doesn't allow the index to be online. So removed the dead
code of online log removal
Thanks to Marko Mäkelä for providing the initial prototype and
Matthias Leich for testing the issue patiently.
- Server incorrectly downgrading the MDL after prepare phase when
table is empty. mdl_exclusive_after_prepare is being set in
prepare phase only. But mdl_exclusive_after_prepare condition was
misplaced and checked before prepare phase by
commit d270525dfde86bcb92a2327234a0954083e14a94 and it is now
changed to check after prepare phase.
- main.innodb_mysql_sync test case was changed to avoid locking
optimization when table is empty.
MDEV-23805 simplified the treatment of empty tables during ALTER TABLE,
which could prevent the scenarios that were previously reported and
fixed as MDEV-16131 and MDEV-24730.
With the MDEV-23805 fix, the statement
SET DEBUG_SYNC = 'now WAIT_FOR copied';
could occasionally time out, depending on timing.
Apparently, there was a race condition where purge could resume
(and empty the table) before ALTER TABLE got the chance to execute.
We must prevent the purge of history from running before
ALTER TABLE has started executing.
- In ha_innobase::prepare_inplace_alter_table(), InnoDB should
check whether the table is empty. If the table is empty then
server should avoid downgrading the MDL after prepare phase.
It is more like instant alter, does change only in dicationary
and metadata.
- Changed few debug test case to make non-empty DDL table
In other ROW_FORMAT than REDUNDANT, the InnoDB record header
size calculation depends on dict_index_t::n_core_null_bytes.
In ROW_FORMAT=REDUNDANT, the record header always is 6 bytes
plus n_fields or 2*n_fields bytes, depending on the maximum
record size. But, during online ALTER TABLE, the log records
in the temporary file always use a format similar to
ROW_FORMAT=DYNAMIC, even omitting the 5-byte fixed-length part
of the header.
While creating a temporary file record for a ROW_FORMAT=REDUNDANT
table, InnoDB must refer to dict_index_t::n_nullable.
The field dict_index_t::n_core_null_bytes is only valid for
other than ROW_FORMAT=REDUNDANT tables.
The bug does not affect MariaDB 10.3, because only
commit 7a27db778e3e5a04271568a94c75157bb6fb48f1 (MDEV-15563)
allowed an ALGORITHM=INSTANT change of a NOT NULL column to
NULL in a ROW_FORMAT=REDUNDANT table.
The fix was developed by Thirunarayanan Balathandayuthapani
and tested by Matthias Leich. The test case was simplified by me.
Between btr_pcur_store_position() and btr_pcur_restore_position()
it is possible that purge empties a table and enlarges
index->n_core_fields and index->n_core_null_bytes.
Therefore, we must cache index->n_core_fields in
btr_pcur_t::old_n_core_fields so that btr_pcur_t::old_rec can be
parsed correctly.
Unfortunately, this is a huge change, because we will replace
"bool leaf" parameters with "ulint n_core"
(passing index->n_core_fields, or 0 for non-leaf pages).
For special cases where we know that index->is_instant() cannot hold,
we may also pass index->n_fields.
In commit eaeb8ec4b87882711ecb8e1c7476a6e410d5d2a9 (MDEV-24653)
an incorrect debug assertion was introduced.
btr_pcur_store_position(): If the only record in the page is the
instant ALTER TABLE metadata record, we cannot expect there to be
a successor page. The situation could be improved by MDEV-24673 later.
Online log for insert operation of redundant table fails with
index->is_instant() assert. Purge can reset the n_core_fields when
alter is waiting to upgrade MDL for commit phase of DDL. In the
meantime, any insert DML tries to log the operation fails with
index is not being instant.
row_log_get_n_core_fields(): Get the n_core_fields of online log
for the given index.
rec_get_converted_size_comp_prefix_low(): Use n_core_fields of online
log when InnoDB calculates the size of data tuple during redundant
row format table rebuild.
rec_convert_dtuple_to_rec_comp(): Use n_core_fields of online log
when InnoDB does the conversion of data tuple to record during
redudant row format table rebuild.
- Adding the test case which has more than 129 instant columns.
We may end up with an empty leaf page (containing only an ADD COLUMN
metadata record) that is not the root page.
innobase_add_instant_try(): Disable an optimization for a non-canonical
empty table that contains a metadata record somewhere else than in
the root page.
btr_pcur_store_position(): Tolerate a non-canonical empty table.