MDEV-11415 Remove excessive undo logging during ALTER TABLE…ALGORITHM=COPY
Move a test from innodb.rename_table_debug to innodb.alter_copy.
ha_innobase::extra(HA_EXTRA_BEGIN_ALTER_COPY): Register id-versioned
tables so that mysql.transaction_registry will be updated, even for
empty tables that are subjected to ALTER TABLE…ALGORITHM=COPY.
If a crash occurs during ALTER TABLE…ALGORITHM=COPY, InnoDB would spend
a lot of time rolling back writes to the intermediate copy of the table.
To reduce the amount of busy work done, a work-around was introduced in
commit fd069e2bb3 in MySQL 4.1.8 and 5.0.2,
to commit the transaction after every 10,000 inserted rows.
A proper fix would have been to disable the undo logging altogether and
to simply drop the intermediate copy of the table on subsequent server
startup. This is what happens in MariaDB 10.3 with MDEV-14717,MDEV-14585.
In MariaDB 10.2, the intermediate copy of the table would be left behind
with a name starting with the string #sql.
This is a backport of a bug fix from MySQL 8.0.0 to MariaDB,
contributed by jixianliang <271365745@qq.com>.
Unlike recent MySQL, MariaDB supports ALTER IGNORE. For that operation
InnoDB must for now keep the undo logging enabled, so that the latest
row can be rolled back in case of an error.
In Galera cluster, the LOAD DATA statement will retain the existing
behaviour and commit the transaction after every 10,000 rows if
the parameter wsrep_load_data_splitting=ON is set. The logic to do
so (the wsrep_load_data_split() function and the call
handler::extra(HA_EXTRA_FAKE_START_STMT)) are joint work
by Ji Xianliang and Marko Mäkelä.
The original fix:
Author: Thirunarayanan Balathandayuthapani <thirunarayanan.balathandayuth@oracle.com>
Date: Wed Dec 2 16:09:15 2015 +0530
Bug#17479594 AVOID INTERMEDIATE COMMIT WHILE DOING ALTER TABLE ALGORITHM=COPY
Problem:
During ALTER TABLE, we commit and restart the transaction for every
10,000 rows, so that the rollback after recovery would not take so long.
Fix:
Suppress the undo logging during copy alter operation. If fts_index is
present then insert directly into fts auxiliary table rather
than doing at commit time.
ha_innobase::num_write_row: Remove the variable.
ha_innobase::write_row(): Remove the hack for committing every 10000 rows.
row_lock_table_for_mysql(): Remove the extra 2 parameters.
lock_get_src_table(), lock_is_table_exclusive(): Remove.
Reviewed-by: Marko Mäkelä <marko.makela@oracle.com>
Reviewed-by: Shaohua Wang <shaohua.wang@oracle.com>
Reviewed-by: Jon Olav Hauglid <jon.hauglid@oracle.com>
- Galera tests that was not updated with connection change
messages
- Test where out of memory error was changed (We are now using the
standard out of memory error in most places)
- Removed tokudb tests that uses include files that doesn't exist
in MariaDB
- Removed not supported mariadb startup option from option file
To disable debug instrumentation, save and restore the original value
of the variable DEBUG_DBUG. Assigning -d,... will enable the output of
a lot of unrelated DBUG messages to the server error log.
To disable debug instrumentation, save and restore the original value
of the variable DEBUG_DBUG. Assigning -d,... will enable the output of
a lot of unrelated DBUG messages to the server error log.
InnoDB limited the maximum number of bytes per character to 4.
But, the filename character set that was introduced in MySQL 5.1
uses up to 5 bytes per character.
To allow InnoDB tables to be created with wider characters, let
us split the mbminmaxlen fields into mbminlen, mbmaxlen, and increase
the limit to 7 bytes per character. This will increase the payload size
of dtype_t and dict_col_t by one bit. The storage size will be unchanged
(54 bits and 77 bits will use the same number of bytes as the
previous sizes 53 and 76 bits).
MDEV-14511 tried to avoid some consistency problems related to InnoDB
persistent statistics. The persistent statistics are being written by
an InnoDB internal SQL interpreter that requires the InnoDB data dictionary
cache to be locked.
Before MDEV-14511, the statistics were written during DDL in separate
transactions, which could unnecessarily reduce performance (each commit
would require a redo log flush) and break atomicity, because the statistics
would be updated separately from the dictionary transaction.
However, because it is unacceptable to hold the InnoDB data dictionary
cache locked while suspending the execution for waiting for a
transactional lock (in the mysql.innodb_index_stats or
mysql.innodb_table_stats tables) to be released, any lock conflict
was immediately be reported as "lock wait timeout".
To fix MDEV-14941, an attempt to reduce these lock conflicts by acquiring
transactional locks on the user tables in both the statistics and DDL
operations was made, but it would still not entirely prevent lock conflicts
on the mysql.innodb_index_stats and mysql.innodb_table_stats tables.
Fixing the remaining problems would require a change that is too intrusive
for a GA release series, such as MariaDB 10.2.
Thefefore, we revert the change MDEV-14511. To silence the
MDEV-13201 assertion, we use the pre-existing flag trx_t::internal.
If InnoDB is killed while ALTER TABLE...ALGORITHM=COPY is in progress,
after recovery there could be undo log records some records that were
inserted into an intermediate copy of the table. Due to these undo log
records, InnoDB would resurrect locks at recovery, and the intermediate
table would be locked while we are trying to drop it. This would cause
a call to row_rename_table_for_mysql(), either from
row_mysql_drop_garbage_tables() or from the rollback of a RENAME
operation that was part of the ALTER TABLE.
row_rename_table_for_mysql(): Do not attempt to parse FOREIGN KEY
constraints when renaming from #sql-something to #sql-something-else,
because it does not make any sense.
row_drop_table_for_mysql(): When deferring DROP TABLE due to locks,
do not rename the table if its name already starts with the #sql-
prefix, which is what row_mysql_drop_garbage_tables() uses.
Previously, the too strict prefix #sql-ib was used, and some
tables were renamed unnecessarily.
innodb.truncate_inject: Replacement for innodb_zip.wl6501_error_1
Note: unlike MySQL, in some cases TRUNCATE does not return
an error in MariaDB. This should be fixed in the scope of
MDEV-13564 or similar.
recv_log_recover_10_3(): Determine if a log from MariaDB 10.3 is clean.
recv_find_max_checkpoint(): Allow startup with a clean 10.3 redo log.
srv_prepare_to_delete_redo_log_files(): When starting up with a 10.3 log,
display a "Downgrading redo log" message instead of "Upgrading".
The XtraDB option innodb_track_changed_pages causes
the function log_group_read_log_seg() to be invoked
even when recv_sys==NULL, leading to the SIGSEGV.
This regression was caused by
MDEV-11027 InnoDB log recovery is too noisy
This bug affected tables where the PRIMARY KEY contains variable-length
columns, and ROW_FORMAT is COMPACT or DYNAMIC.
rec_init_offsets_comp_ordinary(): Do not short-cut the parsing
of the record header for records that contain explicit values
for instantly added columns.
rec_copy_prefix_to_buf(): Copy more header for records that
contain explicit values for instantly added columns.
innodb/buf_LRU_get_free_block
Add debug instrumentation to produce error message about
no free pages. Print error message only once and do not
enable innodb monitor.
xtradb/buf_LRU_get_free_block
Add debug instrumentation to produce error message about
no free pages. Print error message only once and do not
enable innodb monitor. Remove code that does not seem to
be used.
innodb-lru-force-no-free-page.test
New test case to force produce desired error message.
dict_foreign_find_index(): Ignore incompletely created indexes.
After a failed ADD UNIQUE INDEX, an incompletely created index
could be left behind until the next ALTER TABLE statement.
There is a race condition on a test. con1 is the older transaction
as it query started wait first. Test continues so that con1 gets
lock wait timeout first. There is possibility that default connection
gets lock timeout also or as con1 is rolled back it gets the locks
it waited and does the update. Fixed by removing query outputs as
they could vary and accepting success from default connection
query.
While the redo log format was changed in MariaDB 10.3.2 and 10.3.3
due to MDEV-12288 and MDEV-11369, it should be technically possible
to upgrade from a crashed MariaDB 10.2 instance.
On a related note, it should be possible for Mariabackup 10.3
to create a backup from a running MariaDB Server 10.2.
mlog_id_t: Put back the 10.2 specific redo log record types
MLOG_UNDO_INSERT, MLOG_UNDO_ERASE_END, MLOG_UNDO_INIT,
MLOG_UNDO_HDR_REUSE.
trx_undo_parse_add_undo_rec(): Parse or apply MLOG_UNDO_INSERT.
trx_undo_erase_page_end(): Apply MLOG_UNDO_ERASE_END.
trx_undo_parse_page_init(): Parse or apply MLOG_UNDO_INIT.
trx_undo_parse_page_header_reuse(): Parse or apply MLOG_UNDO_HDR_REUSE.
recv_log_recover_10_2(): Remove. Always parse the redo log from 10.2.
recv_find_max_checkpoint(), recv_recovery_from_checkpoint_start():
Always parse the redo log from MariaDB 10.2.
recv_parse_or_apply_log_rec_body(): Parse or apply
MLOG_UNDO_INSERT, MLOG_UNDO_ERASE_END, MLOG_UNDO_INIT.
srv_prepare_to_delete_redo_log_files(),
innobase_start_or_create_for_mysql(): Upgrade from a previous (supported)
redo log format.
trx_undo_page_report_modify(): For SPATIAL INDEX, keep logging
updated off-page columns twice, so that
the minimum bounding rectangle (MBR) will be logged.
Avoiding the redundant logging would require larger changes
to the undo log format.
row_build_index_entry_low(): Handle SPATIAL_UNKNOWN more robustly,
by refusing to purge the record from the spatial index.
We can get this code when processing old undo log from 10.2.10 or
10.2.11 (the releases affected by MDEV-14799, which was a regression
from MDEV-14051).