mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-07-04 01:23:45 +03:00

Author	SHA1	Message	Date
Marko Mäkelä	0936c13809	MDEV-33993 Possible server hang on DROP INDEX or RENAME INDEX commit_try_norebuild(): Add the parameter statistics_exist, similar to commit_try_rebuild(). If the InnoDB statistics tables did not exist, we will not attempt to update statistics later on during the transaction. Thanks to Matthias Leich for originally reproducing this scenario.	2024-04-25 13:44:10 +03:00
Thirunarayanan Balathandayuthapani	fc391df546	MDEV-28399 Assertion failure in dict_table_check_for_dup_indexes upon concurrent DML/DDL Problem: ======== InnoDB DDL fails to remove the newly added table or index from dictionary and index stub from the table cache if the alter transaction encounters DEADLOCK error in commit phase. Solution: ======== Restart the alter table transaction if it encounters DEADLOCK in commit phase. So that index stubs and index, table removal from dictionary can be done in rollback_inplace_alter_table() - Added one assert in rollback_inplace_alter_table() to indicate that the online log for the old table shouldn't exist.	2022-05-02 12:52:16 +05:30
Thirunarayanan Balathandayuthapani	4b80c11f52	MDEV-15250 UPSERT during ALTER TABLE results in 'Duplicate entry' error for alter - InnoDB DDL results in `Duplicate entry' if concurrent DML throws duplicate key error. The following scenario explains the problem connection con1: ALTER TABLE t1 FORCE; connection con2: INSERT INTO t1(pk, uk) VALUES (2, 2), (3, 2); In connection con2, InnoDB throws the 'DUPLICATE KEY' error because of unique index. Alter operation will throw the error when applying the concurrent DML log. - Inserting the duplicate key for unique index logs the insert operation for online ALTER TABLE. When insertion fails, transaction does rollback and it leads to logging of delete operation for online ALTER TABLE. While applying the insert log entries, alter operation encounters 'DUPLICATE KEY' error. - To avoid the above fake duplicate scenario, InnoDB should not write any log for online ALTER TABLE before DML transaction commit. - User thread which does DML can apply the online log if InnoDB ran out of online log and index is marked as completed. Set online log error if apply phase encountered any error. It can also clear all other indexes log, marks the newly added indexes as corrupted. - Removed the old online code which was a part of DML operations commit_inplace_alter_table() : Does apply the online log for the last batch of secondary index log and does frees the log for the completed index. trx_t::apply_online_log: Set to true while writing the undo log if the modified table has active DDL trx_t::apply_log(): Apply the DML changes to online DDL tables dict_table_t::is_active_ddl(): Returns true if the table has an active DDL dict_index_t::online_log_make_dummy(): Assign dummy value for clustered index online log to indicate the secondary indexes are being rebuild. dict_index_t::online_log_is_dummy(): Check whether the online log has dummy value ha_innobase_inplace_ctx::log_failure(): Handle the apply log failure for online DDL transaction row_log_mark_other_online_index_abort(): Clear out all other online index log after encountering the error during row_log_apply() row_log_get_error(): Get the error happened during row_log_apply() row_log_online_op(): Does apply the online log if index is completed and ran out of memory. Returns false if apply log fails UndorecApplier: Introduced a class to maintain the undo log record, latched undo buffer page, parse the undo log record, maintain the undo record type, info bits and update vector UndorecApplier::get_old_rec(): Get the correct version of the clustered index record that was modified by the current undo log record UndorecApplier::clear_undo_rec(): Clear the undo log related information after applying the undo log record UndorecApplier::log_update(): Handle the update, delete undo log and apply it on online indexes UndorecApplier::log_insert(): Handle the insert undo log and apply it on online indexes UndorecApplier::is_same(): Check whether the given roll pointer is generated by the current undo log record information trx_t::rollback_low(): Set apply_online_log for the transaction after partially rollbacked transaction has any active DDL prepare_inplace_alter_table_dict(): After allocating the online log, InnoDB does create fulltext common tables. Fulltext index doesn't allow the index to be online. So removed the dead code of online log removal Thanks to Marko Mäkelä for providing the initial prototype and Matthias Leich for testing the issue patiently.	2022-04-25 18:52:19 +05:30
Marko Mäkelä	58fe6b47d4	MDEV-26903: Assertion ctx->trx->state == TRX_STATE_ACTIVE on DROP INDEX rollback_inplace_alter_table(): Tolerate a case where the transaction is not in an active state. If ha_innobase::commit_inplace_alter_table() failed with a deadlock, the transaction would already have been rolled back. This omission of error handling was introduced in commit `1bd681c8b3` (MDEV-25506 part 3). After commit `c3c53926c4` (MDEV-26554) it became easier to trigger DB_DEADLOCK during exclusive table lock acquisition in ha_innobase::commit_inplace_alter_table(). lock_table_low(): Add DBUG injection "innodb_table_deadlock".	2021-10-26 09:54:37 +03:00
Marko Mäkelä	6e390a62ba	MDEV-26772 InnoDB DDL fails with DUPLICATE KEY error ha_innobase::delete_table(): When the table that is being dropped has a name starting with #sql, temporarily set innodb_lock_wait_timeout=0 while attempting to lock the persistent statistics tables. If the statistics tables cannot be locked, pretend that statistics did not exist and carry on with dropping the table. The SQL layer is not really prepared for failures of this operation. This is what fixes the test case. ha_innobase::rename_table(): When renaming a table from a name that starts with #sql, try to lock the statistics tables with an immediate timeout, and ignore the statistics if the locks were not available. In fact, during any rename from a #sql name, dict_stats_rename_table() should have no effect, because already when an earlier rename to a #sql name took place we should have deleted the statistics for the table using the non-#sql name. This change is just analogous to the ha_innobase::delete_table().	2021-10-19 19:54:29 +03:00
Marko Mäkelä	14f1453b35	MDEV-7318: Fix a test case	2020-05-22 19:56:55 +03:00
Aleksey Midenkov	193725b81e	MDEV-7318 RENAME INDEX This patch adds support of RENAME INDEX operation to the ALTER TABLE statement. Code which determines if ALTER TABLE can be done in-place for "simple" storage engines like MyISAM, Heap and etc. was updated to handle ALTER TABLE ... RENAME INDEX as an in-place operation. Support for in-place ALTER TABLE ... RENAME INDEX for InnoDB was covered by MDEV-13301. Syntax changes ============== A new type of <alter_specification> is added: <rename index clause> ::= RENAME ( INDEX \| KEY ) <oldname> TO <newname> Where <oldname> and <newname> are identifiers for old name and new name of the index. Semantic changes ================ The result of "ALTER TABLE t1 RENAME INDEX a TO b" is a table which contents and structure are identical to the old version of 't1' with the only exception index 'a' being called 'b'. Neither <oldname> nor <newname> can be "primary". The index being renamed should exist and its new name should not be occupied by another index on the same table. Related to: WL#6555, MDEV-13301	2020-03-03 13:50:33 +03:00
Sergei Golubchik	6bb11efa4a	Merge branch '10.2' into 10.3	2019-01-03 13:09:41 +01:00
Marko Mäkelä	68143c8905	MDEV-17470: Fix the test for --embedded	2018-12-29 10:57:26 +02:00
Eugene Kosov	c5a5eaa9a9	MDEV-17470 Orphan temporary files after interrupted ALTER cause InnoDB: Operating system error number 17 and eventual fatal error 71 Orphan #sql* tables may remain after ALTER TABLE was interrupted by timeout or KILL or client disconnect. This is a regression caused by MDEV-16515. Similar to temporary tables (MDEV-16647), we had better ignore the KILL when dropping the original table in the final part of ALTER TABLE. Closes #1020	2018-12-28 17:05:48 +02:00
Marko Mäkelä	df563e0c03	Merge 10.2 into 10.3 main.derived_cond_pushdown: Move all 10.3 tests to the end, trim trailing white space, and add an "End of 10.3 tests" marker. Add --sorted_result to tests where the ordering is not deterministic. main.win_percentile: Add --sorted_result to tests where the ordering is no longer deterministic.	2018-11-06 09:40:39 +02:00
Marko Mäkelä	30c3d6db32	MDEV-17533 Merge new release of InnoDB 5.6.42 to 10.0 Also, add a test for a bug that does not seem to affect MariaDB.	2018-10-25 13:05:23 +03:00
Marko Mäkelä	a4948dafcd	MDEV-11369 Instant ADD COLUMN for InnoDB For InnoDB tables, adding, dropping and reordering columns has required a rebuild of the table and all its indexes. Since MySQL 5.6 (and MariaDB 10.0) this has been supported online (LOCK=NONE), allowing concurrent modification of the tables. This work revises the InnoDB ROW_FORMAT=REDUNDANT, ROW_FORMAT=COMPACT and ROW_FORMAT=DYNAMIC so that columns can be appended instantaneously, with only minor changes performed to the table structure. The counter innodb_instant_alter_column in INFORMATION_SCHEMA.GLOBAL_STATUS is incremented whenever a table rebuild operation is converted into an instant ADD COLUMN operation. ROW_FORMAT=COMPRESSED tables will not support instant ADD COLUMN. Some usability limitations will be addressed in subsequent work: MDEV-13134 Introduce ALTER TABLE attributes ALGORITHM=NOCOPY and ALGORITHM=INSTANT MDEV-14016 Allow instant ADD COLUMN, ADD INDEX, LOCK=NONE The format of the clustered index (PRIMARY KEY) is changed as follows: (1) The FIL_PAGE_TYPE of the root page will be FIL_PAGE_TYPE_INSTANT, and a new field PAGE_INSTANT will contain the original number of fields in the clustered index ('core' fields). If instant ADD COLUMN has not been used or the table becomes empty, or the very first instant ADD COLUMN operation is rolled back, the fields PAGE_INSTANT and FIL_PAGE_TYPE will be reset to 0 and FIL_PAGE_INDEX. (2) A special 'default row' record is inserted into the leftmost leaf, between the page infimum and the first user record. This record is distinguished by the REC_INFO_MIN_REC_FLAG, and it is otherwise in the same format as records that contain values for the instantly added columns. This 'default row' always has the same number of fields as the clustered index according to the table definition. The values of 'core' fields are to be ignored. For other fields, the 'default row' will contain the default values as they were during the ALTER TABLE statement. (If the column default values are changed later, those values will only be stored in the .frm file. The 'default row' will contain the original evaluated values, which must be the same for every row.) The 'default row' must be completely hidden from higher-level access routines. Assertions have been added to ensure that no 'default row' is ever present in the adaptive hash index or in locked records. The 'default row' is never delete-marked. (3) In clustered index leaf page records, the number of fields must reside between the number of 'core' fields (dict_index_t::n_core_fields introduced in this work) and dict_index_t::n_fields. If the number of fields is less than dict_index_t::n_fields, the missing fields are replaced with the column value of the 'default row'. Note: The number of fields in the record may shrink if some of the last instantly added columns are updated to the value that is in the 'default row'. The function btr_cur_trim() implements this 'compression' on update and rollback; dtuple::trim() implements it on insert. (4) In ROW_FORMAT=COMPACT and ROW_FORMAT=DYNAMIC records, the new status value REC_STATUS_COLUMNS_ADDED will indicate the presence of a new record header that will encode n_fields-n_core_fields-1 in 1 or 2 bytes. (In ROW_FORMAT=REDUNDANT records, the record header always explicitly encodes the number of fields.) We introduce the undo log record type TRX_UNDO_INSERT_DEFAULT for covering the insert of the 'default row' record when instant ADD COLUMN is used for the first time. Subsequent instant ADD COLUMN can use TRX_UNDO_UPD_EXIST_REC. This is joint work with Vin Chen (陈福荣) from Tencent. The design that was discussed in April 2017 would not have allowed import or export of data files, because instead of the 'default row' it would have introduced a data dictionary table. The test rpl.rpl_alter_instant is exactly as contributed in pull request #408. The test innodb.instant_alter is based on a contributed test. The redo log record format changes for ROW_FORMAT=DYNAMIC and ROW_FORMAT=COMPACT are as contributed. (With this change present, crash recovery from MariaDB 10.3.1 will fail in spectacular ways!) Also the semantics of higher-level redo log records that modify the PAGE_INSTANT field is changed. The redo log format version identifier was already changed to LOG_HEADER_FORMAT_CURRENT=103 in MariaDB 10.3.1. Everything else has been rewritten by me. Thanks to Elena Stepanova, the code has been tested extensively. When rolling back an instant ADD COLUMN operation, we must empty the PAGE_FREE list after deleting or shortening the 'default row' record, by calling either btr_page_empty() or btr_page_reorganize(). We must know the size of each entry in the PAGE_FREE list. If rollback left a freed copy of the 'default row' in the PAGE_FREE list, we would be unable to determine its size (if it is in ROW_FORMAT=COMPACT or ROW_FORMAT=DYNAMIC) because it would contain more fields than the rolled-back definition of the clustered index. UNIV_SQL_DEFAULT: A new special constant that designates an instantly added column that is not present in the clustered index record. len_is_stored(): Check if a length is an actual length. There are two magic length values: UNIV_SQL_DEFAULT, UNIV_SQL_NULL. dict_col_t::def_val: The 'default row' value of the column. If the column is not added instantly, def_val.len will be UNIV_SQL_DEFAULT. dict_col_t: Add the accessors is_virtual(), is_nullable(), is_instant(), instant_value(). dict_col_t::remove_instant(): Remove the 'instant ADD' status of a column. dict_col_t::name(const dict_table_t& table): Replaces dict_table_get_col_name(). dict_index_t::n_core_fields: The original number of fields. For secondary indexes and if instant ADD COLUMN has not been used, this will be equal to dict_index_t::n_fields. dict_index_t::n_core_null_bytes: Number of bytes needed to represent the null flags; usually equal to UT_BITS_IN_BYTES(n_nullable). dict_index_t::NO_CORE_NULL_BYTES: Magic value signalling that n_core_null_bytes was not initialized yet from the clustered index root page. dict_index_t: Add the accessors is_instant(), is_clust(), get_n_nullable(), instant_field_value(). dict_index_t::instant_add_field(): Adjust clustered index metadata for instant ADD COLUMN. dict_index_t::remove_instant(): Remove the 'instant ADD' status of a clustered index when the table becomes empty, or the very first instant ADD COLUMN operation is rolled back. dict_table_t: Add the accessors is_instant(), is_temporary(), supports_instant(). dict_table_t::instant_add_column(): Adjust metadata for instant ADD COLUMN. dict_table_t::rollback_instant(): Adjust metadata on the rollback of instant ADD COLUMN. prepare_inplace_alter_table_dict(): First create the ctx->new_table, and only then decide if the table really needs to be rebuilt. We must split the creation of table or index metadata from the creation of the dictionary table records and the creation of the data. In this way, we can transform a table-rebuilding operation into an instant ADD COLUMN operation. Dictionary objects will only be added to cache when table rebuilding or index creation is needed. The ctx->instant_table will never be added to cache. dict_table_t::add_to_cache(): Modified and renamed from dict_table_add_to_cache(). Do not modify the table metadata. Let the callers invoke dict_table_add_system_columns() and if needed, set can_be_evicted. dict_create_sys_tables_tuple(), dict_create_table_step(): Omit the system columns (which will now exist in the dict_table_t object already at this point). dict_create_table_step(): Expect the callers to invoke dict_table_add_system_columns(). pars_create_table(): Before creating the table creation execution graph, invoke dict_table_add_system_columns(). row_create_table_for_mysql(): Expect all callers to invoke dict_table_add_system_columns(). create_index_dict(): Replaces row_merge_create_index_graph(). innodb_update_n_cols(): Renamed from innobase_update_n_virtual(). Call my_error() if an error occurs. btr_cur_instant_init(), btr_cur_instant_init_low(), btr_cur_instant_root_init(): Load additional metadata from the clustered index and set dict_index_t::n_core_null_bytes. This is invoked when table metadata is first loaded into the data dictionary. dict_boot(): Initialize n_core_null_bytes for the four hard-coded dictionary tables. dict_create_index_step(): Initialize n_core_null_bytes. This is executed as part of CREATE TABLE. dict_index_build_internal_clust(): Initialize n_core_null_bytes to NO_CORE_NULL_BYTES if table->supports_instant(). row_create_index_for_mysql(): Initialize n_core_null_bytes for CREATE TEMPORARY TABLE. commit_cache_norebuild(): Call the code to rename or enlarge columns in the cache only if instant ADD COLUMN is not being used. (Instant ADD COLUMN would copy all column metadata from instant_table to old_table, including the names and lengths.) PAGE_INSTANT: A new 13-bit field for storing dict_index_t::n_core_fields. This is repurposing the 16-bit field PAGE_DIRECTION, of which only the least significant 3 bits were used. The original byte containing PAGE_DIRECTION will be accessible via the new constant PAGE_DIRECTION_B. page_get_instant(), page_set_instant(): Accessors for the PAGE_INSTANT. page_ptr_get_direction(), page_get_direction(), page_ptr_set_direction(): Accessors for PAGE_DIRECTION. page_direction_reset(): Reset PAGE_DIRECTION, PAGE_N_DIRECTION. page_direction_increment(): Increment PAGE_N_DIRECTION and set PAGE_DIRECTION. rec_get_offsets(): Use the 'leaf' parameter for non-debug purposes, and assume that heap_no is always set. Initialize all dict_index_t::n_fields for ROW_FORMAT=REDUNDANT records, even if the record contains fewer fields. rec_offs_make_valid(): Add the parameter 'leaf'. rec_copy_prefix_to_dtuple(): Assert that the tuple is only built on the core fields. Instant ADD COLUMN only applies to the clustered index, and we should never build a search key that has more than the PRIMARY KEY and possibly DB_TRX_ID,DB_ROLL_PTR. All these columns are always present. dict_index_build_data_tuple(): Remove assertions that would be duplicated in rec_copy_prefix_to_dtuple(). rec_init_offsets(): Support ROW_FORMAT=REDUNDANT records whose number of fields is between n_core_fields and n_fields. cmp_rec_rec_with_match(): Implement the comparison between two MIN_REC_FLAG records. trx_t::in_rollback: Make the field available in non-debug builds. trx_start_for_ddl_low(): Remove dangerous error-tolerance. A dictionary transaction must be flagged as such before it has generated any undo log records. This is because trx_undo_assign_undo() will mark the transaction as a dictionary transaction in the undo log header right before the very first undo log record is being written. btr_index_rec_validate(): Account for instant ADD COLUMN row_undo_ins_remove_clust_rec(): On the rollback of an insert into SYS_COLUMNS, revert instant ADD COLUMN in the cache by removing the last column from the table and the clustered index. row_search_on_row_ref(), row_undo_mod_parse_undo_rec(), row_undo_mod(), trx_undo_update_rec_get_update(): Handle the 'default row' as a special case. dtuple_t::trim(index): Omit a redundant suffix of an index tuple right before insert or update. After instant ADD COLUMN, if the last fields of a clustered index tuple match the 'default row', there is no need to store them. While trimming the entry, we must hold a page latch, so that the table cannot be emptied and the 'default row' be deleted. btr_cur_optimistic_update(), btr_cur_pessimistic_update(), row_upd_clust_rec_by_insert(), row_ins_clust_index_entry_low(): Invoke dtuple_t::trim() if needed. row_ins_clust_index_entry(): Restore dtuple_t::n_fields after calling row_ins_clust_index_entry_low(). rec_get_converted_size(), rec_get_converted_size_comp(): Allow the number of fields to be between n_core_fields and n_fields. Do not support infimum,supremum. They are never supposed to be stored in dtuple_t, because page creation nowadays uses a lower-level method for initializing them. rec_convert_dtuple_to_rec_comp(): Assign the status bits based on the number of fields. btr_cur_trim(): In an update, trim the index entry as needed. For the 'default row', handle rollback specially. For user records, omit fields that match the 'default row'. btr_cur_optimistic_delete_func(), btr_cur_pessimistic_delete(): Skip locking and adaptive hash index for the 'default row'. row_log_table_apply_convert_mrec(): Replace 'default row' values if needed. In the temporary file that is applied by row_log_table_apply(), we must identify whether the records contain the extra header for instantly added columns. For now, we will allocate an additional byte for this for ROW_T_INSERT and ROW_T_UPDATE records when the source table has been subject to instant ADD COLUMN. The ROW_T_DELETE records are fine, as they will be converted and will only contain 'core' columns (PRIMARY KEY and some system columns) that are converted from dtuple_t. rec_get_converted_size_temp(), rec_init_offsets_temp(), rec_convert_dtuple_to_temp(): Add the parameter 'status'. REC_INFO_DEFAULT_ROW = REC_INFO_MIN_REC_FLAG \| REC_STATUS_COLUMNS_ADDED: An info_bits constant for distinguishing the 'default row' record. rec_comp_status_t: An enum of the status bit values. rec_leaf_format: An enum that replaces the bool parameter of rec_init_offsets_comp_ordinary().	2017-10-06 09:50:10 +03:00
Marko Mäkelä	d8d39721df	Follow-up to MDEV-12042 (test innodb_page_size variants) innodb_page_size_small: A new set of combinations, for innodb_page_size up to 16k. In MariaDB 10.0, this does not make a difference, but in 10.1 and later, innodb_page_size would cover 32k and 64k, for which ROW_FORMAT=COMPRESSED is not available. Enable these combinations in a few InnoDB tests.	2017-06-06 09:34:09 +03:00
Marko Mäkelä	8e36216a06	Import two ALTER TABLE…ALGORITHM=INPLACE tests from MySQL 5.6. Also, revert part of MDEV-7685 that added an InnoDB abort when ALTER TABLE…ALGORITHM=INPLACE is reporting that it ran out of file space.	2017-04-05 14:46:35 +03:00

15 Commits