mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-11-09 11:41:36 +03:00

Author	SHA1	Message	Date
Thirunarayanan Balathandayuthapani	8a4d3a044f	MDEV-36017 Alter table aborts when temporary directory is full Problem: ======= - During inplace algorithm, concurrent DML fails to write the log operation into the temporary file. InnoDB fail to mark the error for the online log. - ddl_log_write() releases the global ddl lock prematurely before release the log memory entry Fix: === row_log_online_op(): Mark the error in online log when InnoDB ran out of temporary space fil_space_extend_must_retry(): Mark the os_has_said_disk_full as true if os_file_set_size() fails btr_cur_pessimistic_update(): Return error code when btr_cur_pessimistic_insert() fails ddl_log_write(): Release the global ddl lock after releasing the log memory entry when error was encountered btr_cur_optimistic_update(): Relax the assertion that blob pointer can be null during rollback because InnoDB can ran out of space while allocating the external page row_undo_mod_upd_exist_sec(): Remove the assertion which says that InnoDB should fail to build index entry when rollbacking an incomplete transaction after crash recovery. This scenario can happen when InnoDB ran out of space. row_upd_changes_ord_field_binary_func(): Relax the assertion to make that externally stored field can be null when InnoDB ran out of space.	2025-05-25 09:11:41 +05:30
Marko Mäkelä	82d7419e06	MDEV-34388: Stack overflow on Alpine Linux page_is_corrupted(): Do not allocate the buffers from stack, but from the heap, in xb_fil_cur_open(). row_quiesce_write_cfg(): Issue one type of message when we fail to create the .cfg file. update_statistics_for_table(), read_statistics_for_table(), delete_statistics_for_table(), rename_table_in_stat_tables(): Use a common stack buffer for Index_stat, Column_stat, Table_stat. ha_connect::FileExists(): Invoke push_warning_printf() so that we can avoid allocating a buffer for snprintf(). translog_init_with_table(): Do not duplicate TRANSLOG_PAGE_SIZE_BUFF. Let us also globally enable the GCC 4.4 and clang 3.0 option -Wframe-larger-than=16384 to reduce the possibility of introducing such stack overflow in the future. For RocksDB and Mroonga we relax these limits. Reviewed by: Vladislav Lesin	2025-05-20 17:27:05 +03:00
Vlad Lesin	47e687b109	MDEV-36639 innodb_snapshot_isolation=1 gives error for not committed row changes Set solution is to check if transaction, which modified a record, is still active in lock_clust_rec_read_check_and_lock(). if yes, then just request a lock. If no, then, depending on if the current transaction read view can see the changes, return eighter DB_RECORD_CHANGED or request a lock. We can do the check in lock_clust_rec_read_check_and_lock() because transaction tries to set a lock on the record which cursor points to after transaction resuming and cursor position restoring. If the lock already exists, then we don't request the lock again. But for the current commit it's important that lock_clust_rec_read_check_and_lock() will be invoked again for the same record, so we can do the check again after transaction, which modified a record, was committed or rolled back. MDEV-33802(`4aa9291`) is partially reverted. If some transaction holds implicit lock on some record and transaction with snapshot isolation level requests conflicting lock on the same record, it should be blocked instead of returning DB_RECORD_CHANGED to have ability to continue execution when implicit lock owner is rolled back. The construction -------------------------------------------------------------------------- let $wait_condition= select count(*) = 1 from information_schema.processlist where state = 'Updating' and info = 'UPDATE t SET b = 2 WHERE a'; --source include/wait_condition.inc -------------------------------------------------------------------------- is not reliable enought to make sure transaction is blocked in test case, the test failed sporadically with -------------------------------------------------------------------------- ./mtr --max-test-fail=1 --parallel=96 lock_isolation{,,,,,,,}{,,,}{,,} \ --repeat=500 -------------------------------------------------------------------------- command. That's why it was replaced with debug sync-points. Reviewed by: Marko Mäkelä	2025-04-22 20:41:43 +03:00
Thirunarayanan Balathandayuthapani	dac3d702f7	MDEV-36649 dict_acquire_mdl_shared() aborts when table mode is DICT_TABLE_OP_OPEN_ONLY_IF_CACHED - InnoDB fails to check the table is being dropped or evicted while acquiring the MDL for the table when table open operation mode is DICT_TABLE_OP_OPEN_ONLY_IF_CACHED. This is caused by the commit `337bf8ac4b` (MDEV-36122) Fix: === dict_acquire_mdl_shared(): If the table is evicted or dropped when table operation mode is DICT_TABLE_OP_OPEN_IF_CACHED then return nullptr	2025-04-22 15:17:29 +05:30
Marko Mäkelä	db4763a0d1	Fix a slow test When we expect a lock wait timeout, let us override the default innodb_lock_wait_timeout=50 with the minimum timeout of 1 second.	2025-04-07 10:25:34 +03:00
Thirunarayanan Balathandayuthapani	b11772d9a5	MDEV-33167 ASAN errors in dict_sys_t::load_table / get_foreign_key_info after failing to load a table Problem: ======= - While loading the foreign key constraints for the parent table, if child table wasn't open then InnoDB uses the parent table heap to store the child table name in fk_tables list. If the consecutive foreign key relation for the parent table fails with error, InnoDB evicts the parent table from memory. But InnoDB accesses the evicted table memory again in dict_sys.load_table() Solution: ======== dict_load_table_one(): In case of error, remove the child table names which was added during dict_load_foreigns()	2025-04-03 17:39:40 +05:30
Thirunarayanan Balathandayuthapani	0d7ef4f478	MDEV-36236 Instant alter aborts when InnoDB fails to rollback instant operation Problem: ======== - InnoDB does consecutive instant alter operation, first instant DDL fails, it fails to reset the old instant information in table during rollback. This lead to consecutive instant alter to have wrong assumption about the exisitng instant column information. Fix: ==== dict_table_t::instant_column(): Duplicate the instant information field of the table. By doing this, InnoDB alter retains the old instant information and reset it during rollback operation	2025-04-03 13:09:08 +05:30
Marko Mäkelä	4c0e2f1aca	MDEV-35813: even more robust test case The test in commit `1756b0f37d` is occasionally failing if there are unexpectedly many page cleaner batches that are updating the log checkpoint by small amounts. This occurs in particular when running the server under Valgrind. Let us insert the same number of records with a larger number of statements in a hope that the test would then be more likely to pass.	2025-04-02 08:12:29 +03:00
Marko Mäkelä	191209d8ab	Merge 10.5 into 10.6	2025-03-26 17:09:57 +02:00
Thirunarayanan Balathandayuthapani	1f4a901576	MDEV-36281 DML aborts during online virtual index Reason: ======= - InnoDB DML commit aborts the server when InnoDB does online virtual index. During online DDL, concurrent DML commit operation does read the undo log record and their related current version of the clustered index record. Based on the operation, InnoDB do build the old tuple and new tuple for the table. If the concurrent online index can be affected by the operation, InnoDB does build the entry for the index and log the operation. Problematic case is update operation, InnoDB does build the update vector. But while building the old row, InnoDB fails to fill the non-affected virtual column. This lead to server abort while build the entry for index. Fix: === - First, fill the virtual column entries for the new row. Duplicate the old row based on new row and change only the affected fields in old row based on the update vector.	2025-03-26 12:48:39 +01:00
Marko Mäkelä	1756b0f37d	MDEV-35813: more robust test case Let us integrate the test case with innodb.page_cleaner so that there will be less interference from log writes due to checkpoints. Also, make the test compatible with ./mtr --cursor-protocol.	2025-03-18 10:41:38 +02:00
Marko Mäkelä	0e8e0065d6	MDEV-35813 test case	2025-03-17 16:21:09 +02:00
Kristian Nielsen	acaf07daed	Add --source include/long_test.inc to some tests This will make mysql-test-run.pl try to schedule these long-running (> 60 seconds) tests early in --parallel runs, which helps avoid that the testsuite gets stuck with a few long-running tests at the end while most other test workers are idle. This speed up mtr --parallel=96 with 25 seconds for me. Reviewed-by: Brandon Nesterenko <brandon.nesterenko@mariadb.com> Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2025-03-15 11:15:54 +01:00
Marko Mäkelä	c07e355c40	MDEV-36015: unrepresentable value in row_parse_int() row_parse_int(): Refactor the code and define the function static in one compilation unit. For any negative values, we must return 0. row_search_get_max_rec(), row_search_max_autoinc(): Moved to the same compilation unit with row_parse_int(). We also remove a work-around of an internal compiler error when targeting ARMv8 on GCC 4.8.5, a compiler that is no longer supported. Reviewed by: Debarun Banerjee	2025-02-13 15:10:53 +01:00
Vlad Lesin	6e6fcf4d43	MDEV-34489 innodb.innodb_row_lock_time_ms fails The test fails trying to compare (innodb/lock)_row_lock_time_avg with some limit. We can't predict (innodb/lock)_row_lock_time_avg value, because it's counted as the whole waiting time divided by the amount of waits. Both waiting time and amount of waits depend on the previous tests execution. The corresponding counters in lock_sys can't be reset with any query. Remove (innodb/lock)_row_lock_time_avg comparision from the test. information_schema.global_status.innodb_row_lock_time can't be reset, compare its difference instead of absolute value. Reviewed by: Marko Mäkelä	2025-02-04 19:14:41 +03:00
Marko Mäkelä	900bbbe4a8	MDEV-33295 innodb.doublewrite occasionally fails When the first attempt of XA ROLLBACK is expected to fail, some recovered changes could be written back through the doublewrite buffer. Should that happen, the next recovery attempt (after mangling the data file t1.ibd further) could fail because no copy of the affected pages would be available in the doublewrite buffer. To prevent this from happening, ensure that the doublewrite buffer will not be used and no log checkpoint occurs during the previous failed recovery attempt. Also, let a successful XA ROLLBACK serve the additional purpose of freeing a BLOB page and therefore rewriting page 0, which we must then be able to recover despite induced corruption. In the last restart step, we will tolerate an unexpected checkpoint, because one is frequently occurring on FreeBSD and AIX, despite our efforts to force a buffer pool flush before each "no checkpoint" section.	2025-02-03 08:11:43 +02:00
Sergei Golubchik	066e8d6aea	Merge branch '10.5' into 10.6	2025-01-29 11:17:38 +01:00
Marko Mäkelä	3cfffb4de6	MDEV-35962 CREATE INDEX fails to heal a FOREIGN KEY constraint commit_cache_norebuild(): Replace any newly added indexes in the attached foreign key constraints.	2025-01-29 09:04:50 +02:00
Nikita Malyavin	ecaedbe299	MDEV-33658 1/2 Refactoring: extract Key length initialization mysql_prepare_create_table: Extract a Key initialization part that relates to length calculation and long unique index designation. append_system_key_parts call also moves there. Move this initialization before the duplicate elimination. Extract WITHOUT OVERPLAPS check into a separate function. It had to be moved earlier in the code to preserve the order of the error checks, as in the tests.	2025-01-26 16:15:46 +01:00
Marko Mäkelä	d4da659b43	MDEV-35854: Simplify dict_get_referenced_table() innodb_convert_name(): Convert a schema or table name to my_charset_filename compatible format. dict_table_lookup(): Replaces dict_get_referenced_table(). Make the callers responsible for invoking innodb_convert_name(). innobase_casedn_str(): Remove. Let us invoke my_casedn_str() directly. dict_table_rename_in_cache(): Do not duplicate a call to dict_mem_foreign_table_name_lookup_set(). innobase_convert_to_filename_charset(): Defined static in the only compilation unit that needs it. dict_scan_id(): Remove the constant parameters table_id=FALSE, accept_also_dot=TRUE. Invoke strconvert() directly. innobase_convert_from_id(): Remove; only called from dict_scan_id(). innobase_convert_from_table_id(): Remove (dead code). table_name_t::dblen(), table_name_t::basename(): In non-debug builds, tolerate names that may miss a '/' separator. Reviewed by: Debarun Banerjee	2025-01-23 14:38:08 +02:00
Marko Mäkelä	82310f926b	MDEV-29182 Assertion fld->field_no < table->n_v_def failed on cascade row_ins_cascade_calc_update_vec(): Skip any virtual columns in the update vector of the parent table. Based on mysql/mysql-server@0ac176453b Reviewed by: Debarun Banerjee	2025-01-22 17:22:07 +02:00
Thirunarayanan Balathandayuthapani	0301ef38b4	MDEV-35445 Disable foreign key column nullability check for strict sql mode - MDEV-34392(commit `cc810e64d4`) adds the check for nullability of foreign key column when foreign key relation is of UPDATE_CASCADE or UPDATE SET NULL. This check makes DDL fail when it violates foreign key nullability. This patch basically does the nullability check for foreign key column only for strict sql mode	2025-01-21 18:52:33 +05:30
Marko Mäkelä	98dbe3bfaf	Merge 10.5 into 10.6	2025-01-20 09:57:37 +02:00
Marko Mäkelä	f521b8ac21	MDEV-35723: applying non-zero offset to null pointer in INSERT row_mysql_read_blob_ref(): Correctly handle what Field_blob::store() generates for length=0.	2025-01-17 12:34:03 +02:00
Marko Mäkelä	b82abc7163	MDEV-35701 trx_t::autoinc_locks causes unnecessary dynamic memory allocation trx_t::autoinc_locks: Use small_vector<lock_t*,4> in order to avoid any dynamic memory allocation in the most common case (a statement is holding AUTO_INCREMENT locks on at most 4 tables or partitions). lock_cancel_waiting_and_release(): Instead of removing elements from the middle, simply assign nullptr, like lock_table_remove_autoinc_lock(). The added test innodb.auto_increment_lock_mode covers the dynamic memory allocation as well as nondeterministically (occasionally) covers the out-of-order lock release in lock_table_remove_autoinc_lock(). Reviewed by: Debarun Banerjee	2025-01-15 16:55:01 +02:00
Sergei Golubchik	c478b1ba08	MDEV-35598 foreign key error is unnecessary truncated truncate it at 512 bytes (max allowed by the protocol), not 192	2025-01-09 10:00:36 +01:00
Sergei Golubchik	0706c01b88	cleanup: innodb.innodb_information_schema don't disable query/result log unless the output is unstable. and even then don't, but replace away unstable parts.	2025-01-09 10:00:35 +01:00
Marko Mäkelä	6d4841ae26	MDEV-35647 Possible hang during CREATE TABLE…SELECT error handling ha_innobase::delete_table(): Clear trx->dict_operation_lock_mode after, not before invoking trx->rollback(), so that row_undo_mod_parse_undo_rec() will be invoked with dict_locked=true and dict_sys_t::freeze() will not be invoked for loading a table definition. Inside dict_sys_t::freeze(), an assertion !have_any() would fail when the current thread is already holding the latch. This fixes up commit `c5fd9aa562` (MDEV-25919). Reviewed by: Debarun Banerjee	2025-01-08 13:29:16 +02:00
Monty	88d9348dfc	Remove dates from all rdiff files	2025-01-05 16:40:11 +02:00
Marko Mäkelä	e5c4c0842d	MDEV-35443: opt_search_plan_for_table() may degrade to full table scan opt_calc_index_goodness(): Correct an inaccurate condition. We can very well use a clustered index of a table that is subject to online rebuild. But we must not choose an index that has not been committed (it is a secondary index that was not fully created) or that is corrupted or not a normal B-tree index. opt_search_plan_for_table(): Remove some redundant code, now that opt_calc_index_goodness() checks against corrupted indexes. The test case allows this code to be exercised. The main observation in the following: ./mtr --rr innodb.stats_persistent rr replay var/log/mysqld.1.rr/latest-trace should be that when opt_search_plan_for_table() is being invoked by dict_stats_update_persistent() on the being-altered statistics table in the 2nd call after ha_innobase::inplace_alter_table(), and the fix in opt_calc_index_goodness() is absent, it would choose the code path if (n_fields == 0), that is, a full table scan, instead of searching for the record. The GDB commands to execute in "rr replay" would be as follows: break ha_innobase::inplace_alter_table continue break opt_search_plan_for_table continue continue next next … Reviewed by: Vladislav Lesin	2024-12-19 14:05:16 +02:00
Marko Mäkelä	ddd7d5d8e3	MDEV-24035 Failing assertion: UT_LIST_GET_LEN(lock.trx_locks) == 0 causing disruption and replication failure Under unknown circumstances, the SQL layer may wrongly disregard an invocation of thd_mark_transaction_to_rollback() when an InnoDB transaction had been aborted (rolled back) due to one of the following errors: * HA_ERR_LOCK_DEADLOCK * HA_ERR_RECORD_CHANGED (if innodb_snapshot_isolation=ON) * HA_ERR_LOCK_WAIT_TIMEOUT (if innodb_rollback_on_timeout=ON) Such an error used to cause a crash of InnoDB during transaction commit. These changes aim to catch and report the error earlier, so that not only this crash can be avoided but also the original root cause be found and fixed more easily later. The idea of this fix is from Michael 'Monty' Widenius. HA_ERR_ROLLBACK: A new error code that will be translated into ER_ROLLBACK_ONLY, signalling that the current transaction has been aborted and the only allowed action is ROLLBACK. trx_t::state: Add TRX_STATE_ABORTED that is like TRX_STATE_NOT_STARTED, but noting that the transaction had been rolled back and aborted. trx_t::is_started(): Replaces trx_is_started(). ha_innobase: Check the transaction state in various places. Simplify the logic around SAVEPOINT. ha_innobase::is_valid_trx(): Replaces ha_innobase::is_read_only(). The InnoDB logic around transaction savepoints, commit, and rollback was unnecessarily complex and might have contributed to this inconsistency. So, we are simplifying that logic as well. trx_savept_t: Replace with const undo_no_t*. When we rollback to a savepoint, all we need to know is the number of undo log records that must survive. trx_named_savept_t, DB_NO_SAVEPOINT: Remove. We can store undo_no_t directly in the space allocated at innobase_hton->savepoint_offset. fts_trx_create(): Do not copy previous savepoints. fts_savepoint_rollback(): If a savepoint was not found, roll back everything after the default savepoint of fts_trx_create(). The test innodb_fts.savepoint is extended to cover this code. Reviewed by: Vladislav Lesin Tested by: Matthias Leich	2024-12-12 18:02:00 +02:00
Marko Mäkelä	19acb0257e	MDEV-35508 Race condition between purge and secondary index INSERT or UPDATE row_purge_remove_sec_if_poss_leaf(): If there is an active transaction that is not newer than PAGE_MAX_TRX_ID, return the bogus value 1 so that row_purge_remove_sec_if_poss_tree() is guaranteed to recheck if the record needs to be purged. It could be the case that an active transaction would insert this record between the time this check completed and row_purge_remove_sec_if_poss_tree() acquired a latch on the secondary index leaf page again. row_purge_del_mark_error(), row_purge_check(): Some unlikely code refactored into separate non-inline functions. trx_sys_t::find_same_or_older_low(): Move the unlikely and bulky part of trx_sys_t::find_same_or_older() to a non-inline function. trx_sys_t::find_same_or_older_in_purge(): A variant of trx_sys_t::find_same_or_older() for use in the purge subsystem, with potential concurrent access of the same trx_t object from multiple threads. trx_t::max_inactive_id_atomic: An Atomic_relaxed alias of the regular data field trx_t::max_inactive_id, which we use on systems that have native 64-bit loads or stores. On any 64-bit system that seems to be supported by GCC, Clang or MSVC, relaxed atomic loads and stores use the regular load and store instructions. On -march=i686 the 64-bit atomic loads and stores would use an XMM register. This fixes a regression that had been introduced in commit `b7b9f3ce82` (MDEV-34515). There would be messages [ERROR] InnoDB: tried to purge non-delete-marked record in index in the server error log, and an assertion ut_ad(0) would cause a crash of debug instrumented builds. This could also cause incorrect results for MVCC reads and corrupted secondary indexes. The debug instrumented test case was written by Debarun Banerjee. Reviewed by: Debarun Banerjee	2024-11-29 10:44:38 +02:00
Thirunarayanan Balathandayuthapani	9ba18d1aa0	MDEV-35394 Innochecksum misinterprets freed pages - Innochecksum misinterprets the freed pages as active one. This leads the user to think there are too many valid pages exist. - To avoid this confusion, innochecksum introduced one more option --skip-freed-pages and -r to avoid the freed pages while dumping or printing the summary of the tablespace. - Innochecksum can safely assume the page is freed if the respective extent doesn't belong to a segment and marked as freed in XDES_BITMAP in extent descriptor page. - Innochecksum shouldn't assume that zero-filled page as extent descriptor page. Reviewed-by: Marko Mäkelä	2024-11-27 13:00:51 +05:30
Thirunarayanan Balathandayuthapani	8e1cf078a0	MDEV-35363 Avoid cloning of table statistics while saving the InnoDB table stats - Remove the test case added as a part of the commit `98d57719e2` (MDEV-32667)	2024-11-14 15:32:55 +05:30
Thirunarayanan Balathandayuthapani	074831ec61	Merge branch 10.5 into 10.6	2024-11-08 18:17:15 +05:30
Thirunarayanan Balathandayuthapani	7afee25b08	MDEV-35115 Inconsistent Replace behaviour when multiple unique index exist - Replace statement fails with duplicate key error when multiple unique key conflict happens. Reason is that Server expects the InnoDB engine to store the confliciting keys in ascending order. But the InnoDB doesn't store the conflicting keys in ascending order. Fix: === - Enable HA_DUPLICATE_KEY_NOT_IN_ORDER for InnoDB storage engine only when unique index order is different in .frm and innodb dictionary.	2024-11-08 16:46:41 +05:30
Marko Mäkelä	ba4541ba7f	MDEV-29015/MDEV-29260/MDEV-34938 test fixup The merge `f00711bba2` included a change of the test innodb.log_file_name, which would try to ensure that in the presence of the code fix `decdd4bf49` we would get an error on Linux when invoking lseek() on a directory. It turns out that this is not the case in at least one Linux based cloud environment.	2024-11-08 09:55:47 +02:00
Thirunarayanan Balathandayuthapani	98d57719e2	MDEV-32667 dict_stats_save_index_stat() reads uninitialized index->stats_error_printed Problem: ======== - dict_stats_table_clone_create() does not initialize the flag stats_error_printed in either dict_table_t or dict_index_t. Because dict_stats_save_index_stat() is operating on a copy of a dict_index_t object, it appears that dict_index_t::stats_error_printed will always be false for actual metadata objects, and uninitialized in dict_stats_save_index_stat(). Solution: ========= dict_stats_table_clone_create(): Assign stats_error_printed for table and index while copying the statistics	2024-11-08 11:35:19 +05:30
Oleksandr Byelkin	f00711bba2	Merge branch '10.5' into 10.6	2024-10-29 14:20:03 +01:00
Sergei Golubchik	3cd706b107	MDEV-35236 Assertion `(mem_root->flags & 4) == 0' failed in safe_lexcstrdup_root Post-fix for MDEV-35144. Cannot allocate options values on the statement arena, because HA_CREATE_INFO is shallow-copied for every execution, so if the option_list was initially empty, it will be reset for every execution and any values allocated on the statement arena will be lost. Cannot allocate option values on the execution arena, because HA_CREATE_INFO is shallow-copied for every execution, so if the option_list was initially NOT empty, any values appended to the end will be preserved and if they're on the execution arena their content will be destroyed. Let's use thd->change_item_tree() to save and restore necessary pointers for every execution. followup for `3da565c41d`	2024-10-23 14:58:57 +02:00
Vlad Lesin	8c7786e7d5	MDEV-34690 lock_rec_unlock_unmodified() causes deadlock lock_rec_unlock_unmodified() is executed either under lock_sys.wr_lock() or under a combination of lock_sys.rd_lock() + record locks hash table cell latch. It also requests page latch to check if locked records were changed by the current transaction or not. Usually InnoDB requests page latch to find the certain record on the page, and then requests lock_sys and/or record lock hash cell latch to request record lock. lock_rec_unlock_unmodified() requests the latches in the opposite order, what causes deadlocks. One of the possible scenario for the deadlock is the following: thread 1 - lock_rec_unlock_unmodified() is invoked under locks hash table cell latch, the latch is acquired; thread 2 - purge thread acquires page latch and tries to remove delete-marked record, it invokes lock_update_delete(), which requests locks hash table cell latch, held by thread 1; thread 1 - requests page latch, held by thread 2. To fix it we need to release lock_sys.latch and/or lock hash cell latch, acquire page latch and re-acquire lock_sys related latches. When lock_sys.latch and/or lock hash cell latch are released in lock_release_on_prepare() and lock_release_on_prepare_try(), the page on which the current lock is held, can be merged. In this case the bitmap of the current lock must be cleared, and the new lock must be added to the end of trx->lock.trx_locks list, or bitmap of already existing lock must be changed. The new field trx_lock_t::set_nth_bit_calls indicates if new locks (bits in existing lock bitmaps or new lock objects) were created during the period when lock_sys was released in trx->lock.trx_locks list iteration loop in lock_release_on_prepare() or lock_release_on_prepare_try(). And, if so, we traverse the list again. The block can be freed during pages merging, what causes assertion failure in buf_page_get_gen(), as btr_block_get() passes BUF_GET as page get mode to it. That's why page_get_mode parameter was added to btr_block_get() to pass BUF_GET_POSSIBLY_FREED from lock_release_on_prepare() and lock_release_on_prepare_try() to buf_page_get_gen(). As searching for id of trx, which modified secondary index record, is quite expensive operation, restrict its usage for master. System variable was added to remove the restriction for testing simplifying. The variable exists only either for debug build or for build with -DINNODB_ENABLE_XAP_UNLOCK_UNMODIFIED_FOR_PRIMARY option to increase the probability of catching bugs for release build with RQG. Note that the code, which does primary index lookup to find out what transaction modified secondary index record, is necessary only when there is no primary key and no unique secondary key on replica with row based replication, because only in this case extra X locks on unmodified records can be set during scan phase. Reviewed by Marko Mäkelä.	2024-10-23 12:36:17 +03:00
Vlad Lesin	92180ad513	MDEV-34466 XA prepare don't release unmodified records for some cases There is no need to exclude exclusive non-gap locks from the procedure of locks releasing on XA PREPARE execution in lock_release_on_prepare_try() after commit `17e59ed3aa` (MDEV-33454), because lock_rec_unlock_unmodified() should check if the record was modified with the XA, and release the lock if it was not. lock_release_on_prepare_try(): don't skip X-locks, let lock_rec_unlock_unmodified() to process them. lock_sec_rec_some_has_impl(): add template parameter for not acquiring trx_t::mutex for the case if a caller already holds the mutex, don't crash if lock's bitmap is clean. row_vers_impl_x_locked(), row_vers_impl_x_locked_low(): add new argument to skip trx_t::mutex acquiring. rw_trx_hash_t::validate_element(): don't acquire trx_t::mutex if the current thread already holds it. Thanks to Andrei Elkin for finding the bug. Reviewed by Marko Mäkelä, Debarun Banerjee.	2024-10-23 12:36:17 +03:00
Marko Mäkelä	1cad1dbde6	MDEV-35235 innodb_snapshot_isolation=ON fails to signal transaction rollback convert_error_code_to_mysql(): Treat DB_DEADLOCK and DB_RECORD_CHANGED in the same way, that is, signal to the SQL layer that the transaction had been rolled back.	2024-10-23 07:55:22 +03:00
Sergei Golubchik	3a1cf2c85b	MDEV-34679 ER_BAD_FIELD uses non-localizable substrings	2024-10-17 21:37:37 +02:00
Sergei Golubchik	3da565c41d	MDEV-35144 CREATE TABLE ... LIKE uses current innodb_compression_default instead of the create value When adding a column or index that uses plugin-defined sysvar-based options with CREATE ... LIKE the server was using the current value of the sysvar, not the default one. Because parse_option_list() function was used both in create and open and it tried to guess when it's create (need to use current sysvar value and add a new name=value pair to the list) or open (need to use default, without extending the list). Let's move the list extending functionality into a separate function and call it explicitly when needed. Operations that add new objects (CREATE, ALTER ... ADD) will extend the list, other operations (ALTER, CREATE ... LIKE, open) will not.	2024-10-17 16:28:39 +02:00
Marko Mäkelä	bb47e575de	MDEV-34830: LSN in the future is not being treated as serious corruption The invariant of write-ahead logging is that before any change to a page is written to the data file, the corresponding log record must must first have been durably written. On crash recovery, there were some sloppy checks for this. Let us implement accurate checks and flag an inconsistency as a hard error, so that we can avoid further corruption of a corrupted database. For data extraction from the corrupted database, innodb_force_recovery can be used. Before recovery is reading any data pages or invoking buf_dblwr_t::recover() to recover torn pages from the doublewrite buffer, InnoDB will have parsed the log until the final LSN and updated log_sys.lsn to that. So, we can rely on log_sys.lsn at all times. The doublewrite buffer recovery has been refactored in such a way that the recv_sys.dblwr.pages may be consulted while discovering files and their page sizes, but nothing will be written back to data files before buf_dblwr_t::recover() is invoked. A section of the test mariabackup.innodb_redo_overwrite that is parsing some mariadb-backup --backup output has been removed, because that output "redo log block is overwritten" would often be missing in a Microsoft Windows environment as a result of these changes. recv_max_page_lsn, recv_lsn_checks_on: Remove. recv_sys_t::validate_checkpoint(): Validate the write-ahead-logging condition at the end of the recovery. recv_dblwr_t::validate_page(): Keep track of the maximum LSN (if we are checking a non-doublewrite copy of a page) but do not complain LSN being in the future. The doublewrite buffer is a special case, because it will be read early during recovery. Besides, starting with commit `762bcb81b5` the dblwr=true copies of pages may legitimately be "too new". recv_dblwr_t::find_page(): Find a valid page with the smallest FIL_PAGE_LSN that is in the valid range for recovery. recv_dblwr_t::restore_first_page(): Replaced by find_page(). Only buf_dblwr_t::recover() will write to data files. buf_dblwr_t::recover(): Simplify the message output. Do attempt doublewrite recovery on user page read error. Ignore doublewrite pages whose FIL_PAGE_LSN is outside the usable bounds. Previously, we could wrongly recover a too new page from the doublewrite buffer. It is unlikely that this could have lead to an actual error. Write back all recovered pages from the doublewrite buffer here, including for the first page of any tablespace. buf_page_is_corrupted(): Distinguish the return values CORRUPTED_FUTURE_LSN and CORRUPTED_OTHER. buf_page_check_corrupt(): Return the error code DB_CORRUPTION in case the LSN is in the future. Datafile::read_first_page(): Handle FSP_SPACE_FLAGS=0xffffffff in the same way on both 32-bit and 64-bit architectures. Datafile::read_first_page_flags(): Split from read_first_page(). Take a copy of the first page as a parameter. recv_sys_t::free_corrupted_page(): Take the file as a parameter and return whether a message was displayed. This avoids some duplicated and incomplete error messages. buf_page_t::read_complete(): Remove some redundant output and always display the name of the corrupted file. Never return DB_FAIL; use it only in internal error handling. IORequest::read_complete(): Assume that buf_page_t::read_complete() will have reported any error. fil_space_t::set_corrupted(): Return whether this is the first time the tablespace had been flagged as corrupted. Datafile::validate_first_page(), fil_node_open_file_low(), fil_node_open_file(), fil_space_t::read_page0(), fil_node_t::read_page0(): Add a parameter for a copy of the first page, and a parameter to indicate whether the FIL_PAGE_LSN check should be suppressed. Before buf_dblwr_t::recover() is invoked, we cannot validate the FIL_PAGE_LSN, but we can trust the FSP_SPACE_FLAGS and the tablespace ID that may be present in a potentially too new copy of a page. Reviewed by: Debarun Banerjee	2024-10-17 17:24:20 +03:00
Vladislav Vaintroub	c1fc59277a	MDEV-34929 page-compressed tables do not work on Windows Remove workaround for MDEV-13941, it served for 5 years,and all affected pre-release 10.2 installation should have been already fixed in between. Apparently Innodb is using is_sparse parameter in os_file_set_size() inconsistently, and it passes is_sparse=false now during first file extension. With MDEV-13941 workaround in place, it would unsparse the file, which is makes compression not to work at all anymore.	2024-10-16 16:02:13 +02:00
Thirunarayanan Balathandayuthapani	6aaae4c03b	MDEV-35122 Incorrect NULL value handling for instantly dropped BLOB columns Problem: ======= - Redundant table fails to insert into the table after instant drop blob column. Instant drop column only marking the column as hidden and consecutive insert statement tries to insert NULL value for the dropped BLOB column and returns the fixed length of the blob type as 65535. This lead to row size too large error. Fix: ==== For redundant table, if the non-fixed dropped column can be null then set the length of the field type as 0.	2024-10-15 12:04:37 +05:30
Yuchen Pei	cd5577ba4a	Merge branch '10.5' into 10.6	2024-10-15 16:00:44 +11:00
Thirunarayanan Balathandayuthapani	5777d9f282	MDEV-35116 InnoDB fails to set error index for HA_ERR_NULL_IN_SPATIAL - InnoDB fails to set the index information or index number for the spatial index error HA_ERR_NULL_IN_SPATIAL. row_build_spatial_index_key(): Initialize the tmp_mbr array completely. check_if_supported_inplace_alter(): Fix the spelling mistake of alter	2024-10-14 14:28:24 +05:30

1 2 3 4 5 ...

3687 Commits