mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-08-07 00:04:31 +03:00

Author	SHA1	Message	Date
Sergei Golubchik	bead24b7f3	mariadb-test: wait on disconnect Remove one of the major sources of race condiitons in mariadb-test. Normally, mariadb_close() sends COM_QUIT to the server and immediately disconnects. In mariadb-test it means the test can switch to another connection and sends queries to the server before the server even started parsing the COM_QUIT packet and these queries can see the connection as fully active, as it didn't reach dispatch_command yet. This is a major source of instability in tests and many - but not all, still less than a half - tests employ workarounds. The correct one is a pair count_sessions.inc/wait_until_count_sessions.inc. Also very popular was wait_until_disconnected.inc, which was completely useless, because it verifies that the connection is closed, and after disconnect it always is, it didn't verify whether the server processed COM_QUIT. Sadly the placebo was as widely used as the real thing. Let's fix this by making mariadb-test `disconnect` command _to wait_ for the server to confirm. This makes almost all workarounds redundant. In some cases count_sessions.inc/wait_until_count_sessions.inc is still needed, though, as only `disconnect` command is changed: * after external tools, like `exec $MYSQL` * after failed `connect` command * replication, after `STOP SLAVE` * Federated/CONNECT/SPIDER/etc after `DROP TABLE` and also in some XA tests, because an XA transaction is dissociated from the THD very late, after the server has closed the client connection. Collateral cleanups: fix comments, remove some redundant statements: * DROP IF EXISTS if nothing is known to exist * DROP table/view before DROP DATABASE * REVOKE privileges before DROP USER etc	2025-07-16 09:14:33 +07:00
Marko Mäkelä	ddd7d5d8e3	MDEV-24035 Failing assertion: UT_LIST_GET_LEN(lock.trx_locks) == 0 causing disruption and replication failure Under unknown circumstances, the SQL layer may wrongly disregard an invocation of thd_mark_transaction_to_rollback() when an InnoDB transaction had been aborted (rolled back) due to one of the following errors: * HA_ERR_LOCK_DEADLOCK * HA_ERR_RECORD_CHANGED (if innodb_snapshot_isolation=ON) * HA_ERR_LOCK_WAIT_TIMEOUT (if innodb_rollback_on_timeout=ON) Such an error used to cause a crash of InnoDB during transaction commit. These changes aim to catch and report the error earlier, so that not only this crash can be avoided but also the original root cause be found and fixed more easily later. The idea of this fix is from Michael 'Monty' Widenius. HA_ERR_ROLLBACK: A new error code that will be translated into ER_ROLLBACK_ONLY, signalling that the current transaction has been aborted and the only allowed action is ROLLBACK. trx_t::state: Add TRX_STATE_ABORTED that is like TRX_STATE_NOT_STARTED, but noting that the transaction had been rolled back and aborted. trx_t::is_started(): Replaces trx_is_started(). ha_innobase: Check the transaction state in various places. Simplify the logic around SAVEPOINT. ha_innobase::is_valid_trx(): Replaces ha_innobase::is_read_only(). The InnoDB logic around transaction savepoints, commit, and rollback was unnecessarily complex and might have contributed to this inconsistency. So, we are simplifying that logic as well. trx_savept_t: Replace with const undo_no_t*. When we rollback to a savepoint, all we need to know is the number of undo log records that must survive. trx_named_savept_t, DB_NO_SAVEPOINT: Remove. We can store undo_no_t directly in the space allocated at innobase_hton->savepoint_offset. fts_trx_create(): Do not copy previous savepoints. fts_savepoint_rollback(): If a savepoint was not found, roll back everything after the default savepoint of fts_trx_create(). The test innodb_fts.savepoint is extended to cover this code. Reviewed by: Vladislav Lesin Tested by: Matthias Leich	2024-12-12 18:02:00 +02:00
Marko Mäkelä	e441c32a0b	Merge 10.5 into 10.6	2023-01-03 18:13:11 +02:00
Vlad Lesin	3ddc00dc3b	MDEV-30225 RR isolation violation with locking unique search Before the fix next-key lock was requested only if a record was delete-marked for locking unique search in RR isolation level. There can be several delete-marked records for the same unique key, that's why InnoDB scans the records until eighter non-delete-marked record is reached or all delete-marked records with the same unique key are scanned. For range scan next-key locks are used for RR to protect scanned range from inserting new records by other transactions. And this is the reason of why next-key locks are used for delete-marked records for unique searches. If a record is not delete-marked, the requested lock type was "not-gap". When a record is not delete-marked during lock request by trx 1, and some other transaction holds conflicting lock, trx 1 creates waiting not-gap lock on the record and suspends. During trx 1 suspending the record can be delete-marked. And when the lock is granted on conflicting transaction commit or rollback, its type is still "not-gap". So we have "not-gap" lock on delete-marked record for RR. And this let some other transaction to insert some record with the same unique key when trx 1 is not committed, what can cause isolation level violation. The fix is to set next-key locks for both delete-marked and non-delete-marked records for unique search in RR.	2022-12-20 11:31:49 +03:00

4 Commits