After the merging of MDEV-24915, 10.6 branch has regressions with handling of concurrent write load against two or more cluster nodes. These regressions may surface as cluster hanging, node crashes or data inconsistency. With some test scenarios, the only visible symptom could be that the BF victim aborting happens only by innodb lock wait timeout expiration. This would result only to poor performance (by default 50 sec hang for each BF conflict), and could be somewhat difficult to diagnose. This pull request has following fixes to handle concurrent write load from multiple nodes: In lock_wait_wsrep_kill(), the victim trx was expected to be only in TRX_STATE_ACTIVE state. With the delayed BF conflict handling, it can happen that victim has advanced into pre commit state. This was fixed by choosing victim both in TRX_STATE_ACTIVE and TRX_STATE_PREPARED states. Victim transaction may be in several different states at the time of detected lock conflict, and due to delayed BF aborting practice in MDEV-24915, the victim may advance further before the actual BF aborting takes place. The BF aborting in MDEV-24915 did not wake the victim, if it was in the state of waiting for some other lock (than the one that was blocking the high priority thread). This anomaly caused the innodb lock wait timeout expiration delays and poor performance symptom. To fix this, lock_wait_wsrep_kill() now looks if victim is in lock waiting state, and uses lock_cancel_waiting_and_release() to cancel this lock wait. wsrep_bf_abort() checks if the victim has active transaction (in wsrep-lib), and starts a new transaction if there was no active transaction before. Due to late BF aborting, the victim may have e.g. failed in certification and is already aborting or has aborted at this stage. This has caused problems in testing where BF aborter tries to BF abort himself. The fix in wsrep_bf_abort() now skips the BF abort, if victim is aborting or has aborted. Victim may not have started transaction yet in wsrep context, but it may have acquired MDL locks (due to DDL execution), and this has caused BF conflict. Such case does not require aborting in wsrep or replication provider state. BF aborting could cause BF-BF conflict scenario, if victim was already aborted and changed to replayer having high priority as well. This BF-BF conflict scenario is now avoided in lock_wait_wsrep() where we now check if blocking lock holder is also high priority and is ordered before, caller should wait for the lock in this situation. The natural innodb deadlock resolving algorithm could pick BF thread as deadlock victim. This is fixed by giving max weigh to BF threads in Deadlock::report(). MDEV-24341 has changed excution paths in do_command() and this affects BF aborted victim execution. This PR fixes one assert in do_command(): DBUG_ASSERT(!thd->async_state.pending_ops()) Which fired if the thd was BF aborted earlier. This assert is now changed to allow pending_ops() if thd was BF aborted before. With these fixes, long term highly conflicting write load could be run against to node cluster. If binlogging is configured, log_slave_updates should be also set.
Code status:
MariaDB: The open source relational database
MariaDB was designed as a drop-in replacement of MySQL(R) with more features, new storage engines, fewer bugs, and better performance.
MariaDB is brought to you by the MariaDB Foundation and the MariaDB Corporation. Please read the CREDITS file for details about the MariaDB Foundation, and who is developing MariaDB.
MariaDB is developed by many of the original developers of MySQL who now work for the MariaDB Corporation, the MariaDB Foundation and by many people in the community.
MySQL, which is the base of MariaDB, is a product and trademark of Oracle Corporation, Inc. For a list of developers and other contributors, see the Credits appendix. You can also run 'SHOW authors' to get a list of active contributors.
A description of the MariaDB project and a manual can be found at:
https://mariadb.com/kb/en/mariadb-vs-mysql-features/
https://mariadb.com/kb/en/mariadb-versus-mysql-compatibility/
https://mariadb.com/kb/en/new-and-old-releases/
Help
More help is available from the Maria Discuss mailing list https://launchpad.net/~maria-discuss, MariaDB's Zulip instance, https://mariadb.zulipchat.com/ and the #maria IRC channel on Freenode.
Live QA for beginner contributors
MariaDB has a dedicated time each week when we answer new contributor questions live on Zulip and IRC. From 8:00 to 10:00 UTC on Mondays, and 10:00 to 12:00 UTC on Thursdays, anyone can ask any questions they’d like, and a live developer will be available to assist.
New contributors can ask questions any time, but we will provide immediate feedback during that interval.
Licensing
NOTE:
MariaDB is specifically available only under version 2 of the GNU General Public License (GPLv2). (I.e. Without the "any later version" clause.) This is inherited from MySQL. Please see the README file in the MySQL distribution for more information.
License information can be found in the COPYING file. Third party license information can be found in the THIRDPARTY file.
Bug Reports
Bug and/or error reports regarding MariaDB should be submitted at: https://jira.mariadb.org
For reporting security vulnerabilities see: https://mariadb.org/about/security-policy/
The code for MariaDB, including all revision history, can be found at: https://github.com/MariaDB/server