mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-08-07 00:04:31 +03:00

Author	SHA1	Message	Date
Sachin	706a7101bf	MDEV-23089 rpl_parallel2 fails in 10.5 Problem:- rpl_parallel2 was failing non-deterministically Analysis:- When FLUSH TABLES WITH READ LOCK is executed, it will allow all worker threads to complete their ongoing transactions and then it will pause them. At this state FTWRL will proceed to acquire global read lock. FTWRL first blocks threads from starting new commits, then upgrades the lock to block commit of existing transactions. Step1: FLUSH TABLES WITH READ LOCK - Blocks new commits Step2: * STOP SLAVE command enables 'force_abort=1' which unblocks workers, they continue to execute events. * T1: Waits in 'record_gtid' call to update 'gtid_slave_pos' table with its current GTID, but it is blocked becuase of Step1. * T2: Holds COMMIT lock and waits for T1 to commit. Step3: FLUSH TABLES WITH READ LOCK - Waiting to get BLOCK_COMMIT. This results in deadlock. When STOP SLAVE command allows paused workers to proceed, workers should skip the execution of all further events, similar to 'conservative' parallel mode. Solution:- We will assign 1 to skip_event_group when we are aborted in do_ftwrl_wait. rpl_parallel_entry->pause_sub_id is only reset when force_abort is off in rpl_pause_after_ftwrl.	2020-08-03 17:07:16 +05:30
Monty	f383cbcb03	Added some selects to rpl_parallel2.test to find out where it fails in buildbot	2015-11-18 14:46:30 +02:00
Kristian Nielsen	ba02550166	MDEV-7818: Deadlock occurring with parallel replication and FTWRL Problem is that FLUSH TABLES WITH READ LOCK first blocks threads from starting new commits, then waits for running commits to complete. But in-order parallel replication needs commits to happen in a particular order, so this can easily deadlock. To fix this problem, this patch introduces a way to temporarily pause the parallel replication worker threads. Before starting FTWRL, we let all worker threads complete in-progress transactions, and then wait. Then we proceed to take the global read lock. Once the lock is obtained, we unpause the worker threads. Now commits are blocked from starting by the global read lock, so the deadlock will no longer occur.	2015-11-13 14:02:15 +01:00
Kristian Nielsen	682ed005c5	MDEV-8294: Inconsistent behavior of slave parallel threads at runtime There were some cases where the slave SQL thread could stop without the pool of parallel replication worker threads being correctly de-activated.	2015-06-10 11:57:42 +02:00
Kristian Nielsen	50b42441a6	MDEV-7241: rpl.rpl_parallel2 fails sporadically in buildbot There was a race, a small window between updating slave position and updating Seconds_Behind_Master, during which the test case could see the wrong value. Fix by waiting for the expected status to appear.	2014-12-02 09:27:22 +01:00
unknown	8cc6e90d74	MDEV-5509: Seconds_behind_master incorrect in parallel replication The problem was a race between the SQL driver thread and the worker threads. The SQL driver thread would set rli->last_master_timestamp to zero to mark that it has caught up with the master, while the worker threads would set it to the timestamp of the executed event. This can happen out-of-order in parallel replication, causing the "caught up" status to be overwritten and Seconds_Behind_Master to wrongly grow when the slave is idle. To fix, introduce a separate flag rli->sql_thread_caught_up to mark that the SQL driver thread is caught up. This avoids issues with worker threads overwriting the SQL driver thread status. In parallel replication, we then make SHOW SLAVE STATUS check in addition that all worker threads are idle before showing Seconds_Behind_Master as 0 due to slave idle.	2014-01-08 11:00:44 +01:00

6 Commits