mirror of
https://github.com/MariaDB/server.git
synced 2025-08-07 00:04:31 +03:00
MDEV-11853: semisync thread can be killed after sync binlog but before ACK in the sync state
Problem: ======== If a primary is shutdown during an active semi-sync connection during the period when the primary is awaiting an ACK, the primary hard kills the active communication thread and does not ensure the transaction was received by a replica. This can lead to an inconsistent replication state. Solution: ======== During shutdown, the primary should wait for an ACK or timeout before hard killing a thread which is awaiting a communication. We extend the `SHUTDOWN WAIT FOR SLAVES` logic to identify and ignore any threads waiting for a semi-sync ACK in phase 1. Then, before stopping the ack receiver thread, the shutdown is delayed until all waiting semi-sync connections receive an ACK or time out. The connections are then killed in phase 2. Notes: 1) There remains an unresolved corner case that affects this patch. MDEV-28141: Slave crashes with Packets out of order when connecting to a shutting down master. Specifically, If a slave is connecting to a master which is actively shutting down, the slave can crash with a "Packets out of order" assertion error. To get around this issue in the MTR tests, the primary will wait a small amount of time before phase 1 killing threads to let the replicas safely stop (if applicable). 2) This patch also fixes MDEV-28114: Semi-sync Master ACK Receiver Thread Can Error on COM_QUIT Reviewed By ============ Andrei Elkin <andrei.elkin@mariadb.com>
This commit is contained in:
@@ -640,7 +640,8 @@ THD::THD(my_thread_id id, bool is_wsrep_applier)
|
||||
#ifdef HAVE_REPLICATION
|
||||
,
|
||||
current_linfo(0),
|
||||
slave_info(0)
|
||||
slave_info(0),
|
||||
is_awaiting_semisync_ack(0)
|
||||
#endif
|
||||
#ifdef WITH_WSREP
|
||||
,
|
||||
|
Reference in New Issue
Block a user