mirror of
https://github.com/MariaDB/server.git
synced 2025-08-30 11:22:14 +03:00
Bug#37780: Make KILL reliable (main.kill fails randomly)
- A prerequisite cleanup patch for making KILL reliable. The test case main.kill did not work reliably. The following problems have been identified: 1. A kill signal could go lost if it came in, short before a thread went reading on the client connection. 2. A kill signal could go lost if it came in, short before a thread went waiting on a condition variable. These problems have been solved as follows. Please see also added code comments for more details. 1. There is no safe way to detect, when a thread enters the blocking state of a read(2) or recv(2) system call, where it can be interrupted by a signal. Hence it is not possible to wait for the right moment to send a kill signal. It has been decided, not to fix it in the code. Instead, the test case repeats the KILL statement until the connection terminates. 2. Before waiting on a condition variable, we register it together with a synchronizating mutex in THD::mysys_var. After this, we need to test THD::killed again. At some places we did only test it in a loop condition before the registration. When THD::killed had been set between this test and the registration, we entered waiting without noticing the killed flag. Additional checks ahve been introduced where required. In addition to the above, a re-write of the main.kill test case has been done. All sleeps have been replaced by Debug Sync Facility synchronization. A couple of sync points have been added to the server code. To avoid further problems, if the test case fails in spite of the fixes, the test case has been added to the "experimental" list for now. - Most of the work on this patch is authored by Ingo Struewing
This commit is contained in:
@@ -712,6 +712,22 @@ bool do_command(THD *thd)
|
||||
|
||||
net_new_transaction(net);
|
||||
|
||||
/*
|
||||
Synchronization point for testing of KILL_CONNECTION.
|
||||
This sync point can wait here, to simulate slow code execution
|
||||
between the last test of thd->killed and blocking in read().
|
||||
|
||||
The goal of this test is to verify that a connection does not
|
||||
hang, if it is killed at this point of execution.
|
||||
(Bug#37780 - main.kill fails randomly)
|
||||
|
||||
Note that the sync point wait itself will be terminated by a
|
||||
kill. In this case it consumes a condition broadcast, but does
|
||||
not change anything else. The consumed broadcast should not
|
||||
matter here, because the read/recv() below doesn't use it.
|
||||
*/
|
||||
DEBUG_SYNC(thd, "before_do_command_net_read");
|
||||
|
||||
if ((packet_length= my_net_read(net)) == packet_error)
|
||||
{
|
||||
DBUG_PRINT("info",("Got error %d reading command from socket %s",
|
||||
|
Reference in New Issue
Block a user