1
0
mirror of https://github.com/MariaDB/server.git synced 2026-01-06 05:22:24 +03:00

MDEV-742 XA PREPAREd transaction survive disconnect/server restart

Lifted long standing limitation to the XA of rolling it back at the
transaction's
connection close even if the XA is prepared.

Prepared XA-transaction is made to sustain connection close or server
restart.
The patch consists of

    - binary logging extension to write prepared XA part of
      transaction signified with
      its XID in a new XA_prepare_log_event. The concusion part -
      with Commit or Rollback decision - is logged separately as
      Query_log_event.
      That is in the binlog the XA consists of two separate group of
      events.

      That makes the whole XA possibly interweaving in binlog with
      other XA:s or regular transaction but with no harm to
      replication and data consistency.

      Gtid_log_event receives two more flags to identify which of the
      two XA phases of the transaction it represents. With either flag
      set also XID info is added to the event.

      When binlog is ON on the server XID::formatID is
      constrained to 4 bytes.

    - engines are made aware of the server policy to keep up user
      prepared XA:s so they (Innodb, rocksdb) don't roll them back
      anymore at their disconnect methods.

    - slave applier is refined to cope with two phase logged XA:s
      including parallel modes of execution.

This patch does not address crash-safe logging of the new events which
is being addressed by MDEV-21469.

CORNER CASES: read-only, pure myisam, binlog-*, @@skip_log_bin, etc

Are addressed along the following policies.
1. The read-only at reconnect marks XID to fail for future
   completion with ER_XA_RBROLLBACK.

2. binlog-* filtered XA when it changes engine data is regarded as
   loggable even when nothing got cached for binlog.  An empty
   XA-prepare group is recorded. Consequent Commit-or-Rollback
   succeeds in the Engine(s) as well as recorded into binlog.

3. The same applies to the non-transactional engine XA.

4. @@skip_log_bin=OFF does not record anything at XA-prepare
   (obviously), but the completion event is recorded into binlog to
   admit inconsistency with slave.

The following actions are taken by the patch.

At XA-prepare:
   when empty binlog cache - don't do anything to binlog if RO,
   otherwise write empty XA_prepare (assert(binlog-filter case)).

At Disconnect:
   when Prepared && RO (=> no binlogging was done)
     set Xid_cache_element::error := ER_XA_RBROLLBACK
     *keep* XID in the cache, and rollback the transaction.

At XA-"complete":
   Discover the error, if any don't binlog the "complete",
   return the error to the user.

Kudos
-----
Alexey Botchkov took to drive this work initially.
Sergei Golubchik, Sergei Petrunja, Marko Mäkelä provided a number of
good recommendations.
Sergei Voitovich made a magnificent review and improvements to the code.
They all deserve a bunch of thanks for making this work done!
This commit is contained in:
Andrei Elkin
2019-03-31 01:47:28 +04:00
parent 5754ea2eca
commit c8ae357341
72 changed files with 8869 additions and 222 deletions

View File

@@ -0,0 +1,31 @@
#
# This file initiate connections to run XA transactions up to
# their prepare.
# Connection name, transaction name and its content depends on
# supplied parameters.
#
# param $type type of transaction
# param $index index identifies the connection with those of type $type
# param $sql_init1 a query to execute once connection is established
# param $sql_init2 a query to execute once connection is established
# param $sql_doit a query to execute inside transaction
# Note, the query may depend on tables created by caller
#
--connect (conn$index$type, 127.0.0.1,root,,test,$MASTER_MYPORT,)
if ($sql_init1)
{
--eval $sql_init1
}
if ($sql_init2)
{
--eval $sql_init2
}
--eval XA START 'trx$index$type'
if ($sql_doit)
{
--eval $sql_doit
}
--eval XA END 'trx$index$type'
--eval XA PREPARE 'trx$index$type'

View File

@@ -0,0 +1,37 @@
#
# This file disconnects two connections. One actively and one through
# kill. It is included by binlog_xa_prepared_do_and_restart.
#
# param $type type of transaction
# param $terminate_with how to conclude actively disconnecte:
# XA COMMIT or XA ROLLBACK
# param $conn3_id connection id of the being killed.
# param $num_trx_prepared number of transactions prepared so far
#
--connection default
--echo *** $num_trx_prepared prepared transactions must be in the list ***
--replace_column 2 LEN1 3 LEN2 4 TRX_N
XA RECOVER;
--connection conn1$type
--let $conn1_id=`SELECT connection_id()`
--disconnect conn1$type
--connection default
--let $wait_condition= SELECT count(*) = 0 FROM v_processlist WHERE PROCESSLIST_ID = $conn1_id
--source include/wait_condition.inc
# It will conclude now
--error 0,1402
--eval $terminate_with 'trx1$type'
--replace_result $conn3_id CONN_ID
--eval KILL connection $conn3_id
--let $wait_condition= SELECT count(*) = 0 FROM v_processlist WHERE PROCESSLIST_ID = $conn3_id
--source include/wait_condition.inc
# It will conclude now
--error 0,1402
--eval $terminate_with 'trx3$type'

View File

@@ -0,0 +1,323 @@
#
# This file creates various kinds of prepared XA transactions,
# manipulates their connection state and examines how their prepared
# status behave while the transaction is disconnected, killed or
# the server kisses it shutdown.
# The file can be sourced multiple times
# param $restart_number (as the number of inclusion) adjusts
# verification logics.
#
# param [in] $conn_number Total number of connection each performing
# one insert into table.
# param [in] $commit_number Number of commits from either.
# side of the server restart.
# param [in] $rollback_number The same as the above just for rollback.
# param [in] $term_number Number of transaction that are terminated
# before server restarts
# param [in] $killed_number Instead of disconnect make some
# connections killed when their
# transactions got prepared.
# param [in] $server_disconn_number Make some connections disconnected
# by shutdown rather than actively
# param [in] $post_restart_conn_number Number a "warmup" connection
# after server restart, they all commit
# param [out] restart_number Counter to be incremented at the end of the test
#
# The test consists of three sections:
# I. Corner cases check
# II. Regular case check
# III. Post server-restart verification
#
# I. Corner cases of
#
# A. XA with an update to a temp table
# B. XA with SELECT
# C. XA empty
# Demonstrate their XA status upon prepare and how they react on disconnect and
# shutdown.
# In each of A,B,C three prepared transactions are set up.
# trx1 is for disconnection, trx2 for shutdown, trx3 for being killed.
# The A case additionally contains some XA prohibited state transaction check.
#
# D. Prove that not prepared XA remains to be cleared out by disconnection.
#
#
# A. The temp table only prepared XA recovers only formally to
# let post recovery XA COMMIT or XA ROLLBACK with no effect.
--let $type = tmp
--let $index = 1
--let $sql_init1 = SET @@sql_log_bin = OFF
--let $sql_init2 = CREATE TEMPORARY TABLE tmp$index (a int) ENGINE=innodb
--let $sql_doit = INSERT INTO tmp$index SET a=$index
--source suite/binlog/include/binlog_xa_prepare_connection.inc
--let $index = 2
--source suite/binlog/include/binlog_xa_prepare_connection.inc
--let $index = 3
--source suite/binlog/include/binlog_xa_prepare_connection.inc
--let $conn3_id=`SELECT connection_id()`
#
# Various prohibited XA state changes to test here:
#
--connection default
# Stealing is not allowed
--error ER_XAER_NOTA
--eval XA COMMIT 'trx1$type'
--error ER_XAER_NOTA
--eval XA ROLLBACK 'trx1$type'
# Before disconnect: creating a duplicate is not allowed
--error ER_XAER_DUPID
--eval XA START 'trx1$type'
# Manipulate now the prepared transactions.
# Two to terminate, one to leave out.
--let $terminate_with = XA COMMIT
--let $num_trx_prepared = $index
--source suite/binlog/include/binlog_xa_prepare_disconnect.inc
#
# B. "Read-only" (select) prepared XA recovers only formally to
# let post recovery XA COMMIT or XA ROLLBACK with no effect.
#
--let $type=ro
--let $index = 1
--let $sql_init1 =
--let $sql_init2 =
--let $sql_doit = SELECT * from t ORDER BY a
--source suite/binlog/include/binlog_xa_prepare_connection.inc
--let $index = 2
--source suite/binlog/include/binlog_xa_prepare_connection.inc
--let $index = 3
--source suite/binlog/include/binlog_xa_prepare_connection.inc
--let $conn3_id=`SELECT connection_id()`
--let $terminate_with = XA ROLLBACK
# two three above section prepared transaction were terminated.
--inc $num_trx_prepared
--source suite/binlog/include/binlog_xa_prepare_disconnect.inc
#
# C. Empty prepared XA recovers only formally to
# let post recovery XA COMMIT or XA ROLLBACK with no effect.
#
--let $type=empty
--let $index = 1
--let $sql_init1 =
--let $sql_init2 =
--let $sql_doit =
--source suite/binlog/include/binlog_xa_prepare_connection.inc
--let $index = 2
--source suite/binlog/include/binlog_xa_prepare_connection.inc
--let $index = 3
--source suite/binlog/include/binlog_xa_prepare_connection.inc
--let $conn3_id=`SELECT connection_id()`
--let $terminate_with = XA COMMIT
--inc $num_trx_prepared
--source suite/binlog/include/binlog_xa_prepare_disconnect.inc
#
# D. Not prepared XA disconnects to be cleared out,
# no effect on data left as well.
# Few more prohibited XA state transactions is checked out.
#
--let $type=unprepared
--let $prev_count=`SELECT count(*) from t`
--connect(conn1$type, 127.0.0.1,root,,test,$MASTER_MYPORT,)
--eval XA START 'trx1$type'
INSERT INTO t set a=0;
--eval XA END 'trx1$type'
--error ER_XAER_RMFAIL
INSERT INTO t set a=0;
--error ER_XAER_RMFAIL
--eval XA START 'trx1$type'
--error ER_XAER_RMFAIL
--eval XA START 'trx1$type'
--disconnect conn1$type
--connection default
# No such transactions
--error ER_XAER_NOTA
--eval XA COMMIT 'trx1$type'
if (`SELECT count(*) > $prev_count from t`)
{
--echo *** Unexpected commit to the table. ***
--die
}
#
# II. Regular case.
#
# Prepared transactions get disconnected in three ways:
# actively, being killed and by the server shutdown.
#
--let $i=0
while ($i < $conn_number)
{
--connect (conn$i, 127.0.0.1,root,,test,$MASTER_MYPORT,)
--let $conn_id=`SELECT connection_id()`
--disable_reconnect
SET @@binlog_format = STATEMENT;
if (`SELECT $i % 2`)
{
SET @@binlog_format = ROW;
}
--eval XA START 'trx_$i'
--eval INSERT INTO t SET a=$i
--eval XA END 'trx_$i'
--eval XA PREPARE 'trx_$i'
--let $disc_via_kill=`SELECT $conn_number - $i <= $killed_number`
if (!$disc_via_kill)
{
--let $disc_via_shutdown=`SELECT $conn_number - $i <= $killed_number + $server_disconn_number`
if (!$disc_via_shutdown)
{
--disconnect conn$i
}
}
if ($disc_via_kill)
{
--connection default
--replace_result $conn_id CONN_ID
--eval KILL CONNECTION $conn_id
}
if (!$disc_via_shutdown)
{
--connection default
--let $wait_condition= SELECT count(*) = 0 FROM v_processlist WHERE PROCESSLIST_ID = $conn_id
--source include/wait_condition.inc
}
--inc $i
}
# [0, $rollback_number - 1] are rolled back now
--connection default
--let $i=0
while ($i < $rollback_number)
{
--eval XA ROLLBACK 'trx_$i'
--inc $i
}
# [$rollback_number, $rollback_number + $commit_number - 1] get committed
while ($i < $term_number)
{
--eval XA COMMIT 'trx_$i'
--inc $i
}
--source include/$how_to_restart
#
# III. Post server-restart verification.
# It concludes survived XA:s with a number of commits and rollbacks
# as configured in the 1st part to check expected results in the end.
# Cleanup section consists of explicit disconnect (for killed, or
# not disconnected before shutdown).
#
# New XA can be prepared and committed
--let $k = 0
while ($k < $post_restart_conn_number)
{
--connect (conn_restart_$k, 127.0.0.1,root,,test,$MASTER_MYPORT,)
--let $conn_id=`SELECT connection_id()`
--eval XA START 'new_trx_$k'
--eval INSERT INTO t SET a=$k
--eval XA END 'new_trx_$k'
--eval XA PREPARE 'new_trx_$k'
--disconnect conn_restart_$k
--connection default
--let $wait_condition= SELECT count(*) = 0 FROM v_processlist WHERE PROCESSLIST_ID = $conn_id
--source include/wait_condition.inc
--inc $k
}
--connection default
--let $k = 0
while ($k < $post_restart_conn_number)
{
--eval XA COMMIT 'new_trx_$k'
--inc $k
}
#
# Symmetrically to the pre-restart, the resurrected trx:s are committed
# [$term_number, $term_number + $commit_number - 1]
# and the rest is rolled back.
#
--let $i = $term_number
while ($i < `SELECT $term_number + $commit_number`)
{
# Expected to fail
--error ER_XAER_DUPID
--eval XA START 'trx_$i'
--eval XA COMMIT 'trx_$i'
--inc $i
}
while ($i < $conn_number)
{
# Expected to fail
--error ER_XAER_DUPID
--eval XA START 'trx_$i'
--eval XA ROLLBACK 'trx_$i'
--inc $i
}
#
# Verification of correct results of recovered XA transaction handling:
#
SELECT * FROM t;
--let $type=tmp
--disconnect conn2$type
--disconnect conn3$type
--let $type=ro
--disconnect conn2$type
--disconnect conn3$type
--let $type=empty
--disconnect conn2$type
--disconnect conn3$type
--let $i= $conn_number
--let $k= 0
--let $expl_disconn_number = `SELECT $killed_number + $server_disconn_number`
while ($k < $expl_disconn_number)
{
--connection default
--error ER_XAER_NOTA
--eval XA ROLLBACK 'trx_$i'
--dec $i
--disconnect conn$i
--inc $k
}
--inc $restart_number

View File

@@ -0,0 +1,33 @@
RESET MASTER;
CREATE TABLE t1 (a INT PRIMARY KEY, b MEDIUMTEXT) ENGINE=Innodb;
connect con1,localhost,root,,;
SET DEBUG_SYNC= "at_unlog_xa_prepare SIGNAL con1_ready WAIT_FOR con1_go";
XA START '1';
INSERT INTO t1 SET a=1;
XA END '1';
XA PREPARE '1';;
connection default;
SET DEBUG_SYNC= "now WAIT_FOR con1_ready";
FLUSH LOGS;
FLUSH LOGS;
FLUSH LOGS;
show binary logs;
Log_name File_size
master-bin.000001 #
master-bin.000002 #
master-bin.000003 #
master-bin.000004 #
include/show_binlog_events.inc
Log_name Pos Event_type Server_id End_log_pos Info
master-bin.000004 # Format_desc # # SERVER_VERSION, BINLOG_VERSION
master-bin.000004 # Gtid_list # # [#-#-#]
master-bin.000004 # Binlog_checkpoint # # master-bin.000001
SET DEBUG_SYNC= "now SIGNAL con1_go";
connection con1;
*** master-bin.000004 checkpoint must show up now ***
connection con1;
XA ROLLBACK '1';
SET debug_sync = 'reset';
connection default;
DROP TABLE t1;
SET debug_sync = 'reset';

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,57 @@
--source include/have_innodb.inc
--source include/have_debug.inc
--source include/have_debug_sync.inc
--source include/have_binlog_format_row.inc
RESET MASTER;
CREATE TABLE t1 (a INT PRIMARY KEY, b MEDIUMTEXT) ENGINE=Innodb;
# Test that
# 1. XA PREPARE is binlogged before the XA has been prepared in Engine
# 2. While XA PREPARE already binlogged in an old binlog file which has been rotated,
# Binlog checkpoint is not generated for the latest log until
# XA PREPARE returns, e.g OK to the client.
# con1 will hang before doing commit checkpoint, blocking RESET MASTER.
connect(con1,localhost,root,,);
SET DEBUG_SYNC= "at_unlog_xa_prepare SIGNAL con1_ready WAIT_FOR con1_go";
XA START '1';
INSERT INTO t1 SET a=1;
XA END '1';
--send XA PREPARE '1';
connection default;
SET DEBUG_SYNC= "now WAIT_FOR con1_ready";
FLUSH LOGS;
FLUSH LOGS;
FLUSH LOGS;
--source include/show_binary_logs.inc
--let $binlog_file= master-bin.000004
--let $binlog_start= 4
--source include/show_binlog_events.inc
SET DEBUG_SYNC= "now SIGNAL con1_go";
connection con1;
reap;
--echo *** master-bin.000004 checkpoint must show up now ***
--source include/wait_for_binlog_checkpoint.inc
# Todo: think about the error code returned, move to an appropriate test, or remove
# connection default;
#--error 1399
# DROP TABLE t1;
connection con1;
XA ROLLBACK '1';
SET debug_sync = 'reset';
# Clean up.
connection default;
DROP TABLE t1;
SET debug_sync = 'reset';

View File

@@ -0,0 +1,102 @@
--source include/have_innodb.inc
--source include/have_perfschema.inc
#
# The test verifies binlogging of XA transaction and state of prepared XA
# as far as binlog is concerned.
#
# The prepared XA transactions can be disconnected from the client,
# discovered from another connection and commited or rolled back
# later. They also survive the server restart. The test runs two
# loops each consisting of prepared XA:s generation, their
# manipulation and a server restart followed with survived XA:s
# completion.
#
# Prepared XA can't get available to an external connection
# until connection that either leaves actively or is killed
# has completed a necessary part of its cleanup.
# Selecting from P_S.threads provides a method to learn that.
#
# Total number of connection each performing one insert into table
--let $conn_number=20
# Number of rollbacks and commits from either side of the server restart
--let $rollback_number=5
--let $commit_number=5
# Number of transactions that are terminated before server restarts
--let $term_number=`SELECT $rollback_number + $commit_number`
# Instead of disconnect make some connections killed when their
# transactions got prepared.
--let $killed_number=5
# make some connections disconnected by shutdown rather than actively
--let $server_disconn_number=5
--let $prepared_at_server_restart = `SELECT $conn_number - $term_number`
# number a "warmup" connection after server restart, they all commit
--let $post_restart_conn_number=10
# Counter to be used in GTID consistency check.
# It's incremented per each non-XA transaction commit.
# Local to this file variable to control one-phase commit loop
--let $one_phase_number = 5
--connection default
# Remove possibly preceeding binlogs and clear initialization time
# GTID executed info. In the following all transactions are counted
# to conduct verification at the end of the test.
if (`SELECT @@global.log_bin`)
{
RESET MASTER;
}
# Disconected and follower threads need synchronization
CREATE VIEW v_processlist as SELECT * FROM performance_schema.threads where type = 'FOREGROUND';
--eval call mtr.add_suppression("Found $prepared_at_server_restart prepared XA transactions")
CREATE TABLE t (a INT) ENGINE=innodb;
# Counter is incremented at the end of post restart to
# reflect number of loops done in correctness computation.
--let $restart_number = 0
--let $how_to_restart=restart_mysqld.inc
--source suite/binlog/include/binlog_xa_prepared_do_and_restart.inc
--let $how_to_restart=kill_and_restart_mysqld.inc
--source suite/binlog/include/binlog_xa_prepared_do_and_restart.inc
--connection default
# Few xs that commit in one phase, not subject to the server restart
# nor reconnect.
# This piece of test is related to mysqlbinlog recovery examine below.
--let $k = 0
while ($k < $one_phase_number)
{
--eval XA START 'one_phase_trx_$k'
--eval INSERT INTO t SET a=$k
--eval XA END 'one_phase_trx_$k'
--eval XA COMMIT 'one_phase_trx_$k' ONE PHASE
--inc $k
}
SELECT SUM(a) FROM t;
DROP TABLE t;
DROP VIEW v_processlist;
let $outfile= $MYSQLTEST_VARDIR/tmp/mysqlbinlog.sql;
if (`SELECT @@global.log_bin`)
{
# Recording proper samples of binlogged prepared XA:s
--source include/show_binlog_events.inc
--exec $MYSQL_BINLOG -R --to-last-log master-bin.000001 > $outfile
}
--echo All transactions must be completed, to empty-list the following:
XA RECOVER;
if (`SELECT @@global.log_bin`)
{
--exec $MYSQL test < $outfile
--remove_file $outfile
XA RECOVER;
}

View File

@@ -0,0 +1,11 @@
###############################################################################
# Bug#12161 Xa recovery and client disconnection
# Testing new server options and binary logging prepared XA transaction.
###############################################################################
#
# MIXED mode is chosen because formats are varied inside the sourced tests.
#
--source include/have_binlog_format_mixed.inc
--source suite/binlog/t/binlog_xa_prepared.inc