Allocate separate client state in mock_client_service::replay()
for replaying step.
Added two new test cases for streaming replication replay after
BF abort.
* Move resetting is_bf_immutable_ into trasaction::cleanup()
to ensure that it is reset to false regardless how the
transaction terminates.
* Removed redundant lock()/unlock() methods from
mock_client_state.
The transaction state is set to s_ordered_commit in
ordered_commit(). However, this is too late for making the
transaction immune for BF aborts after commit order has
been established, which happens in before_commit().
Moving the state change into before_commit() would be the
right thing to do, but that would require too many fixes
to existing applications which are using the lib.
In order to make the transaction immune for BF abort
after it has been ordered to commit, introduce additional
boolean flag which is set to true at the end of before_commit()
and is taken into account in bf_abort().
Streaming rollback for total order BF abort used regular
BF abort codepath, which was not correct because the streaming
rollback must fully complete before total order operation executes.
Fixed this by adjusting bf_aborted_in_total_order_
before streaming_rollback() gets called.
The condition to skip changing to `s_joined` for all codepaths
which return from donor state. Extracted the logic into separate
method.
Commented start_sst_action in mock_server_service.
When the donor lost its donor state during SST due to cluster
partitioning, the state was erranously changed to `s_joined`
in `start_sst()` and `sst_sent()`, which caused assertion failures
in state checking.
Fixed by changing state to `s_joined` only if donor is still in
`s_donor` state.
Use pointers to pass state objects to service constructors
to work around GCC 12 warning
error: member ‘wsrep::mock_storage_service::client_state_’
is used uninitialized
Changed server_state public methods sst_received() and wait_until_state()
to report errors as return value instead of throwing exceptions.
This was done to gradually get rid of public methods which report
errors via exceptions.
This change was part of MDEV-30419.
Report event will write json formatted event into report
file.
Include Boost headers as system headers to avoid generating
excessive warnings. Enable extra tests for selected compilers
in actions.
This patch introduces a queue to store ids of transactions that failed
to send a rollback fragment in streaming_rollback(). This is to avoid
potentially missed rollback fragments when a cluster splits and then
later reforms. Rollback fragments would be missing if a node rolled
back a transaction locally (either BFed or voluntary rollback) while
non-primary, and the attempt to send rollback fragment failed in
transaction::streaming_rollback().
Transaction that fail to send rollback fragment can proceed to
rollback locally. However we must ensure that rollback fragments for
those transactions are eventually delivered by the cluster. This must
be done before a potentially conflicting writeset causes BF-BF
conflicts in the rest of the cluster.
Handle the case were prepare is bf aborted after it has replicated a
fragment, and before the command finishes in
after_command_before_result() and after_command_after_result() hooks.
Client state end_rsu() didn't reset toi_mode to m_undefined,
which caused an assertion when NBO was started after RSU.
As a fix, reset toi_mode to m_undefined in end_rsu() after
changing mode.
This patch adds the possibility to have client commands that do not
return results from DBMS. While processing such commands we must be
able to preserve errors until the next interaction with client.
Specifically if the transaction is bf aborted while processing such
a non-returning command, then we have to keep the deadlock error until
the client issues a command that may return the error.
To handle such cases, client_state::before_command() now takes
parameter keep_command_error. The DBMS is supposed set
keep_command_error true to instruct wsrep-lib to preserve errors (if
any) until the next command which sets keep_command_error false.
Dealing with a case where current client command does not return result.
Work in progress.
Fix typo and add assertions in keep_command_error()
Make keep_command_error a parameter to before_commit()
Fix comment about keep_command_error
Handle keep_command_error with s_must_abort in wsrep_before_command()
Fix unit test
* Added unit tests for transaction::xa_detach() and
transaction::xa_replay()
* Added unit tests for wsrep::xid
* Fixed minor issues pointed out by reviewer
This patch implments replaying for prepared XA transactions.
Replay may happen in the following cases:
1) The transaction is BF aborted in prepared state and is idle. In
that case, the transaction is handed over to rollbacker for replay.
2) The transaction is BF aborted while executing the
commit (i.e. before or after successful certification). In
which case the transaction replays itself from fragment storage.
3) The transaction is BF aborted while certifying its commit
fragment. This case is handled like replay for streaming transactions,
where the provider is directly involved and re-delivers the last
fragment.
Add support for detaching XA transactions. This is useful for handling
the case where the DBMS client has a transaction in prepared state and
disconnects. Before disconnect, the DBMS calls the newly introduced
client_state::xa_detach(), to cleanup the local transaction and
convert it to a high priority transaction. The DBMS may later attempt
to terminate the transaction through client_state::commit_by_xid() or
client_state::rollback_by_xid().
Also in this patch:
- Fix client_state::close() so that it does not rollback transactions
in prepared state
- Changed class wsrep::xid representation to hold enough information
so that DBMS can convert to its native representation
- Fix potential infinite loop in
server_state::find_streaming_applier(wsrep:xid&)
- Append SR keys on prepare fragment and make it pa_unsafe
- Handle one phase commit (simply fall back to two phase)
- Do not rollback prepared streaming clients in
server_state::close_orphaned_transactions()
Modified tests to verify that client_service::will_replay() is
called whenever it is determined that the transaction must replay.
Added a test to verify behavior when provider::commit_order_enter()
returns BF abort error.
Moved call to client_service::will_replay() into transaction::state()
to ensure that it is always called when shift to s_must_replay
happens.
An assertion
`server_state_.rollback_mode() == wsrep::server_state::rm_async`
fired in `client_state::before_command()` if a BF abort happened
between calls to wait_rollback_complete_and_acquire_ownership()
and before_command().
This commit adds a test to reproduce the assertion and verify
the correct behavior, as well as removes the incorrect assertion
to fix the issue.
which complicates diagnostics and debugging.
Don't ignore provider return codes and more verbose error logging for
sst_sent(), sst_received(), set_encryption_key() methods
Refs codership/wsrep-lib#127
After a local certification failure, commit order is released without
the setting the current position in DBMS. Which results in diverging
positions between provider and DBMS, if clean shutdown happens right
after local certification failure.
This patch add method set_position() to server_service class. So that
wsrep-lib can instruct DBMS to set the current position after local
certification failure releases commit order.
Shutting down the provider may cause replication/appling failures, which
may further result to disconnect calls from failing operations.
Allow concurrent disconnect requests to deal with such a situations.
Adjusted transaction_xa_applying unit test to change in
applying_client_fixture. The fixture does not start transaction
and the transaction needs to be started in test explicitly.
If timeout option is give, enter_toi_local() and begin_nbo_phase_one()
retry provider::enter_toi() as long as return status indicates
certification failure, given timeout expires or the client is interrupted.
- High priority interface method to apply NBO begin, separate from
apply_toi() in order to avoid implementation to force interpreting
ws_meta flags.
- Method to put client_state into NBO mode when applying NBO begin.
The client_state will process in m_local mode.
- Unit tests for applying NBO
- Implemented calls to enter and leave NBO phase one and two
- Extended client_state mode checking to include m_nbo
- Changed client_state state and mode change sanity checks to
print a warning and assert() instead of throwing exceptions
to be more graceful in release builds.
Access to empty vector by using operator[] may cause stdlib++
assertions to fail. Replaced the vector data access to use data()
method which is valid operation even if the vector is empty.
Added unit test to reproduce assertion with empty mutable_buffer access.
Added -D_GLIBCXX_ASSERTIONS preprocessor option to debug builds
to catch standard library misuse.
Added gcc 8 and gcc9 into travis build matrix.
Remove methods `is_xa()`, `is_xa_prepare()`, and `xid()` from
client_service interface. Instead, transactions are explicitly
assigned their xid, through at start of XA.
* Add method `restore_prepared_transaction` to `client_state` class
which restores a transaction state from storage given its xid.
* Add method `commit_or_rollback_by_xid` to terminate prepared XA
transactions by xid.
* Make sure that transactions in prepared state are not rolled back
when their master fails/partitions away.