Make is_rollbacker_active() public so that the BF thread can
check if the rollbacker was started or not.
Also don't unlock the lock for launching the background
rollbacker to avoid race conditions in accessing the
victim state.
Added methods bf_abort() and total_order_bf_abort() which take
wsrep::unique_lock<wsrep::mutex> as argument to allow caller
to grab the mutex before attempting BF abort. The old calls
were kept for backwards compatibility and wrap the new calls
with internal locking.
Removed calls to assert() from public headers to have
full control when assertions are enabled in wsrep-lib
code regardless of parent project build configuration.
Moved methods containing assertions and non-trivial
code from headers into compilation units.
* Removed transaction::p_unsafe_ member
* Changed transaction::pa_unsafe(bool) to modify flags member directly
* Modified transaction.cpp to use transaction.pa_unsafe(bool) rather than
directly changing transaction's flag
* added method mark_transaction_pa_unsafe() for client_state,
application will use this
The method takes already locked lock object as an argument.
The caller must ensure that the lock object owns the underlying mutex.
Replaced homegrown wsrep::unique_lock with type alias from
std::unique_lock.
This patch adds the possibility to have client commands that do not
return results from DBMS. While processing such commands we must be
able to preserve errors until the next interaction with client.
Specifically if the transaction is bf aborted while processing such
a non-returning command, then we have to keep the deadlock error until
the client issues a command that may return the error.
To handle such cases, client_state::before_command() now takes
parameter keep_command_error. The DBMS is supposed set
keep_command_error true to instruct wsrep-lib to preserve errors (if
any) until the next command which sets keep_command_error false.
Dealing with a case where current client command does not return result.
Work in progress.
Fix typo and add assertions in keep_command_error()
Make keep_command_error a parameter to before_commit()
Fix comment about keep_command_error
Handle keep_command_error with s_must_abort in wsrep_before_command()
Fix unit test
This patch implments replaying for prepared XA transactions.
Replay may happen in the following cases:
1) The transaction is BF aborted in prepared state and is idle. In
that case, the transaction is handed over to rollbacker for replay.
2) The transaction is BF aborted while executing the
commit (i.e. before or after successful certification). In
which case the transaction replays itself from fragment storage.
3) The transaction is BF aborted while certifying its commit
fragment. This case is handled like replay for streaming transactions,
where the provider is directly involved and re-delivers the last
fragment.
Add support for detaching XA transactions. This is useful for handling
the case where the DBMS client has a transaction in prepared state and
disconnects. Before disconnect, the DBMS calls the newly introduced
client_state::xa_detach(), to cleanup the local transaction and
convert it to a high priority transaction. The DBMS may later attempt
to terminate the transaction through client_state::commit_by_xid() or
client_state::rollback_by_xid().
Also in this patch:
- Fix client_state::close() so that it does not rollback transactions
in prepared state
- Changed class wsrep::xid representation to hold enough information
so that DBMS can convert to its native representation
- Fix potential infinite loop in
server_state::find_streaming_applier(wsrep:xid&)
- Append SR keys on prepare fragment and make it pa_unsafe
- Handle one phase commit (simply fall back to two phase)
- Do not rollback prepared streaming clients in
server_state::close_orphaned_transactions()
If the transaction fails during replay because of certification
failure, the provider will return control to applier without
terminating the transaction and transaction remains in
s_replaying.
Fixed transaction::after_statement() to handle the state changes
correctly if certification failure is returned from replay.
Replaying was extracted to separate private method from
after_statement(). Removed transaction::after_replay() as it
seems now unnecessary and it bypassed state change sanity checks.
Allowed replaying -> committed transaction transition to handle
the situation where DBMS allocates a new context and client_state
to do the replay.
when losing error voting:
- if NBO has failed locally (DBMS side), don't override original DBMS
error so it gets reported to the client
- otherwise, report "query interrupted" instead of "error during commit"
- Retry enter_toi() in poll_enter_toi() also for error_connection_failed
which means that the connectivity to the cluster has been lost,
a.k.a non-prim.
Certification keys are needed for NBO end to resolve dependencies
for the write sets which follow NBO end. Without keys the following
write sets do not detect dependency to NBO event and may start applying
too early.
If timeout option is give, enter_toi_local() and begin_nbo_phase_one()
retry provider::enter_toi() as long as return status indicates
certification failure, given timeout expires or the client is interrupted.
- High priority interface method to apply NBO begin, separate from
apply_toi() in order to avoid implementation to force interpreting
ws_meta flags.
- Method to put client_state into NBO mode when applying NBO begin.
The client_state will process in m_local mode.
- Unit tests for applying NBO
- Implemented calls to enter and leave NBO phase one and two
- Extended client_state mode checking to include m_nbo
- Changed client_state state and mode change sanity checks to
print a warning and assert() instead of throwing exceptions
to be more graceful in release builds.
Remove methods `is_xa()`, `is_xa_prepare()`, and `xid()` from
client_service interface. Instead, transactions are explicitly
assigned their xid, through at start of XA.
* Add method `restore_prepared_transaction` to `client_state` class
which restores a transaction state from storage given its xid.
* Add method `commit_or_rollback_by_xid` to terminate prepared XA
transactions by xid.
* Make sure that transactions in prepared state are not rolled back
when their master fails/partitions away.
next_fragment is called outside the scope of a high_priority_switch,
and we may be in a different thread context then the streaming applier
was created in
Added a wsrep::thread_service interface to allow application to
inject instrumented thread, mutex and condition variable implementation
for provider.
The interface is defined in include/wsrep/thread_service.hpp.
Sample implementation is provided in dbsim/db_threads.[h|c]pp.
This patch will also clean up some remaining dependencies to
wsrep-API compilation units so that the dependency to wsrep-API
is header only. This will extending the provider support to
later wsrep-API versions.
Sanity checks to detect concurrency bugs were assuming a threading
model where each client state would always be processed within
single thread of execution. This however may be too strong assumption
if the application uses some kind of thread pooling.
This patch relaxes those assumptions by removing current_thread_id_
from client_state and relaxing assertions against owning_thread_id_.
This patch also adds a new method
wait_rollback_complete_and_acquire_ownership() into
client_state. This method is idempotent and can be used to gain
control to client_state before before_command() is called.
The method will wait until possible background rollback process is
over and marks the state to s_exec to protect the state against
new background rollbacks.
Other fixes/improvements:
- High priority globals state is restored after discarding streaming.
- Allowed server_state transition donor -> synced.
- Client state method store_globals() was renamed to acquire_ownership()
to better describe the intent. Method store_globals() was left for
backwards compatibility and marked deprecated.
- populate and pass real error description buffer to provider in case
of applying error
- return 0 from server_state::on_apply() if error voting confirmed
consistency
- remove fragments and rollback after fragment applying failure
- always release streaming applier on commit or rollback
Storing information that background rollbacker in ongoing in client state has_rollback_
This can be used for detecting if there is ongoing background rollback,
and client should keep waiting in before_command() entry to avoid conflicts
in accessing client state during background rollbacking.
transaction::bf_abort() is modified to set has_rollback_ flag when
backgroung rollbacking has been assigned for the client
sync_rollback_complete() method has been modified to reset the backround
rollbacker flag
* Check fragment removal error code in prepare phase. It is possible
that the transaction gets BF aborted during fragment removal.
* Mark fragment certified in certify_fragment() even if the provider
returns cert failed error. With current wsrep-API error codes
it may not be possible to distinquish certification failure
and BF abort during fragment replication. This may also be a
provider bug. As a result rollback fragment may sometimes be
replicated when it would not be necessary.
* Added server_id into transaction in order to be able to stop
streaming applier during high priority BF abort
* Added missing commit fragment applying
* Don't clear fragments for replaying SR transaction