Sanity checks to detect concurrency bugs were assuming a threading
model where each client state would always be processed within
single thread of execution. This however may be too strong assumption
if the application uses some kind of thread pooling.
This patch relaxes those assumptions by removing current_thread_id_
from client_state and relaxing assertions against owning_thread_id_.
This patch also adds a new method
wait_rollback_complete_and_acquire_ownership() into
client_state. This method is idempotent and can be used to gain
control to client_state before before_command() is called.
The method will wait until possible background rollback process is
over and marks the state to s_exec to protect the state against
new background rollbacks.
Other fixes/improvements:
- High priority globals state is restored after discarding streaming.
- Allowed server_state transition donor -> synced.
- Client state method store_globals() was renamed to acquire_ownership()
to better describe the intent. Method store_globals() was left for
backwards compatibility and marked deprecated.
- populate and pass real error description buffer to provider in case
of applying error
- return 0 from server_state::on_apply() if error voting confirmed
consistency
- remove fragments and rollback after fragment applying failure
- always release streaming applier on commit or rollback
If the application uses caching for client sessions, the
client_state object may be reused. This will cause the opened
client session to have unexpected value for sync_wait_gtid and
last_written_gtid.
In order to work around the problem, reset sync_wait_gtid and
last_written_gtid in client_state::open().
If the size of a SR fragment exceeds the maximum size that the
replication provider allows us to replicate, then we are expected to
set the client error code to e_error_during_commit.
However, client_state::after_statement() unconditionally overrides it
to error e_deadlock_error.
Fixes client_state::after_statement() so that it overrided the error
only if noerror has been set yet.
Storing information that background rollbacker in ongoing in client state has_rollback_
This can be used for detecting if there is ongoing background rollback,
and client should keep waiting in before_command() entry to avoid conflicts
in accessing client state during background rollbacking.
transaction::bf_abort() is modified to set has_rollback_ flag when
backgroung rollbacking has been assigned for the client
sync_rollback_complete() method has been modified to reset the backround
rollbacker flag
* Check error code from fragment release
* Always call streaming rollback from must abort if the
transaction is in executing phase. This is needed to ensure
that rollback fragment replication happens before rollback starts
* Initiate streaming rollback from certify fragment if BF abort
happens after fragment certification.
* Check fragment removal error code in prepare phase. It is possible
that the transaction gets BF aborted during fragment removal.
* Mark fragment certified in certify_fragment() even if the provider
returns cert failed error. With current wsrep-API error codes
it may not be possible to distinquish certification failure
and BF abort during fragment replication. This may also be a
provider bug. As a result rollback fragment may sometimes be
replicated when it would not be necessary.
* Count separately fragments certified and fragments stored in
streaming context. Storing the fragment may ultimately fail
due to BF abort even if the fragment was succesfully certified.
Therefore we need to have separate counter for certified fragments
to determine if the transaction is streaming and seqnos of fragments
which have been succesfully stored.
* Provider release is called only after succesful fragment certification
and fragment store.
* Fixed handling of write sets with rollback flag set in apply_write_set()
* Handle BF rollback also in after_statement() call.
* Added missing after_apply() call when handling rollback fragment.
* Fixed state changes when rollback is starated during preparing state.
The intended purpose for local mode was local storage access without
entering replication hooks. However, the same can be achieved with
high priority mode. Removed replicating mode and use local instead to
denote locally processing clients.
quite useful as there might not be enough information for it
after the statement has been processed. Better to handle retrying
on DBMS side. Also removed after_statement_result enumeration and
return plain int from after_statement().
* Pass condition variable for client_state
* Notify all cond waiters when changing the transcation status to
aborted
* Wait for aborting transaction state aborted in before_command
Depending on the DBMS client session allocation strategy the
client id may or may not be available when the client_session
is constructed, therefore there should be a method to assign
an id after construction. Close/cleanup methods were added to
clean up open transactions appropriately.