Introduced server_service recover_streaming_appliers() interface
call which will be called in total order whenever streaming appliers
must be recovered. The call comes with two overloads, one which
can be called from client context (e.g. after SST has been received)
and the other from high priority context (e.g. view event handling).
The client context overload should be eventually be deprecated once
there is a mechanism to make provider signal that it has joined to
the cluster and will start applying events.
* Implemented encryption callback and enc_set_key
* Added pure virtual functions for encryption functionality
* Set enc key if provider was not loaded on time
In general the position where the storage recovers after a SST
cannot be known untile the recovery process is over. This in turn
means that the position cannot be known when the server_state
sst_received() method is called. Worked around the problem by
introducing get_position() method into server service which
can be used to get the position from stable storage after SST
has completed and the state has been recovered.
* Adds method wsrep::transaction::streaming_step() so that there is a
single place where streaming context unit counter is udpated.
The method also checks that some data has been generated before
attempting fragment replication.
* Emit a warning if there is an attempt to replicate a fragment and
there is no data to replicate.
Intruduced server_state::interrupt_state_waiters() to interrupt
all waiters inside server_state::wait_until_state(). This mechanism
is needed when an error is encountered during state change processing
and waiting threads may need to be interrupted to check and handle
the error condition.
Made server_state::wait_until_state() to throw exception if the
wait was interrupted and the new server state is either disconnecting
or disconnected, which usually indicates error condition.
Transition from server_state connected state to disconnecting must
be allowed to deal with errors during server startup.
Added SST first test cases for server_state transitions:
* Successful join via SST
* Error in connect state
* Error in joiner state
Use iterators for scanning members vector in order to avoid
issues with integer signedness and range checks. The vector is
usually rather small and not in hot codepath, so performance
is here not an issue.
Added unit test for member_index() method.
Provider desync may return an error if the provider cannot communicate
with rest of the cluster. However, this is acceptable for example
if the node has dropped from primary view. Instead of returning
error immediately after failed desync(), attempt to pause the provider
regardless of the error. If pause operation fails, error is returned.
In order to avoid resync in resume_and_resync() in the case desync
failed in desync_and_pause(), new member variable desynced_on_pause_
was introduced to decide whether to resync or not in resume_and_resync().
This variable is protected by pause()/resume() calls since they do
not allow concurrent pause/resume operations.
Storing information that background rollbacker in ongoing in client state has_rollback_
This can be used for detecting if there is ongoing background rollback,
and client should keep waiting in before_command() entry to avoid conflicts
in accessing client state during background rollbacking.
transaction::bf_abort() is modified to set has_rollback_ flag when
backgroung rollbacking has been assigned for the client
sync_rollback_complete() method has been modified to reset the backround
rollbacker flag
- fixed node ID assertion in on_connect() method,
fixed "sanity checks" to allow reconnection to primary component
- fixed code duplication in on_view() method
When member joins the group and needs to receive an SST it won't
receive the corresponding menbership view event because the SST
happens after the event and will already include the effects of
all events ordered before it. The view then must be recovered from
the received state.
Minor renames and cleanups.
References codership/wsrep-lib#18
it on disconnect.
- Don't rely on own index from the view because the view may come from
another member (IST/SST), instead always determine own index from own ID.
Refs codership/wsrep-lib#13
There are some corner cases where keys with two parts are needed
for a transaction. Relaxed the assertion and sanity check so that
at least two key parts are needed for each key which is assigned
to a transaction.
with superproject definitions.
Avoid using client_service::do_2pc() in before_commit() to
determine if 2pc is actually happening, will use transaction
states to deduce that. client_service::do_2pc() should be deprecated.
Fixed a compiler warning in db_high_priority_service.cpp.
Moved SR fragment removal for total order BFd SR transactions
into after_rollback() call to avoid deadlocking while trying
to access storage before rolling back the transaction.
* Check fragment removal error code in prepare phase. It is possible
that the transaction gets BF aborted during fragment removal.
* Mark fragment certified in certify_fragment() even if the provider
returns cert failed error. With current wsrep-API error codes
it may not be possible to distinquish certification failure
and BF abort during fragment replication. This may also be a
provider bug. As a result rollback fragment may sometimes be
replicated when it would not be necessary.
* Count separately fragments certified and fragments stored in
streaming context. Storing the fragment may ultimately fail
due to BF abort even if the fragment was succesfully certified.
Therefore we need to have separate counter for certified fragments
to determine if the transaction is streaming and seqnos of fragments
which have been succesfully stored.
* Provider release is called only after succesful fragment certification
and fragment store.
* Fixed handling of write sets with rollback flag set in apply_write_set()
SR tranasctions are BF aborted or rolled back on primary view
changes according to the following rules:
* Ongoing local SR transactions are BF aborted if the processing
server is not found from the current view.
* All remote SR transactions whose origin server is not included in the
current view are rolled back.
* Enable codepath to BF abort high priority SR applier
* Pass ws_handle, ws_meta to high priority service rollback
call to allow total ordering of rollback process
* Added server_id into transaction in order to be able to stop
streaming applier during high priority BF abort
* Added missing commit fragment applying
* Don't clear fragments for replaying SR transaction