when losing error voting:
- if NBO has failed locally (DBMS side), don't override original DBMS
error so it gets reported to the client
- otherwise, report "query interrupted" instead of "error during commit"
- Retry enter_toi() in poll_enter_toi() also for error_connection_failed
which means that the connectivity to the cluster has been lost,
a.k.a non-prim.
Certification keys are needed for NBO end to resolve dependencies
for the write sets which follow NBO end. Without keys the following
write sets do not detect dependency to NBO event and may start applying
too early.
- Set both current error and current error status if provider enter_toi()
or leave_toi() fails.
- Leave NBO mode if TOI cannot be entered in begin_nbo_phase_two().
If timeout option is give, enter_toi_local() and begin_nbo_phase_one()
retry provider::enter_toi() as long as return status indicates
certification failure, given timeout expires or the client is interrupted.
- High priority interface method to apply NBO begin, separate from
apply_toi() in order to avoid implementation to force interpreting
ws_meta flags.
- Method to put client_state into NBO mode when applying NBO begin.
The client_state will process in m_local mode.
- Unit tests for applying NBO
- Implemented calls to enter and leave NBO phase one and two
- Extended client_state mode checking to include m_nbo
- Changed client_state state and mode change sanity checks to
print a warning and assert() instead of throwing exceptions
to be more graceful in release builds.
In convert_streaming_client_to_applier() the new streaming applier
is created and deleted if the server has been disconnected. However,
releasing streaming applier may modify thread local storage. Call
store_globals() to restore thread local storage before returning.
Sanity checks to detect concurrency bugs were assuming a threading
model where each client state would always be processed within
single thread of execution. This however may be too strong assumption
if the application uses some kind of thread pooling.
This patch relaxes those assumptions by removing current_thread_id_
from client_state and relaxing assertions against owning_thread_id_.
This patch also adds a new method
wait_rollback_complete_and_acquire_ownership() into
client_state. This method is idempotent and can be used to gain
control to client_state before before_command() is called.
The method will wait until possible background rollback process is
over and marks the state to s_exec to protect the state against
new background rollbacks.
Other fixes/improvements:
- High priority globals state is restored after discarding streaming.
- Allowed server_state transition donor -> synced.
- Client state method store_globals() was renamed to acquire_ownership()
to better describe the intent. Method store_globals() was left for
backwards compatibility and marked deprecated.
- populate and pass real error description buffer to provider in case
of applying error
- return 0 from server_state::on_apply() if error voting confirmed
consistency
- remove fragments and rollback after fragment applying failure
- always release streaming applier on commit or rollback
Added version header which contains definitions for major, minor
and patch version numbers, as well as for lowest and highest supported
wsrep-API versions. The library versioning follows Semantic Versioning.
Handle CMake policy CMP0048 in top level CMakeLists.txt.
Write wsrep-lib logging facility output from unit tests into file.
The argument --wsrep-log-file can be controlled where the
log from wsrep-lib is written during unit tests. The default is
'wsrep-lib_test.log' in the working directory. In order to
get the log written into stdout, use empty value,
i.e. --wsrep-log-file=''.
Made travis test run a bit more verbose.
Fixes a bug where the fact that an SR master leaves the primary view
gets missed. When two consecutive primary views have the same
membership we now assume that every SR needs to be rolled back, as the
system may have been through a state of only non-primary components.