1
0
mirror of https://github.com/codership/wsrep-lib.git synced 2025-06-30 18:01:53 +03:00

58 Commits

Author SHA1 Message Date
6df17812d9 Introduce set_provider_factory() method for server_state
This allows injecting an application allocated provider into
server_state.

After this virtual provider getter is unnecessary. Made the getter
normal method and fixed unit tests accordingly.
2023-02-17 13:04:52 +02:00
275a0af8c5 Return error codes instead of throwing exception
Changed server_state public methods sst_received() and wait_until_state()
to report errors as return value instead of throwing exceptions.
This was done to gradually get rid of public methods which report
errors via exceptions.

This change was part of MDEV-30419.
2023-01-18 13:47:10 +02:00
31a35bf573 Remove obsolete wsrep::server_state::last_committed_gtid() method 2021-11-30 15:05:38 +02:00
22921e7082 Cache rollback events that failed to replicate for later retry
This patch introduces a queue to store ids of transactions that failed
to send a rollback fragment in streaming_rollback(). This is to avoid
potentially  missed rollback fragments when a cluster splits and then
later reforms. Rollback fragments would be missing if a node rolled
back a transaction locally (either BFed or voluntary rollback) while
non-primary, and the attempt to send rollback fragment failed in
transaction::streaming_rollback().
Transaction that fail to send rollback fragment can proceed to
rollback locally.  However we must ensure that rollback fragments for
those transactions are eventually delivered by the cluster. This must
be done before a potentially conflicting writeset causes BF-BF
conflicts in the rest of the cluster.
2021-09-30 10:41:57 +02:00
a12b814270 Fix various spelling errors
e.g.
- succesfully -> successfully
- preceeding -> preceding
2021-02-04 17:08:08 +02:00
d0255569b0 Variable desynced_on_pause getter 2020-04-03 10:10:29 +02:00
66ee7bed1b Add type wsrep::xid
Create type `wsrep::xid`, and change all signatures that take
`std::string xid` to take `wsrep::xid xid`.
2019-10-18 09:36:18 +02:00
052247144f Support recovery of XA transactions
* Add method `restore_prepared_transaction` to `client_state` class
  which restores a transaction state from storage given its xid.
* Add method `commit_or_rollback_by_xid` to terminate prepared XA
  transactions by xid.
* Make sure that transactions in prepared state are not rolled back
  when their master fails/partitions away.
2019-10-16 10:16:39 +02:00
eb4cf86c1e Implemented thread service support.
Added a wsrep::thread_service interface to allow application to
inject instrumented thread, mutex and condition variable implementation
for provider.

The interface is defined in include/wsrep/thread_service.hpp.
Sample implementation is provided in dbsim/db_threads.[h|c]pp.

This patch will also clean up some remaining dependencies to
wsrep-API compilation units so that the dependency to wsrep-API
is header only. This will extending the provider support to
later wsrep-API versions.
2019-10-14 09:30:15 +03:00
ae746fb289 fixing reviewer comments
- style fixes
- small improvement to avoid unnecessary search on close_orphaned_sr
2019-03-05 10:53:21 +01:00
5ef5becea6 removing previous_primary_view from public iface and style fixes 2019-03-05 10:34:30 +01:00
71f3fb2d01 close SR transacions on equal consecutive views
Fixes a bug where the fact that an SR master leaves the primary view
gets missed. When two consecutive primary views have the same
membership we now assume that every SR needs to be rolled back, as the
system may have been through a state of only non-primary components.
2019-03-05 09:41:48 +01:00
be98517cb3 Debug log level implementation
Debug log will now filter output based on debug level that is enabled.
2019-02-13 13:05:45 +02:00
e7d72ae7f6 codership/mariadb-wsrep#27 Galera cache encryption
* Created interface class for encryption support
* Implemented function for setting enc key to provider, callback function for encryption/decryption
2019-02-01 16:57:34 +01:00
a6b38d2428 codership/wsrep-lib#54 Service call to recover streaming appliers
Introduced server_service recover_streaming_appliers() interface
call which will be called in total order whenever streaming appliers
must be recovered. The call comes with two overloads, one which
can be called from client context (e.g. after SST has been received)
and the other from high priority context (e.g. view event handling).

The client context overload should be eventually be deprecated once
there is a mechanism to make provider signal that it has joined to
the cluster and will start applying events.
2019-01-21 17:00:08 +02:00
47263df442 Revert "codership/mariadb-wsrep#27 Galera cache encryption"
This reverts commit 7e9419e811.
2019-01-21 14:12:28 +02:00
476bcdb41e Revert "codership/mariadb-wsrep#27 Galera cache encryption fixup"
This reverts commit 043e8bc2ea.
2019-01-21 14:12:10 +02:00
043e8bc2ea codership/mariadb-wsrep#27 Galera cache encryption fixup
Fixup to enable/disable encryption on provider loading
2019-01-20 15:20:52 +02:00
7e9419e811 codership/mariadb-wsrep#27 Galera cache encryption
* Implemented encryption callback and enc_set_key
* Added pure virtual functions for encryption functionality
* Set enc key if provider was not loaded on time
2019-01-19 23:58:20 +01:00
89b3561ad8 Read recovered position from sst_received() after initialization
In general the position where the storage recovers after a SST
cannot be known untile the recovery process is over. This in turn
means that the position cannot be known when the server_state
sst_received() method is called. Worked around the problem by
introducing get_position() method into server service which
can be used to get the position from stable storage after SST
has completed and the state has been recovered.
2019-01-15 12:35:06 +02:00
a3a632cafe Added method to check if provider is loaded 2019-01-10 10:04:58 +01:00
e9bb552096 codership/wsrep-lib#34 Provided a method to interrupt state waiters
Intruduced server_state::interrupt_state_waiters() to interrupt
all waiters inside server_state::wait_until_state(). This mechanism
is needed when an error is encountered during state change processing
and waiting threads may need to be interrupted to check and handle
the error condition.

Made server_state::wait_until_state() to throw exception if the
wait was interrupted and the new server state is either disconnecting
or disconnected, which usually indicates error condition.
2018-12-20 19:35:31 +02:00
e81c66cd59 Fixed assertion on server_state connected - disconnecting transition
Transition from server_state connected state to disconnecting must
be allowed to deal with errors during server startup.

Added SST first test cases for server_state transitions:
* Successful join via SST
* Error in connect state
* Error in joiner state
2018-12-20 19:35:31 +02:00
ae0109f9b3 codership/wsrep-lib#34 Refactored view handling
Extracted on_primary_view(), on_non_primary_view() out of on_view().
2018-12-20 19:35:31 +02:00
76424ad515 codership/wsrep-lib#34 Unit test for sync-disconnect-sync
Added unit test for sync-disconnect-sync transition without SST.
2018-12-20 19:35:31 +02:00
256cd6ae60 codership/wsrep-lib#32 Allow transient desync errors in desync_and_pause()
Provider desync may return an error if the provider cannot communicate
with rest of the cluster. However, this is acceptable for example
if the node has dropped from primary view. Instead of returning
error immediately after failed desync(), attempt to pause the provider
regardless of the error. If pause operation fails, error is returned.
In order to avoid resync in resume_and_resync() in the case desync
failed in desync_and_pause(), new member variable desynced_on_pause_
was introduced to decide whether to resync or not in resume_and_resync().
This variable is protected by pause()/resume() calls since they do
not allow concurrent pause/resume operations.
2018-12-13 13:04:41 +02:00
3950ea3027 Refs codership/wsrep-lib#18 Small fixups
- fixed node ID assertion in on_connect() method,
   fixed "sanity checks" to allow reconnection to primary component
 - fixed code duplication in on_view() method
2018-11-23 23:27:09 +02:00
fb14883547 Recover current view from state after SST.
When member joins the group and needs to receive an SST it won't
receive the corresponding menbership view event because the SST
happens after the event and will already include the effects of
all events ordered before it. The view then must be recovered from
the received state.

Minor renames and cleanups.

References codership/wsrep-lib#18
2018-11-12 12:47:42 +02:00
ea9971d54b - Initialize member cluster ID only on connection to cluster and forget
it on disconnect.
 - Don't rely on own index from the view because the view may come from
   another member (IST/SST), instead always determine own index from own ID.

Refs codership/wsrep-lib#13
2018-11-09 00:42:05 +02:00
1c0a82f5b1 Typo fixes in server_state.hpp 2018-10-24 11:35:28 +03:00
435b589ff5 Merge branch 'master' into deadlock-warnings 2018-10-17 17:09:05 +03:00
a9abb3a80a Extracted mock_server_service out of mock_server_state. 2018-10-17 12:07:41 +03:00
7c6ee3f61f In order to avoid potential deadlocks, release client_state lock when
calling server state methods which may acquire server_state mutex.

Fixed compilation errors in release mode.
2018-10-15 16:35:19 +03:00
c0c977f9ab Added GPLv2 licence and copyright headers. 2018-10-15 15:14:22 +03:00
31f244c3b3 Fixed compilation on Ubuntu 18.04 / GCC 7.3.0 2018-10-02 21:41:14 +03:00
5bf8ad1294 Close SR transactions when disconnecting from the group.
Moved SR fragment removal for total order BFd SR transactions
into after_rollback() call to avoid deadlocking while trying
to access storage before rolling back the transaction.
2018-07-19 15:13:27 +03:00
86472ee420 Implemented SR transaction rollbacking during configuration changes.
SR tranasctions are BF aborted or rolled back on primary view
changes according to the following rules:
* Ongoing local SR transactions are BF aborted if the processing
  server is not found from the current view.
* All remote SR transactions whose origin server is not included in the
  current view are rolled back.
2018-07-14 16:11:13 +03:00
2ac13100f7 Refactored storage service out of client service interface. 2018-07-07 18:06:37 +03:00
d80a69fe90 Defined log_state_change() interface in server_service.
The interface method can be used to notify the DBMS implementation
about state changes in well defined order. The call will be done
under server_state mutex protection.
2018-07-05 12:45:22 +03:00
fcefe9f03b Provide additional provider error status. Fixed IST handling. 2018-07-05 11:31:47 +03:00
b3de50fa05 Implemented convenience methods to desync/pause, resume/resync. 2018-07-04 18:12:42 +03:00
c552d944ed Deprecated sst_transferred(), always use sst_received() 2018-07-03 10:20:36 +03:00
635eaf4c29 Refactored high priority service out of client service. 2018-07-02 18:22:24 +03:00
db18e91c42 Implemented client last_written_gtid, sync_wait 2018-06-30 07:44:09 +03:00
3d2af88428 Propagate incoming address to provider. 2018-06-29 17:46:11 +03:00
0851970c53 Bootstrap server service, fixes to server state management
* Added bootstrap service call to do DBMS side bootstrap operations
  during the cluster bootstrap.
* Added last_committed_gtid() to provider interface
* Implemented wait_for_gtid() provider call
* Pass initial position to the server state
2018-06-29 11:54:33 +03:00
fd9cf87141 * Return provider status from provider connect
* Call to get server status variables along with provider variables
* Deal with intermediate non-prims
2018-06-27 15:36:52 +03:00
b1a374a9ba Fixed pause/desync logic. Allow concurrent callers to desync,
this should be dealt with the provider. However, only one
thread is allowed to call pause at the time to keep track
of implicit desyncs when pausing the provider.
2018-06-26 13:42:42 +03:00
2a53198f5c Protocol version and connected gtid
* Propagate server max protocol version to provider init options
* Store gtid from connected call to make cluster id and the connect
  position available
2018-06-26 11:34:05 +03:00
d3821d88a5 Partial implementation of methods needed for SST.
* server_state desync()/resync() and pause()/resume()
* Fixes to server_state state machine
2018-06-24 14:35:47 +03:00