1
0
mirror of https://github.com/codership/wsrep-lib.git synced 2025-07-30 07:23:07 +03:00
Commit Graph

360 Commits

Author SHA1 Message Date
3f449c6318 Remove calls to client_service::will_replay()
Method client_service::will_replay() is now called whenever the
transaction state changes to s_must_replay, in transaction::state().
2020-10-27 09:40:42 +01:00
6752a4504f Address review comments
* Added unit tests for transaction::xa_detach() and
  transaction::xa_replay()
* Added unit tests for wsrep::xid
* Fixed minor issues pointed out by reviewer
2020-10-26 14:22:22 +01:00
39e37d3a39 Fix BF abort one phase XA transactions
Assertion is_streaming() would trigger in transaction::before_commit()
if a one phase XA transaction was BF aborted at the right
time (because one phase XA transaction is not streaming yet at commit
time).
2020-10-26 14:22:22 +01:00
588166e183 Add debug logging to transaction::commit_or_rollback_by_xid 2020-10-26 14:22:22 +01:00
0172d0fe4f Remove unused debug sync point 2020-10-26 14:22:22 +01:00
b12bbd059c Support for replaying prepared XA transactions
This patch implments replaying for prepared XA transactions.
Replay may happen in the following cases:

1) The transaction is BF aborted in prepared state and is idle. In
that case, the transaction is handed over to rollbacker for replay.

2) The transaction is BF aborted while executing the
commit (i.e. before or after successful certification). In
which case the transaction replays itself from fragment storage.

3) The transaction is BF aborted while certifying its commit
fragment. This case is handled like replay for streaming transactions,
where the provider is directly involved and re-delivers the last
fragment.
2020-10-26 14:22:22 +01:00
965642eded Support for detaching prepared XA transactions
Add support for detaching XA transactions. This is useful for handling
the case where the DBMS client has a transaction in prepared state and
disconnects. Before disconnect, the DBMS calls the newly introduced
client_state::xa_detach(), to cleanup the local transaction and
convert it to a high priority transaction. The DBMS may later attempt
to terminate the transaction through client_state::commit_by_xid() or
client_state::rollback_by_xid().

Also in this patch:

- Fix client_state::close() so that it does not rollback transactions
  in prepared state
- Changed class wsrep::xid representation to hold enough information
  so that DBMS can convert to its native representation
- Fix potential infinite loop in
  server_state::find_streaming_applier(wsrep:xid&)
- Append SR keys on prepare fragment and make it pa_unsafe
- Handle one phase commit (simply fall back to two phase)
- Do not rollback prepared streaming clients in
  server_state::close_orphaned_transactions()
2020-10-26 14:20:21 +01:00
2da6e4894e Unnecessary adopt/start transaction in rollback_fragment()
This patch changes the handling of a rollback fragment so that
the high_priority_service adopts and starts a new transaction only
if fragment removal has to be performed.
When no fragment removal happens, starting a new transaction is
unnecessary: a dummy write set is logged instead and the transaction
is not cleaned up properly in DBMS side.
2020-10-26 10:21:59 +01:00
7245db4704 Enable -Wsuggest-override if supported by the compiler. 2020-10-22 17:31:21 +03:00
d1482feb32 Ensure that client_service::will_replay() is called.
Modified tests to verify that client_service::will_replay() is
called whenever it is determined that the transaction must replay.

Added a test to verify behavior when provider::commit_order_enter()
returns BF abort error.

Moved call to client_service::will_replay() into transaction::state()
to ensure that it is always called when shift to s_must_replay
happens.
2020-10-19 06:12:17 +03:00
ec767cd3f0 Ostream operator for key type for better readability. 2020-10-19 05:31:20 +03:00
04944b4d10 Unlock before releasing aborted transaction in provider.
Having aborted transaction holding a lock when releasing the
transaction in provider may cause a deadlock if:
- The transaction was BF aborted before it was known that the
  latest fragment was successfully replicated,
- Transaction was going to be released on provider side, but
  it waited for commit order,
- BF thread tried to grab lock for BF aborting, perhaps for
  second time.

As a fix, unlock the lock protecting victim transaction for
the duration of transaction release.
2020-10-13 17:51:51 +03:00
eac56f19e0 Remove dead code 2020-10-05 15:10:48 +02:00
ae4e58ba03 Check existence if dl library.
All platforms do not have dl library, but dlopen() and friends
are included in libc.

Check existence of dl lib and store into WSREP_LIB_LIBDL if found.
2020-07-31 11:13:15 +03:00
3e5a28df32 codership/wsrep-lib#135 Fix wrong assertion in before_command().
An assertion

  `server_state_.rollback_mode() == wsrep::server_state::rm_async`

fired in `client_state::before_command()` if a BF abort happened
between calls to wait_rollback_complete_and_acquire_ownership()
and before_command().

This commit adds a test to reproduce the assertion and verify
the correct behavior, as well as removes the incorrect assertion
to fix the issue.
2020-07-24 10:46:48 +03:00
daae4a9c35 Some methods in wsrep-lib still hide/ignore return codes from provider
which complicates diagnostics and debugging.

Don't ignore provider return codes and more verbose error logging for
sst_sent(), sst_received(), set_encryption_key() methods

Refs codership/wsrep-lib#127
2020-07-14 12:50:04 +03:00
94f5696010 Provide logger callback for wsrep_load 2020-04-17 17:24:23 +02:00
0cec027030 Refs codership/wsrep-lib#124 32-bit compilation fix 2020-03-20 13:52:50 +00:00
a17b65a25f Set server position after local certification failure
After a local certification failure, commit order is released without
the setting the current position in DBMS. Which results in diverging
positions between provider and DBMS, if clean shutdown happens right
after local certification failure.
This patch add method set_position() to server_service class. So that
wsrep-lib can instruct DBMS to set the current position after local
certification failure releases commit order.
2020-01-07 11:20:21 +01:00
76f7249b8d Incorrect assertion and state handling in after_replay().
If the transaction fails during replay because of certification
failure, the provider will return control to applier without
terminating the transaction and transaction remains in
s_replaying.

Fixed transaction::after_statement() to handle the state changes
correctly if certification failure is returned from replay.
Replaying was extracted to separate private method from
after_statement(). Removed transaction::after_replay() as it
seems now unnecessary and it bypassed state change sanity checks.

Allowed replaying -> committed transaction transition to handle
the situation where DBMS allocates a new context and client_state
to do the replay.
2019-12-28 12:28:38 +02:00
90157ed1b0 Allow concurrent server_state disconnect operations.
Shutting down the provider may cause replication/appling failures, which
may further result to disconnect calls from failing operations.
Allow concurrent disconnect requests to deal with such a situations.
2019-12-08 13:42:11 +02:00
57523eea75 enter_toi polling fix 2019-12-08 12:52:36 +02:00
7d8583983f Fixes to review comments
- Increased loop sleep in poll_enter_toi()
- Fixed typos in comments
- Got rid of unnecessary ostringstreams
2019-12-08 12:52:36 +02:00
3fd20c4e4d Fixed compilation for gcc 4.4 2019-12-08 12:52:36 +02:00
043ff7a7e9 remove has_error arg from begin_nbo_phase_two 2019-12-08 12:52:36 +02:00
3389b7ad3c better error handling for NBO failures
when losing error voting:
- if NBO has failed locally (DBMS side), don't override original DBMS
  error so it gets reported to the client
- otherwise, report "query interrupted" instead of "error during commit"
2019-12-08 12:52:36 +02:00
b63e753aec removed unnecessary leave_toi and related TODO 2019-12-08 12:52:36 +02:00
f27f549479 poll_enter_toi timeout handlign 2019-12-08 12:52:36 +02:00
e0f9550967 handle certification error explicitly when entering TOI 2019-12-08 12:52:36 +02:00
922ce579c7 Clear NBO meta on failure, reset current error status after command. 2019-12-08 12:52:36 +02:00
086c466637 - Added wait-until parameter for begin_nbo_phase_two().
- Retry enter_toi() in poll_enter_toi() also for error_connection_failed
  which means that the connectivity to the cluster has been lost,
  a.k.a non-prim.
2019-12-08 12:52:36 +02:00
750052b640 Fixed timeout condition in poll_enter_toi() 2019-12-08 12:52:36 +02:00
3a1b194741 Pass certification keys also for NBO end.
Certification keys are needed for NBO end to resolve dependencies
for the write sets which follow NBO end. Without keys the following
write sets do not detect dependency to NBO event and may start applying
too early.
2019-12-08 12:52:36 +02:00
58cea10577 Release TOI critical section in poll_enter_toi() in case of error. 2019-12-08 12:52:36 +02:00
4ff55088b1 Fix NBO error handling
- Set both current error and current error status if provider enter_toi()
  or leave_toi() fails.
- Leave NBO mode if TOI cannot be entered in begin_nbo_phase_two().
2019-12-08 12:52:36 +02:00
e700ce8c79 Added short sleep between calls to enter_toi(). 2019-12-08 12:52:36 +02:00
aaa92e130b Made gcc 4.4 work. 2019-12-08 12:52:36 +02:00
b05abb005f Chrono definitions to work around g++ 4.4 C++11 incompatibilities. 2019-12-08 12:52:36 +02:00
55fdbb7a05 Added timeout option to enter_toi_local() and begin_nbo_phase_one()
If timeout option is give, enter_toi_local() and begin_nbo_phase_one()
retry provider::enter_toi() as long as return status indicates
certification failure, given timeout expires or the client is interrupted.
2019-12-08 12:52:36 +02:00
594e34052d handle nbo apply eror
also, remove outdated comment
2019-12-08 12:52:36 +02:00
5298d2340e error parameter to nbo calls and m_undefined for toi_mode
toi_mode is set only when actually inside phase one and two.
In between it goes back to m_undefined.
2019-12-08 12:52:36 +02:00
0b12869715 NBO begin error handling, unit test 2019-12-08 12:52:36 +02:00
e9bd950ee6 Fixed nbo_meta handling, release commit order for NBO begin. 2019-12-08 12:52:36 +02:00
c7a25b15db Ingnore NBO end event in applier, it will be handled via local TOI 2019-12-08 12:52:36 +02:00
24ad144db3 - Remove unneeded keys from nbo phase two begin.
- Save nbo meta for phase two
- Assign trx_meta in mutable_ws_meta
2019-12-08 12:52:36 +02:00
85a03394cc NBO applying
- High priority interface method to apply NBO begin, separate from
  apply_toi() in order to avoid implementation to force interpreting
  ws_meta flags.
- Method to put client_state into NBO mode when applying NBO begin.
  The client_state will process in m_local mode.
- Unit tests for applying NBO
2019-12-08 12:52:36 +02:00
1267e29b8f Implementation of client_state NBO operations.
- Implemented calls to enter and leave NBO phase one and two
- Extended client_state mode checking to include m_nbo
- Changed client_state state and mode change sanity checks to
  print a warning and assert() instead of throwing exceptions
  to be more graceful in release builds.
2019-12-08 12:52:36 +02:00
9b25cebdf1 codership/wsrep-lib#117 Fixed empty vector access.
Access to empty vector by using operator[] may cause stdlib++
assertions to fail. Replaced the vector data access to use data()
method which is valid operation even if the vector is empty.

Added unit test to reproduce assertion with empty mutable_buffer access.

Added -D_GLIBCXX_ASSERTIONS preprocessor option to debug builds
to catch standard library misuse.

Added gcc 8 and gcc9 into travis build matrix.
2019-12-05 14:27:35 +02:00
c9513bd2e4 Fixed compilation errors with GCC 4.7
- Do not use [[noreturn]] with GCC older than 4.8.
- Use if/else instead of ternary operator in transaction
  commit_or_rollback_by_xid() to avoid debug build failure with
  GCC 4.7.
2019-10-28 17:22:09 +02:00
66ee7bed1b Add type wsrep::xid
Create type `wsrep::xid`, and change all signatures that take
`std::string xid` to take `wsrep::xid xid`.
2019-10-18 09:36:18 +02:00