1
0
mirror of https://github.com/MariaDB/server.git synced 2025-07-24 19:42:23 +03:00
Commit Graph

31 Commits

Author SHA1 Message Date
feeeacc4d7 MDEV-30955 Explicit locks released too early in rollback path
Assertion `thd->mdl_context.is_lock_owner()` fires when a client is
disconnected, while transaction and and a table is opened through
`HANDLER` interface.
Reason for the assertion is that when a connection closes, its ongoing
transaction is eventually rolled back in
`Wsrep_client_state::bf_rollback()`. This method also releases explicit
which are expected to survive beyond the transaction lifetime.
This patch also removes calls to `mysql_ull_cleanup()`. User level
locks are not supported in combination with Galera, making these calls
unnecessary.
2023-04-18 13:57:59 +02:00
cadc3efcdd MDEV-27317 wsrep_checkpoint order violation due to certification failure
With binlogs enabled, debug assertion ut_ad(xid_seqno > wsrep_seqno)
fired in trx_rseg_update_wsrep_checkpoint() when an applier thread
synced the seqno out of order for write set which had failed
certification. This was caused by releasing commit
order too early when binlogs were on, allowing group
commit to run in parallel and commit following transactions
too early.

Fixed by extending the commit order critical section to cover
call to wsrep_set_SE_checkpoint() also when binlogs are on.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2023-03-31 12:41:24 +02:00
cd97523dcf MDEV-30317 Transaction savepoint may cause failure in galera replaying
Created mtr test for reproducing the crash

Developed actual fix for the issue.
Setting THD::system_thread_info.rpl_sql_info for replayer thread,
same way as it is handled for appliers.

Recorded test result, with the fix

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2023-01-13 13:11:03 +02:00
3c92050d1c Fix build without either ENABLED_DEBUG_SYNC or DBUG_OFF
There are separate flags DBUG_OFF for disabling the DBUG facility
and ENABLED_DEBUG_SYNC for enabling the DEBUG_SYNC facility.
Let us allow debug builds without DEBUG_SYNC.

Note: For CMAKE_BUILD_TYPE=Debug, CMakeLists.txt will continue to
define ENABLED_DEBUG_SYNC.
2022-09-23 17:37:52 +03:00
2917bd0d2c Reduce compilation dependencies on wsrep_mysqld.h
Making changes to wsrep_mysqld.h causes large parts of server code to
be recompiled. The reason is that wsrep_mysqld.h is included by
sql_class.h, even tough very little of wsrep_mysqld.h is needed in
sql_class.h. This commit introduces a new header file, wsrep_on.h,
which is meant to be included from sql_class.h, and contains only
macros and variable declarations used to determine whether wsrep is
enabled.
Also, header wsrep.h should only contain definitions that are also
used outside of sql/. Therefore, move WSREP_TO_ISOLATION* and
WSREP_SYNC_WAIT macros to wsrep_mysqld.h.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-08-31 11:05:23 +03:00
507030c492 MDEV-27713 Crash after a conflict of applier thread with stored procedure call by event scheduler
When thread is BF aborted by high priority service, ULL (user level
locks need to be removed and released). Calling directly release of lock for
MDL_EXPLICIT type doesn't clear also `thd->ull_hash`. Method
`mysql_ull_cleanup` will properly clear all information about ULL locks
for thread.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-18 08:30:26 +02:00
15b691b7bd After-merge fix f84e28c119
In a rebase of the merge, two preceding commits were accidentally reverted:
commit 112b23969a (MDEV-26308)
commit ac2857a5fb (MDEV-25717)

Thanks to Daniele Sciascia for noticing this.
2021-08-25 17:35:44 +03:00
f84e28c119 Merge 10.3 into 10.4 2021-08-18 16:51:52 +03:00
ac2857a5fb MDEV-25717 Assertion `owning_thread_id_ == wsrep::this_thread::get_id()'
A test case to reproduce the issue. The actual fix is in galera
library.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2021-08-18 12:28:11 +03:00
112b23969a MDEV-26308 : Galera test failure on galera.galera_split_brain
Contains following fixes:

* allow TOI commands to timeout while trying to acquire TOI with
override lock_wait_timeout with a LONG_TIMEOUT only after
succesfully entering TOI
* only ignore lock_wait_timeout on TOI
* fix galera_split_brain test as TOI operation now returns ER_LOCK_WAIT_TIMEOUT after lock_wait_timeout
* explicitly test for TOI

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2021-08-18 08:57:33 +03:00
5511c21b41 MDEV-23851 BF-BF Conflict issue because of UK GAP locks
Add missing dbug sync point and correct the test case.
2021-03-17 08:35:38 +02:00
41a961d85c MDEV-24327 wsrep XID checkpointing order violation with cert failures
Handling of write sets, which fail in certification happens differently than
with write sets which pass certification. When certification fails, the
write set applying can be skipped and applier needs only to take care of
wsrep XID checkpointing. With current implementation, this can rush ahead
of wsrep XID checkpointing of successful write sets.

The fix in this PR registers wsrep XID checkpointing of certification failure cases in group commit,
which guarantees that XID ceckpointing order is synchronized with real committing transactions.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2020-12-17 09:14:49 +02:00
87fa6d2c5c MDEV-24327 wsrep XID checkpointing order with log_slave_updates=OFF
If log_slave_updates==OFF, wsrep applier threads used to be configured
with option: thd->variables.option_bits&= ~(OPTION_BIN_LOG);
(i.e. like sql_log_bin=ON). And this was regardless of log-bin configuration.

With this, having configuration of: --log-bin && --log-slave-updates=OFF,
local threads used binlogging, but applier threads did not. And further:
local threads went through binlog group commit, while applier threads did
direct commits. This resulted in situation, where applier threads entered
earlier in wsrep XID checkpointing, and could sync their wsrep XID out of order.
Later local thread commit would see that higher seqno was already checkpointed,
and fire an assert because of this.

As a fix, applier threads are now forced to enable binlogging regardless of
log-slave-updates configuration.

This PR comes with new mtr test: galera.MDEV-24327, which causes a scenario
where applier transaction is applied and committed while earlier local transaction
is parked before commit order monitor enter. A buggy mariadb versoin would fail
for assertion because of wsrep XID checkpoint order violation.
2020-12-17 08:12:58 +02:00
24ec8eaf66 MDEV-15532 after-merge fixes from Monty
The Galera tests were massively failing with debug assertions.
2020-12-02 16:16:29 +02:00
571764c04f MDEV-23584 : Galera assertion failure "thd->transaction.stmt.is_empty() at transaction.cc:69
If statement still contains changes it must be committed
before actual transaction is committed.

This assertion was found using randgen and happens only on
applier. No repeatable test case found.
2020-08-26 13:34:51 +03:00
3d76af0814 MDEV-22998 Free thd->mem_root at applier commit or rollback. 2020-07-24 19:33:45 +03:00
bdcecfa22c MDEV-22021: Galera database could get inconsistent with rollback to savepoint
When binlog is disabled, WSREP will not behave correctly when
SAVEPOINT ROLLBACK is executed and we will not rollback transaction.
2020-03-31 14:18:21 +03:00
2d4b6571ec Wsrep position not updated in InnoDB after certification failures (#1432)
A certification failure followed by a clean shutdown would cause an
inconsistency between the sequence number stored in innodb and the
sequence number stored in provider.
This happened both in the case of local certification failure, and in
the case where dummy writeset is applied.
The fix consists of:
- updating wsrep position after dummy writeset is delivered in
 `Wsrep_high_priority_service::log_dummy_write_set()`
- updating wsrep position while releasing commit order in wsrep-lib
 side

Added two tests which stress the situation where a server is shutdown
after a certification failure.
2020-01-14 07:33:02 +02:00
13b3d7f1f1 MDEV-20793 Assertion failed after replay.
Assertion failed in wsrep-lib after transaction replay which
failed due to conflict in certification.

- Implemented reproducible test case MDEV-20793 to reproduce the crash.
- Fixed wsrep-lib to deal with certification error during replay.
2019-12-31 11:46:55 +02:00
67e063eb94 Update wsrep-lib. (#1426)
This commit updates the wsrep-lib. The changes are a cleanup in
client_state TOI processing and stub methods for future extensions.
2019-12-16 07:50:15 +02:00
f4ba775914 MDEV-17099 Preliminary changes for Galera XA support (#1404)
Redo changes reverted in commit
8f46e3833c, this time without build
issues in wsrep-lib.
2019-10-30 10:45:22 +02:00
8f46e3833c Revert MDEV-17099 Preliminary changes for Galera XA support (#1401)
This reverts commit 2b5f4b3ed6
due to build failures.
2019-10-28 16:16:21 +02:00
2b5f4b3ed6 MDEV-17099 Preliminary changes for Galera XA support (#1401)
Update wsrep-lib, and adapt to wsrep-lib interface changes.
2019-10-24 14:05:32 +03:00
efefafd02f fix for thread getting stuck after BF ABORT (#1362)
- Fixes a situation in which a thread gets BF aborted and does not send the reply back to
  the client, even though the connection is still alive. That caused
  both sides to hang waiting for the next message. Now we explicitly
  check that the connection is still alive.
- MTR test for the above
- Replaced thd->killed assignments to thd->reset_kill_query where applicable.
2019-09-17 10:58:20 +03:00
9487e0b259 MDEV-19826 10.4 seems to crash with "pool-of-threads" (#1370)
MariaDB 10.4 was crashing when thread-handling was set to
pool-of-threads and wsrep was enabled.

There were two apparent reasons for the crash:
- Connection handling in threadpool_common.cc was missing calls to
  control wsrep client state.
- Thread specific storage which contains thread variables (THR_KEY_mysys)
  was not handled appropriately by wsrep patch when pool-of-threads
  was configured.

This patch addresses the above issues in the following way:
- Wsrep client state open/close was moved in thd_prepare_connection() and
  end_connection() to have common handling for one-thread-per-connection
  and pool-of-threads.
- Thread local storage handling in wsrep patch was reworked by introducing
  set of wsrep_xxx_threadvars() calls which replace calls to
  THD store_globals()/reset_globals() and deal with thread handling
  specifics internally.

Wsrep-lib was updated to version which relaxes internal concurrency
related sanity checks.

Rollback code from wsrep_rollback_process() was extracted to separate calls
for better readability.

Post rollback thread was removed as it was completely unused.
2019-08-30 08:42:24 +03:00
819c40d694 - wsrep-lib update (SR cleanups and voting support) (#1359)
- TOI error ignoring fix (wsrep_ignore_apply_errors)
2019-07-22 16:34:12 +03:00
eb872ceb27 Fixed wsrep replaying for stored procedures (#1256)
- Changed replaying to always allocate a separate THD object
  for applying log events. This is to avoid tampering original
  THD state during replay process.
- Return success from sp_instr_stmt::exec_core() if replaying
  succeeds.
- Do not push warnings/errors into diagnostics area if the
  transaction must be replayed. This is to avoid reporting
  transient errors to the client.

Added two tests galera_sp_bf_abort, galera_sp_insert_parallel.
Wsrep-lib position updated.
2019-04-06 12:33:51 +03:00
1ef50a34ec 10.4 wsrep group commit fixes (#1224)
* MDEV-16509 Improve wsrep commit performance with binlog disabled

Release commit order critical section early after trx_commit_low() if
binlog is not transaction coordinator. In order to avoid two phase commit,
binlog_hton is not registered for THD during IO_CACHE population.

Implemented a test which verifies that the transactions release
commit order early.

This optimization will change behavior during recovery as the commit
is not two phase when binlog is off. Fixed and recorded wsrep-recover-v25
and wsrep-recover to match the behavior.

* MDEV-18730 Ordering for wsrep binlog group commit

Previously out of order execution was allowed for wsrep commits.
Established proper ordering by populating wait_for_commit
for every wsrep THD and making group commit leader to wait for
prior commits before proceeding to trx_group_commit_leader().

* MDEV-18730 Added a test case to verify correct commit ordering

* MDEV-16509, MDEV-18730 Review fixes

Use WSREP_EMULATE_BINLOG() macro to decide if the binlog_hton
should be registered. Whitespace/syntax fixes and cleanups.

* MDEV-16509 Require binlog for galera_var_innodb_disallow_writes test

If the commit to InnoDB is done in one phase, the native InnoDB behavior
is that the transaction is committed in memory before it is persisted to
disk. This means that the innodb_disallow_writes=ON may not prevent
transaction to become visible to other readers before commit is completely
over. On the other hand, if the commit is two phase (as it is with binlog),
the transaction will be blocked in prepare phase.

Fixed the test to use binlog, which enforces two phase commit, which
in turn makes commit to block before the changes become visible to
other connections. This guarantees that the test produces expected
result.
2019-03-15 07:09:13 +02:00
a8ff4f243d MDEV-18631 Fix streaming replication with wsrep_gtid_mode=ON
With wsrep_gtid_mode=ON, the appropriate commit hooks were not
called in all cases for applied streaming transactions.

As a fix, removed all special handling of commit order critical
section from Wsrep_high_priority_service and Wsrep_storage_service.
Now commit order critical section is always entered in ha_commit_trans().

Check for wsrep_run_commit_hook is now done in handler.cc, log.cc.
This makes it explicit that the transaction is an active wsrep
transaction which must go through commit hooks.
2019-03-04 09:00:25 +02:00
7ae685d032 Fixed replaying bugs found with multimaster load
The replayer did not signal replaying waiters. Added
mysql_cond_broadcast() after replaying is over.

Assertion on client error failed after replay attempt failed due
to certification failure. At this point the transaction does not
go through client state, so the client error cannot be overridden.
Assign ER_LOCK_DEADLOCK to thd directly instead.

Use timed cond wait when waiting for replayers to finish and
check if the transaction has been BF aborted during the wait.
2019-02-19 17:09:19 +02:00
36a2a185fe Galera4 2019-01-23 15:30:00 +04:00