1
0
mirror of https://github.com/MariaDB/server.git synced 2025-07-20 10:24:14 +03:00
Commit Graph

229 Commits

Author SHA1 Message Date
a70a1cf3f4 Merge branch '10.3' into 10.4 2022-05-08 23:03:08 +02:00
9614fde1aa Merge branch '10.2' into 10.3 2022-05-03 10:59:54 +02:00
a7923b37c4 MDEV-28263 mariadb-tzinfo-to-sql binlog fixes
The --skip-write-binlog message was confusing that it only had
an effect if the galera was enabled. There are uses beyond galera
so we apply SET SESSION SQL_LOG_BIN=0 as implied by the option
without being conditional on the wsrep status.

Remove wsrep.mysql_tzinfo_to_sql_symlink{,_skip} tests as they offered
no additional coverage beyond main.mysql_tzinfo_to_sql_symlink as no
server testing was done.

Introduced a variant of the galera.mariadb_tzinfo_to_sql as
galera.mysql_tzinfo_to_sql, which does testing using the mysql client
rather than directly importing into the server via mysqltest.

Update man page and mysql_tzinfo_to_sql to having a --skip-write-binlog
option.

merge notes:
10.4:
- conflicts in tztime.cc can revert to this version of --help text.
- tztime.cc - merge execute immediate @prep1, and leave %s%s trunc_tables, lock_tables
  after that.
10.6:
- Need to remove the not_embedded.inc in mysql_tzinfo_to_sql.test and
  replace it with no_protocol.inc
- leave both mysql_tzinfo_to_sql.test and mariadb_tzinfo_to_sql.sql
  tests.
- sql/tztime.cc - keep entirely 10.6 version.
2022-04-23 14:20:22 +10:00
5e04c08d0a MDEV-23326: mtr fix - slow on timezone intialisation
Fix wsrep.mysql_tzinfo_to_sql_symlink_skip test result.
2022-01-14 18:29:24 +11:00
6b4f0d782c MDEV-23326: Aria significantly slow on timezone intialisation
The --skip-write-binary-log added to mysql_tzinfo_to_sql in
MDEV-18778 was only effective if galera was enabled on the server.
This is because it tied together three concepts under one option:
1. binary logging
2. wsrep replication, and
3. using innodb as a transitional table type.

Change 1: small change in help option to reflect this.

To solve the performance problem with Aria tables, LOCK TABLES WRITE
is used to eliminate the need to fdatasync until the UNLOCK TABLES.

If galera isn't enabled, then we also want to use the LOCK TABLE WRITE
mechanism.

The START TRANSACTION added in MDEV-23440 needed to be moved to
before LOCK TABLES otherwise it would cancel their effect.

TRUNCATE TABLE statements also need to be before the LOCK TABLES.

When changing back from InnoDB to Aria, include the ORDER BY that
was originally there in 6aaccbcbf7 and matching the final ALTER
TABLE in the timezonedir branch.

Running: mariadb-tzinfo-to-sql --skip-write-binlog /usr/share/zoneinfo
now generates 16 Aria_transaction_log_syncs from 7053.
2022-01-12 10:39:31 +11:00
157b3a637f MDEV-23328 Server hang due to Galera lock conflict resolution
Mutex order violation when wsrep bf thread kills a conflicting trx,
the stack is

          wsrep_thd_LOCK()
          wsrep_kill_victim()
          lock_rec_other_has_conflicting()
          lock_clust_rec_read_check_and_lock()
          row_search_mvcc()
          ha_innobase::index_read()
          ha_innobase::rnd_pos()
          handler::ha_rnd_pos()
          handler::rnd_pos_by_record()
          handler::ha_rnd_pos_by_record()
          Rows_log_event::find_row()
          Update_rows_log_event::do_exec_row()
          Rows_log_event::do_apply_event()
          Log_event::apply_event()
          wsrep_apply_events()

and mutexes are taken in the order

          lock_sys->mutex -> victim_trx->mutex -> victim_thread->LOCK_thd_data

When a normal KILL statement is executed, the stack is

          innobase_kill_query()
          kill_handlerton()
          plugin_foreach_with_mask()
          ha_kill_query()
          THD::awake()
          kill_one_thread()

        and mutexes are

          victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex

This patch is the plan D variant for fixing potetial mutex locking
order exercised by BF aborting and KILL command execution.

In this approach, KILL command is replicated as TOI operation.
This guarantees total isolation for the KILL command execution
in the first node: there is no concurrent replication applying
and no concurrent DDL executing. Therefore there is no risk of
BF aborting to happen in parallel with KILL command execution
either. Potential mutex deadlocks between the different mutex
access paths with KILL command execution and BF aborting cannot
therefore happen.

TOI replication is used, in this approach,  purely as means
to provide isolated KILL command execution in the first node.
KILL command should not (and must not) be applied in secondary
nodes. In this patch, we make this sure by skipping KILL
execution in secondary nodes, in applying phase, where we
bail out if applier thread is trying to execute KILL command.
This is effective, but skipping the applying of KILL command
could happen much earlier as well.

This also fixed unprotected calls to wsrep_thd_abort
that will use wsrep_abort_transaction. This is fixed
by holding THD::LOCK_thd_data while we abort transaction.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2021-10-29 10:00:17 +03:00
db50ea3ad3 MDEV-23328 Server hang due to Galera lock conflict resolution
Mutex order violation when wsrep bf thread kills a conflicting trx,
the stack is

          wsrep_thd_LOCK()
          wsrep_kill_victim()
          lock_rec_other_has_conflicting()
          lock_clust_rec_read_check_and_lock()
          row_search_mvcc()
          ha_innobase::index_read()
          ha_innobase::rnd_pos()
          handler::ha_rnd_pos()
          handler::rnd_pos_by_record()
          handler::ha_rnd_pos_by_record()
          Rows_log_event::find_row()
          Update_rows_log_event::do_exec_row()
          Rows_log_event::do_apply_event()
          Log_event::apply_event()
          wsrep_apply_events()

and mutexes are taken in the order

          lock_sys->mutex -> victim_trx->mutex -> victim_thread->LOCK_thd_data

When a normal KILL statement is executed, the stack is

          innobase_kill_query()
          kill_handlerton()
          plugin_foreach_with_mask()
          ha_kill_query()
          THD::awake()
          kill_one_thread()

        and mutexes are

          victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex

This patch is the plan D variant for fixing potetial mutex locking
order exercised by BF aborting and KILL command execution.

In this approach, KILL command is replicated as TOI operation.
This guarantees total isolation for the KILL command execution
in the first node: there is no concurrent replication applying
and no concurrent DDL executing. Therefore there is no risk of
BF aborting to happen in parallel with KILL command execution
either. Potential mutex deadlocks between the different mutex
access paths with KILL command execution and BF aborting cannot
therefore happen.

TOI replication is used, in this approach,  purely as means
to provide isolated KILL command execution in the first node.
KILL command should not (and must not) be applied in secondary
nodes. In this patch, we make this sure by skipping KILL
execution in secondary nodes, in applying phase, where we
bail out if applier thread is trying to execute KILL command.
This is effective, but skipping the applying of KILL command
could happen much earlier as well.

This also fixed unprotected calls to wsrep_thd_abort
that will use wsrep_abort_transaction. This is fixed
by holding THD::LOCK_thd_data while we abort transaction.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2021-10-29 07:57:18 +03:00
f59f5c4a10 Revert MDEV-25114
Revert 88a4be75a5 and
9d97f92feb, which had been
prematurely pushed by accident.
2021-09-24 16:21:20 +03:00
88a4be75a5 MDEV-25114 Crash: WSREP: invalid state ROLLED_BACK (FATAL)
This patch is the plan D variant for fixing potetial mutex locking
order exercised by BF aborting and KILL command execution.

In this approach, KILL command is replicated as TOI operation.
This guarantees total isolation for the KILL command execution
in the first node: there is no concurrent replication applying
and no concurrent DDL executing. Therefore there is no risk of
BF aborting to happen in parallel with KILL command execution
either. Potential mutex deadlocks between the different mutex
access paths with KILL command execution and BF aborting cannot
therefore happen.

TOI replication is used, in this approach,  purely as means
to provide isolated KILL command execution in the first node.
KILL command should not (and must not) be applied in secondary
nodes. In this patch, we make this sure by skipping KILL
execution in secondary nodes, in applying phase, where we
bail out if applier thread is trying to execute KILL command.
This is effective, but skipping the applying of KILL command
could happen much earlier as well.

This patch also fixes mutex locking order and unprotected
THD member accesses on bf aborting case. We try to hold
THD::LOCK_thd_data during bf aborting. Only case where it
is not possible is at wsrep_abort_transaction before
call wsrep_innobase_kill_one_trx where we take InnoDB
mutexes first and then THD::LOCK_thd_data.

This will also fix possible race condition during
close_connection and while wsrep is disconnecting
connections.

Added wsrep_bf_kill_debug test case

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2021-09-24 09:47:31 +03:00
217caf27e3 Update result files for Galera library 26.4.9 2021-07-27 10:34:45 +03:00
473e85e931 MDEV-25591 : Test case cleanups
galera_var_wsrep_on_off : Add wait conditions to make sure DDL is
replicated before continuing.

wsrep.[variables|variables_debug] :  Remove unnecessary parts
and add check to correct number of variables or skip

galera_ssl_reload: Add version check and SSL checks.
2021-05-04 11:34:06 +03:00
9e6310e323 Fix MTR test wsrep.variables_debug
The test was changing variable wsrep_provider dynamically,
but wsrep_provider was recently made read-only.

followup for ce3a2a688d
2021-04-26 09:56:46 +03:00
eb4123eefc More fixes to variable wsrep_on
* Disallow setting wsrep_on = 1 if wsrep_provider is unset. Also, move
  wsrep_on_basic from sys_vars to wsrep suite: this test now requires
  to run with wsrep_provider set
* Disallow setting @@session.wsrep_on = 1 when @@global.wsrep_on = 0
* Handle the case where a new connection turns @@global.wsrep_on from
  off to on. In this case we would miss a call to wsrep_open, causing
  unexpected states in wsrep::client_state (causing assertions).
* Disable wsrep.MDEV-22443 because it is no longer possible to enable
  wsrep_on, if server is started with wsrep_provider='none'

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2021-04-20 08:24:14 +03:00
e841957416 Merge branch '10.3' into 10.4 2021-02-23 09:25:57 +01:00
0ab1e3914c Merge branch '10.2' into 10.3 2021-02-22 22:42:27 +01:00
a638f1577a Merge branch 'bb-10.2-release' into 10.2 2021-02-22 18:43:03 +01:00
53123dfa3e Merge branch 'bb-10.3-release' into bb-10.4-release 2021-02-19 00:19:42 +01:00
0d55b020e1 Merge branch 'bb-10.2-release' into bb-10.3-release 2021-02-18 22:09:53 +01:00
ce3a2a688d make @@wsrep_provider and @@wsrep_notify_cmd read-only
this should simplify run-time cluster management
2021-02-18 19:03:01 +01:00
4d300ab1a8 MDEV-24867 : wsrep.variables MTR failed: Result length mismatch
Stabilize test case.
2021-02-17 10:28:37 +02:00
00a313ecf3 Merge branch 'bb-10.3-release' into bb-10.4-release
Note, the fix for "MDEV-23328 Server hang due to Galera lock conflict resolution"
was null-merged. 10.4 version of the fix is coming up separately
2021-02-12 17:44:22 +01:00
60ea09eae6 Merge branch '10.2' into 10.3 2021-02-01 13:49:33 +01:00
900a14754a Fix wsrep.variables 2021-01-26 14:21:33 +02:00
be5fce16a0 MDEV-24596 : Assertion `state_ == s_exec || state_ == s_quitting' failed in wsrep::client_state::disable_streaming
There were multiple problems here
* wsrep_trx_fragment_size should not be set when wsrep is disabled or provider is not loaded
* wsrep_trx_fragment_unit should not be set when wsrep is disabled or provider is not loaded
* wsrep_debug has no effect if wsrep is disabled or provider is not loaded
* wsrep_start_position should not be set when wsrep is disabled or provider is not loaded any other value than default
* wsrep_start_position should be changed only when we are joiner or initialized
* wsrep_start_position should be allowed to set only a value that exits, thus
we need to add error handling to wsrep_sst_complete
2021-01-21 11:41:29 +02:00
6f50f51e60 MDEV-21494 : Galera test sporadic failure on galera.galera_defaults
Make sure that we operate with correct Galera library version
and do not print wsrep_provider_options field.
2020-11-19 12:42:54 +02:00
4b3690b504 fixup 67cb7ea22a 2020-11-03 14:48:08 +02:00
67cb7ea22a Clean up wsrep.variables 2020-11-03 09:12:06 +02:00
2391582ec3 Merge remote-tracking branch 10.2 into 10.3 2020-11-03 09:00:23 +02:00
94859d985e Clean up wsrep.variables 2020-11-03 08:49:10 +02:00
ec0e9d6f76 MDEV-22681 EXECUTE IMMEDIATE crashes server if wsrep is on.
A wsrep transaction was started for EXECUTE IMMEDIATE, which
caused assertion failure when the executed statement was
CREATE TABLE which should be executed in TOI mode.

As a fix, don't start wsrep transaction for EXECUTE IMMEDIATE
to let the wsrep state logic to be handled from inside stored
procedure codepath.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2020-10-28 09:51:35 +02:00
45e10d46d8 After-merge fix to wsrep.variables
We must make the test depend on a debug version of Galera.
2020-10-22 13:50:06 +03:00
46957a6a77 Merge 10.3 into 10.4 2020-10-22 13:27:18 +03:00
e3d692aa09 Merge 10.2 into 10.3 2020-10-22 08:26:28 +03:00
fdf87973cb MDEV-23081 Stray XA transactions at startup, with wsrep_on=OFF
Change xarecover_handlerton so that transaction with WSREP prefixed
xids are rolled back when Galera is disabled.

Reviewd-by: Jan Lindström <jan.lindstrom@mariadb.com>
2020-10-21 16:29:07 +03:00
a3eddd9f11 MDEV-23659 : Update Galera disabled.def file
Changes to be committed:
	modified:   mysql-test/suite/galera/disabled.def
	modified:   mysql-test/suite/wsrep/disabled.def
2020-10-10 08:50:50 +03:00
fc3b5c7db3 MDEV-17585 : wsrep.variables failed in buildbot with deadlock on CREATE USER
Stabilize test by using correct galera library and restore
original galera cluster at end.
2020-10-10 08:50:50 +03:00
7edfb72eff Fix MTR test wsrep.variables
Require galera_have_debug_sync.inc and re-record to include new
variables exposed by latest galera library.
2020-09-28 12:17:05 +03:00
de76bebc57 MDEV-23659 : Update Galera disabled.def file 2020-09-14 18:21:48 +03:00
b69e980a38 MDEV-20581 Fix MTR test wsrep.variables
Made the test work with --repeat option by adding
galera_wait_ready.inc at the end of test. Recorded the test output.
2020-09-14 18:21:17 +03:00
2fa9f8c53a Merge 10.3 into 10.4 2020-08-20 11:01:47 +03:00
de0e7cd72a Merge 10.2 into 10.3 2020-08-20 09:12:16 +03:00
bfba2bce6a Merge 10.1 into 10.2 2020-08-20 06:00:36 +03:00
f8bf5b0f84 MDEV-23466 SIGABRT on SELECT WSREP_LAST_SEEN_GTID
SELECT WSREP_LAST_SEEN_GTID aborts the server if no provider is
loaded.
2020-08-19 13:12:00 +03:00
fe3284b2cc MDEV-23092 SIGABRT when setting invalid wsrep_provider
Some invalid wsrep_provider paths may be interpreted as a valid
directory. For example '/invalid/libgalera_smm.so' with UTF character
set is interpreted as '/', which is a valid directory. A early check
that wsrep_provider should not be a directory fixes it.
2020-08-19 13:12:00 +03:00
09dd06f14a MDEV-22443 wsrep::runtime_error on START TRANSACTION
This happens with global wsrep_on disabled and local wsrep_on enabled.
The fix consists in avoiding sync wait when global wsrep_on is
disabled.
2020-08-19 13:12:00 +03:00
b970363acf MDEV-23440: mysql_tzinfo_to_sql to use transactions
Since MDEV-18778, timezone tables get changed to innodb
to allow them to be replicated to other galera nodes.

Even without galera, timezone tables could be declared innodb.
With the standalone innodb tables, the mysql_tzinfo_to_sql takes
approximately 27 seconds.

With the transactions enabled in this patch, 1.2 seconds is
the approximate load time.

While explicit checks for the engine of the time zone tables could be
done, or checks against !opt_skip_write_binlog, non-transactional
storage engines will just ignore the transactional state without
even a warning so its safe to enact globally.

Leap seconds are pretty much ignored as they are a single insert
statement and have gone out of favour as they have caused MariaDB
stalls in the past.
2020-08-15 14:02:05 +10:00
eae968f62d Merge 10.3 into 10.4 2020-08-10 21:08:46 +03:00
bafc5c1321 Merge 10.2 into 10.3 2020-08-10 18:40:57 +03:00
1dec60c795 MDEV-22626: mysql_tzinfo_to_sql not replicates timezone to galeranodes if only 1 timezone will be loaded.
Move alter to InnoDB earlier to more correct place to handle
also if only a one timezone file is loaded.
2020-08-07 09:06:13 +03:00
c2db9397c7 MDEV-18565 Galera mtr-suite fails if galera library is not installed
revert/simplify f5390eea9a

remove galera-specific checks from mtr and the main suite
2020-04-27 09:22:36 +02:00