1
0
mirror of https://github.com/MariaDB/server.git synced 2025-11-09 11:41:36 +03:00
Commit Graph

8068 Commits

Author SHA1 Message Date
Michael Widenius
7af50e4df4 MDEV-32551: "Read semi-sync reply magic number error" warnings on master
rpl_semi_sync_slave_enabled_consistent.test and the first part of
the commit message comes from Brandon Nesterenko.

A test to show how to induce the "Read semi-sync reply magic number
error" message on a primary. In short, if semi-sync is turned on
during the hand-shake process between a primary and replica, but
later a user negates the rpl_semi_sync_slave_enabled variable while
the replica's IO thread is running; if the io thread exits, the
replica can skip a necessary call to kill_connection() in
repl_semisync_slave.slave_stop() due to its reliance on a global
variable. Then, the replica will send a COM_QUIT packet to the
primary on an active semi-sync connection, causing the magic number
error.

The test in this patch exits the IO thread by forcing an error;
though note a call to STOP SLAVE could also do this, but it ends up
needing more synchronization. That is, the STOP SLAVE command also
tries to kill the VIO of the replica, which makes a race with the IO
thread to try and send the COM_QUIT before this happens (which would
need more debug_sync to get around). See THD::awake_no_mutex for
details as to the killing of the replica’s vio.

Notes:
- The MariaDB documentation does not make it clear that when one
  enables semi-sync replication it does not matter if one enables
  it first in the master or slave. Any order works.

Changes done:
- The rpl_semi_sync_slave_enabled variable is now a default value for
  when semisync is started. The variable does not anymore affect
  semisync if it is already running. This fixes the original reported
  bug.  Internally we now use repl_semisync_slave.get_slave_enabled()
  instead of rpl_semi_sync_slave_enabled. To check if semisync is
  active on should check the @@rpl_semi_sync_slave_status variable (as
  before).
- The semisync protocol conflicts in the way that the original
  MySQL/MariaDB client-server protocol was designed (client-server
  send and reply packets are strictly ordered and includes a packet
  number to allow one to check if a packet is lost). When using
  semi-sync the master and slave can send packets at 'any time', so
  packet numbering does not work. The 'solution' has been that each
  communication starts with packet number 1, but in some cases there
  is still a chance that the packet number check can fail.  Fixed by
  adding a flag (pkt_nr_can_be_reset) in the NET struct that one can
  use to signal that packet number checking should not be done. This
  is flag is set when semi-sync is used.
- Added Master_info::semi_sync_reply_enabled to allow one to configure
  some slaves with semisync and other other slaves without semisync.
  Removed global variable semi_sync_need_reply that would not work
  with multi-master.
- Repl_semi_sync_master::report_reply_packet() can now recognize
  the COM_QUIT packet from semisync slave and not give a
  "Read semi-sync reply magic number error" error for this case.
  The slave will be removed from the Ack listener.
- On Windows, don't stop semisync Ack listener just because one
  slave connection is using socket_id > FD_SETSIZE.
- Removed busy loop in Ack_receiver::run() by using
 "Self-pipe trick" to signal new slave and stop Ack_receiver.
- Changed some Repl_semi_sync_slave functions that always returns 0
  from int to void.
- Added Repl_semi_sync_slave::slave_reconnect().
- Removed dummy_function Repl_semi_sync_slave::reset_slave().
- Removed some duplicate semisync notes from the error log.
- Add test of "if (get_slave_enabled() && semi_sync_need_reply)"
  before calling Repl_semi_sync_slave::slave_reply().
  (Speeds up the code as we can skip all initializations).
- If epl_semisync_slave.slave_reply() fails, we disable semisync
  for that connection.
- We do not call semisync.switch_off() if there are no active slaves.
  Instead we check in Repl_semi_sync_master::commit_trx() if there are
  no active threads. This simplices the code.
- Changed assert() to DBUG_ASSERT() to ensure that the DBUG log is
  flushed in case of asserts.
- Removed the internal rpl_semi_sync_slave_status as it is not needed
  anymore. The @@rpl_semi_sync_slave_status status variable is now
  mapped to rpl_semi_sync_enabled.
- Removed rpl_semi_sync_slave_enabled  as it is not needed anymore.
  Repl_semi_sync_slave::get_slave_enabled() contains the active status.
- Added checking that we do not add a slave twice with
  Ack_receiver::add_slave(). This could happen with old code.
- Removed Repl_semi_sync_master::check_and_switch() as it is not
  needed anymore.
- Ensure that when we call Ack_receiver::remove_slave() that the slave
  is removed from the listener before function returns.
- Call listener.listen_on_sockets() outside of mutex for better
  performance and less contested mutex.
- Ensure that listening is ignoring newly added slaves when checking for
  responses.
- Fixed the master ack_receiver listener is not killed if there are no
  connected slaves (and thus stop semisync handling of future
  connections). This could happen if all slaves sockets where would be
  marked as unreliable.
- Added unlink() to base_ilist_iterator and remove() to
  I_List_iterator. This enables us to remove 'dead' slaves in
  Ack_recever::run().
- kill_zombie_dump_threads() now does killing of dump threads properly.
  - It can now kill several threads (should be impossible but could
    happen if IO slaves reconnects very fast).
  - We now wait until the dump thread is done before starting the
    dump.
- Added an error if kill_zombie_dump_threads() fails.
- Set thd->variables.server_id before calling
  kill_zombie_dump_threads(). This simplies the code.
- Added a lot of comments both in code and tests.
- Removed DBUG_EVALUATE_IF "failed_slave_start" as it is not used.

Test changes:
- rpl.rpl_session_var2 added which runs rpl.rpl_session_var test with
  semisync enabled.
- Some timings changed slight with startup of slave which caused
  rpl_binlog_dump_slave_gtid_state_info.text to fail as it checked the
  error log file before the slave had started properly. Fixed by
  adding wait_for_pattern_in_file.inc that allows waiting for the
  pattern to appear in the log file.
- Tests have been updated so that we first set
  rpl_semi_sync_master_enabled on the master and then set
  rpl_semi_sync_slave_enabled on the slaves (this is according to how
  the MariaDB documentation document how to setup semi-sync).
- Error text "Master server does not have semi-sync enabled" has been
  replaced with "Master server does not support semi-sync" for the
  case when the master supports semi-sync but semi-sync is not
  enabled.

Other things:
- Some trivial cleanups in Repl_semi_sync_master::update_sync_header().
- We should in 11.3 changed the default value for
  rpl-semi-sync-master-wait-no-slave from TRUE to FALSE as the TRUE
  does not make much sense as default. The main difference with using
  FALSE is that we do not wait for semisync Ack if there are no slave
  threads.  In the case of TRUE we wait once, which did not bring any
  notable benefits except slower startup of master configured for
  using semisync.

Co-author: Brandon Nesterenko <brandon.nesterenko@mariadb.com>

This solves the problem reported in MDEV-32960 where a new
slave may not be registered in time and the master disables
semi sync because of that.
2024-01-23 13:03:11 +02:00
Marko Mäkelä
9374772ecd Merge 10.11 into 11.0 2024-01-19 09:07:48 +02:00
Marko Mäkelä
ad13fb36bf Merge 10.6 into 10.11 2024-01-17 17:37:15 +02:00
Yuchen Pei
931df937e9 MDEV-32559 failing spider signal_ddl_recovery_done callback should result in spider deinit
Since 0930eb86cb, system table creation
needed for spider init is delayed to the signal_ddl_recovery_done
callback. Since it is part of the init, failure should result in
spider deinit.

We also remove the call to spider_init_system_tables() from
spider_db_init(), as it was removed in the commit mentioned above and
accidentally restored in a merge.
2024-01-16 17:17:50 +11:00
Yuchen Pei
d06b6de305 Merge branch '10.5' into 10.6 2024-01-11 12:59:22 +11:00
Sergei Golubchik
761d5c8987 MDEV-33092 Undefined reference to concurrency on Solaris
remove thr_setconcurrency()
followup for 8bbcaab160

Fix by Rainer Orth
2024-01-10 10:16:20 +01:00
Yuchen Pei
c9902a20b3 Merge branch '10.4' into 10.5 2024-01-10 18:01:46 +11:00
Marko Mäkelä
613d019497 MDEV-33160 show_status_array() calls various functions via incompatible pointer
In commit b4ff64568c the
signature of mysql_show_var_func was changed, but not all functions
of that type were adjusted.

When the server is configured with `cmake -DWITH_ASAN=ON` and
compiled with clang, runtime errors would be flagged for invoking
functions through an incompatible function pointer.

Reviewed by: Michael 'Monty' Widenius
2024-01-04 12:50:05 +02:00
Sergei Golubchik
c154aafe1a Merge remote-tracking branch '11.3' into 11.4 2023-12-21 15:40:55 +01:00
Sergei Golubchik
7f0094aac8 Merge branch '11.2' into 11.3 2023-12-21 02:14:59 +01:00
Sergei Golubchik
fef31a26f3 Merge branch '11.1' into 11.2 2023-12-20 23:43:05 +01:00
Sergei Golubchik
7a5448f8da Merge branch '11.0' into 11.1 2023-12-19 20:11:54 +01:00
Sergei Golubchik
8c8bce05d2 Merge branch '10.11' into 11.0 2023-12-19 15:53:18 +01:00
Sergei Golubchik
fd0b47f9d6 Merge branch '10.6' into 10.11 2023-12-18 11:19:04 +01:00
Sergei Golubchik
e95bba9c58 Merge branch '10.5' into 10.6 2023-12-17 11:20:43 +01:00
Sergei Golubchik
4231cf6d3f MDEV-32617 deprecate secure_auth=0 2023-12-12 15:21:28 +01:00
Sergei Golubchik
98a39b0c91 Merge branch '10.4' into 10.5 2023-12-02 01:02:50 +01:00
Monty
bc6b6cf6a7 Add back --debug option to mariadbd
This option was never supposed to be depricated.
Almost all MariaDB binaries also supports the --debug option.
2023-11-28 19:19:10 +02:00
Daniel Black
6218b5f7cb MDEV-32567 Remove thr_alarm from server codebase - socket activation fix
Systemd socket activation cannot handle a shutdown on the file
descriptor[1].

Enumerate past the socket activation descriptors.

If there was no shutdown to trigger the breaking of the event loop,
then write to the termination_event_fd that was setup during
the socket activation code for this purpose.

As abort_loop= true is already set at the top of break_connect_loop,
and this is checked in loop before sockets are processed, no
additional checking to read from the termination_event_fd is needed.

Without socket activation defined, or used, termination_event_fd keeps
its -1 default value.

Close the eventfd outside the while loop so retries can happen if
the write fails for some reason.

ref[1]: https://www.freedesktop.org/software/systemd/man/latest/sd_listen_fds.html

Reviewed by: Vladislav Vaintroub
2023-11-23 11:52:38 +11:00
Vladislav Vaintroub
013fc02a23 MDEV-32567 Remove thr_alarm from server codebase
This allows to simplify net_real_read() and net_real_write() a bit.

Removed some superfluous #ifdef/ifndef MYSQL_SERVER from net_serv.cc
The code always runs in server, either normal or embedded.
Dead code for switching socket between blocking and non-blocking modes,
is also removed.

Removed pthread_kill() with alarm signal that woke up main thread on
server shutdown. Used shutdown(2) on polling sockets instead, to the same
effect.

Removed yet another superstitious pthread_kill(), that ran on non-Windows
in terminate_slave_thread().
2023-11-23 11:52:38 +11:00
Vladislav Vaintroub
3424ed7d42 MDEV-32189 Use icu for timezones on windows
Use ICU to work with timezones, to retrieve current timezone name,
abbreviation, and offset from GMT. However in case TZ environment variable
is used to set timezone, and ICU does not have corresponding one,
C runtime functions will be used.

Moved some of timezone handling to mysys.
Added unit tests.
2023-11-21 21:35:02 +01:00
Vladislav Vaintroub
bb8e1bf7a2 Merge 11.3 into 11.4 2023-11-21 15:43:20 +01:00
Alexey Botchkov
801b45bf4f MDEV-26890 : Crash on shutdown, with active binlog dump threads
Backported from 10.7.
   The reason for the crash was a bug in MDEV-19275, after which
   shutdown does not wait for binlog threads anymore.
2023-11-05 23:35:31 +04:00
Alexey Botchkov
1fa196a559 MDEV-27595 Backport SQL service, introduced by MDEV-19275.
The SQL SERVICE backported into the 10.4.
2023-11-05 23:35:31 +04:00
Kristian Nielsen
b8f9f796ff MDEV-31273: Precompute binlog checksums
Compute binlog checksums (when enabled) already when writing events
into the statement or transaction caches, where before it was done
when the caches are copied to the real binlog file. This moves the
checksum computation outside of holding LOCK_log, improving
scalabitily.

At stmt/trx cache write time, the final end_log_pos values are not
known, so with this patch these will be set to 0. Events that are
written directly to the binlog file (not through stmt/trx cache) keep
the correct end_log_pos value. The GTID and COMMIT/XID events at the
start and end of event groups are written directly, so the zero
end_log_pos is only for events in the middle of event groups, which
do not negatively affect replication.

An option --binlog-legacy-event-pos, off by default, is provided to
disable this behavior to provide backwards compatibility with any
external applications that might rely on end_log_pos in events in the
middle of event groups.

Checksums cannot be pre-computed when binlog encryption is enabled, as
encryption relies on correct end_log_pos to provide part of the
nonce/IV.

Checksum pre-computation is also disabled for WSREP/Galera, as it uses
events differently in its write-sets and so on. Extending pre-computation of
checksums to Galera where it makes sense could be added in a future patch.

The current --binlog-checksum configuration is saved in
binlog_cache_data at transaction start and used to pre-compute
checksums in cache, if applicable. When the cache is later copied to
the binlog, a check is made if the saved value still matches the
configured global value; if so, the events are block-copied directly
into the binlog file. If --binlog-checksum was changed during the
transaction, events are re-written to the binlog file one-by-one and
the checksums recomputed/discarded as appropriate.

Reviewed-by: Monty <monty@mariadb.org>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2023-10-27 19:57:43 +02:00
Marko Mäkelä
7b842f1536 Merge 11.2 into 11.3 2023-10-27 10:48:29 +03:00
Yuchen Pei
d0f8dfbcf0 Merge branch '11.1' into 11.2 2023-10-27 18:11:56 +11:00
Yuchen Pei
c4a506f0bf MDEV-32525 Validate --redirect_url supplied as server flag
Like sql_mode, we factor out of ON_CHECK function for export, to be
used in get_options() during server startup, for validation of
--redirect_url value.
2023-10-26 17:55:13 +11:00
Sergei Golubchik
d16817c477 typo fixed. it's wsrep-causal-reads 2023-10-25 17:08:11 +02:00
Marko Mäkelä
9b2a65e41a Merge 11.0 into 11.1 2023-10-19 08:26:16 +03:00
Marko Mäkelä
be24e75229 Merge 10.11 into 11.0 2023-10-19 08:12:16 +03:00
Marko Mäkelä
2ecc0443ec Merge 10.10 into 10.11 2023-10-17 16:04:21 +03:00
Marko Mäkelä
d5e15424d8 Merge 10.6 into 10.10
The MDEV-29693 conflict resolution is from Monty, as well as is
a bug fix where ANALYZE TABLE wrongly built histograms for
single-column PRIMARY KEY.
Also includes a fix for safe_malloc error reporting.

Other things:
- Copied main.log_slow from 10.4 to avoid mtr issue

Disabled test:
- spider/bugfix.mdev_27239 because we started to get
  +Error	1429 Unable to connect to foreign data source: localhost
  -Error	1158 Got an error reading communication packets
- main.delayed
  - Bug#54332 Deadlock with two connections doing LOCK TABLE+INSERT DELAYED
    This part is disabled for now as it fails randomly with different
    warnings/errors (no corruption).
2023-10-14 13:36:11 +03:00
Monty
8edef482a7 Changed some malloc() calls to my_malloc()
- hostnames in hostname_cache added
- Some Galera (WSREP) allocations
- Table caches
2023-10-03 08:25:30 +03:00
Sergei Golubchik
4c3584b510 MDEV-32104 add removed command line options back as noops 2023-09-30 14:43:12 +02:00
Sergei Golubchik
6b9e1220ee MDEV-31811 deprecate old_mode values
mark non-default values of old_mode as deprecated.
print a warning when they're set from the command line and in SQL.
2023-09-30 14:43:12 +02:00
Sergei Golubchik
82174dae06 MDEV-32104 remove deprecated features
In particular:

* @@debug
  deprecated since 5.5.37
* sr_YU locale
  deprecated since 10.0.11
* "engine_condition_pushdown" in the @@optimizer_switch
  deprecated since 10.1.1
* @@date_format, @@datetime_format, @@time_format, @@max_tmp_tables
  deprecated since  10.1.2
* @@wsrep_causal_reads
  deprecated since 10.1.3
* "parser" in mroonga table comment
  deprecated since 10.2.11
2023-09-30 14:43:12 +02:00
Sergei Golubchik
37e854f34a Merge branch '11.1' into 11.2 2023-09-29 16:01:59 +02:00
Sergei Golubchik
05d850d4b3 Merge branch '11.0' into 11.1 2023-09-29 13:58:47 +02:00
Sergei Golubchik
3f6bccb888 Merge branch '10.11' into 11.0 2023-09-29 12:24:54 +02:00
Marko Mäkelä
6a470db552 Merge 10.5 into 10.6 2023-09-14 15:25:53 +03:00
Yuchen Pei
cb1965bd9d Merge branch '10.4' into 10.5 2023-09-14 16:30:11 +10:00
Daniel Black
1831f8e4d7 MDEV-31369 Disable TLS v1.0 and 1.1 for MariaDB
Remove TLSv1.1 from the default tls_version system variable.

Output a warning if TLSv1.0 or TLSv1.1 are selected.

Thanks Tingyao Nian for the feature request.
2023-09-13 20:17:29 +10:00
Marko Mäkelä
0dd25f28f7 Merge 10.5 into 10.6 2023-09-11 14:46:39 +03:00
Marko Mäkelä
f8f7d9de2c Merge 10.4 into 10.5 2023-09-11 11:29:31 +03:00
Monty
69c420be3d Added support for --skip-secure-file-priv
This works the same as secure-file-priv="", but is more obvious way to
turn of secure-file-priv.
2023-09-09 15:14:44 +03:00
Daniel Black
91ab819451 MDEV-25177 Better indication of refusing to start because of ProtectHome
Create test for for case insensitive gives a basic warning on creating
a test file and the next thing a user might see is an abort.

ProtectHome and other systemd setting protect system services from
accessing user data. Unfortunately some of our users do put things
on /home due space or other reasons.

Rather than enumberate the systemd options in a very clunkly fragile
way we put an error associated with the "Can't create test file" and
hope the user can work it out from there.

%M tip thanks Sergei.
2023-09-04 21:26:05 +10:00
Monty
a6bf4b5807 MDEV-29693 ANALYZE TABLE still flushes table definition cache when engine-independent statistics is used
This commits enables reloading of engine-independent statistics
without flushing the table from table definition cache.

This is achieved by allowing multiple version of the
TABLE_STATISTICS_CB object and having independent pointers to it in
TABLE and TABLE_SHARE.  The TABLE_STATISTICS_CB object have reference
pointers and are freed when no one is pointing to it anymore.

TABLE's TABLE_STATISTICS_CB pointer is updated to use the
TABLE_SHARE's pointer when read_statistics_for_tables() is called at
the beginning of a query.

Main changes:
- read_statistics_for_table() will allocate an new TABLE_STATISTICS_CB
  object.
- All get_stat_values() functions has a new parameter that tells
  where collected data should be stored. get_stat_values() are not
  using the table_field object anymore to store data.
- All get_stat_values() functions returns 1 if they found any
  data in the statistics tables.

Other things:
- Fixed INSERT DELAYED to not read statistics tables.
- Removed Statistics_state from TABLE_STATISTICS_CB as this is not
  needed anymore as wer are not changing TABLE_SHARE->stats_cb while
  calculating or loading statistics.
- Store values used with store_from_statistical_minmax_field() in
  TABLE_STATISTICS_CB::mem_root. This allowed me to remove the function
  delete_stat_values_for_table_share().
  - Field_blob::store_from_statistical_minmax_field() is implemented
    but is not normally used as we do not yet support EIS statistics
    for blobs. For example Field_blob::update_min() and
    Field_blob::update_max() are not implemented.
    Note that the function can be called if there is an concurrent
    "ALTER TABLE MODIFY field BLOB" running because of a bug in
    ALTER TABLE where it deletes entries from column_stats
    before it has an exclusive lock on the table.
- Use result of field->val_str(&val) as a pointer to the result
  instead of val (safetly fix).
- Allocate memory for collected statistics in THD::mem_root, not in
  in TABLE::mem_root. This could cause the TABLE object to grow if a
  ANALYZE TABLE was run many times on the same table.
  This was done in allocate_statistics_for_table(),
  create_min_max_statistical_fields_for_table() and
  create_min_max_statistical_fields_for_table_share().
- Store in TABLE_STATISTICS_CB::stats_available which statistics was
  found in the statistics tables.
- Removed index_table from class Index_prefix_calc as it was not used.
- Added TABLE_SHARE::LOCK_statistics to ensure we don't load EITS
  in parallel. First thread will load it, others will reuse the
  loaded data.
- Eliminate read_histograms_for_table(). The loading happens within
  read_statistics_for_tables() if histograms are needed.
  One downside is that if we have read statistics without histograms
  before and someone requires histograms, we have to read all statistics
  again (once) from the statistics tables.
  A smaller downside is the need to call alloc_root() for each
  individual histogram. Before we could allocate all the space for
  histograms with a single alloc_root.
- Fixed bug in MyISAM and Aria where they did not properly notice
  that table had changed after analyze table. This was not a problem
  before this patch as then the MyISAM and Aria tables where flushed
  as part of ANALYZE table which did hide this issue.
- Fixed a bug in ANALYZE table where table->records could be seen as 0
  in collect_statistics_for_table(). The effect of this unlikely bug
  was that a full table scan could be done even if
  analyze_sample_percentage was not set to 1.
- Changed multiple mallocs in a row to use multi_alloc_root().
- Added a mutex protection in update_statistics_for_table() to ensure
  that several tables are not updating the statistics at the same time.

Some of the changes in sql_statistics.cc are based on a patch from
Oleg Smirnov <olernov@gmail.com>

Co-authored-by: Oleg Smirnov <olernov@gmail.com>
Co-authored-by: Vicentiu Ciorbaru <cvicentiu@gmail.com>
Reviewer: Sergei Petrunia <sergey@mariadb.com>
2023-08-18 13:28:39 +03:00
Alexander Barkov
75f25e4ca7 MDEV-30164 System variable for default collations
This patch adds a way to override default collations
(or "character set collations") for desired character sets.

The SQL standard says:
> Each collation known in an SQL-environment is applicable to one
> or more character sets, and for each character set, one or more
> collations are applicable to it, one of which is associated with
> it as its character set collation.

In MariaDB, character set collations has been hard-coded so far,
e.g. utf8mb4_general_ci has been a hard-coded character set collation
for utf8mb4.

This patch allows to override (globally per server, or per session)
character set collations, so for example, uca1400_ai_ci can be set as a
character set collation for Unicode character sets
(instead of compiled xxx_general_ci).

The array of overridden character set collations is stored in a new
(session and global) system variable @@character_set_collations and
can be set as a comma separated list of charset=collation pairs, e.g.:

SET @@character_set_collations='utf8mb3=uca1400_ai_ci,utf8mb4=uca1400_ai_ci';

The variable is empty by default, which mean use the hard-coded
character set collations (e.g. utf8mb4_general_ci for utf8mb4).

The variable can also be set globally by passing to the server startup command
line, and/or in my.cnf.
2023-07-17 14:56:17 +04:00
Marko Mäkelä
2867894ac6 Merge 11.1 into 11.2 2023-06-28 09:39:28 +03:00