1
0
mirror of https://github.com/MariaDB/server.git synced 2025-08-12 20:49:12 +03:00
Commit Graph

7813 Commits

Author SHA1 Message Date
Julius Goryavsky
3cd9f9d1b3 Merge branch '10.5' into '10.6' 2024-12-18 05:09:23 +01:00
Daniele Sciascia
d72c5d1ace Fixup for MDEV-35446
The previous commit for fixing MDEV-35446 disabled setting
Galera errors on COM_STMT_PREPARE commands.
As a side effect, a number of tests were started to fail
due to the client receiving different error codes from the
ones expected in the test dependending on whether --ps-protocol
was used.
Also, in the case of test galera_ftwrl, it was found that
it is expected that during COM_STMT_PREPARE command, we
may perform a sync wait operation, which can fail with
LOCK_WAIT_TIMEOUT error.
The revised fix consists in anticipating the call to
wsrep_after_command_before_result(), so that we check for
BF aborts or errors during statement prepare, before sending
back the statement metadata message to client.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-12-17 09:52:32 +01:00
Kristian Nielsen
0166c89e02 Merge 10.5 -> 10.6
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-12-05 09:20:36 +01:00
Daniele Sciascia
85bcc7d263 MDEV-35446 Sporadic failure of galera.galera_insert_multi
Test failed sporadically when --ps-protocol was enabled:
a transaction that was BF aborted on COMMIT would succeed
instead of reporting the expected deadlock error.
The reason for the failure was that, depending on timing,
the transaction was BF aborted while the COMMIT statement
was being prepared through a COM_STMT_PREPARE command.
In the failing cases, the transaction was BF aborted
after COM_STMT_PREPARE had already disabled the diagnostics
area of the client. Attempt to override the deadlock error
towards the end of dispatch_command() would be skipped,
resulting in a successful COMMIT even if the transaction
is aborted.
This bug affected the following MTR tests:
 - galera_insert_multi
 - galera_nopk_unicode

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-12-03 14:55:09 +01:00
Oleksandr Byelkin
f00711bba2 Merge branch '10.5' into 10.6 2024-10-29 14:20:03 +01:00
Jan Lindström
b3be3c2157 MDEV-30653 : With wsrep_mode=REPLICATE_ARIA only part of mixed-engine transactions is replicated
Replication of non-transactional engines is experimental and
uses TOI. This naturally means that if there is open transaction
with transactional engine it's changes will be rolled back.

Fixed by adding error message if non-transactional engine
is part of multi-engine transaction with warning.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-10-23 04:00:52 +02:00
Monty
bddbef3573 MDEV-34533 asan error about stack overflow when writing record in Aria
The problem was that when using clang + asan, we do not get a correct value
for the thread stack as some local variables are not allocated at the
normal stack.

It looks like that for example clang 18.1.3, when compiling with
-O2 -fsanitize=addressan it puts local variables and things allocated by
alloca() in other areas than on the stack.

The following code shows the issue

Thread 6 "mariadbd" hit Breakpoint 3, do_handle_one_connection
    (connect=0x5080000027b8,
    put_in_cache=<optimized out>) at sql/sql_connect.cc:1399

THD *thd;
1399      thd->thread_stack= (char*) &thd;
(gdb) p &thd
(THD **) 0x7fffedee7060
(gdb) p $sp
(void *) 0x7fffef4e7bc0

The address of thd is 24M away from the stack pointer

(gdb) info reg
...
rsp            0x7fffef4e7bc0      0x7fffef4e7bc0
...
r13            0x7fffedee7060      140737185214560

r13 is pointing to the address of the thd. Probably some kind of
"local stack" used by the sanitizer

I have verified this with gdb on a recursive call that calls alloca()
in a loop. In this case all objects was stored in a local heap,
not on the stack.

To solve this issue in a portable way, I have added two functions:

my_get_stack_pointer() returns the address of the current stack pointer.
The code is using asm instructions for intel 32/64 bit, powerpc,
arm 32/64 bit and sparc 32/64 bit.
Supported compilers are gcc, clang and MSVC.
For MSVC 64 bit we are using _AddressOfReturnAddress()

As a fallback for other compilers/arch we use the address of a local
variable.

my_get_stack_bounds() that will return the address of the base stack
and stack size using pthread_attr_getstack() or NtCurrentTed() with
fallback to using the address of a local variable and user provided
stack size.

Server changes are:

- Moving setting of thread_stack to THD::store_globals() using
  my_get_stack_bounds().
- Removing setting of thd->thread_stack, except in functions that
  allocates a lot on the stack before calling store_globals().  When
  using estimates for stack start, we reduce stack_size with
  MY_STACK_SAFE_MARGIN (8192) to take into account the stack used
  before calling store_globals().

I also added a unittest, stack_allocation-t, to verify the new code.

Reviewed-by: Sergei Golubchik <serg@mariadb.org>
2024-10-16 17:24:46 +03:00
Alexander Barkov
a931da82fa MDEV-34123 CONCAT Function Returns Unexpected Empty Set in Query
Search conditions were evaluated using val_int(), which was wrong.
Fixing the code to use val_bool() instead.

Details:
- Adding a new item_base_t::IS_COND flag which marks Items used
  as <search condition> in WHERE, HAVING, JOIN ON, CASE WHEN clauses.
  The flag is at the parse time.
  These expressions must be evaluated using val_bool() rather than val_int().

  Note, the optimizer creates more Items which are used as search conditions.
  Most of these items are not marked with IS_COND yet. This is OK for now,
  but eventually these Items can also be fixed to have the flag.

- Adding a method Item::is_cond() which tests if the Item has the IS_COND flag.

- Implementing Item_cache_bool. It evaluates the cached expression using
  val_bool() rather than val_int().
  Overriding Type_handler_bool::Item_get_cache() to create Item_cache_bool.

- Implementing Item::save_bool_in_field(). It uses val_bool() rather than
  val_int() to evaluate the expression.

- Implementing Type_handler_bool::Item_save_in_field()
  using Item::save_bool_in_field().

- Fixing all Item_bool_func descendants to implement a virtual val_bool()
  rather than a virtual val_int().

- To find places where val_int() should be fixed to val_bool(), a few
  DBUG_ASSERT(!is_cond()) where added into val_int() implementations
  of selected (most frequent) classes:

  Item_field
  Item_str_func
  Item_datefunc
  Item_timefunc
  Item_datetimefunc
  Item_cache_bool
  Item_bool_func
  Item_func_hybrid_field_type
  Item_basic_constant descendants

- Fixing all places where DBUG_ASSERT() happened during an "mtr" run
  to use val_bool() instead of val_int().
2024-10-08 11:58:46 +02:00
Marko Mäkelä
7e0afb1c73 Merge 10.5 into 10.6 2024-10-03 09:31:39 +03:00
sjaakola
cf0c3ec274 MDEV-30307 KILL command inside a transaction causes problem for galera replication
Added new test scenario in galera.galera_bf_kill
test to make the issue surface. The tetst scenario has
a multi statement transaction containing a KILL command.
When the KILL is submitted, another transaction is
replicated, which causes BF abort for the KILL command
processing. Handling BF abort rollback while executing
KILL command causes node hanging, in this scenario.

sql_kill() and sql_kill_user() functions have now fix,
to perform implicit commit before starting the KILL command
execution. BEcause of the implicit commit, the KILL execution
will not happen inside transaction context anymore.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-27 19:26:26 +02:00
Jan Lindström
a50a5e0f3b MDEV-34647 : 'INSERT...SELECT' on MyISAM table suddenly replicated by Galera
Replication of MyISAM and Aria DML is experimental and best
effort only. Earlier change make INSERT SELECT on both
MyISAM and Aria to replicate using TOI and STATEMENT
replication. Replication should happen only if user
has set needed wsrep_mode setting.

Note: This commit contains additional changes compared
to those already made for the 10.5 branch.

+ small refactoring after main fix.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-02 00:13:05 +02:00
Julius Goryavsky
bac0804d81 Merge branch '10.5' into '10.6' 2024-09-01 06:51:25 +02:00
Julius Goryavsky
b65bbb2fae MDEV-34647: small refactoring after main fix 2024-08-30 21:50:33 +02:00
Jan Lindström
b1d74b7e72 MDEV-33997 : Assertion `((WSREP_PROVIDER_EXISTS_ && this->variables.wsrep_on) && wsrep_emulate_bin_log) || mysql_bin_log.is_open()' failed in int THD::binlog_write_row(TABLE*, bool, const uchar*)
Problem was that we did not found that table was partitioned
and then we should find what is actual underlaying storage
engine.

We should not use RSU for !InnoDB tables.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-08-29 13:41:23 +02:00
Marko Mäkelä
757c368139 Merge 10.5 into 10.6 2024-08-14 10:56:11 +03:00
Jan Lindström
eb30a9d633 MDEV-34647 : 'INSERT...SELECT' on MyISAM table suddenly replicated by Galera
Replication of MyISAM and Aria DML is experimental and best
effort only. Earlier change make INSERT SELECT on both
MyISAM and Aria to replicate using TOI and STATEMENT
replication. Replication should happen only if user
has set needed wsrep_mode setting.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-08-04 17:56:39 +02:00
Hugo Wen
811614d412 MDEV-34625 Fix undefined behavior of using uninitialized member variables
Commit a8a75ba2d causes the MariaDB server to crash, usually with signal
11, at random code locations due to invalid pointer values during any
table operation. This issue occurs when the server is built with -O3 and
other customized compiler flags.

For example, the command `use db1;` causes server to crash in the
`check_table_access` function at line sql_parse.cc:7080 because
`tables->correspondent_table` is an invalid pointer value of 0x1.

The crashes are due to undefined behavior from using uninitialized
variables. The problematic commit a8a75ba2d introduces code that
allocates memory and sets it to 0 using thd->calloc before initializing
it with a placement new operation.
This process depends on setting memory to 0 to initialize member
variables not explicitly set in the constructor. However, the compiler
can optimize out the memset/bfill, leading to uninitialized values and
unpredictable issues.

Once a constructor function initializes an object, any uninitialized
variables within that object are subject to undefined behavior. The
state of memory before the constructor runs, whether it involves
memset or was used for other purposes, is irrelevant after the
placement new operation.

This behavior can be demonstrated with this
[test](https://gcc.godbolt.org/z/5n87z1raG) I wrote to examine the
assembly code. The code in MariaDB can be abstracted to the following,
though it has many layers wrapped around it and more complex logic,
causing slight differences in optimization in the MariaDB build.
To summarize, on x86, the memset in the following code is optimized out
with both -O2 and -O3 in GCC 13, and is only preserved in the much older
GCC 4.9.

    struct S {
      int i;     // uninitialized in consturctor
      S() {};
    };
    int bar() {
      void *buf = malloc(sizeof(S));
      memset(buf, 0, sizeof(S));       // optimized out
      S* s = new(buf) S;
      return s->i;
    }

With GCC13 -O3:

    bar():
          sub     rsp, 8
          mov     edi, 4
          call    malloc
          mov     eax, DWORD PTR [rax]
          add     rsp, 8
          ret

With GCC4.9 -O3

    bar():
          sub     rsp, 8
          mov     edi, 4
          call    malloc
          mov     DWORD PTR [rax], 0
          xor     eax, eax
          add     rsp, 8
          ret

Now we ensure the constructor initializes variables correctly by running
the reset() function in the constructor to perform the memset/bfill(0)
operation. After applying the fix, the crash is gone.

All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services.
2024-07-30 20:18:28 +01:00
Yuchen Pei
f071b7620b Merge branch '10.5' into 10.6 2024-07-16 15:54:22 +08:00
Daniel Black
cf1c381bb8 MDEV-34099: AddressSanitizer running out of memory regardless of stack_thread size
Address Sanitizer's know how to detect stack overrun, so there's
no point in us doing it.

As evidenced by perfschema tests where signficant test failures
because this function failed under ASAN (MDEV-33210).

Also, so since clang-16, we cannot assume much about how local
variables are allocated on the stack (MDEV-31605).

Disabling check idea thanks to Sanja.
2024-07-15 18:02:49 +01:00
Alexander Barkov
e56040fee8 Merge remote-tracking branch 'origin/10.5' into 10.6 2024-07-08 18:59:04 +04:00
Anson Chung
215fab68db Perform simple fixes for cppcheck findings
Rectify cases of mismatched brackets and address
possible cases of division by zero by checking if
the denominator is zero before dividing.

No functional changes were made.

All new code of the whole pull request, including one or several
files that are either new files or modified ones, are contributed
under the BSD-new license. I am contributing on behalf of my
employer Amazon Web Services, Inc.
2024-07-08 10:51:48 +01:00
Yuchen Pei
d7042ec4da Merge branch '10.5' into 10.6 2024-06-26 09:16:54 +08:00
Rex
d513a4ce74 MDEV-19520 Extend condition normalization to include 'NOT a'
Having Item_func_not items in item trees breaks assumptions during the
optimization phase about transformation possibilities in fix_fields().
Remove Item_func_not by extending normalization during parsing.

Reviewed by Oleksandr Byelkin (sanja@mariadb.com)
2024-06-25 04:51:29 +11:00
Marko Mäkelä
0076eb3d4e Merge 10.5 into 10.6 2024-06-24 13:09:47 +03:00
Marko Mäkelä
d9dd673fee MDEV-12008 fixup: Do not add a new error code
New error codes can only be added in the latest major version.
Adding ER_KILL_DENIED_HIGH_PRIORITY would shift by one all
error codes that were added in MariaDB Server 10.6 or later.

This amends commit 1001dae186

Suggested by: Sergei Golubchik
2024-06-24 12:08:13 +03:00
Dave Gosselin
db0c28eff8 MDEV-33746 Supply missing override markings
Find and fix missing virtual override markings.  Updates cmake
maintainer flags to include -Wsuggest-override and
-Winconsistent-missing-override.
2024-06-20 11:32:13 -04:00
Jan Lindström
1001dae186 MDEV-12008 : Change error code for Galera unkillable threads
Changed error code for Galera unkillable threads to
be ER_KILL_DENIED_HIGH_PRIORITY giving message

This is a high priority thread/query and cannot be killed
without the compromising consistency of the cluster

also a warning is produced
  Thread %lld is [wsrep applier|high priority] and cannot be killed

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-06-19 14:07:34 +02:00
Marko Mäkelä
27834ebc91 Merge 10.5 into 10.6 2024-06-10 15:22:15 +03:00
Alexander Barkov
21f56583bf MDEV-32376 SHOW CREATE DATABASE statement crashes the server when db name contains some unicode characters, ASAN stack-buffer-overflow
Adding the test for the length of lex->name into show_create_db().

Without this test writes beyond the end of db_name_buff were possible
upon a too long database name.
2024-06-10 09:31:14 +04:00
Julius Goryavsky
0d85c905c4 MDEV-34269: post-fix code simplification
The code is slightly simplified taking into account
the fact that partition_ht() always returns a normal
hton when there is no partitioning.
2024-06-07 18:26:08 +02:00
Jan Lindström
0172887980 MDEV-34269 : 10.11.8 cluster becomes inconsistent when using composite primary key and partitioning
This is regression from commit 3228c08fa8. Problem is that
when table storage engine is determined there should be
check is table partitioned and if it is then determine
partition implementing storage engine.

Reported bug is reproducible only with --log-bin so make
sure tests changed by 3228c08fa8 and new test are run
with --log-bin and binlog disabled.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-06-07 18:26:08 +02:00
Marko Mäkelä
5ba542e9ee Merge 10.5 into 10.6 2024-05-30 14:27:07 +03:00
Vladislav Vaintroub
736449d30f MDEV-34205: ASAN stack buffer overflow in strxnmov() in frm_file_exists
Correct the second parameter for strxnmov to prevent potential buffer
overflows. The second parameter must be one less than the size of the
input buffer to avoid writing past the end of the buffer.

While the second parameter is usually correct, there are exceptions
that need fixing.

This commit addresses the issue within frm_file_exists() and other
affected places.
2024-05-23 22:08:27 +02:00
Alexander Barkov
310fd6ff69 Backporting bugs fixes fixed by MDEV-31340 from 11.5
The patch for MDEV-31340 fixed the following bugs:

MDEV-33084 LASTVAL(t1) and LASTVAL(T1) do not work well with lower-case-table-names=0
MDEV-33085 Tables T1 and t1 do not work well with ENGINE=CSV and lower-case-table-names=0
MDEV-33086 SHOW OPEN TABLES IN DB1 -- is case insensitive with lower-case-table-names=0
MDEV-33088 Cannot create triggers in the database `MYSQL`
MDEV-33103 LOCK TABLE t1 AS t2 -- alias is not case sensitive with lower-case-table-names=0
MDEV-33108 TABLE_STATISTICS and INDEX_STATISTICS are case insensitive with lower-case-table-names=0
MDEV-33109 DROP DATABASE MYSQL -- does not drop SP with lower-case-table-names=0
MDEV-33110 HANDLER commands are case insensitive with lower-case-table-names=0
MDEV-33119 User is case insensitive in INFORMATION_SCHEMA.VIEWS
MDEV-33120 System log table names are case insensitive with lower-cast-table-names=0

Backporting the fixes from 11.5 to 10.5
2024-05-21 14:58:01 +04:00
Marko Mäkelä
829cb1a49c Merge 10.5 into 10.6 2024-04-17 14:14:58 +03:00
Dave Gosselin
a8a75ba2d0 Factor TABLE_LIST creation from add_table_to_list
Ideally our methods and functions should do one thing, do that well,
and do only that.  add_table_to_list does far more than adding a
table to a list, so this commit factors the TABLE_LIST creation out
to a new TABLE_LIST constructor.  It then uses placement new()
to create it in the correct memory area (result of thd->calloc).
Benefits of this approach:
 1. add_table_to_list now returns as early as possible on an error
 2. fewer side-effects incurred on creating the TABLE_LIST object
 3. TABLE_LIST won't be calloc'd if copy_to_db fails
 4. local declarations moved closer to their respective first uses
 5. improved code readability and logical flow
Also factored a couple of other functions to keep the happy path
more to the left, which makes them easier to follow at a glance.
2024-04-16 10:09:43 -04:00
Oleksandr Byelkin
9b18275623 Merge branch '10.4' into 10.5 2024-04-16 11:04:14 +02:00
Kristian Nielsen
16aa4b5f59 Merge from 10.4 to 10.5
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-04-15 17:46:49 +02:00
Daniele Sciascia
c71dc39529 MDEV-26499 Fix error "mysql_shutdown failed" during MTR tests
- Fix to avoid mysqltest client getting killed abruptly during
  mysql_shutdown(). When Galera replication is shutdown, wait for
  THDs with `thd->stmt_da()->is_eof()` to disconnect (these are about
  to disconnect anyway).
- Extract duplicate code from `wsrep_stop_replication()` and
  `wsrep_shutdown_replication()` in a new function.
- No need to use a custom `shutdown_mysqld.inc` in galera
  suite. Delete it, so that the one in `mysql-test/include/` is used.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-03-27 04:31:45 +01:00
Kristian Nielsen
ef7abc881c MDEV-10793: MDEV-33292: main.kill_processlist-6619 fails sporadically in buildbot
There were several races in the main.kill_processlist-6619 testcase:

 - Lingering connections from a previous test case could be visible in SHOW
   PROCESSLIST and cause .result diff.
 - A sync point "dispatch_command_end" was ineffective, as it was consumed at
   the end of the SET DEBUG command itself.
 - The signal from sync point "before_execute_sql_command" could override an
   earlier signal, causing DEBUG_SYNC timeout and test failure.
 - The final SHOW PROCESSLIST could occasionally see a connection in state
   "Busy" instead of the expected "Sleep".

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-03-14 22:48:12 +01:00
Dmitry Shulga
d7758debae MDEV-33218: Assertion `active_arena->is_stmt_prepare_or_first_stmt_execute() || active_arena->state == Query_arena::STMT_SP_QUERY_ARGUMENTS' failed in st_select_lex::fix_prepare_information
In case there is a view that queried from a stored routine or
a prepared statement and this temporary table is dropped between
executions of SP/PS, then it leads to hitting an assertion
at the SELECT_LEX::fix_prepare_information. The fired assertion
 was added by the commit 85f2e4f8e8
(MDEV-32466: Potential memory leak on executing of create view statement).
Firing of this assertion means memory leaking on execution of SP/PS.
Moreover, if the added assert be commented out, different result sets
can be produced by the statement SELECT * FROM the hidden table.

Both hitting the assertion and different result sets have the same root
cause. This cause is usage of temporary table's metadata after the table
itself has been dropped. To fix the issue, reload the cache of stored
routines. To do it  cache of stored routines is reset at the end of
execution of the function dispatch_command(). Next time any stored routine
be called it will be loaded from the table mysql.proc. This happens inside
the method Sp_handler::sp_cache_routine where loading of a stored routine
is performed in case it missed in cache. Loading is performed unconditionally
while previously it was controlled by the parameter lookup_only. By that
reason the signature of the method Sroutine_hash_entry::sp_cache_routine
was changed by removing unused parameter lookup_only.

Clearing of sp caches affects the test main.lock_sync since it forces
opening and locking the table mysql.proc but the test assumes that each
statement locks its tables once during its execution. To keep this invariant
the debug sync points with names "before_lock_tables_takes_lock" and
"after_lock_tables_takes_lock" are not activated on handling the table
mysql.proc
2024-03-14 15:43:03 +07:00
Monty
9a132d423a MDEV-33620 Improve times and states in show processlist for replication
This will makes it easier to find out what replication workers are
doing and what they are waiting for.

Things changed in processlist:
- Slave_SQL time was not consistent. Now time for state "Slave has
  read all relay log; waiting for more updates" shows how long it has
  waited for getting the next event.
- Slave_worker threads did often show "Closing tables" for a long
  time.  Now the state is reverted to the previous state after
  "Closing tables" is done.
- Commit and Rollback states where not shown for replication (and some
  other threads). Now Commit and Rollback states are always shown and
  the state is reverted to previous state when the Commit/Rollback
  have finished.

Code changes:
- Added thd->set_time_for_next_stage() for parallel replication when
  when starting to wait for prior transactions to commit, group commit,
  and FTWRL and for free space in thread pool.
  Before we reset the time only after the above events.
- Moved THD_STAGE_INFO(stage_rollback) and THD_STAGE_INFO(stage_commit)
  from sql_parse.cc to transaction.cc to ensure this is done for
  all commits and not only 'normal connection queries'.

Test case changes:
- close_thread_tables() reverting stage to previous stage caused the
  counter in performance_schema to be increased. In many case it is
  the 'sql/starting' stage that was effected.
- We only change to "Commit" stage if there is a need for a commit.
  This caused some "Commit" stages to disapper from perfschema reports.

TODO in 11.#:
- Slave_IO always showes "Waiting for master to send event" and the time is
  from SLAVE START. We should in 11.# change this to be the time since
  reading the last event.
2024-03-08 15:23:17 +02:00
Marko Mäkelä
691f923906 Merge 10.5 into 10.6 2024-02-13 20:42:59 +02:00
Marko Mäkelä
8ec12e0d6d Merge 10.4 into 10.5 2024-02-12 11:38:13 +02:00
Jan Lindström
3228c08fa8 MDEV-22063 : Assertion `0' failed in wsrep::transaction::before_rollback
Problem was that REPLACE was using consistency check that started
TOI and we tried to rollback it.

Do not use wsrep_before_rollback and wsrep_after_rollback if
we are runing consistency check because no writeset keys are
in that case added. Do not allow consistency check usage
if table storage for target table is not InnoDB, instead
give warning. REPLACE|SELECT INTO ... SELECT will use
now TOI if table storage for target table is not InnoDB
to maintain consistency between galera nodes.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-01-29 06:34:46 +01:00
Monty
2fcb5d651b Fixed possible mutex-wrong-order with KILL USER
The old code collected a list of THD's, locked the THD's from getting
deleted by locking two mutex and then later in a separate loop
sent a kill signal to each THD.

The problem with this approach is that, as THD's can be reused,
the second time the THD is killed, the mutex can be taken in
different order, which signals failures in safe_mutex.

Fixed by sending the kill signal directly and not collect the THD's
in a list to be signaled later.  This is the same approach we are using
in kill_zombie_dump_threads().

Other things:
- Reset safe_mutex_t->locked_mutex when freed (Safety fix)
2024-01-23 13:03:11 +02:00
Michael Widenius
7af50e4df4 MDEV-32551: "Read semi-sync reply magic number error" warnings on master
rpl_semi_sync_slave_enabled_consistent.test and the first part of
the commit message comes from Brandon Nesterenko.

A test to show how to induce the "Read semi-sync reply magic number
error" message on a primary. In short, if semi-sync is turned on
during the hand-shake process between a primary and replica, but
later a user negates the rpl_semi_sync_slave_enabled variable while
the replica's IO thread is running; if the io thread exits, the
replica can skip a necessary call to kill_connection() in
repl_semisync_slave.slave_stop() due to its reliance on a global
variable. Then, the replica will send a COM_QUIT packet to the
primary on an active semi-sync connection, causing the magic number
error.

The test in this patch exits the IO thread by forcing an error;
though note a call to STOP SLAVE could also do this, but it ends up
needing more synchronization. That is, the STOP SLAVE command also
tries to kill the VIO of the replica, which makes a race with the IO
thread to try and send the COM_QUIT before this happens (which would
need more debug_sync to get around). See THD::awake_no_mutex for
details as to the killing of the replica’s vio.

Notes:
- The MariaDB documentation does not make it clear that when one
  enables semi-sync replication it does not matter if one enables
  it first in the master or slave. Any order works.

Changes done:
- The rpl_semi_sync_slave_enabled variable is now a default value for
  when semisync is started. The variable does not anymore affect
  semisync if it is already running. This fixes the original reported
  bug.  Internally we now use repl_semisync_slave.get_slave_enabled()
  instead of rpl_semi_sync_slave_enabled. To check if semisync is
  active on should check the @@rpl_semi_sync_slave_status variable (as
  before).
- The semisync protocol conflicts in the way that the original
  MySQL/MariaDB client-server protocol was designed (client-server
  send and reply packets are strictly ordered and includes a packet
  number to allow one to check if a packet is lost). When using
  semi-sync the master and slave can send packets at 'any time', so
  packet numbering does not work. The 'solution' has been that each
  communication starts with packet number 1, but in some cases there
  is still a chance that the packet number check can fail.  Fixed by
  adding a flag (pkt_nr_can_be_reset) in the NET struct that one can
  use to signal that packet number checking should not be done. This
  is flag is set when semi-sync is used.
- Added Master_info::semi_sync_reply_enabled to allow one to configure
  some slaves with semisync and other other slaves without semisync.
  Removed global variable semi_sync_need_reply that would not work
  with multi-master.
- Repl_semi_sync_master::report_reply_packet() can now recognize
  the COM_QUIT packet from semisync slave and not give a
  "Read semi-sync reply magic number error" error for this case.
  The slave will be removed from the Ack listener.
- On Windows, don't stop semisync Ack listener just because one
  slave connection is using socket_id > FD_SETSIZE.
- Removed busy loop in Ack_receiver::run() by using
 "Self-pipe trick" to signal new slave and stop Ack_receiver.
- Changed some Repl_semi_sync_slave functions that always returns 0
  from int to void.
- Added Repl_semi_sync_slave::slave_reconnect().
- Removed dummy_function Repl_semi_sync_slave::reset_slave().
- Removed some duplicate semisync notes from the error log.
- Add test of "if (get_slave_enabled() && semi_sync_need_reply)"
  before calling Repl_semi_sync_slave::slave_reply().
  (Speeds up the code as we can skip all initializations).
- If epl_semisync_slave.slave_reply() fails, we disable semisync
  for that connection.
- We do not call semisync.switch_off() if there are no active slaves.
  Instead we check in Repl_semi_sync_master::commit_trx() if there are
  no active threads. This simplices the code.
- Changed assert() to DBUG_ASSERT() to ensure that the DBUG log is
  flushed in case of asserts.
- Removed the internal rpl_semi_sync_slave_status as it is not needed
  anymore. The @@rpl_semi_sync_slave_status status variable is now
  mapped to rpl_semi_sync_enabled.
- Removed rpl_semi_sync_slave_enabled  as it is not needed anymore.
  Repl_semi_sync_slave::get_slave_enabled() contains the active status.
- Added checking that we do not add a slave twice with
  Ack_receiver::add_slave(). This could happen with old code.
- Removed Repl_semi_sync_master::check_and_switch() as it is not
  needed anymore.
- Ensure that when we call Ack_receiver::remove_slave() that the slave
  is removed from the listener before function returns.
- Call listener.listen_on_sockets() outside of mutex for better
  performance and less contested mutex.
- Ensure that listening is ignoring newly added slaves when checking for
  responses.
- Fixed the master ack_receiver listener is not killed if there are no
  connected slaves (and thus stop semisync handling of future
  connections). This could happen if all slaves sockets where would be
  marked as unreliable.
- Added unlink() to base_ilist_iterator and remove() to
  I_List_iterator. This enables us to remove 'dead' slaves in
  Ack_recever::run().
- kill_zombie_dump_threads() now does killing of dump threads properly.
  - It can now kill several threads (should be impossible but could
    happen if IO slaves reconnects very fast).
  - We now wait until the dump thread is done before starting the
    dump.
- Added an error if kill_zombie_dump_threads() fails.
- Set thd->variables.server_id before calling
  kill_zombie_dump_threads(). This simplies the code.
- Added a lot of comments both in code and tests.
- Removed DBUG_EVALUATE_IF "failed_slave_start" as it is not used.

Test changes:
- rpl.rpl_session_var2 added which runs rpl.rpl_session_var test with
  semisync enabled.
- Some timings changed slight with startup of slave which caused
  rpl_binlog_dump_slave_gtid_state_info.text to fail as it checked the
  error log file before the slave had started properly. Fixed by
  adding wait_for_pattern_in_file.inc that allows waiting for the
  pattern to appear in the log file.
- Tests have been updated so that we first set
  rpl_semi_sync_master_enabled on the master and then set
  rpl_semi_sync_slave_enabled on the slaves (this is according to how
  the MariaDB documentation document how to setup semi-sync).
- Error text "Master server does not have semi-sync enabled" has been
  replaced with "Master server does not support semi-sync" for the
  case when the master supports semi-sync but semi-sync is not
  enabled.

Other things:
- Some trivial cleanups in Repl_semi_sync_master::update_sync_header().
- We should in 11.3 changed the default value for
  rpl-semi-sync-master-wait-no-slave from TRUE to FALSE as the TRUE
  does not make much sense as default. The main difference with using
  FALSE is that we do not wait for semisync Ack if there are no slave
  threads.  In the case of TRUE we wait once, which did not bring any
  notable benefits except slower startup of master configured for
  using semisync.

Co-author: Brandon Nesterenko <brandon.nesterenko@mariadb.com>

This solves the problem reported in MDEV-32960 where a new
slave may not be registered in time and the master disables
semi sync because of that.
2024-01-23 13:03:11 +02:00
Sergei Golubchik
e95bba9c58 Merge branch '10.5' into 10.6 2023-12-17 11:20:43 +01:00
Sergei Golubchik
98a39b0c91 Merge branch '10.4' into 10.5 2023-12-02 01:02:50 +01:00
Monty
83214c3406 Improve reporting from sf_report_leaked_memory()
Other things:
- Added DBUG_EXECUTE_IF("print_allocated_thread_memory") at end of query
  to easier find not freed memory allocated by THD
- Removed free_root() from plugin_init() that did nothing.
2023-11-27 19:08:14 +02:00