This is actually an existing problem in the old binlog implementation, and
this patch is applicable to old binlog also. The problem is that RESET
MASTER can run concurrently with binlog dump threads / connected slaves.
This will remove the binlog from under the feet of the reader, which can
cause all sorts of strange behaviour.
This patch fixes the problem by disallowing to run RESET MASTER when dump
threads (or other RESET MASTER or SHOW BINARY LOGS) are running. An error is
thrown in this case, user must stop slaves and/or kill dump threads to make
the RESET MASTER go through. A slave that connects in the middle of RESET
MASTER will wait for it to complete.
Fix a lot of test cases to kill any lingering dump threads before doing
RESET MASTER, mostly just by sourcing include/kill_binlog_dump_threads.inc.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
DROP TABLE on child and UPDATE of parent table can cause an MDL BF-BF
conflict when applied concurrently.
DROP TABLE takes MDL locks on both child and its parent table, however
it only it did not add certification keys for the parent table.
This patch adds the following:
* Append certification keys corresponding to all parent tables
before DROP TABLE replication.
* Fix wsrep_append_fk_parent_table() so that it works when it is
given a table list containing temporary tables.
* Make sure function wsrep_append_fk_parent_table() is only called
for local transaction. That was not the case for ALTER TABLE.
* Add a test case that verifies that UPDATE parent depends on
preceeding DROP TABLE child.
* Adapt galera_ddl_fk_conflict test to work with DROP TABLE as well.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
The previous commit for fixing MDEV-35446 disabled setting
Galera errors on COM_STMT_PREPARE commands.
As a side effect, a number of tests were started to fail
due to the client receiving different error codes from the
ones expected in the test dependending on whether --ps-protocol
was used.
Also, in the case of test galera_ftwrl, it was found that
it is expected that during COM_STMT_PREPARE command, we
may perform a sync wait operation, which can fail with
LOCK_WAIT_TIMEOUT error.
The revised fix consists in anticipating the call to
wsrep_after_command_before_result(), so that we check for
BF aborts or errors during statement prepare, before sending
back the statement metadata message to client.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Test failed sporadically when --ps-protocol was enabled:
a transaction that was BF aborted on COMMIT would succeed
instead of reporting the expected deadlock error.
The reason for the failure was that, depending on timing,
the transaction was BF aborted while the COMMIT statement
was being prepared through a COM_STMT_PREPARE command.
In the failing cases, the transaction was BF aborted
after COM_STMT_PREPARE had already disabled the diagnostics
area of the client. Attempt to override the deadlock error
towards the end of dispatch_command() would be skipped,
resulting in a successful COMMIT even if the transaction
is aborted.
This bug affected the following MTR tests:
- galera_insert_multi
- galera_nopk_unicode
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Replication of non-transactional engines is experimental and
uses TOI. This naturally means that if there is open transaction
with transactional engine it's changes will be rolled back.
Fixed by adding error message if non-transactional engine
is part of multi-engine transaction with warning.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
The problem was that when using clang + asan, we do not get a correct value
for the thread stack as some local variables are not allocated at the
normal stack.
It looks like that for example clang 18.1.3, when compiling with
-O2 -fsanitize=addressan it puts local variables and things allocated by
alloca() in other areas than on the stack.
The following code shows the issue
Thread 6 "mariadbd" hit Breakpoint 3, do_handle_one_connection
(connect=0x5080000027b8,
put_in_cache=<optimized out>) at sql/sql_connect.cc:1399
THD *thd;
1399 thd->thread_stack= (char*) &thd;
(gdb) p &thd
(THD **) 0x7fffedee7060
(gdb) p $sp
(void *) 0x7fffef4e7bc0
The address of thd is 24M away from the stack pointer
(gdb) info reg
...
rsp 0x7fffef4e7bc0 0x7fffef4e7bc0
...
r13 0x7fffedee7060 140737185214560
r13 is pointing to the address of the thd. Probably some kind of
"local stack" used by the sanitizer
I have verified this with gdb on a recursive call that calls alloca()
in a loop. In this case all objects was stored in a local heap,
not on the stack.
To solve this issue in a portable way, I have added two functions:
my_get_stack_pointer() returns the address of the current stack pointer.
The code is using asm instructions for intel 32/64 bit, powerpc,
arm 32/64 bit and sparc 32/64 bit.
Supported compilers are gcc, clang and MSVC.
For MSVC 64 bit we are using _AddressOfReturnAddress()
As a fallback for other compilers/arch we use the address of a local
variable.
my_get_stack_bounds() that will return the address of the base stack
and stack size using pthread_attr_getstack() or NtCurrentTed() with
fallback to using the address of a local variable and user provided
stack size.
Server changes are:
- Moving setting of thread_stack to THD::store_globals() using
my_get_stack_bounds().
- Removing setting of thd->thread_stack, except in functions that
allocates a lot on the stack before calling store_globals(). When
using estimates for stack start, we reduce stack_size with
MY_STACK_SAFE_MARGIN (8192) to take into account the stack used
before calling store_globals().
I also added a unittest, stack_allocation-t, to verify the new code.
Reviewed-by: Sergei Golubchik <serg@mariadb.org>
Search conditions were evaluated using val_int(), which was wrong.
Fixing the code to use val_bool() instead.
Details:
- Adding a new item_base_t::IS_COND flag which marks Items used
as <search condition> in WHERE, HAVING, JOIN ON, CASE WHEN clauses.
The flag is at the parse time.
These expressions must be evaluated using val_bool() rather than val_int().
Note, the optimizer creates more Items which are used as search conditions.
Most of these items are not marked with IS_COND yet. This is OK for now,
but eventually these Items can also be fixed to have the flag.
- Adding a method Item::is_cond() which tests if the Item has the IS_COND flag.
- Implementing Item_cache_bool. It evaluates the cached expression using
val_bool() rather than val_int().
Overriding Type_handler_bool::Item_get_cache() to create Item_cache_bool.
- Implementing Item::save_bool_in_field(). It uses val_bool() rather than
val_int() to evaluate the expression.
- Implementing Type_handler_bool::Item_save_in_field()
using Item::save_bool_in_field().
- Fixing all Item_bool_func descendants to implement a virtual val_bool()
rather than a virtual val_int().
- To find places where val_int() should be fixed to val_bool(), a few
DBUG_ASSERT(!is_cond()) where added into val_int() implementations
of selected (most frequent) classes:
Item_field
Item_str_func
Item_datefunc
Item_timefunc
Item_datetimefunc
Item_cache_bool
Item_bool_func
Item_func_hybrid_field_type
Item_basic_constant descendants
- Fixing all places where DBUG_ASSERT() happened during an "mtr" run
to use val_bool() instead of val_int().
Added new test scenario in galera.galera_bf_kill
test to make the issue surface. The tetst scenario has
a multi statement transaction containing a KILL command.
When the KILL is submitted, another transaction is
replicated, which causes BF abort for the KILL command
processing. Handling BF abort rollback while executing
KILL command causes node hanging, in this scenario.
sql_kill() and sql_kill_user() functions have now fix,
to perform implicit commit before starting the KILL command
execution. BEcause of the implicit commit, the KILL execution
will not happen inside transaction context anymore.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Replication of MyISAM and Aria DML is experimental and best
effort only. Earlier change make INSERT SELECT on both
MyISAM and Aria to replicate using TOI and STATEMENT
replication. Replication should happen only if user
has set needed wsrep_mode setting.
Note: This commit contains additional changes compared
to those already made for the 10.5 branch.
+ small refactoring after main fix.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Problem was that we did not found that table was partitioned
and then we should find what is actual underlaying storage
engine.
We should not use RSU for !InnoDB tables.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Replication of MyISAM and Aria DML is experimental and best
effort only. Earlier change make INSERT SELECT on both
MyISAM and Aria to replicate using TOI and STATEMENT
replication. Replication should happen only if user
has set needed wsrep_mode setting.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Commit a8a75ba2d causes the MariaDB server to crash, usually with signal
11, at random code locations due to invalid pointer values during any
table operation. This issue occurs when the server is built with -O3 and
other customized compiler flags.
For example, the command `use db1;` causes server to crash in the
`check_table_access` function at line sql_parse.cc:7080 because
`tables->correspondent_table` is an invalid pointer value of 0x1.
The crashes are due to undefined behavior from using uninitialized
variables. The problematic commit a8a75ba2d introduces code that
allocates memory and sets it to 0 using thd->calloc before initializing
it with a placement new operation.
This process depends on setting memory to 0 to initialize member
variables not explicitly set in the constructor. However, the compiler
can optimize out the memset/bfill, leading to uninitialized values and
unpredictable issues.
Once a constructor function initializes an object, any uninitialized
variables within that object are subject to undefined behavior. The
state of memory before the constructor runs, whether it involves
memset or was used for other purposes, is irrelevant after the
placement new operation.
This behavior can be demonstrated with this
[test](https://gcc.godbolt.org/z/5n87z1raG) I wrote to examine the
assembly code. The code in MariaDB can be abstracted to the following,
though it has many layers wrapped around it and more complex logic,
causing slight differences in optimization in the MariaDB build.
To summarize, on x86, the memset in the following code is optimized out
with both -O2 and -O3 in GCC 13, and is only preserved in the much older
GCC 4.9.
struct S {
int i; // uninitialized in consturctor
S() {};
};
int bar() {
void *buf = malloc(sizeof(S));
memset(buf, 0, sizeof(S)); // optimized out
S* s = new(buf) S;
return s->i;
}
With GCC13 -O3:
bar():
sub rsp, 8
mov edi, 4
call malloc
mov eax, DWORD PTR [rax]
add rsp, 8
ret
With GCC4.9 -O3
bar():
sub rsp, 8
mov edi, 4
call malloc
mov DWORD PTR [rax], 0
xor eax, eax
add rsp, 8
ret
Now we ensure the constructor initializes variables correctly by running
the reset() function in the constructor to perform the memset/bfill(0)
operation. After applying the fix, the crash is gone.
All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services.
Address Sanitizer's know how to detect stack overrun, so there's
no point in us doing it.
As evidenced by perfschema tests where signficant test failures
because this function failed under ASAN (MDEV-33210).
Also, so since clang-16, we cannot assume much about how local
variables are allocated on the stack (MDEV-31605).
Disabling check idea thanks to Sanja.