1
0
mirror of https://github.com/MariaDB/server.git synced 2025-11-25 17:25:02 +03:00
Commit Graph

20 Commits

Author SHA1 Message Date
Monty
bddbef3573 MDEV-34533 asan error about stack overflow when writing record in Aria
The problem was that when using clang + asan, we do not get a correct value
for the thread stack as some local variables are not allocated at the
normal stack.

It looks like that for example clang 18.1.3, when compiling with
-O2 -fsanitize=addressan it puts local variables and things allocated by
alloca() in other areas than on the stack.

The following code shows the issue

Thread 6 "mariadbd" hit Breakpoint 3, do_handle_one_connection
    (connect=0x5080000027b8,
    put_in_cache=<optimized out>) at sql/sql_connect.cc:1399

THD *thd;
1399      thd->thread_stack= (char*) &thd;
(gdb) p &thd
(THD **) 0x7fffedee7060
(gdb) p $sp
(void *) 0x7fffef4e7bc0

The address of thd is 24M away from the stack pointer

(gdb) info reg
...
rsp            0x7fffef4e7bc0      0x7fffef4e7bc0
...
r13            0x7fffedee7060      140737185214560

r13 is pointing to the address of the thd. Probably some kind of
"local stack" used by the sanitizer

I have verified this with gdb on a recursive call that calls alloca()
in a loop. In this case all objects was stored in a local heap,
not on the stack.

To solve this issue in a portable way, I have added two functions:

my_get_stack_pointer() returns the address of the current stack pointer.
The code is using asm instructions for intel 32/64 bit, powerpc,
arm 32/64 bit and sparc 32/64 bit.
Supported compilers are gcc, clang and MSVC.
For MSVC 64 bit we are using _AddressOfReturnAddress()

As a fallback for other compilers/arch we use the address of a local
variable.

my_get_stack_bounds() that will return the address of the base stack
and stack size using pthread_attr_getstack() or NtCurrentTed() with
fallback to using the address of a local variable and user provided
stack size.

Server changes are:

- Moving setting of thread_stack to THD::store_globals() using
  my_get_stack_bounds().
- Removing setting of thd->thread_stack, except in functions that
  allocates a lot on the stack before calling store_globals().  When
  using estimates for stack start, we reduce stack_size with
  MY_STACK_SAFE_MARGIN (8192) to take into account the stack used
  before calling store_globals().

I also added a unittest, stack_allocation-t, to verify the new code.

Reviewed-by: Sergei Golubchik <serg@mariadb.org>
2024-10-16 17:24:46 +03:00
Oleksandr Byelkin
6cfd2ba397 Merge branch '10.4' into 10.5 2023-11-08 12:59:00 +01:00
Andrei
9c43343213 MDEV-32365 detailize the semisync replication magic number error
Semisync ack (master side) receiver thread is made to report
details of faced errors.
In case of 'magic byte' error, a hexdump of the received packet
is always (level) NOTEd into the error log.
In other cases an exact server level error is print out
as a warning (as it may not be critical) under log_warnings > 2.

An MTR test added for the magic byte error. For others existing mtr
tests cover that, provided log_warnings > 2 is set.
2023-10-26 20:24:44 +03:00
Marko Mäkelä
559efad44e Merge 10.4 into 10.5 2021-04-27 09:10:47 +03:00
Marko Mäkelä
90a306a7ab Merge 10.3 into 10.4 2021-04-27 08:53:50 +03:00
Sujatha
391f1aa6ee MDEV-24773: slave_compressed_protocol doesn't work properly with semi-sync replication
Back port upstream fix

commit 1800b015a1d487330f7b15f2020b887be348a66b
Author: Venkatesh Duggirala <venkatesh.duggirala@oracle.com>
Date:   Fri Sep 8 20:29:22 2017 +0530

Bug#26027024    SLAVE_COMPRESSED_PROTOCOL DOESN'T WORK WITH
SEMI-SYNC REPLICATION IN MYSQL-5.7

Analysis: In mysql-5.6, dump thread (the thread that is created
on Master after Slave requested for a binlog dump) is also used
to receive acknowledgements from the Slave and act on them accordingly.
For performance reasons, a special thread called Ack Receiver thread
is added in mysql-5.7 Semi synchronous replication plugin.
This thread does not have special handling to receive acknowledgements
if Slave has enabled compression in the protocol. Hence Master is
unable to handle any slave if Slave_compressed_protocol is enabled
on it.

Fix: Enable compress flag on the communication channels if the Slave
has Slave_compressed_protocol ON.
2021-04-26 11:09:39 +05:30
Sergei Golubchik
7af733a5a2 perfschema compilation, test and misc fixes 2020-03-10 19:24:23 +01:00
Marko Mäkelä
d3dcec5d65 Merge 10.3 into 10.4 2019-05-05 15:06:44 +03:00
Sergey Vojtovich
779fb636da Revert THD::THD(skip_global_sys_var_lock) argument
Originally introduced by e972125f1 to avoid harmless wait for
LOCK_global_system_variables in a newly created thread, which creation was
initiated by system variable update.

At the same time it opens dangerous hole, when system variable update
thread already released LOCK_global_system_variables and ack_receiver
thread haven't yet completed new THD construction. In this case THD
constructor goes completely unprotected.

Since ack_receiver.stop() waits for the thread to go down, we have to
temporarily release LOCK_global_system_variables so that it doesn't
deadlock with ack_receiver.run(). Unfortunately it breaks atomicity
of rpl_semi_sync_master_enabled updates and makes them not serialized.

LOCK_rpl_semi_sync_master_enabled was introduced to workaround the above.
TODO: move ack_receiver start/stop into repl_semisync_master
enable_master/disable_master under LOCK_binlog protection?

Part of MDEV-14984 - regression in connect performance
2019-05-03 16:46:11 +04:00
Sergey Vojtovich
9824ec81aa Removed redundant service_thread_count
In contrast to thread_count, which is decremented by THD destructor,
this one was most probably intended to be decremented after all THD
destructors are done.

THD_count class was added to achieve similar effect with thread_count.

Aim is to reduce usage of LOCK_thread_count and COND_thread_count.
Part of MDEV-15135.
2019-01-28 17:39:08 +04:00
Andrei Elkin
573c4db57a MDEV-17437 followup: fixing compilation on non-HAVE_POLL. 2018-11-13 14:54:33 +02:00
Andrei Elkin
6db773a542 MDEV-17437 Semisync master fires invalid fd value assert
The semisync ack collector hits fd's out-of-bound value assert through
a stack of

/usr/sbin/mysqld(_ZN12Ack_receiver17get_slave_socketsEP6fd_setPj+0x70)[0x7fa3bbe27400]

/usr/sbin/mysqld(_ZN12Ack_receiver3runEv+0x540)[0x7fa3bbe27980]

/usr/sbin/mysqld(ack_receive_handler+0x19)[0x7fa3bbe27a79]

The reason of the failure must be the same as in https://bugs.mysql.com/bug.php?id=79865
whose fixes are applied with minor changes.

Specifically, the semisync ack thread is changed to use poll()
instead of select() on platforms where the former is defined.
On the systems that still use select(), Ack receive thread will generate
an error and semi sync will be switched off. Windows systems is exception
case because on windows this limitation does not exists.

The sustain manual testing with `mysqlslap --concurrency > 1024' in "background" while the slave io thread is restarting multiple times.
2018-11-13 10:29:18 +02:00
Sergei Golubchik
08b91548db MDEV-16495 mariadb segfaults at start on FreeBSD
don't use MY_MUTEX_INIT_FAST in constructors of statically
allocated objects.
2018-08-12 11:37:42 +02:00
Vladislav Vaintroub
6c279ad6a7 MDEV-15091 : Windows, 64bit: reenable and fix warning C4267 (conversion from 'size_t' to 'type', possible loss of data)
Handle string length as size_t, consistently (almost always:))
Change function prototypes to accept size_t, where in the past
ulong or uint were used. change local/member variables to size_t
when appropriate.

This fix excludes rocksdb, spider,spider, sphinx and connect for now.
2018-02-06 12:55:58 +00:00
Aleksey Midenkov
c59c1a0736 System Versioning 1.0 pre8
Merge branch '10.3' into trunk
2018-01-10 12:36:55 +03:00
Sergei Golubchik
e577b5667a fix compilation w/o P_S 2018-01-09 14:21:08 +03:00
Vladislav Vaintroub
a603b46593 Fix warnings 2018-01-07 11:37:38 +00:00
Andrei Elkin
f279d3c43a MDEV-13073. This part converts the Ali patch`s identifiers to the MariaDB standard. Also some renaming is done as well as white spaces removal. 2017-12-18 13:43:38 +02:00
Andrei Elkin
6a84e3407d MDEV-13073. This patch replaces semisync's native function_enter,exit
and its custom trace faciltiy with standard DBUG_ based equivalents.
2017-12-18 13:43:38 +02:00
Andrei Elkin
e972125f11 MDEV-13073 This part merges the Ali semisync related changes
and specifically the ack receiving functionality.
Semisync is turned to be static instead of plugin so its functions
are invoked at the same points as RUN_HOOKS.
The RUN_HOOKS and the observer interface remain to be removed by later
patch.

Todo:
  React on killed status by repl_semisync_master.wait_after_sync(). Currently
  Repl_semi_sync_master::commit_trx does not check the killed status.

  There were few bugfixes found that are present in mysql and its unclear
  whether/how they are covered. Those include:

  Bug#15985893: GTID SKIPPED EVENTS ON MASTER CAUSE SEMI SYNC TIME-OUTS
  Bug#17932935 CALLING IS_SEMI_SYNC_SLAVE() IN EACH FUNCTION CALL
                 HAS BAD PERFORMANCE
  Bug#20574628: SEMI-SYNC REPLICATION PERFORMANCE DEGRADES WITH A HIGH NUMBER OF THREADS
2017-12-18 13:43:37 +02:00