1
0
mirror of https://github.com/MariaDB/server.git synced 2025-08-08 11:22:35 +03:00
Commit Graph

97 Commits

Author SHA1 Message Date
Oleksandr Byelkin
a8d4642375 Merge branch '10.11' into 11.4 2025-04-26 10:53:02 +02:00
Vladislav Vaintroub
b005b6097f Cleanup CMake code (Windows-specific)
Prepare for a more modern CMake version than the current minimum.

- Use CMAKE_MSVC_RUNTIME_LIBRARY instead of the custom MSVC_CRT_TYPE.
- Replace CMAKE_{C,CXX}_FLAGS modifications with
  add_compile_definitions/options and add_link_options.
  The older method already broke with new pcre2.
- Fix clang-cl compilation and ASAN build.
- Avoid modifying CMAKE_C_STANDARD_LIBRARIES/CMAKE_CXX_STANDARD_LIBRARIES,
  as this is discouraged by CMake.
- Reduce system checks.
2025-04-04 08:58:40 +02:00
Marko Mäkelä
b53b81e937 Merge 11.2 into 11.4 2024-10-03 14:32:14 +03:00
Marko Mäkelä
6acada713a MDEV-34062: Implement innodb_log_file_mmap on 64-bit systems
When using the default innodb_log_buffer_size=2m, mariadb-backup --backup
would spend a lot of time re-reading and re-parsing the log. For reads,
it would be beneficial to memory-map the entire ib_logfile0 to the
address space (typically 48 bits or 256 TiB) and read it from there,
both during --backup and --prepare.

We will introduce the Boolean read-only parameter innodb_log_file_mmap
that will be OFF by default on most platforms, to avoid aggressive
read-ahead of the entire ib_logfile0 in when only a tiny portion would be
accessed. On Linux and FreeBSD the default is innodb_log_file_mmap=ON,
because those platforms define a specific mmap(2) option for enabling
such read-ahead and therefore it can be assumed that the default would
be on-demand paging. This parameter will only have impact on the initial
InnoDB startup and recovery. Any writes to the log will use regular I/O,
except when the ib_logfile0 is stored in a specially configured file system
that is backed by persistent memory (Linux "mount -o dax").

We also experimented with allowing writes of the ib_logfile0 via a
memory mapping and decided against it. A fundamental problem would be
unnecessary read-before-write in case of a major page fault, that is,
when a new, not yet cached, virtual memory page in the circular
ib_logfile0 is being written to. There appears to be no way to tell
the operating system that we do not care about the previous contents of
the page, or that the page fault handler should just zero it out.

Many references to HAVE_PMEM have been replaced with references to
HAVE_INNODB_MMAP.

The predicate log_sys.is_pmem() has been replaced with
log_sys.is_mmap() && !log_sys.is_opened().

Memory-mapped regular files differ from MAP_SYNC (PMEM) mappings in the
way that an open file handle to ib_logfile0 will be retained. In both
code paths, log_sys.is_mmap() will hold. Holding a file handle open will
allow log_t::clear_mmap() to disable the interface with fewer operations.

It should be noted that ever since
commit 685d958e38 (MDEV-14425)
most 64-bit Linux platforms on our CI platforms
(s390x a.k.a. IBM System Z being a notable exception) read and write
/dev/shm/*/ib_logfile0 via a memory mapping, pretending that it is
persistent memory (mount -o dax). So, the memory mapping based log
parsing that this change is enabling by default on Linux and FreeBSD
has already been extensively tested on Linux.

::log_mmap(): If a log cannot be opened as PMEM and the desired access
is read-only, try to open a read-only memory mapping.

xtrabackup_copy_mmap_snippet(), xtrabackup_copy_mmap_logfile():
Copy the InnoDB log in mariadb-backup --backup from a memory
mapped file.
2024-09-26 18:47:12 +03:00
Oleksandr Byelkin
99b370e023 Merge branch '11.2' into 11.4 2024-05-21 19:38:51 +02:00
Marko Mäkelä
f4d75452fc MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.

We will only implement innodb_log_file_buffering=OFF on systems where
we can determine the physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.

HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.

OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).

os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.

os_file_create_func(): When applicable, initially attempt to open files
in O_DIRECT mode. For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.

create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.

row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.

Reviewed by: Vladislav Vaintroub
2024-02-20 13:43:19 +02:00
Marko Mäkelä
7f7329f092 MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs
Apparently, invoking fcntl(fd, F_SETFL, O_DIRECT) will lead to
unexpected behaviour on Linux bcachefs and possibly other file systems,
depending on the operating system version. So, let us avoid doing that,
and instead just attempt to pass the O_DIRECT flag to open(). This should
make us compatible with NetBSD, IBM AIX, as well as Solaris and its
derivatives.

This fix does not change the fact that we had only implemented
innodb_log_file_buffering=OFF on systems where we can determine the
physical block size (typically 512 or 4096 bytes).
Currently, those operating systems are Linux and Microsoft Windows.

HAVE_FCNTL_DIRECT, os_file_set_nocache(): Remove.

OS_FILE_OVERWRITE, OS_FILE_CREATE_PATH: Remove (never used parameters).

os_file_log_buffered(), os_file_log_maybe_unbuffered(): Helper functions.

os_file_create_simple_func(): When applicable, initially attempt to
open files in O_DIRECT mode.

os_file_create_func(): When applicable, initially attempt to
open files in O_DIRECT mode.
For type==OS_LOG_FILE && create_mode != OS_FILE_CREATE
we will first invoke stat(2) on the file name to find out if the size
is compatible with O_DIRECT. If create_mode == OS_FILE_CREATE, we will
invoke fstat(2) on the created log file afterwards, and may close and
reopen the file in O_DIRECT mode if applicable.

create_temp_file(): Support O_DIRECT. This is only used if O_TMPFILE is
available and innodb_disable_sort_file_cache=ON (non-default value).
Notably, that setting never worked on Microsoft Windows.

row_merge_file_create_mode(): Split from row_merge_file_create_low().
Create a temporary file in the specified mode.

Reviewed by: Vladislav Vaintroub
2024-02-20 11:22:45 +02:00
Oleksandr Byelkin
fa69b085b1 Merge branch '11.3' into 11.4 2024-02-15 13:53:21 +01:00
Marko Mäkelä
a6290a5bc5 MDEV-33095 innodb_flush_method=O_DIRECT creates excessive errors on Solaris
The directio(3C) function on Solaris is supported on NFS and UFS
while the majority of users should be on ZFS, which is a copy-on-write
file system that implements transparent compression and therefore
cannot support unbuffered I/O.

Let us remove the call to directio() and simply treat
innodb_flush_method=O_DIRECT in the same way as the previous
default value innodb_flush_method=fsync on Solaris. Also, let us
remove some dead code around calls to os_file_set_nocache() on
platforms where fcntl(2) is not usable with O_DIRECT.

On IBM AIX, O_DIRECT is not documented for fcntl(2), only for open(2).
2024-01-19 15:34:33 +11:00
Yuchen Pei
d06b6de305 Merge branch '10.5' into 10.6 2024-01-11 12:59:22 +11:00
Sergei Golubchik
761d5c8987 MDEV-33092 Undefined reference to concurrency on Solaris
remove thr_setconcurrency()
followup for 8bbcaab160

Fix by Rainer Orth
2024-01-10 10:16:20 +01:00
Vladislav Vaintroub
3fad2b1155 MDEV-33096 mysys/my_timezone.cc does not compile on AIX
AIX compilation failed, because glibc's non-standard extension to
`struct tm` were used - additional members tm_gmtoff and tm_zone.

The patch fixes it by adding corresponding compile-time check.

Additionally, for the calculation of GMT offset on AIX, a portable
variant of timegm() was required.Implementation here is inspired by
SergeyD's answer on Stackoverflow :
https://stackoverflow.com/questions/16647819/timegm-cross-platform
2023-12-22 13:17:55 +01:00
Vladislav Vaintroub
f8600b1755 MDEV-32567 Remove thr_alarm from server codebase
Remove alarm() remnants

- Replace thread-unsafe use of alarm() inside my_lock.c with a
  timed loop.
- Remove configure time checks
- Remove mysys my_alarm.c/my_alarm.h
2023-11-23 11:52:38 +11:00
Marko Mäkelä
3c7887a85f Merge 10.5 into 10.6 2022-09-05 10:09:03 +03:00
Daniel Black
43037a5a48 Merge branch 10.4 into 10.5 2022-08-31 11:06:14 +10:00
Daniel Black
cf1a944f5b Merge 10.3 into 10.4 2022-08-31 10:52:53 +10:00
Daniel Black
129616c70a MDEV-28592 disks plugin - getmntinfo (BSD) & getmntent (AIX)
Thanks to references from Brad Smith, BSDs use getmntinfo as
a system call for mounted filesystems.

Most BSDs return statfs structures, (and we use OSX's statfs64),
but NetBSD uses a statvfs structure.

Simplify Linux getmntent_r to just use getmntent.

AIX uses getmntent.

An attempt at writing Solaris compatibility with
a small bit of HPUX compatibility was made based on man page
entries only. Fixes welcome.

statvfs structures now use f_bsize for consistency with statfs

Test case adjusted as PATH_MAX is OS defined (e.g. 1023 on AIX)

Fixes: 0ee5cf837e

also fixes:

MDEV-27818: Disk plugin does not show zpool mounted devices

This is because zpool mounted point don't begin with /.

Due to the proliferation of multiple filesystem types since this
was written, we restrict the entries listed in the disks plugin
to excude:
* read only mount points (no point monitoring, and
  includes squash, snaps, sysfs, procfs, cgroups...)
* mount points that aren't directories (excludes /etc/hostname and
  similar mounts in containers). (getmntent (Linux/AIX) only)
* exclude systems where there is no capacity listed (excludes various
  virtual filesystem types).

Reviewer: Sergei Golubchik
2022-08-31 10:32:04 +10:00
Brad Smith
f02ca429f7 Revert aligned_alloc() addition from MDEV-28836
As pointed out with MDEV-29308 there are issues with the code as is.
MariaDB is built as C++11 / C99. aligned_alloc() is not guarenteed
to be exposed when building with any mode other than C++17 / C11.
The other *BSD's have their stdlib.h header to expose the function
with C+11 anyway, but the issue exists in the C99 code too, the
build just does not use -Werror. Linux globally defines _GNU_SOURCE
hiding the issue as well.
2022-08-22 09:10:40 +03:00
Marko Mäkelä
30914389fe Merge 10.5 into 10.6 2022-07-27 17:52:37 +03:00
Marko Mäkelä
098c0f2634 Merge 10.4 into 10.5 2022-07-27 17:17:24 +03:00
Oleksandr Byelkin
3bb36e9495 Merge branch '10.3' into 10.4 2022-07-27 11:02:57 +02:00
Vladislav Vaintroub
990ddaba1e Windows - reduce irrelevant CMake system checks 2022-07-18 14:59:07 +02:00
Marko Mäkelä
3794673111 MDEV-28836: Memory alignment cleanup
Table_cache_instance: Define the structure aligned at
the CPU cache line, and remove a pad[] data member.
Krunal Bauskar reported this to improve performance on ARMv8.

aligned_malloc(): Wrapper for the Microsoft _aligned_malloc()
and the ISO/IEC 9899:2011 <stdlib.h> aligned_alloc().
Note: The parameters are in the Microsoft order (size, alignment),
opposite of aligned_alloc(alignment, size).
Note: The standard defines that size must be an integer multiple
of alignment. It is enforced by AddressSanitizer but not by GNU libc
on Linux.

aligned_free(): Wrapper for the Microsoft _aligned_free() and
the standard free().

HAVE_ALIGNED_ALLOC: A new test. Unfortunately, support for
aligned_alloc() may still be missing on some platforms.
We will fall back to posix_memalign() for those cases.

HAVE_MEMALIGN: Remove, along with any use of the nonstandard memalign().

PFS_ALIGNEMENT (sic): Removed; we will use CPU_LEVEL1_DCACHE_LINESIZE.

PFS_ALIGNED: Defined using the C++11 keyword alignas.

buf_pool_t::page_hash_table::create(),
lock_sys_t::hash_table::create():
lock_sys_t::hash_table::resize(): Pad the allocation size to an
integer multiple of the alignment.

Reviewed by: Vladislav Vaintroub
2022-06-21 16:59:49 +03:00
Marko Mäkelä
a722ee88f3 Merge 10.5 into 10.6 2021-06-01 11:39:38 +03:00
Daniel Black
90adf2aa59 perfschema: use glibc gettid if available 2021-06-01 13:51:39 +10:00
Marko Mäkelä
6729dd894c Merge 10.5 into 10.6 2021-04-14 13:39:28 +03:00
David CARLIER
9a3cbc0541 mysqld: print status display subset of memory usage.
leveling up to some degree with linux's mallinfo* api for
 the memory usage display with debug build with malloc_zone
 Darwin api.

Closes: #1803
2021-04-14 19:21:35 +10:00
Marko Mäkelä
5eae8c2742 Merge 10.4 into 10.5 2021-03-31 11:05:21 +03:00
Marko Mäkelä
50de71b026 Merge 10.3 into 10.4 2021-03-31 09:47:14 +03:00
Marko Mäkelä
d6d3d9ae2f Merge 10.2 into 10.3 2021-03-31 08:01:03 +03:00
Vladislav Vaintroub
add24e7889 Windows - suppress nonsensical(for this OS) system check.
Amends 48141f3c17
2021-03-30 16:15:24 +11:00
Vladislav Vaintroub
8048831a5b Windows - suppress nonsensical(for this OS) system check.
Amends 48141f3c17
2021-03-29 14:35:12 +02:00
Eugene Kosov
62e4aaa240 cleanup: os_thread_sleep() -> std::this_thread::sleep_for()
std version has an advantage of a more convenient units implementation from
std::chrono. Now it's no need to multipy/divide to bring anything to
micro seconds.
2021-03-19 11:44:03 +03:00
Marko Mäkelä
a62a675fd2 Merge 10.5 into 10.6 2020-11-23 17:57:58 +02:00
Daniel Black
3b486c28f7 MDEV-24125: linux large pages, linux/mman.h needed
Centos/RHEL7 have the MAP_HUGE_SHIFT constant
defined in linux/mman.h which needed to get included.
2020-11-19 16:30:17 +11:00
Vladislav Vaintroub
f950559f66 Windows : reduce useless system checks 2020-11-12 21:26:34 +00:00
Oleksandr Byelkin
48b5777ebd Merge branch '10.4' into 10.5 2020-08-04 17:24:15 +02:00
Oleksandr Byelkin
57325e4706 Merge branch '10.3' into 10.4 2020-08-03 14:44:06 +02:00
Oleksandr Byelkin
c32f71af7e Merge branch '10.2' into 10.3 2020-08-03 13:41:29 +02:00
Oleksandr Byelkin
ef7cb0a0b5 Merge branch '10.1' into 10.2 2020-08-02 11:05:29 +02:00
Karthik Kamath
e6cb263ef3 MDEV-15961: Fix stacktraces under FreeBSD (aarch64)
Largely based on MySQL commit
75271e51d6

MySQL Ref:
    BUG#24566529: BACKPORT BUG#23575445 TO 5.6

    (cut)
    Also, the PTR_SANE macro which tries to check if a pointer
    is invalid (used when printing pointer values in stack traces)
    gave false negatives on OSX/FreeBSD. On these platforms we
    now simply check if the pointer is non-null. This also removes
    a sbrk() deprecation warning when building on OS X. (It was
    before only disabled with building using XCode).

Removed execinfo path of MySQL patch that was already included.

sbrk doesn't exist on FreeBSD aarch64.

Removed HAVE_BSS_START based detection and replaced with __linux__
as it doesn't exist on OSX, Solaris or Windows.  __bss_start
exists on mutiple Linux architectures.

Tested on FreeBSD and Linux x86_64. Being in FreeBSD ports for 2
years implies a good testing there on all FreeBSD architectures there
too. MySQL-8.0.21 code is functionally identical to original commit.
2020-07-28 11:10:25 +10:00
Daniel Black
e8351934b6 Merge pull request #1221 from grooverdan/10.4-MDEV-18851-multiple-sized-large-page-support
MDEV-18851: multiple sized large page support (linux)
2020-04-02 23:54:08 +04:00
Vladislav Vaintroub
334ab8c6a7 Improve cmake performance on Windows
by suppressing Unix only system checks
2020-03-25 13:09:44 +01:00
Oleksandr Byelkin
b8c0e49670 Merge commit '10.3' into 10.4 2020-03-11 13:27:10 +01:00
Oleksandr Byelkin
440452628d Merge branch '10.2' into 10.3 2020-03-06 23:28:26 +01:00
Anel Husakovic
0d1dd2e79d Clean wrong cherry-pick from previous commit
- Delete variable HAVE_PTHREAD_CONDATTR_SETCLOCK and check
- Delete second HAVE_PTHREAD_KEY_DELETE
2020-02-20 09:25:11 +01:00
Daniel Black
fb01cc3766 my_getncpus based on threads available
Detecting the cpus based on sysconf of the online CPUs can significantly
over estimate the number of cpus available.

Wheither via numactl, cgroups, taskset, systemd constraints, docker
containers and probably other mechanisms, the number of threads mysqld
can be run on can be quite less.

As such we use the pthread_getaffinity_np function on Linux and FreeBSD
(identical API) to get the number of CPUs.

The number of CPUs is the default for the thread_pool_size and a too
high default will resulting in large memory usage and high context
switching overhead.

Closes PR #922
2020-02-20 08:44:20 +01:00
Robert Bindar
e8392e58b2 MDEV-19696 - Cleanup gcc sync builtins
Since 10.4 requires C++11 capable compiler, gcc sync builtins became
dead code. Remove relevant cmake checks and cleanup include files.
2019-07-03 12:11:22 +03:00
Marko Mäkelä
5e929ee8a0 MDEV-19845: Define my_timer_cycles() inline
On clang, use __builtin_readcyclecounter() when available.
Hinted by Sergey Vojtovich. (This may lead to runtime failure
on ARM systems. The hardware should be available on ARMv8 (AArch64),
but access to it may require special privileges.)

We remove support for the proprietary Sun Microsystems compiler,
and rely on clang or the __GNUC__ assembler syntax instead.

For now, we retain support for IA-64 (Itanium) and 32-bit SPARC,
even though those platforms are likely no longer widely used.

We remove support for clock_gettime(CLOCK_SGI_CYCLE),
because Silicon Graphics ceased supporting IRIX in December 2013.
This was the only cycle timer interface available for MIPS.

On PowerPC, we rely on the GCC 4.8 __builtin_ppc_get_timebase()
(or clang __builtin_readcyclecounter()), which should be equivalent
to the old assembler code on both 64-bit and 32-bit targets.
2019-06-28 19:19:31 +03:00
Oleksandr Byelkin
c07325f932 Merge branch '10.3' into 10.4 2019-05-19 20:55:37 +02:00