1
0
mirror of https://github.com/MariaDB/server.git synced 2025-07-29 05:21:33 +03:00

MDEV-4991: GTID binlog indexing

Improve the performance of slave connect using B+-Tree indexes on each binlog
file. The index allows fast lookup of a GTID position to the corresponding
offset in the binlog file, as well as lookup of a position to find the
corresponding GTID position.

This eliminates a costly sequential scan of the starting binlog file
to find the GTID starting position when a slave connects. This is
especially costly if the binlog file is not cached in memory (IO
cost), or if it is encrypted or a lot of slaves connect simultaneously
(CPU cost).

The size of the index files is generally less than 1% of the binlog data, so
not expected to be an issue.

Most of the work writing the index is done as a background task, in
the binlog background thread. This minimises the performance impact on
transaction commit. A simple global mutex is used to protect index
reads and (background) index writes; this is fine as slave connect is
a relatively infrequent operation.

Here are the user-visible options and status variables. The feature is on by
default and is expected to need no tuning or configuration for most users.

binlog_gtid_index
  On by default. Can be used to disable the indexes for testing purposes.

binlog_gtid_index_page_size (default 4096)
  Page size to use for the binlog GTID index. This is the size of the nodes
  in the B+-tree used internally in the index. A very small page-size (64 is
  the minimum) will be less efficient, but can be used to stress the
  BTree-code during testing.

binlog_gtid_index_span_min (default 65536)
  Control sparseness of the binlog GTID index. If set to N, at most one
  index record will be added for every N bytes of binlog file written.
  This can be used to reduce the number of records in the index, at
  the cost only of having to scan a few more events in the binlog file
  before finding the target position

Two status variables are available to monitor the use of the GTID indexes:

  Binlog_gtid_index_hit
  Binlog_gtid_index_miss

The "hit" status increments for each successful lookup in a GTID index.
The "miss" increments when a lookup is not possible. This indicates that the
index file is missing (eg. binlog written by old server version
without GTID index support), or corrupt.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
This commit is contained in:
Kristian Nielsen
2023-09-08 13:12:49 +02:00
parent 20741b9237
commit d039346a7a
32 changed files with 4315 additions and 256 deletions

View File

@ -98,6 +98,17 @@ The following specify which files/extra groups are read (specified before remain
involve user-defined functions (i.e. UDFs) or the UUID()
function; for those, row-based binary logging is
automatically used.
--binlog-gtid-index Enable the creation of a GTID index for every binlog
file, and the use of such index for speeding up GTID
lookup in the binlog.
(Defaults to on; use --skip-binlog-gtid-index to disable.)
--binlog-gtid-index-page-size=#
Page size to use for the binlog GTID index.
--binlog-gtid-index-span-min=#
Control sparseness of the binlog GTID index. If set to N,
at most one index record will be added for every N bytes
of binlog file written, to reduce the size of the index.
Normally does not need tuning.
--binlog-ignore-db=name
Tells the master that updates to the given database
should not be logged to the binary log.
@ -1597,6 +1608,9 @@ binlog-direct-non-transactional-updates FALSE
binlog-expire-logs-seconds 0
binlog-file-cache-size 16384
binlog-format MIXED
binlog-gtid-index TRUE
binlog-gtid-index-page-size 4096
binlog-gtid-index-span-min 65536
binlog-legacy-event-pos FALSE
binlog-optimize-thread-scheduling TRUE
binlog-row-event-max-size 8192