1
0
mirror of https://github.com/MariaDB/server.git synced 2025-08-08 11:22:35 +03:00

MDEV-34705: Binlog-in-engine: Read side of out-of-band binlogging

With this commit, the out-of-band binlogging of large event groups in
multiple smaller records interleaved with other event groups is now working.

Instead of flushing the binlog cache to disk when they reach
@@binlog_cache_size, instead the cache is binlogged as an out-of-band
record. Then at transaction commit, a commit record is written containing
just the GTID and a link to the out-of-band data.

To facilitate append-only operation, the binlogged records do not have a
"next" pointer. Instead, they are written out as a forest of perfect binary
trees, the leftmost leaf of one tree pointing to the root of the previous
tree. This structure is used in the binlog reader to efficiently read out
the event group data consecutively for the binlog dump thread, needing to
maintain only O(log(N)) amount of memory during the reading.

As part of this commit, the existing binlog reader code is refactored to be
greatly improved, with a much cleaner explicit state machine and handling of
chunk/page/file boundaries etc.

Also fixes some bugs in the gtid_search::find_gtid_pos().

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
This commit is contained in:
Kristian Nielsen
2024-12-21 09:41:55 +01:00
parent 9230e75249
commit 6f6baf9655
7 changed files with 1114 additions and 349 deletions

View File

@@ -5845,21 +5845,6 @@ struct handler_binlog_event_group_info {
Class for reading a binlog implemented in an engine.
*/
class handler_binlog_reader {
public:
/* ToDo: Should some of this state go to the derived class, in case different engines might want to do something different? */
/* The file number of the currently-being-read binlog file. */
uint64_t cur_file_no;
/* The current offset into the binlog file. */
uint64_t cur_file_offset;
/*
Open file handle of binlog file.
This may be NULL if the currently-being-read binlog file is "hot" and
is being read from in-memory buffers while the data may not yet be
written out to the file on the OS level.
*/
File cur_file;
private:
/* Position and length of any remaining data in buf[]. */
uint32_t buf_data_pos;
@@ -5869,8 +5854,7 @@ private:
public:
handler_binlog_reader()
: cur_file_no(~(uint64_t)0), cur_file_offset(0), cur_file((File)-1),
buf_data_pos(0), buf_data_remain(0)
: buf_data_pos(0), buf_data_remain(0)
{ }
virtual ~handler_binlog_reader() { };
virtual int read_binlog_data(uchar *buf, uint32_t len) = 0;
@@ -5879,14 +5863,6 @@ public:
rpl_binlog_state_base *state) = 0;
int read_log_event(String *packet, uint32_t ev_offset, size_t max_allowed);
/*
cur_file_no -> implicitly gives file/tablespace
cur_file_offset -> implicitly gives page
cur_file_fh -> open fh, if any
cur_chunk_len
cur_chunk_sofar
*/
};
#endif /* HANDLER_INCLUDED */