Problem:
=======
There are two issues that are addressed in this patch:
1) SHOW BINARY LOGS uses caching to store the binary logs that exist
in the log directory; however, if new events are written to the logs,
the caching strategy is unaware. This is okay for users, as it is
okay for SHOW to return slightly old data. The test, however, can
result in inconsistent data. It runs two connections concurrently,
where one shows the logs, and the other adds a new file. The output
of SHOW BINARY LOGS then depends on when the cache is built, with
respect to the time that the second connection rotates the logs.
2) There is a race condition between RESET MASTER and SHOW BINARY
LOGS. More specifically, where they both need the binary log lock to
begin, SHOW BINARY LOGS only needs the lock to build its cache. If
RESET MASTER is issued after SHOW BINARY LOGS has built its cache and
before it has returned the results, the presented data may be
incorrect.
Solution:
========
1) As it is okay for users to see stale data, to make the test
consistent, use DEBUG_SYNC to force the race condition (problem 2) to
make SHOW BINARY LOGS build a cache before RESET MASTER is called.
Then, use additional logic from the next part of the solution to
rebuild the cache.
2) Use an Atomic_counter to keep track of the number of times RESET
MASTER has been called. If the value of the counter changes after
building the cache, the cache should be rebuilt and the analysis
should be restarted.
Reviewed By:
============
Andrei Elkin: <andrei.elkin@mariadb.com>
Problem:
========
Aborting OPTIMIZE TABLE still logs in binary logs and replicates to the
Slave server. "Optimize table" command under execution, is killed by using
"Ctrl-C" as shown below.
MariaDB [test]> optimize table t2;
^CCtrl-C -- query killed. Continuing normally.
In spite of query execution being interrupted the query gets written to
binary log.
Analysis:
========
Admin command execution logic is not handling KILL command, hence it
ignores the KILL command and completes its execution.
Fix:
===
Check for thread killed notification, during admin command execution and
handle it. If thread kill occurs prior to any table modification the query
will not be written to binary log. If kill happens after at least one table
is modified then the query will be written to binary log. Ex: command in
execution is 'OPTIMIZE TABLE t1,t2' and the thread kill happens after t1
table is modified then 'OPTIMIZE TABLE t1,t2' will be written to binary log
as admin commands will not make the slave to diverge from master.
There was race between a committing transaction and the following in binlog
order FLUSH LOGS that could create a 2nd Binlog checkpoint (BCP) event
in the new file *before* the first logged-in-old-binlog transaction gets committed in
Innodb. That would cause the transaction loss at recovery, should
the server stop right after the BCP.
The race is tackled by enforcing the necessary set of mutexes to be acquired
by FLUSH-LOGS handler in the correct order (of the group commit leader
pattern).
Note, there remain two cases where a similar race is still possible:
- the above race as it is when the server is run with ("unlikely")
non-default `--binlog-optimize-thread-scheduling=0` (MDEV-24530), and
- at unlikely event of bin-logging of Incident event (MDEV-24531) that
also triggers binlog rotation,
in both cases though with lesser chances after the current fixes.
1. wait for the binlog thread to reach the certain state, don't use
a debug_sync that's incorrectly placed to detect the state
2. no need to do a (non-deterministic) `show binlog events` to verify
what is guaranteed by the directly preceding line
Problem:
========
When O_TMPFILE is not supported mysqlbinlog outputs the error to standard
stream as a warning which breaks PITR:
ERROR 1064 (42000) at line 382: You have an error in your SQL syntax; check
the manual that corresponds to your MariaDB server version for the right
syntax to use near 'mysqlbinlog: O_TMPFILE is not supported on /tmp (disabling
future attempts)
Analysis:
=========
'mysqlbinlog' utility is used to perform point-in-time-recovery based on binary
log. It converts the events in the binary log files, from binary format to text
so that they can be viewed or applied. This output can be saved to a file and
it can be sourced back to mysql client. The mysqlbinlog utility stores the
text output into IO_CACHE and when it is full the data is written to a temp
file. The temporary file creation is attempted using 'O_TMPFILE' flag. If the
underlying filesystem doesn't support this operation, a note is printed on to
standard error and file creation is done without O_TMPFILE' flag. If standard
error is redirected to standard output the note gets written to the sql file
as shown below.
/bld/client/mysqlbinlog: O_TMPFILE is not supported on /tmp (disabling future
attempts)
table id 32
When the sql file is used for PITR, it leads to a syntax error as it is not a
valid sql command.
Fix:
====
Make 'my_message_stderr' to ignore messages which are flagged as ME_NOTE and
ME_ERROR_LOG_ONLY. ME_ERROR_LOG_ONLY flag is applicable to server. In order to
print an informational note to stderr stream, ME_NOTE flag without
ME_ERROR_LOG_ONLY flag should be specified. 'my_message_stderr' should print
messages flagged with ME_WARNING or ME_FATAL to stderr stream.
The crash was caused by improper raising of an error or replication checksum
verification at time of the server initialization. As there is no THD object
associated with the main initializing thread yet the error text should be
assigned with calling a respective macro that is aware of that possibility.
Fixed accordingly.
[At merging to 10.4 the new test result file needs
+# restart: --master_verify_checksum=ON --debug_dbug=+d,corrupt_read_log_event_char
that mtr run will hint on.]
Analysis:
========
"mysqlbinlog -v" option will reconstruct row events and display them as
commented SQL statements. If this option is given twice, the output includes
comments to indicate column data types and some metadata.
`log_event_print_value` is the function reponsible for printing values and
their types. This function doesn't handle GEOMETRY type. Hence the above error
gets printed.
Fix:
===
Add support for GEOMETRY datatype.
Problem:
=======
SHOW BINLOG EVENTS FROM <"random"-pos> caused a variety of failures as
reported in MDEV-18046. They are fixed but that approach is not future-proof
as well as is not optimal to create extra check for being constructed event
parameters.
Analysis:
=========
"show binlog events from <pos>" code considers the user given position as a
valid event start position. The code starts reading data from this event start
position onwards and tries to map it to a set of known events. Each event has
a specific event structure and asserts have been added to ensure that, read
event data, satisfies the event specific requirements. When a random position
is supplied to "show binlog events command" the event structure specific
checks will fail and they result in assert.
For example: https://jira.mariadb.org/browse/MDEV-18046
In the bug description user executes CREATE TABLE/INSERT and ALTER SQL
commands.
When a crazy offset like "SHOW BINLOG EVENTS FROM 365" is provided code
assumes offset 365 as valid event begin and proceeds to EVENT_LEN_OFFSET reads
some random length and comes up with a crazy event which didn't exits in the
binary log. In this quoted example scenario, event read at offset 365 is
considered as "Update_rows_log_event", which is not present in binary log.
Since this is a random event its validation fails and code results in
assert/segmentation fault, as shown below.
mysqld: /data/src/10.4/sql/log_event.cc:10863: Rows_log_event::Rows_log_event(
const char*, uint, const Format_description_log_event*):
Assertion `var_header_len >= 2' failed.
181220 15:27:02 [ERROR] mysqld got signal 6 ;
#7 0x00007fa0d96abee2 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
#8 0x000055e744ef82de in Rows_log_event::Rows_log_event (this=0x7fa05800d390,
buf=0x7fa05800d080 "", event_len=254, description_event=0x7fa058006d60) at
/data/src/10.4/sql/log_event.cc:10863
#9 0x000055e744f00cf8 in Update_rows_log_event::Update_rows_log_event
Since we are reading random data repeating the same command SHOW BINLOG EVENTS
FROM 365 produces different types of crashes with different events. MDEV-18046
reported 10 such crashes.
In order to avoid such scenarios user provided starting offset needs to be
validated for its correctness. Best way of doing this is to make use of
checksums if they are available. MDEV-18046 fix introduced the checksum based
validation.
The issue still remains in cases where binlog checksums are disabled. Please
find the following bug reports.
MDEV-22473: binlog.binlog_show_binlog_event_random_pos failed in buildbot,
server crashed in read_log_event
MDEV-22455: Server crashes in Table_map_log_event,
binlog.binlog_invalid_read_in_rotate failed in buildbot
Fix:
====
When binlog checksum is disabled, perform scan(via reading event by event), to
validate the requested FROM <pos> offset. Starting from offset 4 read the
event_length of next_event in the binary log. Using the next_event length
advance current offset to point to next event. Repeat this process till the
current offset is less than or equal to crazy offset. If current offset is
higher than crazy offset provide appropriate invalid input offset error.
(This commit is exclusively for 10.1 branch, do not merge it to upper ones)
In case of a pattern of non-STMT_END-marked Rows-log-event (A) followed by
a STMT_END marked one (B) mysqlbinlog mixes up the base64 encoded rows events
with their pseudo sql representation produced by the verbose option:
BINLOG '
base64 encoded data for A
### verbose section for A
base64 encoded data for B
### verbose section for B
'/*!*/;
In effect the produced BINLOG '...' query is not valid and is rejected with the error.
Examples of this way malformed BINLOG could have been found in binlog_row_annotate.result
that gets corrected with the patch.
The issue is fixed with introduction an auxiliary IO_CACHE to hold on the verbose
comments until the terminal STMT_END event is found. The new cache is emptied
out after two pre-existing ones are done at that time.
The correctly produced output now for the above case is as the following:
BINLOG '
base64 encoded data for A
base64 encoded data for B
'/*!*/;
### verbose section for A
### verbose section for B
Thanks to Alexey Midenkov for the problem recognition and attempt to tackle,
Venkatesh Duggirala who produced a patch for the upstream whose
idea is exploited here, as well as to MDEV-23077 reporter LukeXwang who
also contributed a piece of a patch aiming at this issue.
Extra: mysqlbinlog_row_minimal refined to not produce mutable numeric values into the result file.
(This commit is for 10.3 and upper branches)
In case of a pattern of non-STMT_END-marked Rows-log-event (A) followed by
a STMT_END marked one (B) mysqlbinlog mixes up the base64 encoded rows events
with their pseudo sql representation produced by the verbose option:
BINLOG '
base64 encoded data for A
### verbose section for A
base64 encoded data for B
### verbose section for B
'/*!*/;
In effect the produced BINLOG '...' query is not valid and is rejected with the error.
Examples of this way malformed BINLOG could have been found in binlog_row_annotate.result
that gets corrected with the patch.
The issue is fixed with introduction an auxiliary IO_CACHE to hold on the verbose
comments until the terminal STMT_END event is found. The new cache is emptied
out after two pre-existing ones are done at that time.
The correctly produced output now for the above case is as the following:
BINLOG '
base64 encoded data for A
base64 encoded data for B
'/*!*/;
### verbose section for A
### verbose section for B
Thanks to Alexey Midenkov for the problem recognition and attempt to tackle,
and to Venkatesh Duggirala who produced a patch for the upstream whose
idea is exploited here, as well as to MDEV-23077 reporter LukeXwang who
also contributed a piece of a patch aiming at this issue.
(This commit is exclusively for 10.2 branch. Do not merge it to 10.3)
In case of a pattern of non-STMT_END-marked Rows-log-event (A) followed by
a STMT_END marked one (B) mysqlbinlog mixes up the base64 encoded rows events
with their pseudo sql representation produced by the verbose option:
BINLOG '
base64 encoded data for A
### verbose section for A
base64 encoded data for B
### verbose section for B
'/*!*/;
In effect the produced BINLOG '...' query is not valid and is rejected with the error.
Examples of this way malformed BINLOG could have been found in binlog_row_annotate.result
that gets corrected with the patch.
The issue is fixed with introduction an auxiliary IO_CACHE to hold on the verbose
comments until the terminal STMT_END event is found. The new cache is emptied
out after two pre-existing ones are done at that time.
The correctly produced output now for the above case is as the following:
BINLOG '
base64 encoded data for A
base64 encoded data for B
'/*!*/;
### verbose section for A
### verbose section for B
Thanks to Alexey Midenkov for the problem recognition and attempt to tackle,
and to Venkatesh Duggirala who produced a patch for the upstream whose
idea is exploited here, as well as to MDEV-23077 reporter LukeXwang who
also contributed a piece of a patch aiming at this issue.