Analysis:
========
RESET MASTER TO # command deletes all binary log files listed in the index
file, resets the binary log index file to be empty, and creates a new binary
log with number #. When the user provided binary log number is greater than
the max allowed value '2147483647' server fails to generate a new binary log.
The RESET MASTER statement marks the binlog closure status as
'LOG_CLOSE_TO_BE_OPENED' and exits. Statements which follow RESET MASTER
try to write to binary log they find the log_state != LOG_CLOSED and
proceed to write to binary log cache and it results in crash.
Fix:
===
During MYSQL_BIN_LOG open, if generation of new binary log name fails then the
"log_state" needs to be marked as "LOG_CLOSED". With this further statements
will find binary log as closed and they will skip writing to the binary log.
The assert was caused by early cleanup of a user variable participant
in BINLOG @var,@var where it plays twice. That was unexpected by the base
code to clear its value prematurely.
Fixed with relocating the user var destruction after operations with
its value is over.
Changed wording in error messages from MySQL to MariaDB. In
cases where the word server could be used instead it was done.
Tests that have these errors recorded were updated.
MDEV-21605 Clean up and speed up interfaces for binary row logging
MDEV-21617 Bug fix for previous version of this code
The intention is to have as few 'if' as possible in ha_write() and
related functions. This is done by pre-calculating once per statement the
row_logging state for all tables.
Benefits are simpler and faster code both when binary logging is disabled
and when it's enabled.
Changes:
- Added handler->row_logging to make it easy to check it table should be
row logged. This also made it easier to disabling row logging for system,
internal and temporary tables.
- The tables row_logging capabilities are checked once per "statements
that updates tables" in THD::binlog_prepare_for_row_logging() which
is called when needed from THD::decide_logging_format().
- Removed most usage of tmp_disable_binlog(), reenable_binlog() and
temporary saving and setting of thd->variables.option_bits.
- Moved checks that can't change during a statement from
check_table_binlog_row_based() to check_table_binlog_row_based_internal()
- Removed flag row_already_logged (used by sequence engine)
- Moved binlog_log_row() to a handler::
- Moved write_locked_table_maps() to THD::binlog_write_table_maps() as
most other related binlog functions are in THD.
- Removed binlog_write_table_map() and binlog_log_row_internal() as
they are now obsolete as 'has_transactions()' is pre-calculated in
prepare_for_row_logging().
- Remove 'is_transactional' argument from binlog_write_table_map() as this
can now be read from handler.
- Changed order of 'if's in handler::external_lock() and wsrep_mysqld.h
to first evaluate fast and likely cases before more complex ones.
- Added error checking in ha_write_row() and related functions if
binlog_log_row() failed.
- Don't clear check_table_binlog_row_based_result in
clear_cached_table_binlog_row_based_flag() as it's not needed.
- THD::clear_binlog_table_maps() has been replaced with
THD::reset_binlog_for_next_statement()
- Added 'MYSQL_OPEN_IGNORE_LOGGING_FORMAT' flag to open_and_lock_tables()
to avoid calculating of binary log format for internal opens. This flag
is also used to avoid reading statistics tables for internal tables.
- Added OPTION_BINLOG_LOG_OFF as a simple way to turn of binlog temporary
for create (instead of using THD::sql_log_bin_off.
- Removed flag THD::sql_log_bin_off (not needed anymore)
- Speed up THD::decide_logging_format() by remembering if blackhole engine
is used and avoid a loop over all tables if it's not used
(the common case).
- THD::decide_logging_format() is not called anymore if no tables are used
for the statement. This will speed up pure stored procedure code with
about 5%+ according to some simple tests.
- We now get annotated events on slave if a CREATE ... SELECT statement
is transformed on the slave from statement to row logging.
- In the original code, the master could come into a state where row
logging is enforced for all future events if statement could be used.
This is now partly fixed.
Other changes:
- Ensure that all tables used by a statement has query_id set.
- Had to restore the row_logging flag for not used tables in
THD::binlog_write_table_maps (not normal scenario)
- Removed injector::transaction::use_table(server_id_type sid, table tbl)
as it's not used.
- Cleaned up set_slave_thread_options()
- Some more DBUG_ENTER/DBUG_RETURN, code comments and minor indentation
changes.
- Ensure we only call THD::decide_logging_format_low() once in
mysql_insert() (inefficiency).
- Don't annotate INSERT DELAYED
- Removed zeroing pos_in_table_list in THD::open_temporary_table() as it's
already 0
MDEV-19964 S3 replication support
Added new configure options:
s3_slave_ignore_updates
"If the slave has shares same S3 storage as the master"
s3_replicate_alter_as_create_select
"When converting S3 table to local table, log all rows in binary log"
This allows on to configure slaves to have the S3 storage shared or
independent from the master.
Other thing:
Added new session variable '@@sql_if_exists' to force IF_EXIST to DDL's.
Lifted long standing limitation to the XA of rolling it back at the
transaction's
connection close even if the XA is prepared.
Prepared XA-transaction is made to sustain connection close or server
restart.
The patch consists of
- binary logging extension to write prepared XA part of
transaction signified with
its XID in a new XA_prepare_log_event. The concusion part -
with Commit or Rollback decision - is logged separately as
Query_log_event.
That is in the binlog the XA consists of two separate group of
events.
That makes the whole XA possibly interweaving in binlog with
other XA:s or regular transaction but with no harm to
replication and data consistency.
Gtid_log_event receives two more flags to identify which of the
two XA phases of the transaction it represents. With either flag
set also XID info is added to the event.
When binlog is ON on the server XID::formatID is
constrained to 4 bytes.
- engines are made aware of the server policy to keep up user
prepared XA:s so they (Innodb, rocksdb) don't roll them back
anymore at their disconnect methods.
- slave applier is refined to cope with two phase logged XA:s
including parallel modes of execution.
This patch does not address crash-safe logging of the new events which
is being addressed by MDEV-21469.
CORNER CASES: read-only, pure myisam, binlog-*, @@skip_log_bin, etc
Are addressed along the following policies.
1. The read-only at reconnect marks XID to fail for future
completion with ER_XA_RBROLLBACK.
2. binlog-* filtered XA when it changes engine data is regarded as
loggable even when nothing got cached for binlog. An empty
XA-prepare group is recorded. Consequent Commit-or-Rollback
succeeds in the Engine(s) as well as recorded into binlog.
3. The same applies to the non-transactional engine XA.
4. @@skip_log_bin=OFF does not record anything at XA-prepare
(obviously), but the completion event is recorded into binlog to
admit inconsistency with slave.
The following actions are taken by the patch.
At XA-prepare:
when empty binlog cache - don't do anything to binlog if RO,
otherwise write empty XA_prepare (assert(binlog-filter case)).
At Disconnect:
when Prepared && RO (=> no binlogging was done)
set Xid_cache_element::error := ER_XA_RBROLLBACK
*keep* XID in the cache, and rollback the transaction.
At XA-"complete":
Discover the error, if any don't binlog the "complete",
return the error to the user.
Kudos
-----
Alexey Botchkov took to drive this work initially.
Sergei Golubchik, Sergei Petrunja, Marko Mäkelä provided a number of
good recommendations.
Sergei Voitovich made a magnificent review and improvements to the code.
They all deserve a bunch of thanks for making this work done!
MDEV-18046: Assortment of crashes, assertion failures and ASAN errors in mysql_show_binlog_events
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following assert when ASAN is enabled.
uint32 binlog_get_uncompress_len(const char*):
Assertion `(buf[0] & 0xe0) == 0x80' failed
Fix:
===
**Part11: Converted debug assert to error handler code**
Problem:
========
SHOW BINLOG EVENTS FROM <pos> causes a variety of failures, some of which are
listed below. It is not a race condition issue, but there is some
non-determinism in it.
Analysis:
========
"show binlog events from <pos>" code considers the user given position as a
valid event start position. The code starts reading data from this event start
position onwards and tries to map it to a set of known events. Each event has
a specific event structure and asserts have been added to ensure that read
event data satisfies the event specific requirements. When a random position
is supplied to "show binlog events command" the event structure specific
checks will fail and they result in assert.
Fix:
====
The fix is split into different parts. Each part addresses either an ASAN
issue or an assert/crash.
**Part1: Checksum based position validation when checksum is enabled**
Using checksum validate the very first event read at the user specified
position. If there is a checksum mismatch report an appropriate error for the
invalid event.
Relates to MDEV-17863 DROP TEMPORARY TABLE creates a transaction in
binary log on read only server
Other things:
- Fixed that insert into normal_table select from tmp_table is
replicated as row events if tmp_table doesn't exists on slave.
- Any temporary tables created under read-only mode will never be logged
to binary log. Any usage of these tables to update normal tables, even
after read-only has been disabled, will use row base logging (as the
temporary table will not be on the slave).
- Analyze, check and repair table will not be logged in read-only mode.
Other things:
- Removed not used varaibles in
MYSQL_BIN_LOG::flush_and_set_pending_rows_event.
- Set table_share->table_creation_was_logged for all normal tables.
- THD::binlog_query() now returns -1 if statement was not logged., This
is used to update table_share->table_creation_was_logged.
- Don't log admin statements in opt_readonly is set.
- Table's that doesn't have table_creation_was_logged will set binlog format to row
logging.
- Removed not needed/wrong setting of table->s->table_creation_was_logged
in create_table_from_items()
calc_field_event_length should accurately calculate the size of BLOB type
fields, Instead of returning just the bytes taken by length it should return
length bytes + actual length.
calc_field_event_length should accurately calculate the size of BLOB type
fields, Instead of returning just the bytes taken by length it should return
length bytes + actual length.
Cherry-pick the commits the mysql and some changes.
WL#4618 RBR: extended table metadata in the binary log
This patch extends Table Map Event. It appends some new fields for
more metadata. The new metadata includes:
- Signedness of Numberic Columns
- Character Set of Character Columns and Binary Columns
- Column Name
- String Value of SET Columns
- String Value of ENUM Columns
- Primary Key
- Character Set of SET Columns and ENUM Columns
- Geometry Type
Some of them are optional, the patch introduces a GLOBAL system
variable to control it. It is binlog_row_metadata.
- Scope: GLOBAL
- Dynamic: Yes
- Type: ENUM
- Values: {NO_LOG, MINIMAL, FULL}
- Default: NO_LOG
Only Signedness, character set and geometry type are logged if it is MINIMAL.
Otherwise all of them are logged.
Also add a binlog_type_info() to field, So that we can have extract
relevant binlog info from field.