This patch augments Gtid_log_event with the user thread-id.
In particular that compensates for the loss of this info in
Rows_log_events.
Gtid_log_event::thread_id gets visible in mysqlbinlog output like
#231025 16:21:45 server id 1 end_log_pos 537 CRC32 0x1cf1d963 GTID 0-1-2 ddl thread_id=10
as a 32 bit unsigned integer. Note this is a 32-bit value, as
the connection id can only be 32 bits (see MDEV-15089 for
details).
While the size of Gtid event has grown by 4 bytes
replication from OLD <-> NEW is not affected by it. This patch
also slightly changes the logic to convert Gtid events to Query
events for older replicas which don't support Gtid. Instead of
hard-coding the padding of the sys var section of the generated
Query event, the length to pad is dynamically calculated based
on the length of the Gtid event.
This work was started by the late Sujatha Sivakumar.
Brandon Nesterenko took it over, reviewed initial patches and
extended the work.
Also thanks to Andrei for his help in finalizing the fixes for
MDEV-33924, which were squashed into this patch.
Reviewed-by:
=============
Andrei Elkin <andrei.elkin@mariadb.com>
Kristian Nielsen <knielsen@knielsen-hq.org>
This reverts commit c37b2087b4.
In c37b20887, when re-binlogging a GTID event on a replica,
it will overwrite the thread_id from the primary to be the
value of the slave applier (SQL thread or parallel worker).
This should be the value of the original thread_id on the
master connection though, to both help track temporary
tables, and be consistent with Query_log_event.
Reverting the commit to re-target 11.5, so we can re-test
with the corrected thread_id.
This patch augments Gtid_log_event with the user thread-id.
In particular that compensates for the loss of this info in
Rows_log_events.
Gtid_log_event::thread_id gets visible in mysqlbinlog output like
#231025 16:21:45 server id 1 end_log_pos 537 CRC32 0x1cf1d963 GTID 0-1-2 ddl thread_id=10
as 64 bit unsigned integer.
While the size of Gtid event has grown by 8-9 bytes
replication from OLD <-> NEW is not affected by it.
This work was started by the late Sujatha Sivakumar.
Brandon Nesterenko took it over, reviewed initial patches and extended
the work.
Reviewed-by: <andrei.elkin@mariadb.com>
Commit a923d6f49c disabled numeric setting
of character_set_* variables with non-default values:
MariaDB [(none)]> set character_set_client=224;
ERROR 1115 (42000): Unknown character set: '224'
However the corresponding binlog functionality still write numeric
values for log event, and this will break binlog replay if the value is
not default. Now make the server use 'String' type for
'character_set_client' when generating binlog events
Before:
/*!\C utf8mb4 *//*!*/;
SET @@session.character_set_client=224,@@session.collation_connection=224,@@session.collation_server=33/*!*/;
After:
/*!\C utf8mb4 *//*!*/;
SET @@session.character_set_client=utf8mb4,@@session.collation_connection=33,@@session.collation_server=8/*!*/;
Note: prior to the previous commit, setting with '224' or '45' or
'utf8mb4' have the same effect, as they all set the parameter to
'utf8mb4'.
All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services, Inc.
- Major rewrite of ddl_log.cc and ddl_log.h
- ddl_log.cc described in the beginning how the recovery works.
- ddl_log.log has unique signature and is dynamic. It's easy to
add more information to the header and other ddl blocks while still
being able to execute old ddl entries.
- IO_SIZE for ddl blocks is now dynamic. Can be changed without affecting
recovery of old logs.
- Code is more modular and is now usable outside of partition handling.
- Renamed log file to dll_recovery.log and added option --log-ddl-recovery
to allow one to specify the path & filename.
- Added ddl_log_entry_phase[], number of phases for each DDL action,
which allowed me to greatly simply set_global_from_ddl_log_entry()
- Changed how strings are stored in log entries, which allows us to
store much more information in a log entry.
- ddl log is now always created at start and deleted on normal shutdown.
This simplices things notable.
- Added probes debug_crash_here() and debug_simulate_error() to simply
crash testing and allow crash after a given number of times a probe
is executed. See comments in debug_sync.cc and rename_table.test for
how this can be used.
- Reverting failed table and view renames is done trough the ddl log.
This ensures that the ddl log is tested also outside of recovery.
- Added helper function 'handler::needs_lower_case_filenames()'
- Extend binary log with Q_XID events. ddl log handling is using this
to check if a ddl log entry was logged to the binary log (if yes,
it will be deleted from the log during ddl_log_close_binlogged_events()
- If a DDL entry fails 3 time, disable it. This is to ensure that if
we have a crash in ddl recovery code the server will not get stuck
in a forever crash-restart-crash loop.
mysqltest.cc changes:
- --die will now replace $variables with their values
- $error will contain the error of the last failed statement
storage engine changes:
- maria_rename() was changed to be more robust against crashes during
rename.
Problem:
========
During point in time recovery of binary log syntax error is reported for
BEGIN statement and recovery fails.
Analysis:
=========
In MariaDB 10.3 and later, setting the sql_mode system variable to Oracle
allows the server to understand a subset of Oracle's PL/SQL language. When
sql_mode=ORACLE is set, it switches the parser from the MariaDB parser to
Oracle compatible parser. With this change 'BEGIN' is not considered as
'START TRANSACTION'. Hence the syntax error is reported.
Fix:
===
At preset 'BEGIN' query is generated from 'Gtid_log_event::print'. The current
session specific 'sql_mode' information is not present as part of
'Gtid_log_event'. If it was available then, mysqlbinlog tool can make use of
'sql_mode == ORACLE' and can output "START TRANSACTION" in this particular
mode and for other sql_modes it will write "BEGIN" as part of output. Since it
is not available 'mysqlbinlog' tool will output all 'BEGIN' statements as
'START TRANSACTION' irrespective of 'sql_mode'.
MDEV-19964 S3 replication support
Added new configure options:
s3_slave_ignore_updates
"If the slave has shares same S3 storage as the master"
s3_replicate_alter_as_create_select
"When converting S3 table to local table, log all rows in binary log"
This allows on to configure slaves to have the S3 storage shared or
independent from the master.
Other thing:
Added new session variable '@@sql_if_exists' to force IF_EXIST to DDL's.
Problem:
=======
rpl_blackhole.test fails when executed with following options
mysqld=--binlog_annotate_row_events=1, mysqld=--replicate_annotate_row_events=1
Test output:
------------
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
rpl.rpl_blackhole_bug 'mix' [ pass ] 791
rpl.rpl_blackhole_bug 'row' [ fail ]
Replicate_Wild_Ignore_Table
Last_Errno 1032
Last_Error Could not execute Update_rows_v1 event on table test.t1; Can't find
record in 't1', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's
master log master-bin.000001, end_log_pos 1510
Analysis:
=========
Enabling "replicate_annotate_row_events" on slave, Tells the slave to write
annotate rows events received from the master to its own binary log. The
received annotate events are applied after the Gtid event as shown below.
thd->query() will be set to the actual query received from the master, through
annotate event. Annotate_rows event should not be deleted after the event is
applied as the thd->query will be used to generate new Annotate_rows event
during applying the subsequent Rows events. After the last Rows event has been
applied, the saved Annotate_rows event (if any) will be deleted.
In balckhole engine all the DML operations are noops as they donot store any
data. They simply return success without doing any operation. But the existing
strictly expects thd->query() to be 'NULL' to identify that row based
replication is in use. This assumption will fail when row annotations are
enabled as the query is not 'NULL'. Hence various row based operations like
'update', 'delete', 'index lookup' will fail when row annotations are enabled.
Fix:
===
Extend the row based replication check to include row annotations as well.
i.e Either the thd->query() is NULL or thd->query() points to query and row
annotations are in use.