Problem:
=======
rpl_blackhole.test fails when executed with following options
mysqld=--binlog_annotate_row_events=1, mysqld=--replicate_annotate_row_events=1
Test output:
------------
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
rpl.rpl_blackhole_bug 'mix' [ pass ] 791
rpl.rpl_blackhole_bug 'row' [ fail ]
Replicate_Wild_Ignore_Table
Last_Errno 1032
Last_Error Could not execute Update_rows_v1 event on table test.t1; Can't find
record in 't1', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's
master log master-bin.000001, end_log_pos 1510
Analysis:
=========
Enabling "replicate_annotate_row_events" on slave, Tells the slave to write
annotate rows events received from the master to its own binary log. The
received annotate events are applied after the Gtid event as shown below.
thd->query() will be set to the actual query received from the master, through
annotate event. Annotate_rows event should not be deleted after the event is
applied as the thd->query will be used to generate new Annotate_rows event
during applying the subsequent Rows events. After the last Rows event has been
applied, the saved Annotate_rows event (if any) will be deleted.
In balckhole engine all the DML operations are noops as they donot store any
data. They simply return success without doing any operation. But the existing
strictly expects thd->query() to be 'NULL' to identify that row based
replication is in use. This assumption will fail when row annotations are
enabled as the query is not 'NULL'. Hence various row based operations like
'update', 'delete', 'index lookup' will fail when row annotations are enabled.
Fix:
===
Extend the row based replication check to include row annotations as well.
i.e Either the thd->query() is NULL or thd->query() points to query and row
annotations are in use.
Simulation of a big-sized event in rpl.rpl_semi_sync_skip_repl did not clean
up after itself so screw the last binlog event offset which could jump
backwards.
The test is refined to rotate a binlog file with simulation and use the next
one for logics of the test incl master-slave synchonization.
Problem:
========
We have a Master/Master Setup on two servers, but are only writing to one of
those servers (so it is essentially Master/Slave) We upgraded from 10.1.* to
10.2.22 last week and starting with the upgrade, we are getting duplicate key
errors on the slave. BINLOG=mixed.
Analysis:
=========
This issue happens with LOCK TABLES and binlog_format=MIXED combination. When an
UNSAFE statement is encountered in 'MIXED' mode, it is logged in the form of
'ROW' format. For all the tables that are part of LOCK TABLES list their table maps
are written into the binary log. For each table in the list a check is
done to see if 'check_table_binlog_row_based_done' flag is set or not. If it is not set
a check process is initiated to see if table qualifies for row based binary
logging or not and 'check_table_binlog_row_based_done' is set. This flag will be
cleared at the time of closing thread tables.
But there can be special cases where the LOCK TABLES contains more number of
tables but the unsafe query is actually using subset of tables from LOCK TABLES
list.
For example: LOCK TABLES locks t1,t2,t3 but the unsafe statement makes use of
only two tables t1,t3. In this case the 'check_table_binlog_row_based_done' flag
is enabled for table 't2' while writing table map, but 'close_thread_tables'
function call will not reset this flag. Since the flag is not cleared for table
't2' even a safe statement which used t2 will be logged in the form of row based
format.
This leads to an assert on debug builds and causes duplicate entries in release
builds. In release builds a statement is logged in the form of both ROW and
STATEMENT format. This causes the slave to fail with duplicate key error.
Fix:
===
During 'close_thread_tables' when LOCK TABLE modes are active "ha_reset" is done
for all the tables which were part of current statement. As mentioned in the
example 'ha_reset' is called for tables 't1' and 't3'. This will clear the
'check_table_binlog_row_based_done' flag. At this point add a check for the rest
of the tables to see if 'check_table_binlog_row_based_done' is enabled or not.
If enabled clear the flag.
The patch fixes a fired assert in the semisync master module. The assert
caught attempt to switch semisync off (per rpl_semi_sync_master_wait_no_slave = OFF)
when it was not even initialized (per rpl_semi_sync_master_enabled = OFF).
The switching-off execution branch is relocated under one that executes
enable_master() first.
A minor cleaup is done to remove the int return from two functions that
did not return anything but an error which could not happen in the functions.
Problem:
========
When attempting to delay a Slave attached with GTID, there appears to be an
extra delay applied initially. For example, this output reflects a Slave that is
already delayed by 43200 seconds. When switching to GTID replication,
replication is paused until SQL_Remaining_Delay counts down to 0:
CHANGE MASTER TO master_use_gtid=current_pos; CHANGE MASTER TO
MASTER_DELAY=43200;
Seconds_Behind_Master: 44847
Using_Gtid: Current_Pos
SQL_Delay: 43200
SQL_Remaining_Delay: 43089
Slave_SQL_Running_State: Waiting until MASTER_DELAY seconds after master
executed event
Analysis:
=========
When slave initiates a GTID based connection request to master, the master sends
two GTID_LIST events. The first one is actual GTID_LIST event and the second
one is a fake GTID_LIST event. This is sent by master to provide its current
binlary log file position. The fake GTID_LIST events will have their ev->when=0.
'when' (the timestamp) is set to 0 so that slave could distinguish between real
and fake Rotate events.
On slave side when MASTER_DELAY is configured to "X" the applier will ensure
that there is a time delay of "X" seconds before the event is applied.
General behaviour of MASTER_DELAY example:-
Master
timestamp of event e1=10
timestamp of event e2=11
On slave MASTER_DELAY=5
Event e1 will be applied at = 15
e2 will be applied at =16
In bug scenario:-
On Master: With GTIDs
timestamp of event e1=10
timestamp of event e2=0
On Slave:
e1 will be applied at = 10 + 5 =15
For e2, since "e2->when=0" e2->when is set to current timestamp.
i.e since the e2->when and current timestamp on slave is the same applier waits
for additional master_delay=5 seconds. the ev->when contributes to
"rli->last_master_timestamp".
rli->last_master_timestamp= ev->when + (time_t) ev->exec_time;
Fake events should not update the "ev->when" to "current timestamp" on slave.
Fix:
===
Remove the assignment of current timestamp to "ev->when" when "ev->when=0".
select from I_S
Problem:
========
When applier thread tries to access 'variable_name' of
INFORMATION_SCHEMA.SESSION_VARIABLES table through triggers, it results in an
abnormal exit of slave server.
Analysis:
========
At the time of replication of stored routines and triggers, their associated
security context will be sent by the master. The applier thread on the slave
server will use this information to set the required security context for the
execution of stored routines and triggers. This is achieved as follows.
->The stored routine object has a member named 'm_security_ctx' which holds the
security context received from master.
->The applier thread's security_ctx is stored into a 'backup' object.
->Set the applier thread's security_ctx to 'm_security_ctx'.
->Upon the completion of stored routine execution restore the original security
context of applier thread from the backup.
During the above process the 'm_security_ctx' object is not initialized
properly. Hence the 'external_user' of 'm_security_ctx' has invalid value for
this variable and accessing this variable results in abnormal exit of server.
Fix:
===
Invoke the Security_context::init() call from the constructor of stored routine
so that 'm_security_ctx' gets initialized properly.
The patches features an optional shutdown behavior to hold on until
after all connected slaves have been sent the last binlogged event.
The connected slave is one whose START SLAVE has been acknowledged and
that was not stopped since that though it could be technically
reconnecting in background.
The solution therefore disallows killing the dump thread until is has
found EOF of the latest binlog file. It is up to the shutdown
requester (DBA) to set up a sufficiently large shutdown timeout value
for shudown to wait patiently until lagging behind slaves have been
synchronized. On the other hand if a specific slave needs exclusion
from synchronization the DBA would have to stop it manually which
would terminate its dump thread.
`mysqladmin shutdown' is extended with a `--wait_for_all_slaves' option
which translates to `SHUTDOW WAIT FOR ALL SLAVES' sql query
to enable the feature on the client side.
The patch also performs a small refactoring of the server shutdown
around close_connections() to introduce kill thread phases which
are two as of current.
post-merge changes:
* handle password expiration on old tables like everything else -
make changes in memory, even if they cannot be done on disk
* merge "debug" tests with non-debug tests, they don't use dbug anyway
* only run rpl password expiration in MIXED mode, it doesn't replicate
anything, so no need to repeat it thrice
* restore update_user_table_password() prototype, it should not change
ACL_USER, this is done in acl_user_update()
* don't parse json twice in get_password_lifetime and get_password_expired
* remove LEX_USER::is_changing_password, see if there was any auth instead
* avoid overflow in expiration calculations
* don't initialize Account_options in the constructor, it's bzero-ed later
* don't create ulong sysvars - they're not portable, prefer uint or ulonglong
* misc simplifications
32 bit int
Row-based slave applier could not parse correctly the table id when
the value exceeded the max of 32 bit unsigned int.
The reason turns out in that the being parsed value placeholder
was sized as 4 bytes.
The type is fixed to ulonglong.
Additionally the patch works around Rows_log_event::m_table_id 4 bytes
size on 32 bits platforms. In case of last_table_id value overflows
the 4 byte max, there won't be the zero value for m_table_id generated
and the first wrapped-around value is one, this is thanks to excluding
UINT_MAX32 + 1 from TABLE_SHARE::table_map_id.