* be strict in CREATE TABLE, just like in ALTER TABLE, because
CREATE TABLE, just like ALTER TABLE, can be rolled back for any engine
* but don't auto-convert warnings into errors for engine warnings
(handler::create) - this matches ALTER TABLE behavior
* and not when creating a default record, these errors are handled
specially (and replaced with ER_INVALID_DEFAULT)
* always issue a Note when a non-unique key is truncated, because it's
not a Warning that can be converted to an Error. Before this commit
it was a Note for blobs and a Warning for all other data types.
Problem:
=======
Upon deleting or updating a row in a parent table (with primary key), if
the child table has virtual column and an associated key with ON UPDATE
CASCADE/ON DELETE CASCADE, it will result in slave crash.
Analysis:
========
Tables which are related through foreign key require prelocking similar to
triggers. i.e If a table has triggers/foreign keys we should add all tables
and routines used by them to the prelocking set. This prelocking happens
during 'open_and_lock_tables' call. Each table being opened is checked for
foreign key references. If foreign key reference exists then the child
table is opened and it is linked to the table_list. Upon any modification
to parent table its corresponding child tables are retried from table_list
and they are updated accordingly. This prelocking work fine on master.
On slave prelocking works for following cases.
- Statement/mixed based replication
- In row based replication when trigger execution is enabled through
'slave_run_triggers_for_rbr=YES/LOGGING/ENFORCE'
Otherwise it results in an assert/crash, as the parent table will not find
the corresponding child table and it will be NULL. Dereferencing NULL
pointer leads to slave server exit.
Fix:
===
Introduce a new 'slave_fk_event_map' flag similar to 'trg_event_map'. This
flag will ensure that when foreign key is enabled in row based replication
all the parent and child tables are prelocked, so that parent is able to
locate the child table.
Note: This issue is specific to slave, hence only slave needs to be
upgraded.
Post push fix to address test failure.
Problem:
=======
rpl.rpl_drop_temp_table_invaid_lex added as part bug fix has occasional
failures in build bot.
MTR's internal check of the test case
'rpl.rpl_drop_temp_table_invaid_lex' failed.
Variable_name Value
-Slave_open_temp_tables 0
+Slave_open_temp_tables 1
Analysis:
=========
The reason for the failure is that the DROP TEMPORARY TABLE command which
gets generated on connection disconnect might not have reached the slave
and hence the temp table remains on the slave.
Fix:
===
On master, upon disconnect, wait till connection is completely gone. Then
ensure that DROP TEMPORARY table statement is available in the binary log.
Sync the slave with master and check that temporary table count is zero on
slave. Fixed a typo in test name.
Test we can ALTER log tables directly when not being written
to.
This removes the contraint in the rpl_mysql_upgrade.test such
that we can run mysql_upgrade --write-binlog all the way
through to a replica. We test this in the replication scenario
where the mysql.{slow,general}_log tables aren't being
written to.
Reviewers: Vicențiu Ciorbaru, Anel Husakovic
MDEV-17109 rpl.rpl_parallel_retry fails in buildbot with timeout
I was not able to prove that this fix works, but at least it simplifies
the problem as it removes some possible timing issues.
Problem:- Test case uses socket which does not work on windows, So mysqlslap
falls back to default connection which is defined in my.cnf and connects
to master instead of slave. Since start slave/stop slave is executed on
master we get error MASTER_HOST was not set
Solution:- Use Port instead of socket
Analysis:
========
Slave server will send COM_REGISTER_SLAVE command at the time of establishing
a connection to master. If master is down, then the command will fail and
COM_REGISTER_SLAVE failed warning is reported.
'rpl_binlog_index.test' shutsdown the master and it relocates binary logs to a
new location and attempts to start master by pointing 'log-bin' to new
location. During this process the slave threads are active. IO thread actively
checks for the presence of master when it finds that the connection is lost it
attempts a reconnect, as master is down COM_REGISTER_SLAVE command fails.
As part of fix, stop the slave threads and then shutdown the master and do the
binlog relocation. Once master is restarted start the slave threads and sync
them with the master. In test binary logs and index files on master are
relocated to /tmpdir but during master restart only --log-bin option is
provided, this is incorrect. Even --log-bin-index also should be pointed to
/tmpdir otherwise upon master server restart two index files will be created.
One master-bin.index in /tmpdir and a new master-bin.index as per log_basename
in datadir. Due to this slave will fail to connect to master.
'rpl_gtid_crash.test' tests following scenario "crashing master, causing slave
IO thread to reconnect while SQL thread is running". When IO thread tries to
connect to crashed master on slow platforms COM_REGISTER_SLAVE command fails.
This is expected hence the warning should be added to suppression list.
Shutdown of mtr tests may be too impatient, esp on CI environment where
10 seconds of `arg` of `shutdown_server arg` may not be enough for the clean
shutdown to complete.
This is fixed to remove explicit non-zero timeout argument to
`shutdown_server` from all mtr tests. mysqltest computes 60 seconds default
value for the timeout for the argless `shutdown_server` command.
This policy is additionally ensured with a compile time assert.
Problem:- rpl_parallel2 was failing non-deterministically
Analysis:-
When FLUSH TABLES WITH READ LOCK is executed, it will allow all worker
threads to complete their ongoing transactions and then it will pause them.
At this state FTWRL will proceed to acquire global read lock. FTWRL first
blocks threads from starting new commits, then upgrades the lock to block
commit of existing transactions.
Step1:
FLUSH TABLES WITH READ LOCK - Blocks new commits
Step2:
* STOP SLAVE command enables 'force_abort=1' which unblocks workers,
they continue to execute events.
* T1: Waits in 'record_gtid' call to update 'gtid_slave_pos' table with
its current GTID, but it is blocked becuase of Step1.
* T2: Holds COMMIT lock and waits for T1 to commit.
Step3:
FLUSH TABLES WITH READ LOCK - Waiting to get BLOCK_COMMIT.
This results in deadlock. When STOP SLAVE command allows paused workers to
proceed, workers should skip the execution of all further events, similar
to 'conservative' parallel mode.
Solution:-
We will assign 1 to skip_event_group when we are aborted in do_ftwrl_wait.
rpl_parallel_entry->pause_sub_id is only reset when force_abort is off in
rpl_pause_after_ftwrl.