Parallel slave failed to retry in retry_event_group() with error
WSREP: Parallel slave worker failed at wsrep_before_command() hook
Fix wsrep transaction cleanup/restart in retry_event_group() to properly
clean up previous transaction by calling wsrep_after_statement().
Also move call to reset error after call to wsrep_after_statement()
to make sure that it remains effective.
Add a MTR test galera_as_slave_parallel_retry to reproduce the error
when the fix is not present.
Other issues which were detected when testing with sysbench:
Check if parallel slave is killed for retry before waiting for prior
commits in THD::wsrep_parallel_slave_wait_for_prior_commit(). This
is required with slave-parallel-mode=optimistic to avoid deadlock
when a slave later in commit order manages to reach prepare phase
before a lock conflict is detected.
Suppress wsrep applier specific warning for slave threads.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Test failed sporadically when --ps-protocol was enabled:
a transaction that was BF aborted on COMMIT would succeed
instead of reporting the expected deadlock error.
The reason for the failure was that, depending on timing,
the transaction was BF aborted while the COMMIT statement
was being prepared through a COM_STMT_PREPARE command.
In the failing cases, the transaction was BF aborted
after COM_STMT_PREPARE had already disabled the diagnostics
area of the client. Attempt to override the deadlock error
towards the end of dispatch_command() would be skipped,
resulting in a successful COMMIT even if the transaction
is aborted.
This bug affected the following MTR tests:
- galera_insert_multi
- galera_nopk_unicode
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
I could not reproduce reported assertion. However, I could
reporoduce test failure because missing wait_condition
and error in test case. Added missing wait_condition and
fixed error in test case to make test stable.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
For some reason InnoDB was disabled at --wsrep-recover call.
Added --loose-innodb to command like to make sure InnoDB is
enabled.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Make sure that node_1 remains in primary view by increasing it's
weight. Add suppression on expected warnings because we kill
node_2.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Ignoring configured server_id should not be a warning because
correct configuration is documented. Changed message to info
level with more detailed message what was configured and
what will be actually used.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
While applying CTAS log event, we peek the relay log to see if CTAS
contains inserted rows or if it's empty.
The peek function didn't check for end-of-file condition when tried to
get the next event from the log, and thus it hanged.
The fix includes checking for end-of-file while peeking for log events
and considering returned XID_EVENT value as a sign of an empty CTAS.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Clean up configuration and tests. Add wait conditions to make
sure test continues from clean state.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Don't allow the referencing key column from NULL TO NOT NULL
when
1) Foreign key constraint type is ON UPDATE SET NULL
2) Foreign key constraint type is ON DELETE SET NULL
3) Foreign key constraint type is UPDATE CASCADE and referenced
column declared as NULL
Don't allow the referenced key column from NOT NULL to NULL
when foreign key constraint type is UPDATE CASCADE
and referencing key columns doesn't allow NULL values
get_foreign_key_info(): InnoDB sends the information about
nullability of the foreign key fields and referenced key fields.
fk_check_column_changes(): Enforce the above rules for COPY
algorithm
innobase_check_foreign_drop_col(): Checks whether the dropped
column exists in existing foreign key relation
innobase_check_foreign_low() : Enforce the above rules for
INPLACE algorithm
dict_foreign_t::check_fk_constraint_valid(): This is used
by CREATE TABLE statement to check nullability for foreign
key relation.
Added new test scenario in galera.galera_bf_kill
test to make the issue surface. The tetst scenario has
a multi statement transaction containing a KILL command.
When the KILL is submitted, another transaction is
replicated, which causes BF abort for the KILL command
processing. Handling BF abort rollback while executing
KILL command causes node hanging, in this scenario.
sql_kill() and sql_kill_user() functions have now fix,
to perform implicit commit before starting the KILL command
execution. BEcause of the implicit commit, the KILL execution
will not happen inside transaction context anymore.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Add wait_until_ready waits after wsrep_on is set on again to
make sure that node is ready for next step before continuing.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Stabilize test by reseting DEBUG_SYNC and add wait_condition
for expected table contents.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
It's possible that MDL conflict handling code is called more
than once for a transaction when:
- it holds more than one conflicting MDL lock
- reschedule_waiters() is executed,
which results in repeated attempts to BF-abort already aborted
transaction.
In such situations, it might be that BF-aborting logic sees
a partially rolled back transaction and erroneously decides
on future actions for such a transaction.
The specific situation tested and fixed is when a SR transaction
applied in the node gets BF-aborted by a started TOI operation.
It's then caught with the server transaction already rolled back,
but with no MDL locks yet released. This caused wrong state
detection for such a transaction during repeated MDL conflict
handling code execution.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
* Fixes galera.galera_bf_kill_debug test case.
* Enable galera_ssl_upgrade, galera_ssl_reload, galera_pc_bootstrap
* Add MDEV to disabled tests that miss it
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Problem was that wsrep_schema tables were not marked as
category information. Fix allows access to wsrep_schema
tables even when node is detached.
This is 10.4-10.9 version of fix.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
int wsrep_thd_append_key(THD*, const wsrep_key*, int, Wsrep_service_key_type)
CREATE TABLE [SELECT|REPLACE SELECT] is CTAS and idea was that
we force ROW format. However, it was not correctly enforced
and keys were appended before wsrep transaction was started.
At THD::decide_logging_format we should force used stmt binlog
format to ROW in CTAS case and produce a warning if used
binlog format was not ROW.
At ha_innodb::update_row we should not append keys similarly
as in ha_innodb::write_row if sql_command is SQLCOM_CREATE_TABLE.
Improved error logging on ::write_row, ::update_row and ::delete_row
if wsrep key append fails.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Replication of MyISAM and Aria DML is experimental and best
effort only. Earlier change make INSERT SELECT on both
MyISAM and Aria to replicate using TOI and STATEMENT
replication. Replication should happen only if user
has set needed wsrep_mode setting.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Modified test configuration file to use wsrep_sync_wait
to make sure committed transactions are replicated before
next operation.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
The problem was in error message suppression, which did not match
the actual warning messages, due to bad quotations.
Changed warnings message suppressions to more simple format.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
DML transactions on FK-child tables also get table locks
on FK-parent tables. If there is a DML transaction holding
such a lock, and a TOI transaction starts, the latter
BF-aborts the former and puts itself into a waiting state.
If at this moment another DML transaction on FK-child table
starts, it doesn't check that the transaction waiting on
a parent table lock is TOI, and it erroneously BF-aborts
the waiting TOI transaction.
The fix: don't roll back high-priority transaction waiting
on a lock in InnoDB, instead roll back an incoming DML
transaction.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
New error codes can only be added in the latest major version.
Adding ER_KILL_DENIED_HIGH_PRIORITY would shift by one all
error codes that were added in MariaDB Server 10.6 or later.
This amends commit 1001dae186
Suggested by: Sergei Golubchik
Changed error code for Galera unkillable threads to
be ER_KILL_DENIED_HIGH_PRIORITY giving message
This is a high priority thread/query and cannot be killed
without the compromising consistency of the cluster
also a warning is produced
Thread %lld is [wsrep applier|high priority] and cannot be killed
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
This is regression from commit 3228c08fa8. Problem is that
when table storage engine is determined there should be
check is table partitioned and if it is then determine
partition implementing storage engine.
Reported bug is reproducible only with --log-bin so make
sure tests changed by 3228c08fa8 and new test are run
with --log-bin and binlog disabled.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Avoid starting transactions in wsrep-lib side when wsrep is
disabled. It is unnecessary, and causes spurious deadlock errors on
transaction clean up.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Problem was that updates to mysql.gtid_slave_pos table were
replicated even when they were newer used and because that
newer deleted. Avoid replication of mysql.gtid_slave_pos
table if wsrep_gtid_mode=OFF.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
The test that triggers multi-master conflict between two CTAS commands
uses LOCK/UNLOCK TABLES to block local CTAS from progress. It could
result in a race when UNLOCK TABLES command is issued a bit earlier
then needed, causing local CTAS to run further and change wsrep
transaction state, so that a different code path is taken later and
the original error gets overridden, causing the test to fail.
The solution is to replace LOCK/UNLOCK TABLES with debug sync points.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
on disable_indexes(HA_KEY_SWITCH_NONUNIQ_SAVE) the engine does
not know that the long unique is logically unique, because on the
engine level it is not. And the engine disables it,
Change the disable_indexes/enable_indexes API. Instead of the enum
mode, send a key_map of indexes that should be enabled. This way the
server will decide what is unique, not the engine.
Replicated events have time associated with them from originating
node which will be used for commit timestamp. Associated time can
be set in past before event is even applied.
For WSREP replication we don't need to use time information from
event.
Addressed review comments:
Jan Lindström <jan.lindstrom@galeracluster.com>
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Tests using MW-369.inc sometimes hanged after
signaling two debug sync points inside a Galera
library. Replaced Galera library sync point
with server code sync point when possible and
added more wait_conditions to make sure we are
in correct state.
Tests effected: MW-369, MW-402, MDEV-27276, and
mysql-wsrep#332.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
This commit contains a fix for the code that extracts and parses
the CN (common name, domain name) record from certificates using
the openssl utility. This code is also made common to the rsync
and mariabackup scripts. There is also some systematization of
the use of 'printf' and 'echo' builtins/utilities.
On Windows systems, occurrences of ERROR_SHARING_VIOLATION due to
conflicting share modes between processes accessing the same file can
result in CreateFile failures.
mysys' my_open() already incorporates a workaround by implementing
wait/retry logic on Windows.
But this does not help if files are opened using shell redirection like
mysqltest traditionally did it, i.e via
--echo exec "some text" > output_file
In such cases, it is cmd.exe, that opens the output_file, and it
won't do any sharing-violation retries.
This commit addresses the issue by introducing a new built-in command,
'write_line', in mysqltest. This new command serves as a brief alternative
to 'write_file', with a single line output, that also resolves variables
like "exec" would.
Internally, this command will use my_open(), and therefore retry-on-error
logic.
Hopefully this will eliminate the very sporadic "can't open file because
it is used by another process" error on CI.