Problem: master executed a statement that would fail on slave
(namely, DROP USER 'create_rout_db'@'localhost').
Then the test did:
--let $rpl_only_running_threads= 1
--source include/rpl_reset.inc
rpl_reset.inc calls rpl_sync.inc, which first checks which of
the threads are running and then syncs those threads that are
running. If the SQL thread fails after the check, the sync will
fail. So there was a race in the test and it failed on some
slow hosts.
Fix: Don't replicate the failing statement.
Normally, auto_increment value is generated for the column by
inserting either NULL or 0 into it. NO_AUTO_VALUE_ON_ZERO
suppresses this behavior for 0 so that only NULL generates
the auto_increment value. This behavior is also followed by
a slave, specifically by the SQL Thread, when applying events
in the statement format from a master. However, when applying
events in the row format, the flag was ignored thus causing
an assertion failure:
"Assertion failed: next_insert_id == 0, file .\handler.cc"
In fact, we never need to generate a auto_increment value for
the column when applying events in row format on slave. So we
don't allow it to happen by using 'MODE_NO_AUTO_VALUE_ON_ZERO'.
Refactoring: Get rid of all the sql_mode checks to rows_log_event
when applying it for avoiding problems caused by the inconsistency
of the sql_mode on slave and master as the sql_mode is not set for
Rows_log_event.
Problem: Warnings for truncated data were generated on hosts with
long host names because @@hostname was inserted into a CHAR(40) column.
Fix: Change CHAR(40) to TEXT.
Major replication test framework cleanup. This does the following:
- Ensure that all tests clean up the replication state when they
finish, by making check-testcase check the output of SHOW SLAVE STATUS.
This implies:
- Slave must not be running after test finished. This is good
because it removes the risk for sporadic errors in subsequent
tests when a test forgets to sync correctly.
- Slave SQL and IO errors must be cleared when test ends. This is
good because we will notice if a test gets an unexpected error in
the slave threads near the end.
- We no longer have to clean up before a test starts.
- Ensure that all tests that wait for an error in one of the slave
threads waits for a specific error. It is no longer possible to
source wait_for_slave_[sql|io]_to_stop.inc when there is an error
in one of the slave threads. This is good because:
- If a test expects an error but there is a bug that causes
another error to happen, or if it stops the slave thread without
an error, then we will notice.
- When developing tests, wait_for_*_to_[start|stop].inc will fail
immediately if there is an error in the relevant slave thread.
Before this patch, we had to wait for the timeout.
- Remove duplicated and repeated code for setting up unusual replication
topologies. Now, there is a single file that is capable of setting
up arbitrary topologies (include/rpl_init.inc, but
include/master-slave.inc is still available for the most common
topology). Tests can now end with include/rpl_end.inc, which will clean
up correctly no matter what topology is used. The topology can be
changed with include/rpl_change_topology.inc.
- Improved debug information when tests fail. This includes:
- debug info is printed on all servers configured by include/rpl_init.inc
- User can set $rpl_debug=1, which makes auxiliary replication files
print relevant debug info.
- Improved documentation for all auxiliary replication files. Now they
describe purpose, usage, parameters, and side effects.
- Many small code cleanups:
- Made have_innodb.inc output a sensible error message.
- Moved contents of rpl000017-slave.sh into rpl000017.test
- Added mysqltest variables that expose the current state of
disable_warnings/enable_warnings and friends.
- Too many to list here: see per-file comments for details.
Post-push fixes:
- fixed platform dependent result files
- appeasing valgrind warnings:
Fault injection was also uncovering a previously existing
potential mem leaks. For BUG#46166 testing purposes, fixed
by forcing handling the leak when injecting faults.
"load data infile .." allowed for access to
unautohorized tables.
Due to a faulty if-statement it was possible to
circumvent the secure_file_priv restriction.
--Bug#52157 various crashes and assertions with multi-table update, stored function
--Bug#54475 improper error handling causes cascading crashing failures in innodb/ndb
--Bug#57703 create view cause Assertion failed: 0, file .\item_subselect.cc, line 846
--Bug#57352 valgrind warnings when creating view
--Recently discovered problem when a nested materialized derived table is used
before being populated and it leads to incorrect result
We have several modes when we should disable subquery evaluation.
The reasons for disabling are different. It could be
uselessness of the evaluation as in case of 'CREATE VIEW'
or 'PREPARE stmt', or we should disable subquery evaluation
if tables are not locked yet as it happens in bug#54475, or
too early evaluation of subqueries can lead to wrong result
as it happened in Bug#19077.
Main problem is that if subquery items are treated as const
they are evaluated in ::fix_fields(), ::fix_length_and_dec()
of the parental items as a lot of these methods have
Item::val_...() calls inside.
We have to make subqueries non-const to prevent unnecessary
subquery evaluation. At the moment we have different methods
for this. Here is a list of these modes:
1. PREPARE stmt;
We use UNCACHEABLE_PREPARE flag.
It is set during parsing in sql_parse.cc, mysql_new_select() for
each SELECT_LEX object and cleared at the end of PREPARE in
sql_prepare.cc, init_stmt_after_parse(). If this flag is set
subquery becomes non-const and evaluation does not happen.
2. CREATE|ALTER VIEW, SHOW CREATE VIEW, I_S tables which
process FRM files
We use LEX::view_prepare_mode field. We set it before
view preparation and check this flag in
::fix_fields(), ::fix_length_and_dec().
Some bugs are fixed using this approach,
some are not(Bug#57352, Bug#57703). The problem here is
that we have a lot of ::fix_fields(), ::fix_length_and_dec()
where we use Item::val_...() calls for const items.
3. Derived tables with subquery = wrong result(Bug19077)
The reason of this bug is too early subquery evaluation.
It was fixed by adding Item::with_subselect field
The check of this field in appropriate places prevents
const item evaluation if the item have subquery.
The fix for Bug19077 fixes only the problem with
convert_constant_item() function and does not cover
other places(::fix_fields(), ::fix_length_and_dec() again)
where subqueries could be evaluated.
Example:
CREATE TABLE t1 (i INT, j BIGINT);
INSERT INTO t1 VALUES (1, 2), (2, 2), (3, 2);
SELECT * FROM (SELECT MIN(i) FROM t1
WHERE j = SUBSTRING('12', (SELECT * FROM (SELECT MIN(j) FROM t1) t2))) t3;
DROP TABLE t1;
4. Derived tables with subquery where subquery
is evaluated before table locking(Bug#54475, Bug#52157)
Suggested solution is following:
-Introduce new field LEX::context_analysis_only with the following
possible flags:
#define CONTEXT_ANALYSIS_ONLY_PREPARE 1
#define CONTEXT_ANALYSIS_ONLY_VIEW 2
#define CONTEXT_ANALYSIS_ONLY_DERIVED 4
-Set/clean these flags when we perform
context analysis operation
-Item_subselect::const_item() returns
result depending on LEX::context_analysis_only.
If context_analysis_only is set then we return
FALSE that means that subquery is non-const.
As all subquery types are wrapped by Item_subselect
it allow as to make subquery non-const when
it's necessary.
Auto increment value wraps when performing a bulk insert with
auto_increment_increment and auto_increment_offset greater than
one.
The fix:
If overflow happened then return MAX_ULONGLONG value as an
indication of overflow and check this before storing the
value into the field in update_auto_increment().
When a query fails with a different error on the slave,
the sql thread outputs a message (M) containing:
1. the error message format for the master error code
2. the master error code
3. the error message for the slave's error code
4. the slave error code
Given that the slave has no information on the error message
itself that the master outputs, it can only print its own
version of the message format (but stripped from the
additional data if the message format requires). This may
confuse users.
To fix this we augment the slave's message (M) to explicitly
state that the master's message is actually an error message
format, the one associated with the given master error code
and that the slave server knows about.
when generating new name.
If find_uniq_filename returns an error, then this error is not
being propagated upwards, and execution does not report error to
the user (although a entry in the error log is generated).
Additionally, some more errors were ignored in new_file_impl:
- when writing the rotate event
- when reopening the index and binary log file
This patch addresses this by propagating the error up in the
execution stack. Furthermore, when rotation of the binary log
fails, an incident event is written, because there may be a
chance that some changes for a given statement, were not properly
logged. For example, in SBR, LOAD DATA INFILE statement requires
more than one event to be logged, should rotation fail while
logging part of the LOAD DATA events, then the logged data would
become inconsistent with the data in the storage engine.
InnoDB AUTOINC code expects the locks to be released in strict reverse order
at the end of the statement. However, nested stored proedures and partition
tables break this rule. We now allow the locks to be deleted from the
trx->autoinc_locks vector in any order but optimise for the common (old) case.
rb://441 Approved by Marko Makela
win x86 debug_max
The windows MTR run exhibited a different test execution
ordering (due to the fact that in these platforms MTR is invoked
with --parallel > 1). This uncovered a bug in the aforementioned
test case, which is triggered by the following conditions:
1. server is not restarted between two different tests;
2. the test before binlog.binlog_row_failure_mixing_engines
issues flush logs;
3. binlog.binlog_row_failure_mixing_engines uses binlog
positions to limit the output of show_binlog_events;
4. binlog.binlog_row_failure_mixing_engines does not state which
binlog file to use, thence it uses a wrong binlog file with
the correct position.
There are two possible fixes: 1. make sure that the test start
from a clean slate - binlog wise; 2. in addition to the position,
also state the binary log file before sourcing
show_binlog_events.inc .
We go for fix#1, ie, deploy a RESET MASTER before the test is
actually started.
When using BINLOG statement to execute rows log events, session variables
foreign_key_checks and unique_checks are changed temporarily. As each rows
log event has their own special session environment and its own
foreign_key_checks and unique_checks can be different from current session
which executing the BINLOG statement. But these variables are not restored
correctly after BINLOG statement. This problem will cause that the following
statements fail or generate unexpected data.
In this patch, code is added to backup and restore these two variables.
So BINLOG statement will not affect current session's variables again.
In case of low memory sort buffer QUICK_INDEX_MERGE_SELECT creates
temporary file where is stores row ids which meet QUICK_SELECT ranges
except of clustered pk range, clustered range is processed separately.
In init_read_record we check if temporary file is used and choose
appropriate record access method. It does not take into account that
temporary file contains partial result in case of QUICK_INDEX_MERGE_SELECT
with clustered pk range.
The fix is always to use rr_quick if QUICK_INDEX_MERGE_SELECT
with clustered pk range is used.
The test result differs on windows, since
it writes out 'localhost:<port>' instead of
only 'localhost', since it uses tcp/ip instead
of unix sockets on windows.
Fixed by replacing that column.
Also requires --big-test from some long running tests
and added a weekly run of all test requiring --big-test.
with on duplicate key update
There was a missed corner case in the partitioning
handler, which caused the next_insert_id to be changed
in the second level handlers (i.e the hander of a partition),
which caused this debug assertion.
The solution was to always ensure that only the partitioning
level generates auto_increment values, since if it was done
within a partition, it may fail to match the partition
function.
ALTER TABLE RENAME, DISABLE KEYS.
The code of ALTER TABLE RENAME, DISABLE KEYS could
issue a commit while holding LOCK_open mutex.
This is a regression introduced by the fix for
Bug 54453.
This failed an assert guarding us against a potential
deadlock with connections trying to execute
FLUSH TABLES WITH READ LOCK.
The fix is to move acquisition of LOCK_open outside
the section that issues ha_autocommit_or_rollback().
LOCK_open is taken to protect against concurrent
operations with .frms and the table definition
cache, and doesn't need to cover the call to commit.
A test case added to innodb_mysql.test.
The patch is to be null-merged to 5.5, which
already has 54453 null-merged to it.