rpl_packet got a timeout failure sporadically on PB when stopping
slave. The real reason of this bug is that STOP SLAVE stopped
IO thread first and then stopped SQL thread. It was
possible that IO thread stopped after replicating part of a
transaction which SQL thread was executing. SQL thread would
be hung if the transaction could not be rolled back safely.
After this patch, STOP SLAVE will stop SQL thread first and then stop IO
thread, which guarantees that IO thread will fetch the reset of the
events of the transaction that SQL thread is executing, so that SQL
thread can finish the transaction if it cannot be rolled back safely.
Added below auxiliary files to make the test code neater.
restart_slave_sql.inc
rpl_connection_master.inc
rpl_connection_slave.inc
rpl_connection_slave1.inc
Undoing the patch, it complicates the code but is not the solution
I do not beleive newline mismatch could be the cause of this failure
First, I cannot see how this could be a problem, mtr ignores the newline
when reading the expect file, and the file is written and read on Windows.
Second, if this really was the problem it should have been deterministic:
either the newline is correctly interepreted or it is not.
Backported the fix to 5.1.
Problem: the auxiliary test files rpl_start_server.inc and rpl_stop_server.inc
write a file that is later read by mtr. The bug was that the file was written
with platform-dependent newline terminators, i.e., \r\n on windows, whereas mtr
only understands \n.
Fix: write the file so that it uses \n on all platforms.
The test started failing on the same day patch for BUG 49978 was
pushed. BUG 49978 changed part of the replication testing
infrastructure in mysql-test-run. This caused the test to fail
sporadically with result differences on relay log file
names. When the test fails the relay-log filenames are shifted by
one, eg:
-show relaylog events in 'slave-relay-bin.000002' from <binlog_start>;
+show relaylog events in 'slave-relay-bin.000003' from <binlog_start>;
The problem was caused by a bad cleanup when using the include
files:
- include/setup_fake_relay_log.inc
- include/cleanup_fake_relay_log.inc
Which would leave a spurious relay-log file around (not listed in
slave-relay-bin.index), causing the server to shift the name of
the relay logs by one, even if cleaning up with RESET SLAVE.
We fix this by removing the relay-log file when it is not needed
anymore, ie at setup time and after recreating the fake relay-log
index.
Additionally, to make the affected test more resilient, we
deployed a call to rpl_reset.inc (which resets both master and
slave, including log files) before actually running the test
case.
Finally, appart from the reported bug, we also fix: (a) an
unrelated issue with the failing test itself - in some cases, the
test was not setting the log file name to use when it should;
(b) one typo.
Problem: DATE_ADD() is a hybrid function and can return
DATE, DATETIME or VARCHAR data type depending on arguments.
In case of VARCHAR data type, DATE_ADD() reported "binary" character set,
which was wrong.
Fix: make DATE_ADD() return @character_set_connection in VARCHAR context.
@ mysql-test/include/ctype_numconv.inc
Adding tests
@ mysql-test/r/ctype_binary.result
Adding tests
@ mysql-test/r/ctype_cp1251.result
Adding tests
@ mysql-test/r/ctype_latin1.result
Adding tests
@ mysql-test/r/ctype_ucs.result
Adding tests
@ mysql-test/r/ctype_utf8.result
Adding tests
@ sql/item_strfunc.cc
- Moving code from Item_str_ascii_func::val_str() to
Item_str_func::val_str_from_val_str_ascii(), as
this code needs to be shared by Item_date_add_interval.
- Adding str2 parameter to be used as a buffer, instead of
using private ascii_buf member.
@ sql/item_strfunc.h
- Moving code from Item_str_ascii_func::val_str() to
Item_str_func::val_str_from_val_str_ascii()
- Removing "String *val_str_convert_from_ascii(String *str, String *ascii_buf)"
prototype as it was neither used nor declared.
@ sql/item_timefunc.h
- Overwriting parent's charset_for_protocol() method,
becase we need to behave differenlty in VARCHAR and DATE/DATETYPE context.
- Adding ascii_buf for conversion.
- Adding val_str_ascii() prototype.
- Adding val_str() which uses newly added
Item_str_func::val_str_from_val_str_ascii(),
passing ascii_buf as a conversion buffer.
Put descriptions of plugins into a separate file read by MTR
MTR itself has generalised code to read this and set env. variables
Removed the *SO variables, updated some tests accordingly
New commit: added optional list of plugin names for _LOAD variable
Also made changes for the new AUTH_* plugins
Major replication test framework cleanup. This does the following:
- Ensure that all tests clean up the replication state when they
finish, by making check-testcase check the output of SHOW SLAVE STATUS.
This implies:
- Slave must not be running after test finished. This is good
because it removes the risk for sporadic errors in subsequent
tests when a test forgets to sync correctly.
- Slave SQL and IO errors must be cleared when test ends. This is
good because we will notice if a test gets an unexpected error in
the slave threads near the end.
- We no longer have to clean up before a test starts.
- Ensure that all tests that wait for an error in one of the slave
threads waits for a specific error. It is no longer possible to
source wait_for_slave_[sql|io]_to_stop.inc when there is an error
in one of the slave threads. This is good because:
- If a test expects an error but there is a bug that causes
another error to happen, or if it stops the slave thread without
an error, then we will notice.
- When developing tests, wait_for_*_to_[start|stop].inc will fail
immediately if there is an error in the relevant slave thread.
Before this patch, we had to wait for the timeout.
- Remove duplicated and repeated code for setting up unusual replication
topologies. Now, there is a single file that is capable of setting
up arbitrary topologies (include/rpl_init.inc, but
include/master-slave.inc is still available for the most common
topology). Tests can now end with include/rpl_end.inc, which will clean
up correctly no matter what topology is used. The topology can be
changed with include/rpl_change_topology.inc.
- Improved debug information when tests fail. This includes:
- debug info is printed on all servers configured by include/rpl_init.inc
- User can set $rpl_debug=1, which makes auxiliary replication files
print relevant debug info.
- Improved documentation for all auxiliary replication files. Now they
describe purpose, usage, parameters, and side effects.
- Many small code cleanups:
- Made have_innodb.inc output a sensible error message.
- Moved contents of rpl000017-slave.sh into rpl000017.test
- Added mysqltest variables that expose the current state of
disable_warnings/enable_warnings and friends.
- Too many to list here: see per-file comments for details.
Manual merge from mysql-5.1-bugteam into mysql-5.5-bugteam.
Conflicts
=========
Text conflict in sql/log.cc
Text conflict in sql/log.h
Text conflict in sql/slave.cc
Text conflict in sql/sql_parse.cc
Text conflict in sql/sql_priv.h
when generating new name.
If find_uniq_filename returns an error, then this error is not
being propagated upwards, and execution does not report error to
the user (although a entry in the error log is generated).
Additionally, some more errors were ignored in new_file_impl:
- when writing the rotate event
- when reopening the index and binary log file
This patch addresses this by propagating the error up in the
execution stack. Furthermore, when rotation of the binary log
fails, an incident event is written, because there may be a
chance that some changes for a given statement, were not properly
logged. For example, in SBR, LOAD DATA INFILE statement requires
more than one event to be logged, should rotation fail while
logging part of the LOAD DATA events, then the logged data would
become inconsistent with the data in the storage engine.
Problem: MySQL cp1251 did not support 'U+20AC EURO SIGN'
which was assigned a few years ago to 0x88.
Fix: adding mapping: 0x88 <-> U+20AC
@ mysql-test/include/ctype_8bit.inc
New shared file to test 8bit character sets.
@ mysql-test/r/ctype_cp1251.result
@ mysql-test/t/ctype_cp1251.test
Adding tests
@ sql/share/charsets/cp1251.xml
Adding mapping
@ strings/ctype-extra.c
Regenerating ctype-extra.c using strings/conf_to_src
according to new cp1251.xml
Problem: LIKE over an indexed column optimized away good results,
because my_like_range_utf32/utf16 returned wrong ranges for contractions.
Contraction related code was missing in my_like_range_utf32/utf16,
but did exist in my_like_range_ucs2/utf8.
It was forgotten in utf32/utf16 versions (during mysql-6.0 push/revert mess).
Fix:
The patch removes individual functions my_like_range_ucs2,
my_like_range_utf16, my_like_range_utf32 and introduces a single function
my_like_range_generic() instead. The new function handles contractions
correctly. It can handle any character set with cs->min_sort_char and
cs->max_sort_char represented in Unicode code points.
added:
@ mysql-test/include/ctype_czech.inc
@ mysql-test/include/ctype_like_ignorable.inc
@ mysql-test/r/ctype_like_range.result
@ mysql-test/t/ctype_like_range.test
Adding tests
modified:
@ include/m_ctype.h
- Adding helper functions for contractions.
- Prototypes: removing ucs2,utf16,utf32 functions, adding generic function.
@ mysql-test/r/ctype_uca.result
@ mysql-test/r/ctype_utf16_uca.result
@ mysql-test/r/ctype_utf32_uca.result
@ mysql-test/t/ctype_uca.test
@ mysql-test/t/ctype_utf16_uca.test
@ mysql-test/t/ctype_utf32_uca.test
- Adding tests.
@ strings/ctype-mb.c
- Pad function did not put the last character.
- Implementing my_like_range_generic() - an universal replacement
for three separate functions
my_like_range_ucs2(), my_like_range_utf16() and my_like_range_utf32(),
with correct contraction handling.
@ strings/ctype-ucs2.c
- my_fill_mb2 did not put the high byte, as previously
it was used to put only characters in ASCII range.
Now it puts high byte as well
(needed to pupulate cs->max_sort_char correctly).
- Adding DBUG_ASSERT()
- Removing character set specific functions:
my_like_range_ucs2(), my_like_range_utf16() and my_like_range_utf32().
- Using my_like_range_generic() instead of the old functions.
@ strings/ctype-uca.c
- Using generic function instead of the old character set specific ones.
@ sql/item_create.cc
@ sql/item_strfunc.cc
@ sql/item_strfunc.h
- Adding SQL functions LIKE_RANGE_MIN and LIKE_RANGE_MAX,
available only in debug build to make sure like_range()
works correctly for all character sets and collations.
Regression introduced by WL#2649.
Problem: queries with date/datetime columns did not use indexes:
set names non_latin1_charset;
select * from date_index_test
where date_column between '2010-09-01' and '2010-10-01';
before WL#2649 indexes worked fine because charset of
date/datetime
columns was BINARY which always won.
Fix: testing that collation of the operation matches collation
of the field is only needed in case of "real" string data types.
For DATE, DATETIME it's not needed.
@ mysql-test/include/ctype_numconv.inc
@ mysql-test/r/ctype_binary.result
@ mysql-test/r/ctype_cp1251.result
@ mysql-test/r/ctype_latin1.result
@ mysql-test/r/ctype_ucs.result
@ mysql-test/r/ctype_utf8.result
Adding tests
@ sql/field.h
Adding new method Field_str::match_collation_to_optimize_range()
for use in opt_range.cc to distinguish between
"real string" types like CHAR, VARCHAR, TEXT
(Field_string, Field_varstring, Field_blob)
and "almost string" types DATE, TIME, DATETIME
(Field_newdate, Field_datetime, Field_time, Field_timestamp)
@ sql/opt_range.cc
Using new method instead of checking result_type() against STRING result.
Note:
Another part of this problem (which is not regression)
is submitted separately (see bug##58329).
bug #57006 "Deadlock between HANDLER and FLUSH TABLES WITH READ
LOCK" and bug #54673 "It takes too long to get readlock for
'FLUSH TABLES WITH READ LOCK'".
The first bug manifested itself as a deadlock which occurred
when a connection, which had some table open through HANDLER
statement, tried to update some data through DML statement
while another connection tried to execute FLUSH TABLES WITH
READ LOCK concurrently.
What happened was that FTWRL in the second connection managed
to perform first step of GRL acquisition and thus blocked all
upcoming DML. After that it started to wait for table open
through HANDLER statement to be flushed. When the first connection
tried to execute DML it has started to wait for GRL/the second
connection creating deadlock.
The second bug manifested itself as starvation of FLUSH TABLES
WITH READ LOCK statements in cases when there was a constant
stream of concurrent DML statements (in two or more
connections).
This has happened because requests for protection against GRL
which were acquired by DML statements were ignoring presence of
pending GRL and thus the latter was starved.
This patch solves both these problems by re-implementing GRL
using metadata locks.
Similar to the old implementation acquisition of GRL in new
implementation is two-step. During the first step we block
all concurrent DML and DDL statements by acquiring global S
metadata lock (each DML and DDL statement acquires global IX
lock for its duration). During the second step we block commits
by acquiring global S lock in COMMIT namespace (commit code
acquires global IX lock in this namespace).
Note that unlike in old implementation acquisition of
protection against GRL in DML and DDL is semi-automatic.
We assume that any statement which should be blocked by GRL
will either open and acquires write-lock on tables or acquires
metadata locks on objects it is going to modify. For any such
statement global IX metadata lock is automatically acquired
for its duration.
The first problem is solved because waits for GRL become
visible to deadlock detector in metadata locking subsystem
and thus deadlocks like one in the first bug become impossible.
The second problem is solved because global S locks which
are used for GRL implementation are given preference over
IX locks which are acquired by concurrent DML (and we can
switch to fair scheduling in future if needed).
Important change:
FTWRL/GRL no longer blocks DML and DDL on temporary tables.
Before this patch behavior was not consistent in this respect:
in some cases DML/DDL statements on temporary tables were
blocked while in others they were not. Since the main use cases
for FTWRL are various forms of backups and temporary tables are
not preserved during backups we have opted for consistently
allowing DML/DDL on temporary tables during FTWRL/GRL.
Important change:
This patch changes thread state names which are used when
DML/DDL of FTWRL is waiting for global read lock. It is now
either "Waiting for global read lock" or "Waiting for commit
lock" depending on the stage on which FTWRL is.
Incompatible change:
To solve deadlock in events code which was exposed by this
patch we have to replace LOCK_event_metadata mutex with
metadata locks on events. As result we have to prohibit
DDL on events under LOCK TABLES.
This patch also adds extensive test coverage for interaction
of DML/DDL and FTWRL.
Performance of new and old global read lock implementations
in sysbench tests were compared. There were no significant
difference between new and old implementations.
The thing is that the following attributes are fixed (remembered) when a trigger
is created:
- character_set_client
- character_set_results
- collation_connection
There are two triggers created in mysql-test/include/mtr_warnings.sql.
They were created using "current default" character set / collation.
is_triggers.test shows definition of these triggers including recorded
character set information.
The problem was that if "current default" changed, the recorded character
set information was not accurate.
There might be two ways to fix that:
a) update is_triggers.test so that it does not put character-set information
into result-file;
b) update mtr_warnings.sql so that the triggers are created using
hard-coded character sets.
This patch implements option b).
This is the 5.5 version of the fix. The 5.1 version was too complicated to
merge and was null merged.
This is a regression from the fix for bug no 38999. A storage engine capable
of reading only a subset of a table's columns updates corresponding bits in
the read buffer to signal that it has read NULL values for the corresponding
columns. It cannot, and should not, update any other bits. Bug no 38999
occurred because the implementation of UPDATE statements compare the NULL bits
using memcmp, inadvertently comparing bits that were never requested from the
storage engine. The regression was caused by the storage engine trying to
alleviate the situation by writing to all NULL bits, even those that it had no
knowledge of. This has devastating effects for the index merge algorithm,
which relies on all NULL bits, except those explicitly requested, being left
unchanged.
The fix reverts the fix for bug no 38999 in both InnoDB and InnoDB plugin and
changes the server's method of comparing records. For engines that always read
entire rows, we proceed as usual. For engines capable of reading only select
columns, the record buffers are now compared on a column by column basis. An
assertion was also added so that non comparable buffers are never read. Some
relevant copy-pasted code was also consolidated in a new function.
This is a regression from the fix for bug no 38999. A storage engine capable
of reading only a subset of a table's columns updates corresponding bits in
the read buffer to signal that it has read NULL values for the corresponding
columns. It cannot, and should not, update any other bits. Bug no 38999
occurred because the implementation of UPDATE statements compare the NULL bits
using memcmp, inadvertently comparing bits that were never requested from the
storage engine. The regression was caused by the storage engine trying to
alleviate the situation by writing to all NULL bits, even those that it had no
knowledge of. This has devastating effects for the index merge algorithm,
which relies on all NULL bits, except those explicitly requested, being left
unchanged.
The fix reverts the fix for bug no 38999 in both InnoDB and InnoDB plugin and
changes the server's method of comparing records. For engines that always read
entire rows, we proceed as usual. For engines capable of reading only select
columns, the record buffers are now compared on a column by column basis. An
assertion was also added so that non comparable buffers are never read. Some
relevant copy-pasted code was also consolidated in a new function.
Problem: CASE didn't work with a mixture of different character
sets in THEN/ELSE in some cases.
This happened because after character set aggregation
newly created Item_func_conv_charset items corresponding
to THEN/ELSE arguments were not put back to args[] array.
Fix:
put all Item_func_conv_charset back to args[].
@ mysql-test/include/ctype_numconv.inc
@ mysql-test/r/ctype_ucs.result
Adding tests
@ sql/item_cmpfunc.cc
Put "agg" back to args[] after character set aggregation.