THAN LOCALHOST
This is a test bug and the explanation for the behaviour can be found
on the bug page.Modifying the select to select user where user!=root for the line where
failure is encountered on machines with no hostname other than the localhost.
"SHOW PROCESSLIST"
Analysis:
----------
The problem here is, if one connection changes its
default db and at the same time another connection executes
"SHOW PROCESSLIST", when it wants to read db of the another
connection then there is a chance of accessing the invalid
memory.
The db name stored in THD is not guarded while changing user
DB and while reading the user DB in "SHOW PROCESSLIST".
So, if THD.db is freed by thd "owner" thread and if another
thread executing "SHOW PROCESSLIST" statement tries to read
and copy THD.db at the same time then we may endup in the issue
reported here.
Fix:
----------
Used mutex "LOCK_thd_data" to guard THD.db while freeing it
and while copying it to processlist.
STATUS OF ROLLBACKED TRANSACTION" and bug #17054007 - "TRANSACTION
IS NOT FULLY ROLLED BACK IN CASE OF INNODB DEADLOCK".
The problem in the first bug report was that although deadlock involving
metadata locks was reported using the same error code and message as InnoDB
deadlock it didn't rollback transaction like the latter. This caused
confusion to users as in some cases after ER_LOCK_DEADLOCK transaction
could have been restarted immediately and in some cases rollback was
required.
The problem in the second bug report was that although InnoDB deadlock
caused transaction rollback in all storage engines it didn't cause release
of metadata locks. So concurrent DDL on the tables used in transaction was
blocked until implicit or explicit COMMIT or ROLLBACK was issued in the
connection which got InnoDB deadlock.
The former issue has stemmed from the fact that when support for detection
and reporting metadata locks deadlocks was added we erroneously assumed
that InnoDB doesn't rollback transaction on deadlock but only last statement
(while this is what happens on InnoDB lock timeout actually) and so didn't
implement rollback of transactions on MDL deadlocks.
The latter issue was caused by the fact that rollback of transaction due
to deadlock is carried out by setting THD::transaction_rollback_request
flag at the point where deadlock is detected and performing rollback
inside of trans_rollback_stmt() call when this flag is set. And
trans_rollback_stmt() is not aware of MDL locks, so no MDL locks are
released.
This patch solves these two problems in the following way:
- In case when MDL deadlock is detect transaction rollback is requested
by setting THD::transaction_rollback_request flag.
- Code performing rollback of transaction if THD::transaction_rollback_request
is moved out from trans_rollback_stmt(). Now we handle rollback request
on the same level as we call trans_rollback_stmt() and release statement/
transaction MDL locks.
IN TIME RECOVERY FAILURE ON SLAVES
Problem:
DROP TEMP TABLE IF EXISTS commands can cause point
in time recovery (re-applying binlog) failures.
Analyses:
In RBR, 'DROP TEMPORARY TABLE' commands are
always binlogged by adding 'IF EXISTS' clauses.
Also, the slave SQL thread will not check replicate.* filter
rules for "DROP TEMPORARY TABLE IF EXISTS" queries.
If log-slave-updates is enabled on slave, these queries
will be binlogged in the format of "USE `db`;
DROP TEMPORARY TABLE IF EXISTS `t1`;" irrespective
of filtering rules and irrespective of the `db` existence.
When users try to recover slave from it's own binlog,
use `db` command might fail if `db` is not present on slave.
Fix:
At the time of writing the 'DROP TEMPORARY TABLE
IF EXISTS' query into the binlog, 'use `db`' will not be
present and the table name in the query will be a fully
qualified table name.
Eg:
'USE `db`; DROP TEMPORARY TABLE IF EXISTS `t1`;'
will be logged as
'DROP TEMPORARY TABLE IF EXISTS `db`.`t1`;'.
Problem:
sys_vars.rpl_init_slave_func test was failing sporadically
on 5.5+.
Fix:
Added assert condition after wait for checks.
Recorded test and enabled it.
BUG#12535301- SYS_VARS.RPL_INIT_SLAVE_FUNC MISMATCHES IN DAILY-5.5
Problem:
sys_vars.rpl_init_slave_func test was not recorded after
the last edit. It was disabled on 5.1 after seeing failures
due to the above reason.
No old failures as this suite never ran with pb2 on 5.1
Fix:
Added assert condition after wait for checks.
Recorded test and enabled it.
TO INCONSISTENCY
PROBLEM
--------
When we drop a partitoned table , we first gather the
information about partitions in the table from the
table_name.par file and store it in an internal data
structure.Then we delete this file and the data in
the table. If the server crashes after deleting the
file,then after recovering we cannot access the table
.Even we cannot drop the table ,because drop algorithm
requires par file to read the partition information.
FIX
---
1. We move the part of deleting par file after deleting
all the table data from the storage egine.
2. During drop operation if we detect that the par
file is missing then we delete the .frm file,since
there is no way of recovering without par file.
[Approved by Mattias rb#2576 ]
BY BINLOG_KILLED_SIMULATE.TEST
'mysqbinlog' tool creates a temporary file while
preparing LOAD DATA QUERY. These files needs to be deleted
at the end of the test script otherwise these files are
left out in the daily-run machines, causing
"no space on device issues"
Fix:
Delete them at the end of these test scripts
1) execute mysqlbinlog with --local-load option to
create these files in a specified tmpdir
2) delete the tmpdir at the end of the test script
USING THE PLUGIN INTERFACE.
ISSUE: No support for floating-point plugin
system variables.
SOLUTION: Allowing plugins to define and expose floating-point
system variables of type double. MYSQL_SYSVAR_DOUBLE
and MYSQL_THDVAR_DOUBLE are added.
ISSUE: Fractional part of the def, min, max values of system
variables are ignored.
SOLUTION: Adding functions that are used to store the raw
representation of a double in the raw bits of unsigned
longlong in a way that the binary representation
remains the same.
WITH COMPOSITE KEY COLUMNS
Problem:-
While running a SELECT query with several AGGR(DISTINCT) function
and these are referring to different field of same composite key,
Returned incorrect value.
Analysis:-
In a table, where we have composite key like (a,b,c)
and when we give a query like
select COUNT(DISTINCT b), SUM(DISTINCT a) from ....
here, we first make a list of items in Aggr(distinct) function
(which is a, b), where order of item doesn't matter.
and then we see, whether we have a composite key where the prefix
of index columns matches the items of the aggregation function.
(in this case we have a,b,c).
if yes, so we can use loose index scan and we need not perform
duplicate removal to distinct in our aggregate function.
In our table, we traverse column marked with <-- and get the result as
(a,b,c) count(distinct b) sum(distinct a)
treated as count b treated as sum(a)
(1,1,2)<-- 1 1
(1,2,2)<-- 1++=2 1+1=2
(1,2,3)
(2,1,2)<-- 2++=3 1+1+2=4
(2,2,2)<-- 3++=4 1+1+2+2=6
(2,2,3)
result will be 4,6, but it should be (2,3)
As in this case, our assumption is incorrect. If we have
query like
select count(distinct a,b), sum(distinct a,b)from ..
then we can use loose index scan
Solution:-
In our query, when we have more then one aggr(distinct) function
then they should refer to same fields like
select count(distinct a,b), sum(distinct a,b) from ..
-->we can use loose scan index as both aggr(distinct) refer to same fields a,b.
If they are referring to different field like
select count(distinct a), sum(distinct b) from ..
-->will not use loose scan index as both aggr(distinct) refer to different fields.
WITH A PORT NUMBER ENCLOSED IN QUOTES
Problem: mysqldump --dump-slave --include-master-host-port
prints the CHANGE MASTER command in the generated logical
backup. The PORT number that is generated with this command
is a string and should be an integer.
Fix: Remove the Enclosed quotes for port number.
SCHEDULER DROPS EVENTS
Problem: On a semi sync enabled server (Master/Slave),
if event scheduler drops an event after completion,
server crashes.
Analaysis: If an event is created with "ON COMPLETION
NOT PRESERVE" clause, event scheduler deletes the event
upon event completion(expiration) and the thread object
will be destroyed. In the destructor of the thread object,
mysys_var member is set to zero explicitly. Later from
the same destructor call(same execution path),
incase of semi sync enabled server, while cleanup is called,
THD::mysys_var member is accessed by THD::enter_cond()
function which causes server to crash.
Fix: mysys_var should not be explicitly set to zero and
also it is not required.
--BINLOG-IGNORE-DB AND FULLY QUALIFIED TABLE
Problem:
=======
An ALTER TABLE statement is not written to binlog if server
started with "--binlog-ignore-db some database" and 'fully
qualified' table names are used in the ALTER TABLE statement
altering table different from current database context.
Analysis:
========
The above mentioned problem not only affects "ALTER TABLE"
statements but also to all kind of statements. Once the
current default database becomes "NULL" none of the
statements will be binlogged.
The current behaviour is such that if the user has specified
restrictions on which database needs to be replicated and the
default db is not specified, then do not replicate.
This means that "NULL" is considered to be equivalent to
everything (default db = null implied ignore don't log the
statement).
Fix:
===
"NULL" should not be considered as equivalent to everything.
Since the filtering criteria is not equal to "NULL" the
statement should be logged into binlog.
(Based on Sinisa's patch)
Added a version checking facility to mysql_upgrade.
The versions used for checking is the version of the
server that mysql_upgrade is going to upgrade and the
server version that mysql_upgrade was build/distributed
with.
Also added an option '--version-check' to enable/disable
the version checking.
PLATFORM= MACOSX10.6 X86_64 MAX
Problem: The test was failing on pb2's mac machine because
it was not cleaned up properly. The test checks if
the command 'start slave until' throws a proper
error when issued with a wrong number/type of
parameters. After this,the replication stream was
stopped using the include file 'rpl_end.inc'.
The errors thrown earlier left the slave in an
inconsistent state to be closed by the include
file which was caught by the mac machine.
Fix: Started slave by invoking start_slave.inc to have a
working slave before calling rpl_reset.inc
Problem: The test file was not in a good shape. It tested
start slave until relay log file/pos combination
wrongly. A couple of commands were executed at
master and replicated at slave. Next, the
coordinates in terms of relay log file and pos
were noted down followed by reset slave and start
slave until saved relay log file/pos. Reset slave
deletes all relay log files and makes the slave
forget its replication position. So, using the
saved coordiantes after reset slave is wrong.
Fix: Split the test in two parts:
a) Test for start slave until master log file/pos and
checking for correct errors in the failure
scenarios.
b) Test for start slave until relay log file/pos.
Problem: The variables auto_increment_increment and
auto_increment_offset were set in the the include
file rpl_init.inc. This was only configured for
some connections that are rarely used by test
cases, so likely that it will cause confusion.
If replication tests want to setup these variables
they should do so explicitly.
Fix:
a) Removed code to set the variables
auto_increment_increment and auto_increment_offset
in the include file.
b) Updated tests files using the same.
post push fix:
rpl_stm_until.test was disabled because of
this bug. Enabled and fixed it.
Removed a part of the test that was obsolete.
It tested replication from 4.0 master to 5.0
slave.
post push fix:
rpl_stm_until.test was disabled because of
this bug. Enabled and fixed it.
Removed a part of the test that was obsolete.
It tested replication from 4.0 master to 5.0
slave.
DOWNGRADED FROM 5.6.11 TO 5.6.10
Problem was new syntax not accepted by previous version.
Fixed by adding version comment of /*!50531 around the
new syntax.
Like this in the .frm file:
'PARTITION BY KEY /*!50611 ALGORITHM = 2 */ () PARTITIONS 3'
and also changing the output from SHOW CREATE TABLE to:
CREATE TABLE t1 (a INT)
/*!50100 PARTITION BY KEY */ /*!50611 ALGORITHM = 1 */ /*!50100 ()
PARTITIONS 3 */
It will always add the ALGORITHM into the .frm for KEY [sub]partitioned
tables, but for SHOW CREATE TABLE it will only add it in case it is the non
default ALGORITHM = 1.
Also notice that for 5.5, it will say /*!50531 instead of /*!50611, which
will make upgrade from 5.5 > 5.5.31 to 5.6 < 5.6.11 fail!
If one downgrades an fixed version to the same major version (5.5 or 5.6) the
bug 14521864 will be visible again, but unless the .frm is updated, it will
work again when upgrading again.
Also fixed so that the .frm does not get updated version
if a single partition check passes.