BUG#54872 MBR: replication failure caused by using tmp table inside transaction
Changed criteria to classify a statement as unsafe in order to reduce the
number of spurious warnings. So a statement is classified as unsafe when
there is on-going transaction at any point of the execution if:
1. The mixed statement is about to update a transactional table and
a non-transactional table.
2. The mixed statement is about to update a temporary transactional
table and a non-transactional table.
3. The mixed statement is about to update a transactional table and
read from a non-transactional table.
4. The mixed statement is about to update a temporary transactional
table and read from a non-transactional table.
5. The mixed statement is about to update a non-transactional table
and read from a transactional table when the isolation level is
lower than repeatable read.
After updating a transactional table if:
6. The mixed statement is about to update a non-transactional table
and read from a temporary transactional table.
7. The mixed statement is about to update a non-transactional table
and read from a temporary transactional table.
8. The mixed statement is about to update a non-transactionala table
and read from a temporary non-transactional table.
9. The mixed statement is about to update a temporary non-transactional
table and update a non-transactional table.
10. The mixed statement is about to update a temporary non-transactional
table and read from a non-transactional table.
11. A statement is about to update a non-transactional table and the
option variables.binlog_direct_non_trans_update is OFF.
The reason for this is that locks acquired may not protected a concurrent
transaction of interfering in the current execution and by consequence in
the result. So the patch reduced the number of spurious unsafe warnings.
Besides we fixed a regression caused by BUG#51894, which makes temporary
tables to go into the trx-cache if there is an on-going transaction. In
MIXED mode, the patch for BUG#51894 ignores that the trx-cache may have
updates to temporary non-transactional tables that must be written to the
binary log while rolling back the transaction.
So we fix this problem by writing the content of the trx-cache to the
binary log while rolling back a transaction if a non-transactional
temporary table was updated and the binary logging format is MIXED.
switching binlog format to ROW
BUG 52616 fixed the case which the user would switch from STMT to
ROW binlog format, but the server would silently ignore it. After
that fix thd->is_current_stmt_binlog_format_row() reports correct
value at logging time and events are logged in ROW (as expected)
instead of STMT as they were previously and wrongly logged.
However, the fix was only partially complete, because on
disconnect, at THD cleanup, the implicit logging of temporary
tables is conditionally performed. If the binlog_format==ROW and
thd->is_current_stmt_binlog_format_row() is true then DROPs are
not logged. Given that the user can switch from STMT to ROW, this
is wrong because the server cannot tell, just by relying on the
ROW binlog format, that the tables have been dropped before. This
is effectively similar to the MIXED scenario when a switch from
STMT to ROW is triggered.
We fix this by removing this condition from
close_temporary_tables.
DML flow and SAVEPOINT
The problem was that replication could break if a transaction involving
both transactional and non-transactional tables was rolled back to a
savepoint. It broke if a concurrent connection tried to drop a
transactional table which was locked after the savepoint was set.
This DROP TABLE completed when ROLLBACK TO SAVEPOINT was executed as the
lock on the table was dropped by the transaction. When the slave later
tried to apply the binlog, it would fail as the table would already
have been dropped.
The reason for the problem is that transactions involving both
transactional and non-transactional tables are written fully to the
binlog during ROLLBACK TO SAVEPOINT. At the same time, metadata locks
acquired after a savepoint, were released during ROLLBACK TO SAVEPOINT.
This allowed a second connection to drop a table only used between
SAVEPOINT and ROLLBACK TO SAVEPOINT. Which caused the transaction binlog
to refer to a non-existing table when it was written during ROLLBACK
TO SAVEPOINT.
This patch fixes the problem by not releasing metadata locks when
ROLLBACK TO SAVEPOINT is executed if binlogging is enabled.
The default storage engine is changed from MyISAM to
InnoDB, in all builds except for the embedded server.
In addition, the following system variables are
changed:
* innodb_file_per_table is enabled
* innodb_strict_mode is enabled
* innodb_file_format_name_update is changed
to 'Barracuda'
The test suite is changed so that tests that do not
explicitly include the have_innodb.inc are run with
--default-storage-engine=MyISAM. This is to ease the
transition, so that most regression tests are run
with the same engine as before.
Some tests are disabled for the embedded server
regression test, as the output of certain statements
will be different that for the regular server
(i.e SELECT @@default_storage_engine). This is to
ease transition.
and innodb_file_format_max two system variables. And this also fixes
bug #53654 after 2nd shutdown innodb_file_format_check attains strange
values.
rb://366 approved by Marko
------------------------------------------------------------
revno: 3506
revision-id: sergey.glukhov@sun.com-20100609121718-04mpk5kjxvnrxdu8
parent: sergey.glukhov@sun.com-20100609120734-ndy2281wau9067zv
committer: Sergey Glukhov <Sergey.Glukhov@sun.com>
branch nick: mysql-5.1-innodb
timestamp: Wed 2010-06-09 16:17:18 +0400
message:
Bug#38999 valgrind warnings for update statement in function compare_record()
(InnoDB plugin branch)
@ mysql-test/suite/innodb_plugin/r/innodb_mysql.result
test case
@ mysql-test/suite/innodb_plugin/t/innodb_mysql.test
test case
@ storage/innodb_plugin/row/row0sel.c
init null bytes with default values as they might be
left uninitialized in some cases and these uninited bytes
might be copied into mysql record buffer that leads to
valgrind warnings on next use of the buffer.
Bug#46527 COMMIT AND CHAIN RELEASE does not make sense
Bug#53343 completion_type=1, COMMIT/ROLLBACK AND CHAIN don't
preserve the isolation level
Bug#53346 completion_type has strange effect in a stored
procedure/prepared statement
Added test cases to verify the expected behaviour of :
SET SESSION TRANSACTION ISOLATION LEVEL,
SET TRANSACTION ISOLATION LEVEL,
@@completion_type,
COMMIT AND CHAIN,
ROLLBACK AND CHAIN
..and some combinations of the above
Logging slow stored procedures caused the slow log to write
very large lock times. The lock times was a result of a
negative number being cast to an unsigned integer.
The reason the lock time appeard negative was because
one of the measurements points was reset after execution
causing it to change order with the start time of the
statement.
This bug is related to bug 47905 which in turn was
introduced because of a joint fix for 12480,12481,12482 and 11587.
The fix is to only reset the start_time before any statement
execution in a SP while not resetting start_utime or
utime_after_lock which are used for measuring the
performance of the SP. Start_time is used to set the
timestamp on the replication event which controlls how
the slave interprets time functions like NOW().
locks for DML statements and changes the way MDL locks
are acquired/granted in contended case.
Instead of backing-off when a lock conflict is encountered
and waiting for it to go away before restarting open_tables()
process we now wait for lock to be released without releasing
any previously acquired locks. If conflicting lock goes away
we resume opening tables. If waiting leads to a deadlock we
try to resolve it by backing-off and restarting open_tables()
immediately.
As result both waiting for possibility to acquire and
acquiring of a metadata lock now always happen within the
same MDL API call. This has allowed to make release of a lock
and granting it to the most appropriate pending request an
atomic operation.
Thanks to this it became possible to wake up during release
of lock only those waiters which requests can be satisfied
at the moment as well as wake up only one waiter in case
when granting its request would prevent all other requests
from being satisfied. This solves thundering herd problem
which occured in cases when we were releasing some lock and
woke up many waiters for SNRW or X locks (this was the issue
in bug#52289 "performance regression for MyISAM in sysbench
OLTP_RW test".
This also allowed to implement more fair (FIFO) scheduling
among waiters with the same priority.
It also opens the door for introducing new types of requests
for metadata locks such as low-prio SNRW lock which is
necessary in order to support LOCK TABLES LOW_PRIORITY WRITE.
Notice that after this sometimes can report ER_LOCK_DEADLOCK
error in cases in which it has not happened before.
Particularly we will always report this error if waiting for
conflicting lock has happened in the middle of transaction
and resulted in a deadlock. Before this patch the error was
not reported if deadlock could have been resolved by backing
off all metadata locks acquired by the current statement.
Conflicts:
Text conflict in mysql-test/r/archive.result
Contents conflict in mysql-test/r/innodb_bug38231.result
Text conflict in mysql-test/r/mdl_sync.result
Text conflict in mysql-test/suite/binlog/t/disabled.def
Text conflict in mysql-test/suite/rpl_ndb/r/rpl_ndb_binlog_format_errors.result
Text conflict in mysql-test/t/archive.test
Contents conflict in mysql-test/t/innodb_bug38231.test
Text conflict in mysql-test/t/mdl_sync.test
Text conflict in sql/sp_head.cc
Text conflict in sql/sql_show.cc
Text conflict in sql/table.cc
Text conflict in sql/table.h
When using Unique Keys with nullable parts in RBR, the slave can
choose the wrong row to update. This happens because a table with
an unique key containing nullable parts cannot strictly guarantee
uniqueness. As stated in the manual, for all engines, a UNIQUE
index allows multiple NULL values for columns that can contain
NULL.
We fix this at the slave by extending the checks before assuming
that the row found through an unique index is is the correct
one. This means that when a record (R) is fetched from the storage
engine and a key that is not primary (K) is used, the server does
the following:
- If K is unique and has no nullable parts, it returns R;
- Otherwise, if any field in the before image that is part of K
is null do an index scan;
- If there is no NULL field in the BI part of K, then return R.
A side change: renamed the existing test case file and added a
test case covering the changes in this patch.
innodb.innodb [ fail ]
Test ended at 2010-06-02 15:04:06
CURRENT_TEST: innodb.innodb
--- /usr/w/mysql-trunk-innodb/mysql-test/suite/innodb/r/innodb.result 2010-05-23 23:10:26.576407000 +0300
+++ /usr/w/mysql-trunk-innodb/mysql-test/suite/innodb/r/innodb.reject 2010-06-02 15:04:05.000000000 +0300
@@ -2648,7 +2648,7 @@
create table t4 (s1 char(2) binary,primary key (s1)) engine=innodb;
insert into t1 values (0x41),(0x4120),(0x4100);
insert into t2 values (0x41),(0x4120),(0x4100);
-ERROR 23000: Duplicate entry 'A\x00' for key 'PRIMARY'
+ERROR 23000: Duplicate entry 'A' for key 'PRIMARY'
insert into t2 values (0x41),(0x4120);
The change in the printout was introduced in:
------------------------------------------------------------
revno: 3008.6.2
revision-id: sergey.glukhov@sun.com-20100527160143-57nas8nplzpj26dz
parent: sergey.glukhov@sun.com-20100527155443-24vqi9o8rpnkyci7
committer: Sergey Glukhov <Sergey.Glukhov@sun.com>
branch nick: mysql-trunk-bugfixing
timestamp: Thu 2010-05-27 20:01:43 +0400
message:
Bug#52430 Incorrect key in the error message for duplicate key error involving BINARY type
For BINARY(N) strip trailing zeroes to make the error message nice-looking
@ mysql-test/r/errors.result
test case
@ mysql-test/r/type_binary.result
result fix
@ mysql-test/t/errors.test
test case
@ sql/key.cc
For BINARY(N) strip trailing zeroes to make the error message nice-looking
and its author (Sergey) did not notice the test failure because that test
has been disabled in his tree.
------------------------------------------------------------
revno: 3495
committer: Marko Mäkelä <marko.makela@oracle.com>
branch nick: 5.1-innodb
timestamp: Wed 2010-06-02 13:37:14 +0300
message:
Bug#53674: InnoDB: Error: unlock row could not find a 4 mode lock on the record
In semi-consistent read, only unlock freshly locked non-matching records.
lock_rec_lock_fast(): Return LOCK_REC_SUCCESS,
LOCK_REC_SUCCESS_CREATED, or LOCK_REC_FAIL instead of TRUE/FALSE.
enum db_err: Add DB_SUCCESS_LOCKED_REC for indicating a successful
operation where a record lock was created.
lock_sec_rec_read_check_and_lock(),
lock_clust_rec_read_check_and_lock(), lock_rec_enqueue_waiting(),
lock_rec_lock_slow(), lock_rec_lock(), row_ins_set_shared_rec_lock(),
row_ins_set_exclusive_rec_lock(), sel_set_rec_lock(),
row_sel_get_clust_rec_for_mysql(): Return DB_SUCCESS_LOCKED_REC if a
new record lock was created. Adjust callers.
row_unlock_for_mysql(): Correct the function documentation.
row_prebuilt_t::new_rec_locks: Correct the documentation.
errors
In the fix of BUG#39934 in 5.1-rep+3, errors are generated when
binlog_format=row and a statement modifies a table restricted to
statement-logging (ER_BINLOG_ROW_MODE_AND_STMT_ENGINE); or if
binlog_format=statement and a statement modifies a table restricted to
row-logging (ER_BINLOG_STMT_MODE_AND_ROW_ENGINE).
However, some DDL statements that lock tables (e.g. ALTER TABLE,
CREATE INDEX and CREATE TRIGGER) were causing spurious errors,
although no row might be inserted into the binary log.
To fix the problem, we tagged statements that may generate
rows into the binary log and thence the warning messages are
only printed out when the appropriate conditions hold and rows
might be changed.
breaks
When a "CREATE TEMPORARY TABLE SELECT * FROM" was executed the OPTION_KEEP_LOG was
not set into the thd->variables.option_bits. For that reason, if the transaction
had updated only transactional engines and was rolled back at the end (.e.g due to
a deadlock) the changes were not written to the binary log, including the creation
of the temporary table.
To fix the problem, we have set the OPTION_KEEP_LOG into the
thd->variables.option_bits when a "CREATE TEMPORARY TABLE
SELECT * FROM" is executed.