The binlog_cache_use is incremented twice when changes to a transactional table
are committed, i.e. TC_LOG_BINLOG::log_xid calls is called. The problem happens
because log_xid calls both binlog_flush_stmt_cache and binlog_flush_trx_cache
without checking if such caches are empty thus unintentionally increasing the
binlog_cache_use value twice.
To fix the problem we avoided incrementing the binlog_cache_use if the cache is
empty. We also decided to increment binlog_cache_use when the cache is truncated
as the cache is used although its content is discarded and is not written to the
binary log.
Note that binlog_cache_use is incremented for both types of cache, transactional
and non-transactional and that the behavior presented in this patch also applies
to the binlog_cache_disk_use.
Finally, we re-organized the code around the functions binlog_flush_trx_cache and
binlog_flush_stmt_cache.
Some of the test cases reference to binlog position and
these position numbers are written into result explicitly.
It is difficult to maintain if log event format changes.
There are a couple of cases explicit position number appears,
we handle them in different ways
A. 'CHANGE MASTER ...' with MASTER_LOG_POS or/and RELAY_LOG_POS options
Use --replace_result to mask them.
B. 'SHOW BINLOG EVENT ...'
Replaced by show_binlog_events.inc or wait_for_binlog_event.inc.
show_binlog_events.inc file's function is enhanced by given
$binlog_file and $binlog_limit.
C. 'SHOW SLAVE STATUS', 'show_slave_status.inc' and 'show_slave_status2.inc'
For the test cases just care a few items in the result of 'SHOW SLAVE STATUS',
only the items related to each test case are showed.
'show_slave_status.inc' is rebuild, only the given items in $status_items
will be showed.
'check_slave_is_running.inc' and 'check_slave_no_error.inc'
and 'check_slave_param.inc' are auxiliary files helping
to show running status and error information easily.
Bug#20837 Apparent change of isolation level during transaction,
Bug#46527 COMMIT AND CHAIN RELEASE does not make sense,
Bug#53343 completion_type=1, COMMIT/ROLLBACK AND CHAIN don't
preserve the isolation level
Bug#53346 completion_type has strange effect in a stored
procedure/prepared statement
Make thd->tx_isolation mean strictly "current transaction
isolation level"
Make thd->variables.tx_isolation mean "current session isolation
level".
The current transaction isolation level is now established
at transaction start. If there was a SET TRANSACTION
ISOLATION LEVEL statement, the value is taken from it.
Otherwise, the session value is used.
A change in a session value, made while a transaction is active,
whereas still allowed, no longer has any effect on the
current transaction isolation level. This is an incompatible
change.
A change in a session isolation level, made while there is
no active transaction, overrides SET TRANSACTION statement,
if there was any.
Changed the impelmentation to not look at @@session.completion_type
in the parser, and thus fixed Bug#53346.
Changed the parser to not allow AND NO CHAIN RELEASE,
and thus fixed Bug#46527.
Changed the transaction API to take the current transaction
isolation level into account:
- BEGIN/COMMIT now do preserve the current transaction
isolation level if chaining is on.
- implicit commit, XA COMMIT or XA ROLLBACK or autocommit don't.
Clarified error messages related to unsafe statements:
- avoid the internal technical term "row injection"
- use 'binary log' instead of 'binlog'
- avoid the word 'unsafeness'
In auto-commit mode, updating both trx and non-trx tables (i.e. issuing a mixed
statement) causes the following sequence of events:
1 - "Flush trx changes" (MYSQL_BIN_LOG::write) - T1:
1.1 - mutex_lock (&LOCK_log)
1.2 - mutex_lock (&LOCK_prep_xids)
1.3 - increase prepared_xids
1.4 - mutex_unlock (&LOCK_prep_xids)
1.5 - mutex_unlock (&LOCK_log)
2 - "Flush non-trx changes" (MYSQL_BIN_LOG::write) - T1:
2.1 - mutex_lock (&LOCK_log)
2.2 - mutex_unlock (&LOCK_log)
3. "unlog" - T1
3.1 - mutex_lock (&LOCK_prep_xids)
3.2 - decrease prepared xids
3.3 - pthread_cond_signal(&COND_prep_xids);
3.4 - mutex_unlock (&LOCK_prep_xids)
The "FLUSH logs" command produces the following sequence of events:
1 - "FLUSH logs" command (MYSQL_BIN_LOG::new_file_impl) - user thread:
1.1 - mutex_lock (&LOCK_log)
1.2 - mutex_lock (&LOCK_prep_xids)
1.3 - while (prepared_xids) pthread_cond_wait(..., &LOCK_prep_xids);
1.4 - mutex_unlock (&LOCK_prep_xids)
1.5 - mutex_unlock (&LOCK_log)
A deadlock will arise if T1 flushes the trx changes and thus increases
prepared_xids but before it is able to continue the execution and flush the
non-trx changes, an user thread calls the "FLUSH logs" command and wait that
the prepared_xids is decreased and gets to zero. However, T1 cannot proceed
with the call to "Flush non-trx changes" because it will block in the mutex
"LOCK_log" and by consequence cannot complete the execution and call the
unlog to decrease the prepared_xids.
To fix the problem, we ensure that the non-trx changes are always flushed
before the trx changes.
Note that if you call "Flush non-trx changes" and a concurrent "FLUSH logs" is
issued, the "Flush non-trx changes" may block, but a deadlock will never happen
because the prepared_xids will eventually get to zero. Bottom line, there will
not be any transaction able to increase the prepared_xids because they will
block in the mutex "LOCK_log" (MYSQL_BIN_LOG::write) and those that increased
the prepared_xids will eventually commit and decrease the prepared_xids.
General overview:
The logic for switching to row format when binlog_format=MIXED had
numerous flaws. The underlying problem was the lack of a consistent
architecture.
General purpose of this changeset:
This changeset introduces an architecture for switching to row format
when binlog_format=MIXED. It enforces the architecture where it has
to. It leaves some bugs to be fixed later. It adds extensive tests to
verify that unsafe statements work as expected and that appropriate
errors are produced by problems with the selection of binlog format.
It was not practical to split this into smaller pieces of work.
Problem 1:
To determine the logging mode, the code has to take several parameters
into account (namely: (1) the value of binlog_format; (2) the
capabilities of the engines; (3) the type of the current statement:
normal, unsafe, or row injection). These parameters may conflict in
several ways, namely:
- binlog_format=STATEMENT for a row injection
- binlog_format=STATEMENT for an unsafe statement
- binlog_format=STATEMENT for an engine only supporting row logging
- binlog_format=ROW for an engine only supporting statement logging
- statement is unsafe and engine does not support row logging
- row injection in a table that does not support statement logging
- statement modifies one table that does not support row logging and
one that does not support statement logging
Several of these conflicts were not detected, or were detected with
an inappropriate error message. The problem of BUG#39934 was that no
appropriate error message was written for the case when an engine
only supporting row logging executed a row injection with
binlog_format=ROW. However, all above cases must be handled.
Fix 1:
Introduce new error codes (sql/share/errmsg.txt). Ensure that all
conditions are detected and handled in decide_logging_format()
Problem 2:
The binlog format shall be determined once per statement, in
decide_logging_format(). It shall not be changed before or after that.
Before decide_logging_format() is called, all information necessary to
determine the logging format must be available. This principle ensures
that all unsafe statements are handled in a consistent way.
However, this principle is not followed:
thd->set_current_stmt_binlog_row_based_if_mixed() is called in several
places, including from code executing UPDATE..LIMIT,
INSERT..SELECT..LIMIT, DELETE..LIMIT, INSERT DELAYED, and
SET @@binlog_format. After Problem 1 was fixed, that caused
inconsistencies where these unsafe statements would not print the
appropriate warnings or errors for some of the conflicts.
Fix 2:
Remove calls to THD::set_current_stmt_binlog_row_based_if_mixed() from
code executed after decide_logging_format(). Compensate by calling the
set_current_stmt_unsafe() at parse time. This way, all unsafe statements
are detected by decide_logging_format().
Problem 3:
INSERT DELAYED is not unsafe: it is logged in statement format even if
binlog_format=MIXED, and no warning is printed even if
binlog_format=STATEMENT. This is BUG#45825.
Fix 3:
Made INSERT DELAYED set itself to unsafe at parse time. This allows
decide_logging_format() to detect that a warning should be printed or
the binlog_format changed.
Problem 4:
LIMIT clause were not marked as unsafe when executed inside stored
functions/triggers/views/prepared statements. This is
BUG#45785.
Fix 4:
Make statements containing the LIMIT clause marked as unsafe at
parse time, instead of at execution time. This allows propagating
unsafe-ness to the view.
The binlog_innodb test was sensitive to what tests ran before it. Now run
FLUSH STATUS before performing operations that need to be checked.
sys_var_thd_ulong::update() was improperly casting an option value from
ulonglong to ulong before comparing it to the max allowed value. On systems
where ulong and ulonglong are of different size, this caused values greater
than ULONG_MAX to wrap around (not be truncated to ULONG_MAX, which appears to
have been the intention of the original coder), and caused some checks to work
incorrectly. This wasn't generally visible to the user, because later checks
would prevent the wrapped-around value from being used. But it caused warning
messages to differ between 32- and 64-bit platforms. Fix is to just remove the
cast. Also added a DBUG_ASSERT to ensure that the value really is capped
properly before finally stuffing it into the ulong.
Now, every transaction (including autocommit transactions) starts with
a BEGIN and ends with a COMMIT/ROLLBACK in the binlog.
Added a test case, and updated lots of test case result files.