Non-transactional updates that take place inside a transaction present problems
for logging because they are visible to other clients before the transaction
is committed, and they are not rolled back even if the transaction is rolled
back. It is not always possible to log correctly in statement format when both
transactional and non-transactional tables are used in the same transaction.
In the current patch, we ensure that such scenario is completely safe under the
ROW and MIXED modes.
Conflicts
=========
Text conflict in .bzr-mysql/default.conf
Text conflict in libmysqld/CMakeLists.txt
Text conflict in libmysqld/Makefile.am
Text conflict in mysql-test/collections/default.experimental
Text conflict in mysql-test/extra/rpl_tests/rpl_row_sp006.test
Text conflict in mysql-test/suite/binlog/r/binlog_tmp_table.result
Text conflict in mysql-test/suite/rpl/r/rpl_loaddata.result
Text conflict in mysql-test/suite/rpl/r/rpl_loaddata_fatal.result
Text conflict in mysql-test/suite/rpl/r/rpl_row_create_table.result
Text conflict in mysql-test/suite/rpl/r/rpl_row_sp006_InnoDB.result
Text conflict in mysql-test/suite/rpl/r/rpl_stm_log.result
Text conflict in mysql-test/suite/rpl_ndb/r/rpl_ndb_circular_simplex.result
Text conflict in mysql-test/suite/rpl_ndb/r/rpl_ndb_sp006.result
Text conflict in mysql-test/t/mysqlbinlog.test
Text conflict in sql/CMakeLists.txt
Text conflict in sql/Makefile.am
Text conflict in sql/log_event_old.cc
Text conflict in sql/rpl_rli.cc
Text conflict in sql/slave.cc
Text conflict in sql/sql_binlog.cc
Text conflict in sql/sql_lex.h
21 conflicts encountered.
NOTE
====
mysql-5.1-rpl-merge has been made a mirror of mysql-next-mr:
- "mysql-5.1-rpl-merge$ bzr pull ../mysql-next-mr"
This is the first cset (merge/...) committed after pulling
from mysql-next-mr.
file
Issuing 'FLUSH LOGS' does not close and reopen indexfile.
Instead a SEEK_SET is performed.
This patch makes index file to be closed and reopened whenever a
rotation happens (FLUSH LOGS is issued or binary log exceeds
maximum configured size).
Implemented the server infrastructure for the fix:
1. Added a function LEX_STRING *thd_query_string(THD) to return
a LEX_STRING structure instead of char *.
This is the function that must be called in innodb instead of
thd_query()
2. Did some encapsulation in THD : aggregated thd_query and
thd_query_length into a LEX_STRING and made accessor and mutator
methods for easy code updating.
3. Updated the server code to use the new methods where applicable.
not on predefined values
The default name of the PID file was constructed, as documented,
based on the hostname. This name was subsequently used as the
base for the general log file name. If the name of the PID
file was overridden in the configuration, and no explicit name
was set for the general log file, the path location for the
PID file was used also for the general log file.
A new variable, 'default_logfile_name', has been introduced. This name
is constructed based on the hostname, and is then used to
construct both the PID file and the general log file.
The general log file will now, unless explicitly set, be
located in the server data directory (as documentated in
the server docs)
Let
- T be a transactional table and N non-transactional table.
- B be begin, C commit and R rollback.
- N be a statement that accesses and changes only N-tables.
- T be a statement that accesses and changes only T-tables.
In RBR, changes to N-tables that happen early in a transaction are not immediately flushed
upon committing a statement. This behavior may, however, break consistency in the presence
of concurrency since changes done to N-tables become immediately visible to other
connections. To fix this problem, we do the following:
. B N N T C would log - B N C B N C B T C.
. B N N T R would log - B N C B N C B T R.
Note that we are not preserving history from the master as we are introducing a commit that
never happened. However, this seems to be more acceptable than the possibility of breaking
consistency in the presence of concurrency.
Let
- T be a transactional table and N non-transactional table.
- B be begin, C commit and R rollback.
- M be a mixed statement, i.e. a statement that updates both T and N.
- M* be a mixed statement that fails while updating either T or N.
This patch restore the behavior presented in 5.1.37 for rows either produced in
the RBR or MIXED modes, when a M* statement that happened early in a transaction
had their changes written to the binary log outside the boundaries of the
transaction and wrapped in a BEGIN/ROLLBACK. This was done to keep the slave
consistent with with the master as the rollback would keep the changes on N and
undo them on T. In particular, we do what follows:
. B M* T C would log - B M* R B T C.
Note that, we are not preserving history from the master as we are introducing a
rollback that never happened. However, this seems to be more acceptable than
making the slave diverge. We do not fix the following case:
. B T M* C would log B T M* C.
The slave will diverge as the changes on T tables that originated from the M
statement are rolled back on the master but not on the slave. Unfortunately, we
cannot simply rollback the transaction as this would undo any uncommitted
changes on T tables.
SBR is not considered in this patch because a failing statement is written to
the binary along with the error code and a slave executes and then rolls back
the statement when it has an associated error code, thus undoing the effects
on T. In RBR and MBR, a full-fledged fix will be pushed after the WL 2687.
BUG#31665 sync_binlog should cause relay logs to be synchronized
NOTE: Backporting the patch to next-mr.
Add sync_relay_log option to server, this option works for relay log
the same as option sync_binlog for binlog. This option also synchronize
master info to disk when set to non-zero value.
Original patches from Sinisa and Mark, with some modifications
beyond unsigned long.
BUG#44779: binlog.binlog_max_extension may be causing failure on
next test in PB
NOTE1: this is the backport to next-mr.
NOTE2: already includes patch for BUG#44779.
Binlog file extensions would turn into negative numbers once the
variable used to hold the value reached maximum for signed
long. Consequently, incrementing value to the next (negative) number
would lead to .000000 extension, causing the server to fail.
This patch addresses this issue by not allowing negative extensions
and by returning an error on find_uniq_filename, when the limit is
reached. Additionally, warnings are printed to the error log when the
limit is approaching. FLUSH LOGS will also report warnings to the
user, if the extension number has reached the limit. The limit has been
set to 0x7FFFFFFF as the maximum.
NOTE: this is the backport to next-mr.
When using replication, the slave will not log any slow query logs queries
replicated from the master, even if the option "--log-slow-slave-statements"
is set and these take more than "log_query_time" to execute.
In order to log slow queries in replicated thread one needs to set the
--log-slow-slave-statements, so that the SQL thread is initialized with the
correct switch. Although setting this flag correctly configures the slave
thread option to log slow queries, there is an issue with the condition that
is used to check whether to log the slow query or not. When replaying binlog
events the statement contains the SET TIMESTAMP clause which will force the
slow logging condition check to fail. Consequently, the slow query logging will
not take place.
This patch addresses this issue by removing the second condition from the
log_slow_statements as it prevents slow queries to be binlogged and seems
to be deprecated.
Problem: LOGGER::general_log_write() relied on valid "thd" parameter passed
but had inconsistent "if (thd)" check.
Fix: as we always pass a valid "thd" parameter to the method,
redundant check removed.
with gcc 4.3.2
This patch fixes a number of GCC warnings about variables used
before initialized. A new macro UNINIT_VAR() is introduced for
use in the variable declaration, and LINT_INIT() usage will be
gradually deprecated. (A workaround is used for g++, pending a
patch for a g++ bug.)
GCC warnings for unused results (attribute warn_unused_result)
for a number of system calls (present at least in later
Ubuntus, where the usual void cast trick doesn't work) are
also fixed.
binlog
Mixing transactional (T) and non-transactional (N) tables on behalf of a
transaction may lead to inconsistencies among masters and slaves in STATEMENT
mode. The problem stems from the fact that although modifications done to
non-transactional tables on behalf of a transaction become immediately visible
to other connections they do not immediately get to the binary log and therefore
consistency is broken. Although there may be issues in mixing T and M tables in
STATEMENT mode, there are safe combinations that clients find useful.
In this bug, we fix the following issue. Mixing N and T tables in multi-level
(e.g. a statement that fires a trigger) or multi-table table statements (e.g.
update t1, t2...) were not handled correctly. In such cases, it was not possible
to distinguish when a T table was updated if the sequence of changes was N and T.
In a nutshell, just the flag "modified_non_trans_table" was not enough to reflect
that both a N and T tables were changed. To circumvent this issue, we check if an
engine is registered in the handler's list and changed something which means that
a T table was modified.
Check WL 2687 for a full-fledged patch that will make the use of either the MIXED or
ROW modes completely safe.
binlog
The fix for BUG 43929 introduced a regression issue. In a nutshell, when a
statement that changes a non-transactional table fails, it is written to the
binary log with the error code appended. Unfortunately, after BUG 43929, this
failure was flushing the transactional chace causing mismatch between execution
and logging histories. To fix this issue, we avoid flushing the transactional
cache when a commit or rollback is not issued.
NOTE: This undoes changes by BUG#42829 in sql_class.cc:binlog_query().
I will revert the change in a post-push fix (the binlog filter should
be checked in sql_base.cc:decide_logging_format()).
General overview:
The logic for switching to row format when binlog_format=MIXED had
numerous flaws. The underlying problem was the lack of a consistent
architecture.
General purpose of this changeset:
This changeset introduces an architecture for switching to row format
when binlog_format=MIXED. It enforces the architecture where it has
to. It leaves some bugs to be fixed later. It adds extensive tests to
verify that unsafe statements work as expected and that appropriate
errors are produced by problems with the selection of binlog format.
It was not practical to split this into smaller pieces of work.
Problem 1:
To determine the logging mode, the code has to take several parameters
into account (namely: (1) the value of binlog_format; (2) the
capabilities of the engines; (3) the type of the current statement:
normal, unsafe, or row injection). These parameters may conflict in
several ways, namely:
- binlog_format=STATEMENT for a row injection
- binlog_format=STATEMENT for an unsafe statement
- binlog_format=STATEMENT for an engine only supporting row logging
- binlog_format=ROW for an engine only supporting statement logging
- statement is unsafe and engine does not support row logging
- row injection in a table that does not support statement logging
- statement modifies one table that does not support row logging and
one that does not support statement logging
Several of these conflicts were not detected, or were detected with
an inappropriate error message. The problem of BUG#39934 was that no
appropriate error message was written for the case when an engine
only supporting row logging executed a row injection with
binlog_format=ROW. However, all above cases must be handled.
Fix 1:
Introduce new error codes (sql/share/errmsg.txt). Ensure that all
conditions are detected and handled in decide_logging_format()
Problem 2:
The binlog format shall be determined once per statement, in
decide_logging_format(). It shall not be changed before or after that.
Before decide_logging_format() is called, all information necessary to
determine the logging format must be available. This principle ensures
that all unsafe statements are handled in a consistent way.
However, this principle is not followed:
thd->set_current_stmt_binlog_row_based_if_mixed() is called in several
places, including from code executing UPDATE..LIMIT,
INSERT..SELECT..LIMIT, DELETE..LIMIT, INSERT DELAYED, and
SET @@binlog_format. After Problem 1 was fixed, that caused
inconsistencies where these unsafe statements would not print the
appropriate warnings or errors for some of the conflicts.
Fix 2:
Remove calls to THD::set_current_stmt_binlog_row_based_if_mixed() from
code executed after decide_logging_format(). Compensate by calling the
set_current_stmt_unsafe() at parse time. This way, all unsafe statements
are detected by decide_logging_format().
Problem 3:
INSERT DELAYED is not unsafe: it is logged in statement format even if
binlog_format=MIXED, and no warning is printed even if
binlog_format=STATEMENT. This is BUG#45825.
Fix 3:
Made INSERT DELAYED set itself to unsafe at parse time. This allows
decide_logging_format() to detect that a warning should be printed or
the binlog_format changed.
Problem 4:
LIMIT clause were not marked as unsafe when executed inside stored
functions/triggers/views/prepared statements. This is
BUG#45785.
Fix 4:
Make statements containing the LIMIT clause marked as unsafe at
parse time, instead of at execution time. This allows propagating
unsafe-ness to the view.
The problem: described in the bug report.
The fix:
--increase buffers where it's necessary
(buffers which are used in stxnmov)
--decrease buffer lengths which are used
Large transactions and statements may corrupt the binary log if the size of the
cache, which is set by the max_binlog_cache_size, is not enough to store the
the changes.
In a nutshell, to fix the bug, we save the position of the next character in the
cache before starting processing a statement. If there is a problem, we simply
restore the position thus removing any effect of the statement from the cache.
Unfortunately, to avoid corrupting the binary log, we may end up loosing changes
on non-transactional tables if they do not fit in the cache. In such cases, we
store an Incident_log_event in order to stop the slave and alert users that some
changes were not logged.
Precisely, for every non-transactional changes that do not fit into the cache,
we do the following:
a) the statement is *not* logged
b) an incident event is logged after committing/rolling back the transaction,
if any. Note that if a failure happens before writing the incident event to
the binary log, the slave will not stop and the master will not have reported
any error.
c) its respective statement gives an error
For transactional changes that do not fit into the cache, we do the following:
a) the statement is *not* logged
b) its respective statement gives an error
To work properly, this patch requires two additional things. Firstly, callers to
MYSQL_BIN_LOG::write and THD::binlog_query must handle any error returned and
take the appropriate actions such as undoing the effects of a statement. We
already changed some calls in the sql_insert.cc, sql_update.cc and sql_insert.cc
modules but the remaining calls spread all over the code should be handled in
BUG#37148. Secondly, statements must be either classified as DDL or DML because
DDLs that do not get into the cache must generate an incident event since they
cannot be rolled back.
statements missed from general log
A refinement of the test in the previous patch to avoid
using sleep as a means to ensure that timestamps are
added to the log entries.
BEGIN/COMMIT/ROLLBACK was subject to replication db rules, and
caused the boundary of a transaction not recognized correctly
when these queries were ignored by the rules.
Fixed the problem by skipping replication db rules for these
statements.
Make the caller of Query_log_event, Execute_load_log_event
constructors and THD::binlog_query to provide the error code
instead of having the constructors to figure out the error code.
Bug#44091: libmysqld gets stuck waiting on mutex on initialization
The problem was that libmysqld wasn't enforcing a certain
initialization and deinitialization order for the mysys
library. Another problem was that the global object used
for management of log event handlers (aka LOGGER) wasn't
being prepared for a possible reutilization.
What leads to the hang/crash reported is that a failure
to load the language file triggers a double call of the
cleanup functions, causing an already destroyed mutex to
be used.
The solution is enforce a order on the initialization and
deinitialization of the mysys library within the libmysqld
library and to ensure that the global LOGGER object reset
it's internal state during cleanup.
When the thread executing a DDL was killed after finished its
execution but before writing the binlog event, the error code in
the binlog event could be set wrongly to ER_SERVER_SHUTDOWN or
ER_QUERY_INTERRUPTED.
This patch fixed the problem by ignoring the kill status when
constructing the event for DDL statements.
This patch also included the following changes in order to
provide the test case.
1) modified mysqltest to support variable for connection command
2) modified mysql-test-run.pl, add new variable MYSQL_SLAVE to
run mysql client against the slave mysqld.
The problem here seem to be that when mysql
is redirecting stderr to a file, stderr becomes
buffered, whereas it is unbuffered by definition.
The solution is to unbuffer it by setting buffer
to null.
- Remove bothersome warning messages. This change focuses on the warnings
that are covered by the ignore file: support-files/compiler_warnings.supp.
- Strings are guaranteed to be max uint in length