- MWL#47, allowing to annotate row-based binlog events with the SQL test of
the originating query (eg. in mysqlbinlog output).
- row_based_replication_without_primary_key.patch, providing more intelligent
selection of index to use on slave when applying row-based binlog events
for tables with no primary key.
- Make mysqlbinlog omit redundant `use` around BEGIN/SAVEPOINT/COMMIT/
ROLLBACK in 5.0 binlogs.
This patch adds options to annotate the binlog (and the mysqlbinlog
output) with the original SQL query for queries that are logged
using row-based replication.
From a user perspective, the problem is that a FLUSH LOGS or SIGHUP
signal could end up associating the stdout and stderr to random
files. In the case of this bug report, the streams would end up
associated to InnoDB ibd files.
The freopen(3) function is not thread-safe on FreeBSD. What this
means is that if another thread calls open(2) during freopen()
is executing that another thread's fd returned by open(2) may get
re-associated with the file being passed to freopen(3). See FreeBSD
PR number 79887 for reference:
http://www.freebsd.org/cgi/query-pr.cgi?pr=79887
This problem is worked around by substituting a internal hook within
the FILE structure. This avoids the loss of atomicity by not having
the original fd closed before its duplicated.
Patch based on the original work by Vasil Dimov.
include/my_sys.h:
Export my_freopen.
mysys/my_fopen.c:
Add a my_freopen abstraction to workaround bugs in specific OSes.
Add a prototype for getosreldate() as older FreeBSD versions did
not define one.
sql/log.cc:
Move freopen abstraction code over to mysys.
The streams are now only reopened for writing.
Post-push fixes:
- fixed platform dependent result files
- appeasing valgrind warnings:
Fault injection was also uncovering a previously existing
potential mem leaks. For BUG#46166 testing purposes, fixed
by forcing handling the leak when injecting faults.
Manual merge from mysql-5.1-bugteam into mysql-5.5-bugteam.
Conflicts
=========
Text conflict in sql/log.cc
Text conflict in sql/log.h
Text conflict in sql/slave.cc
Text conflict in sql/sql_parse.cc
Text conflict in sql/sql_priv.h
- Fixed problem with oqgraph and 'make dist'
Note that after this merge we have a problem show in join_outer where we examine too many rows in one specific case (related to BUG#57024).
This will be fixed when mwl#128 is merged into 5.3.
when generating new name.
If find_uniq_filename returns an error, then this error is not
being propagated upwards, and execution does not report error to
the user (although a entry in the error log is generated).
Additionally, some more errors were ignored in new_file_impl:
- when writing the rotate event
- when reopening the index and binary log file
This patch addresses this by propagating the error up in the
execution stack. Furthermore, when rotation of the binary log
fails, an incident event is written, because there may be a
chance that some changes for a given statement, were not properly
logged. For example, in SBR, LOAD DATA INFILE statement requires
more than one event to be logged, should rotation fail while
logging part of the LOAD DATA events, then the logged data would
become inconsistent with the data in the storage engine.
mysql-test/include/restart_mysqld.inc:
Refactored restart_mysqld so that it is not hardcoded for
mysqld.1, but rather for the current server.
mysql-test/suite/binlog/t/binlog_index.test:
The error on open of index and binary log on new_file_impl
is now caught. Thence the user will get an error message.
We need to accomodate this change in the test case for the
failing FLUSH LOGS.
mysql-test/suite/rpl/t/rpl_binlog_errors-master.opt:
Sets max_binlog_size to 4096.
mysql-test/suite/rpl/t/rpl_binlog_errors.test:
Added some test cases for asserting that the error is found
and reported.
sql/handler.cc:
Catching error now returned by unlog (in ha_commit_trans) and
returning it.
sql/log.cc:
Propagating errors from new_file_impl upwards. The errors that
new_file_impl catches now are:
- error on generate_new_name
- error on writing the rotate event
- error when opening the index or the binary log file.
sql/log.h:
Changing declaration of:
- rotate_and_purge
- new_file
- new_file_without_locking
- new_file_impl
- unlog
They now return int instead of void.
sql/mysql_priv.h:
Change signature of reload_acl_and_cache so that write_to_binlog
is an int instead of bool.
sql/mysqld.cc:
Redeclaring not_used var as int instead of bool.
sql/rpl_injector.cc:
Changes to catch the return from rotate_and_purge.
sql/slave.cc:
Changes to catch the return values for new_file and rotate_relay_log.
sql/slave.h:
Changes to rotate_relay_log declaration (now returns int
instead of void).
sql/sql_load.cc:
In SBR, some logging of LOAD DATA events goes through
IO_CACHE_CALLBACK invocation at mf_iocache.c:_my_b_get. The
IO_CACHE implementation is ignoring the return value for from
these callbacks (pre_read and post_read), so we need to find out
at the end of the execution if the error is set or not in THD.
sql/sql_parse.cc:
Catching the rotate_relay_log and rotate_and_purge return values.
Semantic change in reload_acl_and_cache so that we report errors
in binlog interactions through the write_to_binlog output parameter.
If there was any failure while rotating the binary log, we should
then report the error to the client when handling SQLCOMM_FLUSH.
the DROP statement ..."
Problem: When using temporary tables and closing a session, an
implicit DROP TEMPORARY TABLE IF EXISTS is written to the binary
log (while cleaning up the context of the session THD - see:
sql_class.cc:THD::cleanup which calls close_temporary_tables).
close_temporary_tables, first checks if the binary log is opened
and then proceeds to creating the DROP statements. Then, such
statements, are written to the binary log through
MYSQL_BIN_LOG::write(Log_event *). Inside, there is another check
if the binary log is opened and if not an error is returned. This
is where the faulty behavior is triggered. Given that the test
case replays a binary log, with temp tables statements, and right
after it issues RESET MASTER, there is a chance that is_open will
report false (when the mysql session is closed and the temporary
tables are written).
is_open may return false, because MYSQL_BIN_LOG::reset_logs is
not setting the correct flag (LOG_CLOSE_TO_BE_OPENED), on the
MYSQL_LOG_BIN::log_state (instead it sets just the
LOG_CLOSE_INDEX flag, leaving the log_state to
LOG_CLOSED). Thence, when writing the DROP statement as part of
the THD::cleanup, the thread could get a return value of false
for is_open - inside MYSQL_BIN_LOG::write, ultimately reporting
that it can't write the event to the binary log.
Fix: We fix this by adding the correct flag, missing in the
second close.
Open issues:
- A better fix for #57688; Igor is working on this
- Test failure in index_merge_innodb.test ; Igor promised to look at this
- Some Innodb tests fails (need to merge with latest xtradb) ; Kristian promised to look at this.
- Failing tests: innodb_plugin.innodb_bug56143 innodb_plugin.innodb_bug56632 innodb_plugin.innodb_bug56680 innodb_plugin.innodb_bug57255
- Werror is disabled; Should be enabled after merge with xtradb.
- Changed TABLE->alias to String to get fewer reallocs when alias are used.
- Preallocate some buffers
Changed some String->c_ptr() -> String->ptr() when \0 is not needed.
Fixed wrong usage of String->ptr() when we need a \0 terminated string.
Use my_strtod() instead of my_atof() to avoid having to add \0 to string.
c_ptr() -> c_ptr_safe() to avoid warnings from valgrind.
zr
sql/event_db_repository.cc:
Update usage of TABLE->alias
sql/event_scheduler.cc:
c_ptr() -> c_ptr_safe()
sql/events.cc:
c_ptr() -> ptr() as \0 was not needed
sql/field.cc:
Update usage of TABLE->alias
sql/field.h:
Update usage of TABLE->alias
sql/ha_partition.cc:
Update usage of TABLE->alias
sql/handler.cc:
Update usage of TABLE->alias
Fixed wrong usage of str.ptr()
sql/item.cc:
Fixed error where code wrongly assumed string was \0 terminated.
sql/item_func.cc:
c_ptr() -> c_ptr_safe()
Update usage of TABLE->alias
sql/item_sum.h:
Use my_strtod() instead of my_atof() to avoid having to add \0 to string
sql/lock.cc:
Update usage of TABLE->alias
sql/log.cc:
c_ptr() -> ptr() as \0 was not needed
sql/log_event.cc:
c_ptr_quick() -> ptr() as \0 was not needed
sql/opt_range.cc:
ptr() -> c_ptr() as \0 is needed
sql/opt_subselect.cc:
Update usage of TABLE->alias
sql/opt_table_elimination.cc:
Update usage of TABLE->alias
sql/set_var.cc:
ptr() -> c_ptr() as \0 is needed
c_ptr() -> c_ptr_safe()
sql/sp.cc:
c_ptr() -> ptr() as \0 was not needed
sql/sp_rcontext.cc:
Update usage of TABLE->alias
sql/sql_base.cc:
Preallocate buffers
Update usage of TABLE->alias
sql/sql_class.cc:
Fix arguments to sprintf() to work even if string is not \0 terminated
sql/sql_insert.cc:
Update usage of TABLE->alias
c_ptr() -> ptr() as \0 was not needed
sql/sql_load.cc:
Preallocate buffers
Trivial optimizations
sql/sql_parse.cc:
Trivial optimization
sql/sql_plugin.cc:
c_ptr() -> ptr() as \0 was not needed
sql/sql_select.cc:
Update usage of TABLE->alias
sql/sql_show.cc:
Update usage of TABLE->alias
sql/sql_string.h:
Added move() function to move allocated memory from one object to another.
sql/sql_table.cc:
Update usage of TABLE->alias
c_ptr() -> c_ptr_safe()
sql/sql_test.cc:
ptr() -> c_ptr_safe()
sql/sql_trigger.cc:
Update usage of TABLE->alias
c_ptr() -> c_ptr_safe()
sql/sql_update.cc:
Update usage of TABLE->alias
sql/sql_view.cc:
ptr() -> c_ptr_safe()
sql/sql_yacc.yy:
ptr() -> c_ptr()
sql/table.cc:
Update usage of TABLE->alias
sql/table.h:
Changed TABLE->alias to String to get fewer reallocs when alias are used.
storage/federatedx/ha_federatedx.cc:
Use c_ptr_safe() to ensure strings are \0 terminated.
storage/maria/ha_maria.cc:
Update usage of TABLE->alias
storage/myisam/ha_myisam.cc:
Update usage of TABLE->alias
storage/xtradb/row/row0sel.c:
Ensure that null bits in record are properly reset.
(Old code didn't work as row_search_for_mysql() can be called twice while reading fields from one row.
Before this fix, file io for the binary log file was not accounted properly,
and showed no io at all.
This bug was due to the following issues:
1) file io for the binlog was instrumented:
- sometime as "wait/io/file/sql/binlog"
- sometime as "wait/io/file/sql/MYSQL_LOG"
leading to inconsistent event_names.
2) the binlog file itself was using an IO_CACHE,
but the IO_CACHE implementation in mysys/mf_iocache.c was
not instrumented to make performance schema calls to record file io.
3) The "wait/io/file/sql/MYSQL_LOG" instrumentation was used
for several log files, such as:
- the binary log
- the slow log
- the query log
which caused file io in these different log files to be accounted
against the same instrument.
The instrumentation needs to have a finer grain and report io
in different event_names, because each file really serves a different purpose.
With this fix:
- the IO_CACHE implementation is now instrumented
- the "wait/io/file/sql/MYSQL_LOG" instrument has been removed
- binlog io is now always instrumented with "wait/io/file/sql/binlog"
- the slow log is instrumented with a new name, "wait/io/file/sql/slow_log"
- the query log is instrumented with a new name, "wait/io/file/sql/query_log"
Make the binlog handlerton participate in START TRANSACTION WITH CONSISTENT
SNAPSHOT, recording the binlog position corresponding to the snapshot taken
in other MVCC storage engines.
Expose this consistent binlog position as the new status variables
binlog_trx_file and binlog_trx_position. This enables to get a fully
non-locking snapshot of the database (including binlog position for
slave provisioning), avoiding the need for FLUSH TABLES WITH READ LOCK.
Modify mysqldump to detect if the server supports this new feature, and
if so, avoid FLUSH TABLES WITH READ LOCK for --single-transaction
--master-data snapshot backups.
After the WL#2687, the binlog_cache_size and max_binlog_cache_size affect both the
stmt-cache and the trx-cache. This means that the resource used is twice the amount
expected/defined by the user.
The binlog_cache_use is incremented when the stmt-cache or the trx-cache is used
and binlog_cache_disk_use is incremented when the disk space from the stmt-cache or the
trx-cache is used. This behavior does not allow to distinguish which cache may be harming
performance due to the extra disk accesses and needs to have its in-memory cache
increased.
To fix the problem, we introduced two new options and status variables related to the
stmt-cache:
Options:
. binlog_stmt_cache_size
. max_binlog_stmt_cache_size
Status Variables:
. binlog_stmt_cache_use
. binlog_stmt_cache_disk_use
So there are
. binlog_cache_size that defines the size of the transactional cache for
updates to transactional engines for the binary log.
. binlog_stmt_cache_size that defines the size of the statement cache for
updates to non-transactional engines for the binary log.
. max_binlog_cache_size that sets the total size of the transactional
cache.
. max_binlog_stmt_cache_size that sets the total size of the statement
cache.
. binlog_cache_use that identifies the number of transactions that used the
temporary transactional binary log cache.
. binlog_cache_disk_use that identifies the number of transactions that used
the temporary transactional binary log cache but that exceeded the value of
binlog_cache_size.
. binlog_stmt_cache_use that identifies the number of statements that used the
temporary non-transactional binary log cache.
. binlog_stmt_cache_disk_use that identifies the number of statements that used
the temporary non-transactional binary log cache but that exceeded the value of
binlog_stmt_cache_size.
include/my_sys.h:
Updated message on disk_writes' usage.
mysql-test/extra/binlog_tests/binlog_cache_stat.test:
Updated the test case and added code to check the new status variables
binlog_stmt_cache_use and binlog_stmt_cache_disk_use.
mysql-test/extra/rpl_tests/rpl_binlog_max_cache_size.test:
Updated the test case to use the new system variables max_binlog_stmt_cache_size and binlog_stmt_cache_size.
mysql-test/r/mysqld--help-notwin.result:
Updated the result file.
mysql-test/suite/binlog/r/binlog_mixed_cache_stat.result:
Updated the result file.
mysql-test/suite/binlog/r/binlog_row_cache_stat.result:
Updated the result file.
mysql-test/suite/binlog/r/binlog_stm_cache_stat.result:
Updated the result file.
mysql-test/suite/rpl/r/rpl_mixed_binlog_max_cache_size.result:
Update the result file.
mysql-test/suite/rpl/r/rpl_row_binlog_max_cache_size.result:
Update the result file.
mysql-test/suite/rpl/r/rpl_stm_binlog_max_cache_size.result:
Updated the result file.
mysql-test/suite/sys_vars/inc/binlog_stmt_cache_size_basic.inc:
Added a test case to check the binlog_stmt_cache_size.
mysql-test/suite/sys_vars/r/binlog_stmt_cache_size_basic_32.result:
Updated the result file.
mysql-test/suite/sys_vars/r/binlog_stmt_cache_size_basic_64.result:
Updated the result file.
mysql-test/suite/sys_vars/r/max_binlog_stmt_cache_size_basic.result:
Updated the result file.
mysql-test/suite/sys_vars/t/binlog_stmt_cache_size_basic_32.test:
Added a test case to check the binlog_stmt_cache_size.
mysql-test/suite/sys_vars/t/binlog_stmt_cache_size_basic_64.test:
Added a test case to check the binlog_stmt_cache_size.
mysql-test/suite/sys_vars/t/max_binlog_cache_size_func-master.opt:
Removed because there is no test case max_binlog_cache_size_func.
mysql-test/suite/sys_vars/t/max_binlog_stmt_cache_size_basic.test:
Added a test case to check the system variable max_binlog_stmt_cache_size.
sql/log.cc:
There two main changes in here:
. Changed the set_write_error() as an error message is set according
to the type of the cache.
. Created the function set_binlog_cache_info where references to the
appropriate status and system variables are set and the server can
smoothly compute statistics and set the maximum size for each cache.
sql/log.h:
Changed the signature of the function in order to identify the error message
to be printed out as there is a different error code for each type of cache.
sql/mysqld.cc:
Added new status variables binlog_stmt_cache_use and binlog_stmt_cache_disk_use.
sql/mysqld.h:
Added new system variables max_binlog_stmt_cache_size and binlog_stmt_cache_size.
sql/share/errmsg-utf8.txt:
Added new error message related to the statement cache.
sql/sys_vars.cc:
Added new system variables max_binlog_stmt_cache_size and binlog_stmt_cache_size.
The binlog_cache_use is incremented twice when changes to a transactional table
are committed, i.e. TC_LOG_BINLOG::log_xid calls is called. The problem happens
because log_xid calls both binlog_flush_stmt_cache and binlog_flush_trx_cache
without checking if such caches are empty thus unintentionally increasing the
binlog_cache_use value twice.
Although not explicitly mentioned in the bug, the binlog_cache_disk_use presents
the same problem.
The binlog_cache_use and binlog_cache_disk_use are status variables that are
incremented when transactional (i.e. trx-cache) or non-transactional (i.e.
stmt-cache) changes are committed. They are used to compute the ratio between
the use of disk and memory while gathering updates for a transaction.
The problem reported is fixed by avoiding incrementing the binlog_cache_use
and binlog_cache_disk_use if a cache is empty. We also have decided to increment
both variables when a cache is truncated as the cache is used although its
content is discarded and is not written to the binary log.
In this patch, we take the opportunity to re-organize the code around the
function binlog_flush_trx_cache and binlog_flush_stmt_cache.
mysql-test/extra/binlog_tests/binlog_cache_stat.test:
Updated the test case with comments and additional tests to check both
transactional and non-transactional changes and what happens when a
is transaction either committed or aborted.
mysql-test/suite/binlog/r/binlog_innodb.result:
Updated the result file.
mysql-test/suite/binlog/r/binlog_mixed_cache_stat.result:
Updated the result file.
mysql-test/suite/binlog/r/binlog_row_cache_stat.result:
Updated the result file.
mysql-test/suite/binlog/r/binlog_stm_cache_stat.result:
Updated the result file.
mysql-test/suite/binlog/t/binlog_mixed_cache_stat.test:
Updated the test case with a new source file.
mysql-test/suite/binlog/t/binlog_row_cache_stat.test:
Updated the test case with a new source file.
mysql-test/suite/binlog/t/binlog_stm_cache_stat.test:
Updated the test case with a new source file.
sql/log.cc:
There are four changes in here:
. Computed statistics on binlog_cache_use and binlog_cache_disk_use while
resting the cache.
. Refactored the code so binlog_flush_cache handles both the trx-cache and
stmt-cache.
. There are functions that encapsulate the calls to binlog_flush_cache and
make easier to read the code.
. binlog_commit_flush_stmt_cache is now taking into account the result:
success or error.
sql/log_event.h:
Guaranteed Xid_log_event is always classified as a transactional event.
Port the Facebook patch for releasing InnoDB row locks early to the
MWL#116 framework.
A new --innodb-release-locks-early option (off by default) enables a
prepare_ordered() handlerton method which will release row locks and
commit a transaction to memory immediately after successful prepare.
If the server subsequently tries to rollback (ie. due to binlog error),
crashes the server to prevent corrupting the InnoDB state.
mysql-test/r/innodb_release_row_locks_early.result:
Test case.
mysql-test/t/innodb_release_row_locks_early-master.opt:
Test case.
mysql-test/t/innodb_release_row_locks_early.test:
Test case.
sql/log.cc:
Add DEBUG_SYNC points for testing.
storage/xtradb/handler/ha_innodb.cc:
Release locks during prepare phase if --innodb-release-locks-early.
Crash the server if we are asked to rollback after releasing locks and
committing the transaction to memory.
storage/xtradb/include/srv0srv.h:
Add variable for --innodb-release-locks-early option.
storage/xtradb/include/trx0sys.ic:
If --innodb-release-locks-early, treat a transaction as committed to memory
as soon as it enters the TRX_PREPARED state.
storage/xtradb/srv/srv0srv.c:
Add variable for --innodb-release-locks-early option.
Remove the extra class hierarchy with classes TC_LOG_queued, TC_LOG_unordered,
and TC_LOG_group_commit, folding the code into the TC_LOG_MMAP and
TC_LOG_BINLOG classes. In particular TC_LOG_BINLOG is greatly simplified by
this, unifying the code path for transactional and non-transactional
commit.
Remove unnecessary locking of LOCK_log in MYSQL_BIN_LOG::write() (backport
of same fix from mysql-5.5).
Make TC_LOG_MMAP (and TC_LOG_DUMMY) derive directly from TC_LOG, avoiding the
inheritance hierarchy TC_LOG_queued->TC_LOG_unordered.
Put the wakeup facility for commit_ordered() calls into the THD class.
Some renaming to get better names.
Fix assorted warnings that are generated in optimized builds.
Most of it is silencing variables that are set but unused.
This patch also introduces the MY_ASSERT_UNREACHABLE macro
which helps the compiler to deduce that a certain piece of
code is unreachable.
include/my_compiler.h:
Use GCC's __builtin_unreachable if available. It allows
GCC to deduce the unreachability of certain code paths,
thus avoiding warnings that, for example, accused that a
variable could be used without being initialized (due to
unreachable code paths).
For crash testing: kill the server without generating core file.
include/my_dbug.h
Use kill(getpid(), SIGKILL) which cannot be caught by signal handlers.
All DBUG_XXX macros should be no-ops in optimized mode, do that for DBUG_ABORT as well.
sql/handler.cc
Kill server without generating core.
sql/log.cc
Kill server without generating core.
For crash testing: kill the server without generating core file.
include/my_dbug.h
Use kill(getpid(), SIGKILL) which cannot be caught by signal handlers.
All DBUG_XXX macros should be no-ops in optimized mode, do that for DBUG_ABORT as well.
sql/handler.cc
Kill server without generating core.
sql/log.cc
Kill server without generating core.
Now the actual binlog position for each commit is stored in THD, and XtraDB
can fetch the correct value from within commit_ordered() or commit().
mysql-test/r/group_commit_binlog_pos.result:
Test case for XtraDB binlog position.
mysql-test/t/group_commit_binlog_pos-master.opt:
Test case for XtraDB binlog position.
mysql-test/t/group_commit_binlog_pos.test:
Test case for XtraDB binlog position.
sql/log.cc:
Save binlog position corresponding to commit in THD, and make accessible to storage engine.
sql/sql_parse.cc:
Add generic crash point for use in test cases.
storage/xtradb/handler/ha_innodb.cc:
Update to use new method of getting current binlog position that works with group commit.
storage/xtradb/handler/ha_innodb.h:
Update to use new method of getting current binlog position that works with group commit.
The binlog_cache_use is incremented twice when changes to a transactional table
are committed, i.e. TC_LOG_BINLOG::log_xid calls is called. The problem happens
because log_xid calls both binlog_flush_stmt_cache and binlog_flush_trx_cache
without checking if such caches are empty thus unintentionally increasing the
binlog_cache_use value twice.
To fix the problem we avoided incrementing the binlog_cache_use if the cache is
empty. We also decided to increment binlog_cache_use when the cache is truncated
as the cache is used although its content is discarded and is not written to the
binary log.
Note that binlog_cache_use is incremented for both types of cache, transactional
and non-transactional and that the behavior presented in this patch also applies
to the binlog_cache_disk_use.
Finally, we re-organized the code around the functions binlog_flush_trx_cache and
binlog_flush_stmt_cache.
mysql-test/extra/binlog_tests/binlog_cache_stat.test:
Updated the test case with comments and additional tests to check both
transactional and non-transactional changes and what happens when a
is transaction either committed or aborted.
mysql-test/suite/binlog/r/binlog_innodb.result:
Updated the result file.
mysql-test/suite/binlog/r/binlog_mixed_cache_stat.result:
Updated the result file.
mysql-test/suite/binlog/r/binlog_row_cache_stat.result:
Updated the result file.
mysql-test/suite/binlog/r/binlog_stm_cache_stat.result:
Updated the result file.
mysql-test/suite/binlog/t/binlog_mixed_cache_stat.test:
Updated the test case with a new source file.
mysql-test/suite/binlog/t/binlog_row_cache_stat.test:
Updated the test case with a new source file.
mysql-test/suite/binlog/t/binlog_stm_cache_stat.test:
Updated the test case with a new source file.
sql/log_event.cc:
Introduced debug information to check if Query_log_event("BEGIN"),
Query_log_event("COMMIT") and Query_log_event("ROLLBACK").
sql/log_event.h:
Guaranteed Xid_log_event is always classified as a transactional event.