The binlog_cache_use is incremented twice when changes to a transactional table
are committed, i.e. TC_LOG_BINLOG::log_xid calls is called. The problem happens
because log_xid calls both binlog_flush_stmt_cache and binlog_flush_trx_cache
without checking if such caches are empty thus unintentionally increasing the
binlog_cache_use value twice.
To fix the problem we avoided incrementing the binlog_cache_use if the cache is
empty. We also decided to increment binlog_cache_use when the cache is truncated
as the cache is used although its content is discarded and is not written to the
binary log.
Note that binlog_cache_use is incremented for both types of cache, transactional
and non-transactional and that the behavior presented in this patch also applies
to the binlog_cache_disk_use.
Finally, we re-organized the code around the functions binlog_flush_trx_cache and
binlog_flush_stmt_cache.
In sql/log.c, member function wait_for_update_bin_log, a condition is entered with
THD::enter_cond but is not exited. This might leave dangling references to the
mutex/condition in the per-thread information area.
To fix the problem, we call exit_cond to properly remove references to the mutex,
LOCK_log.
temp table
This patch introduces two key changes in the replication's behavior.
Firstly, it reverts part of BUG#51894 which puts any update to temporary tables
into the trx-cache. Now, updates to temporary tables are handled according to
the type of their engines as a regular table.
Secondly, an unsafe mixed statement, (i.e. a statement that access transactional
table as well non-transactional or temporary table, and writes to any of them),
are written into the trx-cache in order to minimize errors in the execution when
the statement logging format is in use.
Such changes has a direct impact on which statements are classified as unsafe
statements and thus part of BUG#53259 is reverted.
After BUG#36649, warnings for sub-statements are cleared when a
new sub-statement is started. This is problematic since it suppresses
warnings for unsafe statements in some cases. It is important that we
always give a warning to the client, because the user needs to know
when there is a risk that the slave goes out of sync.
We fixed the problem by generating warning messages for unsafe statements
while returning from a stored procedure, function, trigger or while
executing a top level statement.
We also started checking unsafeness when both performance and log tables are
used. This is necessary after the performance schema which does a distinction
between performance and log tables.
In sql/log.c, member function wait_for_update_bin_log, a condition is entered with
THD::enter_cond but is not exited. This might leave dangling references to the
mutex/condition in the per-thread information area.
To fix the problem, we call exit_cond to properly remove references to the mutex,
LOCK_log.
A CREATE...SELECT that fails is written to the binary log if a non-transactional
statement is updated. If the logging format is ROW, the CREATE statement and the
changes are written to the binary log as distinct events and by consequence the
created table is not rolled back in the slave.
In this patch, we opted to let the slave goes out of sync by not writting to the
binary log the CREATE statement. We do this by simply reseting the binary log's
cache.
This patch also fixes Bug#55452 "SET PASSWORD is
replicated twice in RBR mode".
The goal of this patch is to remove the release of
metadata locks from close_thread_tables().
This is necessary to not mistakenly release
the locks in the course of a multi-step
operation that involves multiple close_thread_tables()
or close_tables_for_reopen().
On the same token, move statement commit outside
close_thread_tables().
Other cleanups:
Cleanup COM_FIELD_LIST.
Don't call close_thread_tables() in COM_SHUTDOWN -- there
are no open tables there that can be closed (we leave
the locked tables mode in THD destructor, and this
close_thread_tables() won't leave it anyway).
Make open_and_lock_tables() and open_and_lock_tables_derived()
call close_thread_tables() upon failure.
Remove the calls to close_thread_tables() that are now
unnecessary.
Simplify the back off condition in Open_table_context.
Streamline metadata lock handling in LOCK TABLES
implementation.
Add asserts to ensure correct life cycle of
statement transaction in a session.
Remove a piece of dead code that has also become redundant
after the fix for Bug 37521.
Fix warnings flagged by the new warning option -Wunused-but-set-variable
that was added to GCC 4.6 and that is enabled by -Wunused and -Wall. The
option causes a warning whenever a local variable is assigned to but is
later unused. It also warns about meaningless pointer dereferences.
Essentially, the problem is that safemalloc is excruciatingly
slow as it checks all allocated blocks for overrun at each
memory management primitive, yielding a almost exponential
slowdown for the memory management functions (malloc, realloc,
free). The overrun check basically consists of verifying some
bytes of a block for certain magic keys, which catches some
simple forms of overrun. Another minor problem is violation
of aliasing rules and that its own internal list of blocks
is prone to corruption.
Another issue with safemalloc is rather the maintenance cost
as the tool has a significant impact on the server code.
Given the magnitude of memory debuggers available nowadays,
especially those that are provided with the platform malloc
implementation, maintenance of a in-house and largely obsolete
memory debugger becomes a burden that is not worth the effort
due to its slowness and lack of support for detecting more
common forms of heap corruption.
Since there are third-party tools that can provide the same
functionality at a lower or comparable performance cost, the
solution is to simply remove safemalloc. Third-party tools
can provide the same functionality at a lower or comparable
performance cost.
The removal of safemalloc also allows a simplification of the
malloc wrappers, removing quite a bit of kludge: redefinition
of my_malloc, my_free and the removal of the unused second
argument of my_free. Since free() always check whether the
supplied pointer is null, redudant checks are also removed.
Also, this patch adds unit testing for my_malloc and moves
my_realloc implementation into the same file as the other
memory allocation primitives.
Apart strict-aliasing warnings, fix the remaining warnings
generated by GCC 4.4.4 -Wall and -Wextra flags.
One major source of warnings was the in-house function my_bcmp
which (unconventionally) took pointers to unsigned characters
as the byte sequences to be compared. Since my_bcmp and bcmp
are deprecated functions whose only difference with memcmp is
the return value, every use of the function is replaced with
memcmp as the special return value wasn't actually being used
by any caller.
There were also various other warnings, mostly due to type
mismatches, missing return values, missing prototypes, dead
code (unreachable) and ignored return values.
BUG#54872 MBR: replication failure caused by using tmp table inside transaction
Changed criteria to classify a statement as unsafe in order to reduce the
number of spurious warnings. So a statement is classified as unsafe when
there is on-going transaction at any point of the execution if:
1. The mixed statement is about to update a transactional table and
a non-transactional table.
2. The mixed statement is about to update a temporary transactional
table and a non-transactional table.
3. The mixed statement is about to update a transactional table and
read from a non-transactional table.
4. The mixed statement is about to update a temporary transactional
table and read from a non-transactional table.
5. The mixed statement is about to update a non-transactional table
and read from a transactional table when the isolation level is
lower than repeatable read.
After updating a transactional table if:
6. The mixed statement is about to update a non-transactional table
and read from a temporary transactional table.
7. The mixed statement is about to update a non-transactional table
and read from a temporary transactional table.
8. The mixed statement is about to update a non-transactionala table
and read from a temporary non-transactional table.
9. The mixed statement is about to update a temporary non-transactional
table and update a non-transactional table.
10. The mixed statement is about to update a temporary non-transactional
table and read from a non-transactional table.
11. A statement is about to update a non-transactional table and the
option variables.binlog_direct_non_trans_update is OFF.
The reason for this is that locks acquired may not protected a concurrent
transaction of interfering in the current execution and by consequence in
the result. So the patch reduced the number of spurious unsafe warnings.
Besides we fixed a regression caused by BUG#51894, which makes temporary
tables to go into the trx-cache if there is an on-going transaction. In
MIXED mode, the patch for BUG#51894 ignores that the trx-cache may have
updates to temporary non-transactional tables that must be written to the
binary log while rolling back the transaction.
So we fix this problem by writing the content of the trx-cache to the
binary log while rolling back a transaction if a non-transactional
temporary table was updated and the binary logging format is MIXED.
Conflicts:
Text conflict in mysql-test/r/archive.result
Contents conflict in mysql-test/r/innodb_bug38231.result
Text conflict in mysql-test/r/mdl_sync.result
Text conflict in mysql-test/suite/binlog/t/disabled.def
Text conflict in mysql-test/suite/rpl_ndb/r/rpl_ndb_binlog_format_errors.result
Text conflict in mysql-test/t/archive.test
Contents conflict in mysql-test/t/innodb_bug38231.test
Text conflict in mysql-test/t/mdl_sync.test
Text conflict in sql/sp_head.cc
Text conflict in sql/sql_show.cc
Text conflict in sql/table.cc
Text conflict in sql/table.h
This patch fixes two problems described as follows:
1 - If there is an on-going transaction and a temporary table is created or
dropped, any failed statement that follows the "create" or "drop commands"
triggers a rollback and by consequence the slave will go out sync because
the binary log will have a wrong sequence of events.
To fix the problem, we changed the expression that evaluates when the
cache should be flushed after either the rollback of a statment or
transaction.
2 - When a "CREATE TEMPORARY TABLE SELECT * FROM" was executed the
OPTION_KEEP_LOG was not set into the thd->options. For that reason, if
the transaction had updated only transactional engines and was rolled
back at the end (.e.g due to a deadlock) the changes were not written
to the binary log, including the creation of the temporary table.
To fix the problem, we have set the OPTION_KEEP_LOG into the thd->options
when a "CREATE TEMPORARY TABLE SELECT * FROM" is executed.
MYSQL_BIN_LOG m_table_map_version member and it's associated
functions were not used in the logic of binlogging and replication,
this patch removed all related code.
transaction
BUG#52616 Temp table prevents switch binlog format from STATEMENT to ROW
Before the WL#2687 and BUG#46364, every non-transactional change that happened
after a transactional change was written to trx-cache and flushed upon
committing the transaction. WL#2687 and BUG#46364 changed this behavior and
non-transactional changes are now written to the binary log upon committing
the statement.
A binary log event is identified as transactional or non-transactional through
a flag in the Log_event which is set taking into account the underlie storage
engine on what it is stems from. In the current bug, this flag was not being
set properly when the DROP TEMPORARY TABLE was executed.
However, while fixing this bug we figured out that changes to temporary tables
should be always written to the trx-cache if there is an on-going transaction.
Otherwise, binlog events in the reversed order would be produced.
Regarding concurrency, keeping changes to temporary tables in the trx-cache is
also safe as temporary tables are only visible to the owner connection.
In this patch, we classify the following statements as unsafe:
1 - INSERT INTO t_myisam SELECT * FROM t_myisam_temp
2 - INSERT INTO t_myisam_temp SELECT * FROM t_myisam
3 - CREATE TEMPORARY TABLE t_myisam_temp SELECT * FROM t_myisam
On the other hand, the following statements are classified as safe:
1 - INSERT INTO t_innodb SELECT * FROM t_myisam_temp
2 - INSERT INTO t_myisam_temp SELECT * FROM t_innodb
The patch also guarantees that transactions that have a DROP TEMPORARY are
always written to the binary log regardless of the mode and the outcome:
commit or rollback. In particular, the DROP TEMPORARY is extended with the
IF EXISTS clause when the current statement logging format is set to row.
Finally, the patch allows to switch from STATEMENT to MIXED/ROW when there
are temporary tables but the contrary is not possible.
Adding my_global.h first in all files using
NO_EMBEDDED_ACCESS_CHECKS.
Correcting a merge problem resulting from a changed definition
of check_some_access compared to the original patches.
Conflicts:
Text conflict in mysql-test/suite/binlog/r/binlog_row_mix_innodb_myisam.result
Text conflict in sql/log.cc
Text conflict in sql/set_var.cc
Text conflict in sql/sql_class.cc
This patch:
- Moves all definitions from the mysql_priv.h file into
header files for the component where the variable is
defined
- Creates header files if the component lacks one
- Eliminates all include directives from mysql_priv.h
- Eliminates all circular include cycles
- Rename time.cc to sql_time.cc
- Rename mysql_priv.h to sql_priv.h
BUG#46364 introduced the flag binlog_direct_non_transactional_updates which
would make N-changes to be written to the binary log upon committing the
statement when "ON". On the other hand, when "OFF" the option was supposed
to mimic the behavior in 5.1. However, the implementation was not mimicking
the behavior correctly and the following bugs popped up:
Case #1: N-changes executed within a transaction would go into
the S-cache. When later in the same transaction a
T-change occurs, N-changes following it were written
to the T-cache instead of the S-cache. In some cases,
this raises problems. For example, a
Table_map_log_event being written initially into the
S-cache, together with the initial N-changes, would be
absent from the T-cache. This would log N-changes
orphaned from a Table_map_log_event (thence discarded
at the slave). (MIXED and ROW)
Case #2: When rolling back a transaction, the N-changes that
might be in the T-cache were disregarded and
truncated along with the T-changes. (MIXED and ROW)
Case #3: When a MIXED statement (TN) is ahead of any other
T-changes in the transaction and it fails, it is kept
in the T-cache until the transaction ends. This is
not the case in 5.1 or Betony (5.5.2). In these, the
failed TN statement would be written to the binlog at
the same instant it had failed and not deferred until
transaction end. (SBR)
To fix these problems, we have decided to do what follows:
For Case #1 and #2, we circumvent them:
1. by not letting binlog_direct_non_transactional_updates
affect MIXED and RBR. These modes will keep the behavior
provided by WL#2687. Although this will make Celosia to
behave differently from 5.1, an execution will be always
safe under such modes in the sense that slaves will never
go out sync. In 5.1, using either MIXED or ROW while
mixing N-statements and T-statements was not safe.
For Case #3, we don't actually fix it. We:
1. keep it and make all MIXED statements whether they end
up failing or not or whether they are up front in the
transaction or after some transactional change to always
be stored in the T-cache. This means that it is written
to the binary log on transaction commit/rollback only.
2. We make the warning message even more specific about the
MIXED statement and SBR.
Conflicts:
Text conflict in client/mysqlbinlog.cc
Text conflict in mysql-test/Makefile.am
Text conflict in mysql-test/collections/default.daily
Text conflict in mysql-test/r/mysqlbinlog_row_innodb.result
Text conflict in mysql-test/suite/rpl/r/rpl_typeconv_innodb.result
Text conflict in mysql-test/suite/rpl/t/rpl_get_master_version_and_clock.test
Text conflict in mysql-test/suite/rpl/t/rpl_row_create_table.test
Text conflict in mysql-test/suite/rpl/t/rpl_slave_skip.test
Text conflict in mysql-test/suite/rpl/t/rpl_typeconv_innodb.test
Text conflict in mysys/charset.c
Text conflict in sql/field.cc
Text conflict in sql/field.h
Text conflict in sql/item.h
Text conflict in sql/item_func.cc
Text conflict in sql/log.cc
Text conflict in sql/log_event.cc
Text conflict in sql/log_event_old.cc
Text conflict in sql/mysqld.cc
Text conflict in sql/rpl_utility.cc
Text conflict in sql/rpl_utility.h
Text conflict in sql/set_var.cc
Text conflict in sql/share/Makefile.am
Text conflict in sql/sql_delete.cc
Text conflict in sql/sql_plugin.cc
Text conflict in sql/sql_select.cc
Text conflict in sql/sql_table.cc
Text conflict in storage/example/ha_example.h
Text conflict in storage/federated/ha_federated.cc
Text conflict in storage/myisammrg/ha_myisammrg.cc
Text conflict in storage/myisammrg/myrg_open.c
When mysqlbinlog was given the --database=X flag, it always printed
'ROLLBACK TO', but the corresponding 'SAVEPOINT' statement was not
printed. The replicated filter(replicated-do/ignore-db) and binlog
filter (binlog-do/ignore-db) has the same problem. They are solved
in this patch together.
After this patch, We always check whether the query is 'SAVEPOINT'
statement or not. Because this is a literal check, 'SAVEPOINT' and
'ROLLBACK TO' statements are also binlogged in uppercase with no
any comments.
The binlog before this patch can be handled correctly except one case
that any comments are in front of the keywords. for example:
/* bla bla */ SAVEPOINT a;
/* bla bla */ ROLLBACK TO a;
Conflicts:
Text conflict in mysql-test/r/partition_innodb.result
Text conflict in sql/field.h
Text conflict in sql/item.h
Text conflict in sql/item_cmpfunc.h
Text conflict in sql/item_sum.h
Text conflict in sql/log_event_old.cc
Text conflict in sql/protocol.cc
Text conflict in sql/sql_select.cc
Text conflict in sql/sql_yacc.yy
for InnoDB
The class Field_bit_as_char stores the metadata for the
field incorrecly because bytes_in_rec and bit_len are set
to (field_length + 7 ) / 8 and 0 respectively, while
Field_bit has the correct values field_length / 8 and
field_length % 8.
Solved the problem by re-computing the values for the
metadata based on the field_length instead of using the
bytes_in_rec and bit_len variables.
To handle compatibility with old server, a table map
flag was added to indicate that the bit computation is
exact. If the flag is clear, the slave computes the
number of bytes required to store the bit field and
compares that instead, effectively allowing replication
*without conversion* from any field length that require
the same number of bytes to store.
In BUG#49562 we fixed the case where numeric user var events
would not serialize the flag stating whether the value was signed
or unsigned (unsigned_flag). This fixed the case that the slave
would get an overflow while treating the unsigned values as
signed.
In this bug, we find that the unsigned_flag can sometimes change
between the moment that the user value is recorded for binlogging
purposes and the actual binlogging time. Since we take the
unsigned_flag from the runtime variable data, at binlogging time,
and the variable value is comes from the copy taken earlier in
the execution, there may be inconsistency in the
User_var_log_event between the variable value and its
unsigned_flag.
We fix this by also copying the unsigned_flag of the
user_var_entry when its value is copied, for binlogging
purposes. Later, at binlogging time, we use the copied
unsigned_flag and not the one in the runtime user_var_entry
instance.
mode
When the master was executing in sql_mode='traditional' (which
implies that really_abort_on_warning returns TRUE - because of
MODE_STRICT_ALL_TABLES), the error code (ER_DUP_ENTRY in the
reported case) was not being set in the
Query_log_event. Therefore, even if a failure was to be expected
when replaying the statement on the slave, a failure would occur,
because the Query_log_event was not transporting the expected
error code, but 0 instead.
This was because when the master was getting the error code to
set it in the Query_log_event, the executing thread would be
assumed to have been killed:
THD::killed==THD::KILL_BAD_DATA. This would make the error code
fetch routine not to check thd->main_da.sql_errno(), but instead
the thd->killed value. What's more, is that the server would
thd->killed value if thd->killed == THD::KILL_BAD_DATA and return
0 instead. So this is a double inconsistency, as the we should
not even check thd->killed but rather thd->main_da.sql_errno().
We fix this by extending the condition used to choose whether to
check the thd->main_da.sql_errno() or thd->killed, so that it
takes into consideration the case when:
thd->killed==THD::KILL_BAD_DATA.