When using Unique Keys with nullable parts in RBR, the slave can
choose the wrong row to update. This happens because a table with
an unique key containing nullable parts cannot strictly guarantee
uniqueness. As stated in the manual, for all engines, a UNIQUE
index allows multiple NULL values for columns that can contain
NULL.
We fix this at the slave by extending the checks before assuming
that the row found through an unique index is is the correct
one. This means that when a record (R) is fetched from the storage
engine and a key that is not primary (K) is used, the server does
the following:
- If K is unique and has no nullable parts, it returns R;
- Otherwise, if any field in the before image that is part of K
is null do an index scan;
- If there is no NULL field in the BI part of K, then return R.
A side change: renamed the existing test case file and added a
test case covering the changes in this patch.
This patch fixes two problems described as follows:
1 - If there is an on-going transaction and a temporary table is created or
dropped, any failed statement that follows the "create" or "drop commands"
triggers a rollback and by consequence the slave will go out sync because
the binary log will have a wrong sequence of events.
To fix the problem, we changed the expression that evaluates when the
cache should be flushed after either the rollback of a statment or
transaction.
2 - When a "CREATE TEMPORARY TABLE SELECT * FROM" was executed the
OPTION_KEEP_LOG was not set into the thd->options. For that reason, if
the transaction had updated only transactional engines and was rolled
back at the end (.e.g due to a deadlock) the changes were not written
to the binary log, including the creation of the temporary table.
To fix the problem, we have set the OPTION_KEEP_LOG into the thd->options
when a "CREATE TEMPORARY TABLE SELECT * FROM" is executed.
sql/log.cc:
Reorganized the code based on the following functions:
- bool ending_trans(const THD* thd, const bool all);
- bool trans_has_updated_non_trans_table(const THD* thd);
- bool trans_has_no_stmt_committed(const THD* thd, const bool all);
- bool stmt_has_updated_non_trans_table(const THD* thd);
sql/log.h:
Added functions to organize the code in log.cc.
sql/log_event.cc:
Removed the OPTION_KEEP_LOG since it must be used only when
creating and dropping temporary tables.
sql/log_event_old.cc:
Removed the OPTION_KEEP_LOG since it must be used only when
creating and dropping temporary tables.
sql/sql_parse.cc:
When a "CREATE TEMPORARY TABLE SELECT * FROM" was executed the
OPTION_KEEP_LOG was not set into the thd->options.
To fix the problem, we have set the OPTION_KEEP_LOG into the
thd->options when a "CREATE TEMPORARY TABLE SELECT * FROM"
is executed.
When using a non-transactional table (t1) on the master
and with autocommit disabled, no COMMIT is recorded
in the binary log ending the statement. Therefore, if
the slave has t1 in a transactional engine, then it will
be as if a transaction is started but never ends. This is
actually BUG#29288 all over again.
We fix this by cherrypicking the cset for BUG#29288 which
was pushed to a later mysql version. The revision picked
was: mats@sun.com-20090923094343-bnheplq8n95opjay .
Additionally, a test case for covering the scenario depicted
in the bug report is included in this cset.
of sync
In RBR, sometimes the table->s->last_null_bit_pos can be zero. This
has impact at the slave when it compares records fetched from the
storage engine against records in the binary log event. If
last_null_bit_pos is zero the slave, while comparing in
log_event.cc:record_compare function, would set all bits in the last
null_byte to 1 (assumed all 8 were unused) . Thence it would loose the
ability to distinguish records that were similar in contents except
for the fact that some field was null in one record, but not in the
other. Ultimately this would cause wrong matches, and in the specific
case depicted in the bug report the same record would be updated
twice, resulting in a lost update.
Additionally, in the record_compare function the slave was setting the
X bit unconditionally. There are cases that the X bit does not exist
in the record header. This could also lead to wrong matches between
records.
We fix both by conditionally resetting the bits: (i) unused null_bits
are set if last_null_bit_pos > 0; (ii) X bit is set if
HA_OPTION_PACK_RECORD is in use.
mysql-test/extra/rpl_tests/rpl_record_compare.test:
Shared part of the test case for MyISAM and InnoDB.
mysql-test/suite/rpl/t/rpl_row_rec_comp_innodb.test:
InnoDB test case.
mysql-test/suite/rpl/t/rpl_row_rec_comp_myisam.test:
MyISAM test case. Added also coverage for Field_bits case.
sql/log_event.cc:
Deployed conditional setting of unused bits at record_compare.
sql/log_event_old.cc:
Same change as in log_event.cc.
When mysqlbinlog was given the --database=X flag, it always printed
'ROLLBACK TO', but the corresponding 'SAVEPOINT' statement was not
printed. The replicated filter(replicated-do/ignore-db) and binlog
filter (binlog-do/ignore-db) has the same problem. They are solved
in this patch together.
After this patch, We always check whether the query is 'SAVEPOINT'
statement or not. Because this is a literal check, 'SAVEPOINT' and
'ROLLBACK TO' statements are also binlogged in uppercase with no
any comments.
The binlog before this patch can be handled correctly except one case
that any comments are in front of the keywords. for example:
/* bla bla */ SAVEPOINT a;
/* bla bla */ ROLLBACK TO a;
for InnoDB
The class Field_bit_as_char stores the metadata for the
field incorrecly because bytes_in_rec and bit_len are set
to (field_length + 7 ) / 8 and 0 respectively, while
Field_bit has the correct values field_length / 8 and
field_length % 8.
Solved the problem by re-computing the values for the
metadata based on the field_length instead of using the
bytes_in_rec and bit_len variables.
To handle compatibility with old server, a table map
flag was added to indicate that the bit computation is
exact. If the flag is clear, the slave computes the
number of bytes required to store the bit field and
compares that instead, effectively allowing replication
*without conversion* from any field length that require
the same number of bytes to store.
mysql-test/suite/rpl/t/rpl_typeconv_innodb.test:
Adding test to check compatibility for bit field
replication when using InnoDB
sql/field.cc:
Extending compatible_field_size() with flags from
table map to allow fields to check master info.
sql/field.h:
Extending compatible_field_size() with flags from
table map to allow fields to check master info.
sql/log.cc:
Removing table map flags since they are not used
outside table map class.
sql/log_event.cc:
Removing flags parameter from table map constructor
since it is not used and does not have to be exposed.
sql/log_event.h:
Adding flag to denote that bit length for bit field type
is exact and not potentially rounded to even bytes.
sql/rpl_utility.cc:
Adding fields to table_def to store table map flags.
sql/rpl_utility.h:
Removing obsolete comment and adding flags to store
table map flags from master.
I found three issues during the analysis:
1. Memory leak caused by temp_buf not being freed;
2. Memory leak caused when handling argv;
3. Conditional jump that depended on unitialized values.
Issue #1
--------
DESCRIPTION: when mysqlbinlog is reading from a remote location
the event temp_buf references the incoming stream (in NET
object), which is not freed by mysqlbinlog explicitly. On the
other hand, when it is reading local binary log, it points to a
temporary buffer that needs to be explicitly freed. For both
cases, the temp_buf was not freed by mysqlbinlog, instead was
set to 0. This clearly disregards the free required in the
second case, thence creating a memory leak.
FIX: we make temp_buf to be conditionally freed depending on
the value of remote_opt. Found out that similar fix is already
in most recent codebases.
Issue #2
--------
DESCRIPTION: load_defaults is called by parse_args, and it
reads default options from configuration files and put them
BEFORE the arguments that are already in argc and argv. This is
done resorting to MEM_ROOT. However, parse_args calls
handle_options immediately after which changes argv. Later when
freeing the defaults, pointers to MEM_ROOT won't match, causing
the memory not to be freed:
void free_defaults(char **argv)
{
MEM_ROOT ptr
memcpy_fixed((char*) &ptr,(char *) argv - sizeof(ptr), sizeof(ptr));
free_root(&ptr,MYF(0));
}
FIX: we remove load_defaults from parse_args and call it
before. Then we save argv with defaults in defaults_argv BEFORE
calling parse_args (which inside can then call handle_options
at will). Actually, found out that this is in fact kind of a
backport for BUG#38468 into 5.1, so I merged in the test case
as well and added error check for load_defaults call.
Fix based on:
revid:zhenxing.he@sun.com-20091002081840-uv26f0flw4uvo33y
Issue #3
--------
DESCRIPTION: the structure st_print_event_info constructor
would not initialize the sql_mode member, although it did for
sql_mode_inited (set to false). This would later raise the
warning in valgrind when printing the sql_mode in the event
header, as this print out is protected by a check against
sql_mode_inited and sql_mode variables. Given that sql_mode was
not initialized valgrind would output the warning.
FIX: we add initialization of sql_mode to the
st_print_event_info constructor.
client/mysqlbinlog.cc:
- Conditionally free ev->temp_buf.
- save defaults_argv before handle_options is called.
mysql-test/t/mysqlbinlog.test:
Added test case from BUG#38468.
sql/log_event.cc:
Added initialization of sql_mode for st_print_event_info.
MySQL with gcc 4.3.2
This is the final patch in the context of this bug.
cmd-line-utils/readline/rlmbutil.h:
Changed in a previous patch, reverted by a backport.
cmd-line-utils/readline/text.c:
Static var initialization.
extra/yassl/include/yassl_error.hpp:
SetErrorString handles errors outside of the YasslError
enum.
extra/yassl/src/ssl.cpp:
SetErrorString handles errors outside of the YasslError
enum.
extra/yassl/src/yassl_error.cpp:
SetErrorString handles errors outside of the YasslError
enum.
into slow log
While processing a statement, down the mysql_parse execution
stack, the thd->enable_slow_log can be assigned to
opt_log_slow_admin_statements, depending whether one is executing
administrative statements, such as ALTER TABLE, OPTIMIZE,
ANALYZE, etc, or not. This can have an impact on slow logging for
statements that are executed after an administrative statement
execution is completed.
When executing statements directly from the user this is fine
because, the thd->enable_slow_log is reset right at the beginning
of the dispatch_command function, ie, everytime a new statement
is set is set to execute.
On the other hand, for slave SQL thread (sql_thd) the story is a
bit different. When in SBR the sql_thd applies statements by
calling mysql_parse. Right after, it calls log_slow_statement
function to log them if they take too long. Calling mysql_parse
directly is fine, but also means that dispatch_command function
is bypassed. As a consequence, thd->enable_slow_log does not get
a chance to be reset before the next statement to be executed by
the sql_thd. If the statement just executed by the sql_thd was an
administrative statement and logging of admin statements was
disabled, this means that sql_thd->enable_slow_log will be set to
0 (disabled) from that moment on. End result: sql_thd stops
logging slow statements.
We fix this by resetting the value of sql_thd->enable_slow_log to
the value of opt_log_slow_slave_statements right after
log_slow_stement is called by the sql_thd.
Rename method as to not hide a base.
Reorder attributes initialization.
Remove unused variable.
Rework code to silence a warning due to assignment used as truth value.
sql/item_strfunc.cc:
Rename method as to not hide a base.
sql/item_strfunc.h:
Rename method as to not hide a base.
sql/log_event.cc:
Reorder attributes initialization.
sql/rpl_injector.cc:
Rework code to silence a warning due to assignment used as truth value.
sql/rpl_record.cc:
Remove unused variable.
sql/sql_db.cc:
Rework code to silence a warning due to assignment used as truth value.
sql/sql_parse.cc:
Rework code to silence a warning due to assignment used as truth value.
sql/sql_table.cc:
Rework code to silence a warning due to assignment used as truth value.
When replicating from 4.1 master to 5.0 slave START SLAVE UNTIL can stop too late.
The necessary in calculating of the beginning of an event the event's length
did not correspond to the master's genuine information at the event's execution time.
That piece of info was changed at the event's relay-logging due to binlog_version<4 event
conversion by IO thread.
Fixed with storing the master genuine Query_log_event size into a new status
variable at relay-logging of the event. The stored info is extacted at the event
execution and participate further to caclulate the correct start position of the event
in the until-pos stopping routine.
The new status variable's algorithm will be only active when the event comes
from the master of version < 5.0 (binlog_version < 4).
mysql-test/r/rpl_until.result:
results changed.
mysql-test/std_data/bug47142_master-bin.000001:
a binlog from 4.1 master to replace one of the running 5.x master is added as
part of Bug #47142 regression test.
mysql-test/t/rpl_until.test:
Regression test for Bug #47142 is added.
sql/log_event.cc:
Storing and extracting the master's genuine size of the event from the status
var of the event packet header.
The binlog_version<4 query-log-event is
a. converted into the modern binlog_version==4 to store the original size of the event
into a new status var; the converted representation goes into the relay log.
b. the converted event is read out and the stored size is engaged in the start pos calculation.
The new status is active only for events that IO thread instantiates for the sake of the conversion.
sql/log_event.h:
Incrementing the max szie of MAX_SIZE_LOG_EVENT_STATUS because of the new status var;
Defining the new status variable to hold the master's genuine event size;
Augmenting the Query_log_event with a new member to hold a value to store/extact from the status
var of the event packet header.
- Marked a couple of tests with --big
- Fixed xtradb/handler/ha_innodb.cc to call explain_filename()
storage/xtradb/handler/ha_innodb.cc:
Call explain_filename() to get proper names for partitioned tables
In statement-based or mixed-mode replication, use DROP TEMPORARY TABLE
to drop multiple tables causes different errors on master and slave,
when one or more of these tables do not exist. Because when executed
on slave, it would automatically add IF EXISTS to the query to ignore
all ER_BAD_TABLE_ERROR errors.
To fix the problem, do not add IF EXISTS when executing DROP TEMPORARY
TABLE on the slave, and clear the ER_BAD_TABLE_ERROR error after
execution if the query does not expect any errors.
mysql-test/r/rpl_drop_temp.result:
Updated for the patch of bug#49137.
mysql-test/t/rpl_drop_temp.test:
Added the test file to verify if DROP MULTI TEMPORARY TABLE
will cause different errors on master and slave, when one or
more of these tables do not exist.
sql/log_event.cc:
Added code to handle above cases which are
removed from sql_parse.cc
sql/sql_parse.cc:
Remove the code to issue the 'Unknown table' error,
if the temporary table does not exist when dropping
it on slave. The above cases decribed in comments
will be handled later in log_event.cc.
BUG#49481: RBR: MyISAM and bit fields may cause slave to stop on delete:
cant find record
BUG#49482: RBR: Replication may break on deletes when MyISAM tables +
char field are used
When using MyISAM tables, despite the fact that the null bit is
set for some fields, their old value is still in the row. This
can cause the comparison of records to fail when the slave is
doing an index or range scan.
We fix this by avoiding memcmp for MyISAM tables when comparing
records. Additionally, when comparing field by field, we first
check if both fields are not null and if so, then we compare
them. If just one field is null we return failure immediately. If
both fields are null, we move on to the next field.
For tables with metadata sizes ranging from 251 to 255 the size
of the event data (m_data_size) was being improperly calculated
in the Table_map_log_event constructor. This was due to the fact
that when writing the Table_map_log_event body (in
Table_map_log_event::write_data_body) a call to net_store_length
is made for packing the m_field_metadata_size. It happens that
net_store_length uses *one* byte for storing
m_field_metadata_size when it is smaller than 251 but *three*
bytes when it exceeds that value. BUG 42749 had already
pinpointed and fix this fact, but the fix was incomplete, as the
calculation in the Table_map_log_event constructor considers 255
instead of 251 as the threshold to increment m_data_size by
three. Thence, the window for having a mismatch between the
number of bytes written and the number of bytes accounted in the
event length (m_data_size) was left open for
m_field_metadata_size values between 251 and 255.
We fix this by changing the condition in the Table_map_log_event
constructor to match the one in the net_store_length, ie,
increment one byte if m_field_metadata_size < 251 and three if it
exceeds this value.
mysql-test/suite/rpl/r/rpl_row_tbl_metadata.result:
Updated result file.
mysql-test/suite/rpl/t/rpl_row_tbl_metadata.test:
Changes to the original test case: added slave and moved
file into the rpl suite.
New test case: replicates two tables one with 250 and
another with 252 metadata sizes. This exercises the usage
of 1 or 3 bytes while packing the m_field_metadata_size.
sql/log_event.cc:
Made the m_data_size calculation for the table map log event
to match the number of bytes used while packing the
m_field_metadata_size value (according to net_store_length
function in pack.c).
In statement-based or mixed-mode replication, use DROP TEMPORARY TABLE
to drop multiple tables causes different errors on master and slave,
when one or more of these tables do not exist. Because when executed
on slave, it would automatically add IF EXISTS to the query to ignore
all ER_BAD_TABLE_ERROR errors.
To fix the problem, do not add IF EXISTS when executing DROP TEMPORARY
TABLE on the slave, and clear the ER_BAD_TABLE_ERROR error after
execution if the query does not expect any errors.
mysql-test/suite/rpl/r/rpl_drop_temp.result:
Updated for the patch of bug#49137.
mysql-test/suite/rpl/t/rpl_drop_temp.test:
Added the test file to verify if DROP MULTI TEMPORARY TABLE
will cause different errors on master and slave, when one or
more of these tables do not exist.
sql/log_event.cc:
Added code to handle above cases which are
removed from sql_parse.cc
sql/sql_parse.cc:
Remove the code to issue the 'Unknown table' error,
if the temporary table does not exist when dropping
it on slave. The above cases decribed in comments
will be handled later in log_event.cc.
'LOAD DATA CONCURRENT [LOCAL] INFILE ...' statment only is binlogged as
'LOAD DATA [LOCAL] INFILE ...' in SBR and MBR. As a result, if replication is on,
queries on slaves will be blocked by the replication SQL thread.
This patch write code to write 'CONCURRENT' into the log event if 'CONCURRENT' option
is in the original statement in SBR and MBR.
sql/log_event.cc:
gcc 4.4.1 assumes that variables that you cast away will not change (strict-aliasing)
The symptom was that mysql-test-run binlog.binglog_database got errors in the log
mysql-test/mysql-test-run.pl:
Fix Valgrind warnings: add more post-shutdown warning suppressions, and revert
bad previous change.
sql/log_event.cc:
Manually apply fix for Bug#48340 (basically missing initialisation
of thd->lex->local_file in Load_log_event::do_apply_event())
Valgrind reports a conditional jump that depends on uninitialized
data while doing a LOAD DATA and for this test case only. This
test case, tests that loading data from a 4.0 or 4.1 instance
into a 5.1 instance is working. As such it handles old binary log
with a different set of events than currently 5.1 codebase uses.
See the following reference for details:
http://forge.mysql.com/wiki/MySQL_Internals_Binary_Log#LOAD_DATA_INFILE_Events
Problem:
The server is handling an Execute_load_log_event, which results
in reading a Load_log_event from the binary log and applying
it. When applying the Load_log_event, some variable setup is
done and then mysql_load is called. Late in mysql_load
execution, if not in row mode logging, the event is
binlogged write_execute_load_query_log_event.
In write_execute_load_query_log_event, thd->lex->local_file is
inspected. The problem is that it has not been set before in the
execution stack. This causes valgrind to report the warning.
Fix:
We fix this by initializing thd->lex->local_file to be the same
as the value of Load_log_event::local_fname, when lex_start is
called inside Load_log_event::do_apply_event.
- Moved some code from innodb_plugin to xtradb, to ensure that all tests runs
- Did changes in pbxt and maria storage engines becasue of changes in thd->query
- Reverted wrong code in sql_table.cc for how ROW_FORMAT is used.
This is a re-commit of Monty's merge to eliminate an extra commit from
MySQL-5.1.42 that was accidentally included in the merge.
This is a merge of the MySQL 5.1.41 clone-off (clone-5.1.41-build). In
case there are any extra changes done before final MySQL 5.1.41
release, these will need to be merged later before MariaDB 5.1.41
release.
In function log_event.cc:Query_log_event::write, there was a cast that
was triggering undefined behavior. The offending cast is the
following:
write_str_with_code_and_len((char **)(&start),
catalog, catalog_len, Q_CATALOG_NZ_CODE);
This results in calling write_str_with_code_and_len with first
argument pointing to a (char **) while "start" is itself a pointer to
uchar (uchar *). Inside write_str_with_..., the content of start is
then be updated:
(*dst)+= len;
The instruction above would cause the (*dst) pointer (ie, the "start"
argument, from the caller point of view, and which actually points to
uchar instead of pointing to char) to be updated so that it would
increment catalog_len. However, this seems to break strict-aliasing
rules ultimately causing the increment and assignment to behave
unexpectedly.
We fix this by removing the cast and by making the types match.
- Review fixes
client/Makefile.am:
- Make it build on Linux
client/mysqlbinlog.cc:
- Coding style fixes
- Better/more comments
- Use client/sql_string.*, not server's sql/sql_string.*.
- Don't declare a dummy TABLE_LIST structure in the client.
client/sql_string.h:
- Use client/sql_string.*, not server's sql/sql_string.*.
sql/log_event.cc:
= Fix coding style
= Introduce Log_event::event_owns_temp_buf which tells whether Log_event::temp_buf is 'owned' by the Log_event object and should be my_free'd on return.
This is needed because rewrite_db() needs to dispose of the buffer, and
- when mysqlbinlog is reading directly from binlog file, the buffer
should be freed
- when mysqlbinlog is reading from a server, the buffer is a part of network
buffer and shouldn't be freed.
sql/log_event.h:
Introduce Log_event::event_owns_temp_buf which tells whether Log_event::temp_buf is 'owned' by the Log_event object and should be my_free'd on return.
This is needed because rewrite_db() needs to dispose of the buffer, and
- when mysqlbinlog is reading directly from binlog file, the buffer
should be freed
- when mysqlbinlog is reading from a server, the buffer is a part of network
buffer and shouldn't be freed.
sql/mysqld.cc:
- Better/more comments
sql/rpl_filter.cc:
- #ifdef-out Rpl_filter::tables_ok from the client. This allows not
to define dummy TABLE_LIST on the client
sql/rpl_filter.h:
- #ifdef-out Rpl_filter::tables_ok from the client. This allows not
to define dummy TABLE_LIST on the client
sql/sql_string.cc:
- Use client/sql_string.*, not server's sql/sql_string.*.
sql/sql_string.h:
- Use client/sql_string.*, not server's sql/sql_string.*.