Row-based replication requires the types of columns on the
master and slave to be approximately the same (some safe
conversions between strings are allowed), but does not
allow safe conversions between fields of similar types such
as TINYINT and INT.
This patch implement type conversions between similar fields
on the master and slave.
The conversions are controlled using a new variable
SLAVE_TYPE_CONVERSIONS of type SET('ALL_LOSSY','ALL_NON_LOSSY').
Non-lossy conversions are any conversions that do not run the
risk of losing any information, while lossy conversions can
potentially truncate the value. The column definitions are
checked to decide if the conversion is acceptable.
If neither conversion is enabled, it is required that the
definitions of the columns are identical on master and slave.
Conversion is done by creating an internal conversion table,
unpacking the master data into it, and then copy the data to
the real table on the slave.
Temporary tables may set join->group to 0 even though there is
grouping. Also need to test if sum_func_count>0 when JOIN::exec()
decides whether to present results in a grouped manner.
columns without where/group
Simple SELECT with implicit grouping used to return many rows if
the query was ordered by the aggregated column in the SELECT
list. This was incorrect because queries with implicit grouping
should only return a single record.
The problem was that when JOIN:exec() decided if execution needed
to handle grouping, it was assumed that sum_func_count==0 meant
that there were no aggregate functions in the query. This
assumption was not correct in JOIN::exec() because the aggregate
functions might have been optimized away during JOIN::optimize().
The reason why queries without ordering behaved correctly was
that sum_func_count is only recalculated if the optimizer chooses
to use temporary tables (which it does in the ordered case).
Hence, non-ordered queries were correctly treated as grouped.
The fix for this bug was to remove the assumption that
sum_func_count==0 means that there is no need for grouping. This
was done by introducing variable "bool implicit_grouping" in the
JOIN object.
The BINLOG statement was sharing too much code with the slave SQL thread, introduced with
the patch for Bug#32407. This caused statements to be logged with the wrong server_id, the
id stored inside the events of the BINLOG statement rather than the id of the running
server.
Fix by rearranging code a bit so that only relevant parts of the code are executed by
the BINLOG statement, and the server_id of the server executing the statements will
not be overrided by the server_id stored in the 'format description BINLOG statement'.
The problem was in incorrect handling of predicates involving
NULL as a constant value by the range optimizer.
For example, when creating a SEL_ARG node from a condition of
the form "field < const" (which would normally result in the
"NULL < field < const" SEL_ARG), the special case when "const"
is NULL was not taken into account, so "NULL < field < NULL"
was produced for the "field < NULL" condition.
As a result, SEL_ARG structures of this form could not be
further optimized which in turn could lead to incorrectly
constructed SEL_ARG trees. In particular, code assuming SEL_ARG
structures to always form a sequence of ordered disjoint
intervals could enter an infinite loop under some
circumstances.
Fixed by changing get_mm_leaf() so that for any sargable
predicate except "<=>" involving NULL as a constant, "empty"
SEL_ARG is returned, since such a predicate is always false.
covering index
When two range predicates were combined under an OR
predicate, the algorithm tried to merge overlapping ranges
into one. But the case when a range overlapped several other
ranges was not handled. This lead to
1) ranges overlapping, which gave repeated results and
2) a range that overlapped several other ranges was cut off.
Fixed by
1) Making sure that a range got an upper bound equal to the
next range with a greater minimum.
2) Removing a continue statement
backport for bug#44059 from mysql-pe to mysql-5.1-bugteam
Using the partition with most rows instead of first partition
to estimate the cardinality of indexes.
is reached
Problem was bad error handling, leaving some new temporary
partitions locked and initialized and some not yet initialized
and locked, leading to a crash when trying to unlock the not
yet initialized and locked partitions
Solution was to unlock the already locked partitions, and not
include any of the new temporary partitions in later unlocks
can lead to bad memory access
Problem: Field_bit is the only field which returns INT_RESULT
and doesn't have unsigned flag. As it's not a descendant of the
Field_num, so using ((Field_num *) field_bit)->unsigned_flag may lead
to unpredictable results.
Fix: check the field type before casting.
buffering is used
FORCE INDEX FOR ORDER BY now prevents the optimizer from
using join buffering. As a result the optimizer can use
indexed access on the first table and doesn't need to
sort the complete resultset at the end of the statement.
Let
- T be a transactional table and N non-transactional table.
- B be begin, C commit and R rollback.
- N be a statement that accesses and changes only N-tables.
- T be a statement that accesses and changes only T-tables.
In RBR, changes to N-tables that happen early in a transaction are not immediately flushed
upon committing a statement. This behavior may, however, break consistency in the presence
of concurrency since changes done to N-tables become immediately visible to other
connections. To fix this problem, we do the following:
. B N N T C would log - B N C B N C B T C.
. B N N T R would log - B N C B N C B T R.
Note that we are not preserving history from the master as we are introducing a commit that
never happened. However, this seems to be more acceptable than the possibility of breaking
consistency in the presence of concurrency.
Let
- T be a transactional table and N non-transactional table.
- B be begin, C commit and R rollback.
- M be a mixed statement, i.e. a statement that updates both T and N.
- M* be a mixed statement that fails while updating either T or N.
This patch restore the behavior presented in 5.1.37 for rows either produced in
the RBR or MIXED modes, when a M* statement that happened early in a transaction
had their changes written to the binary log outside the boundaries of the
transaction and wrapped in a BEGIN/ROLLBACK. This was done to keep the slave
consistent with with the master as the rollback would keep the changes on N and
undo them on T. In particular, we do what follows:
. B M* T C would log - B M* R B T C.
Note that, we are not preserving history from the master as we are introducing a
rollback that never happened. However, this seems to be more acceptable than
making the slave diverge. We do not fix the following case:
. B T M* C would log B T M* C.
The slave will diverge as the changes on T tables that originated from the M
statement are rolled back on the master but not on the slave. Unfortunately, we
cannot simply rollback the transaction as this would undo any uncommitted
changes on T tables.
SBR is not considered in this patch because a failing statement is written to
the binary along with the error code and a slave executes and then rolls back
the statement when it has an associated error code, thus undoing the effects
on T. In RBR and MBR, a full-fledged fix will be pushed after the WL 2687.
SELECT ... WHERE ... IN (NULL, ...) does full table scan,
even if the same query without the NULL uses efficient range scan.
The bugfix for the bug 18360 introduced an optimization:
if
1) all right-hand arguments of the IN function are constants
2) result types of all right argument items are compatible
enough to use the same single comparison function to
compare all of them to the left argument,
then
we can convert the right-hand list of constant items to an array
of equally-typed constant values for the further
QUICK index access etc. (see Item_func_in::fix_length_and_dec()).
The Item_null constant item objects have STRING_RESULT
result types, so, as far as Item_func_in::fix_length_and_dec()
is aware of NULLs in the right list, this improvement efficiently
optimizes IN function calls with a mixed right list of NULLs and
string constants. However, the optimization doesn't affect mixed
lists of NULLs and integers, floats etc., because there is no
unique common comparator.
New optimization has been added to ignore the result type
of NULL constants in the static analysis of mixed right-hand lists.
This is safe, because at the execution phase we care about
presence of NULLs anyway.
1. The collect_cmp_types() function has been modified to optionally
ignore NULL constants in the item list.
2. NULL-skipping code of the Item_func_in::fix_length_and_dec()
function has been modified to work not only with in_string
vectors but with in_vectors of other types.
The problem is that there is only one autoinc value associated with
the query when binlogging. If more than one autoinc values are used
in the query, the autoinc values after the first one can be inserted
wrongly on slave. So these autoinc values can become inconsistent on
master and slave.
The problem is resolved by marking all the statements that invoke
a trigger or call a function that updated autoinc fields as unsafe,
and will switch to row-format in Mixed mode. Actually, the statement
is safe if just one autoinc value is used in sub-statement, but it's
impossible to check how many autoinc values are used in sub-statement.)
On Mac OS X or Windows, sending a SIGHUP to the server or a
asynchronous flush (triggered by flush_time), would cause the
server to crash.
The problem was that a hook used to detach client API handles
wasn't prepared to handle cases where the thread does not have
a associated session.
The solution is to verify whether the thread has a associated
session before trying to detach a handle.
'flush tables' crashes
The server crashes when 'show procedure status' and 'flush tables' are
run concurrently.
This is caused by the way mysql.proc table is added twice to the list
of table to lock although the requirements on the current locking API
assumes differently.
No test case is submitted because of the nature of the crash which is
currently difficult to reproduce in a deterministic way.
This is a backport from 5.1
Backport from 6.0 to 5.1.
Only those sync points are included, which are used in debug_sync.test.
The Debug Sync Facility allows to place synchronization points
in the code:
open_tables(...)
DEBUG_SYNC(thd, "after_open_tables");
lock_tables(...)
When activated, a sync point can
- Send a signal and/or
- Wait for a signal
Nomenclature:
- signal: A value of a global variable that persists
until overwritten by a new signal. The global
variable can also be seen as a "signal post"
or "flag mast". Then the signal is what is
attached to the "signal post" or "flag mast".
- send a signal: Assign the value (the signal) to the global
variable ("set a flag") and broadcast a
global condition to wake those waiting for
a signal.
- wait for a signal: Loop over waiting for the global condition until
the global value matches the wait-for signal.
Please find more information in the top comment in debug_sync.cc
or in the worklog entry.
replication
MySQL server uses wrong lock type (always TL_READ instead of
TL_READ_NO_INSERT when appropriate) for tables used in
subqueries of UPDATE statement. This leads in some cases to
a broken replication as statements are written in the wrong
order to the binlog.
The problem was that appending values to the end of an existing
ENUM or SET column was being treated as table data modification,
preventing a immediately (fast) table alteration that occurs when
only table metadata is being modified.
The cause was twofold: adding a enumeration or set members to the
end of the list of valid member values was not being considered
a "compatible" table alteration, and for SET columns, the check
was being done upon the max display length and not the underlying
(pack) length of the field.
The solution is to augment the function that checks wether two ENUM
or SET fields are compatible -- by comparing the pack lengths and
performing a limited comparison of the member values.
The bug is not related to MERGE table or TRIGGER. More correct description
would be 'assertion on multi-table UPDATE + NATURAL JOIN + MERGEABLE VIEW'.
On PREPARE stage(see test case) we call mark_common_columns() func which
creates ON condition for NATURAL JOIN and sets appropriate
table read_set bitmaps for fields which are used in ON condition.
On EXECUTE stage mark_common_columns() is not called, we set
necessary read_set bitmaps in setup_conds(). But 'B.f1' field
is already processed and related item alredy fixed before
setup_conds() as updated field and setup_conds can not set
read_set bitmap because of that.
The fix is to set read_set bitmap for appropriate table field even
if Item_direct_view_ref item which represents a refernce to this field
is fixed.
"load data" statements were written to the binlog as a mix of the original statement
and bits recreated from parse-info. This relied on implementation details and broke
with IGNORE_SPACES and versioned comments.
We now completely resynthesize the query for LOAD DATA for binlog (which among other
things normalizes them somewhat with regard to case, spaces, etc.).
We have already parsed the query properly, so we make use of that rather
than mix-and-match string literals and parsed items.
This should make us safe with regard to versioned comments, even those
spanning multiple tokens. Also no longer affected by IGNORE_SPACES.
view definition
During SHOW CREATE VIEW there is no reason to 'anonymize'
errors that name objects that a user does not have access
to. Moreover it was inconsistently implemented. For example
base tables being referenced from a view appear to be ok,
but not views. The manual on the other hand is clear: If a
user has the privileges SELECT and SHOW VIEW, the view
definition is available to that user, period. The fix
changes the behavior to support the manual.
trigger, merge table
The problem with break statements is that they have very
local effects. Hence a break statement within the inner loop
of a nested-loops join caused execution to proceed to the
next table even though a serious error occurred. The problem
was fixed by breaking out the inner loop into its own
method. The change empowers all errors to terminate the
execution.
The errors that will now halt multi-DELETE execution
altogether are
- triggers returning errors
- handler errors
- server being killed