General overview:
The logic for switching to row format when binlog_format=MIXED had
numerous flaws. The underlying problem was the lack of a consistent
architecture.
General purpose of this changeset:
This changeset introduces an architecture for switching to row format
when binlog_format=MIXED. It enforces the architecture where it has
to. It leaves some bugs to be fixed later. It adds extensive tests to
verify that unsafe statements work as expected and that appropriate
errors are produced by problems with the selection of binlog format.
It was not practical to split this into smaller pieces of work.
Problem 1:
To determine the logging mode, the code has to take several parameters
into account (namely: (1) the value of binlog_format; (2) the
capabilities of the engines; (3) the type of the current statement:
normal, unsafe, or row injection). These parameters may conflict in
several ways, namely:
- binlog_format=STATEMENT for a row injection
- binlog_format=STATEMENT for an unsafe statement
- binlog_format=STATEMENT for an engine only supporting row logging
- binlog_format=ROW for an engine only supporting statement logging
- statement is unsafe and engine does not support row logging
- row injection in a table that does not support statement logging
- statement modifies one table that does not support row logging and
one that does not support statement logging
Several of these conflicts were not detected, or were detected with
an inappropriate error message. The problem of BUG#39934 was that no
appropriate error message was written for the case when an engine
only supporting row logging executed a row injection with
binlog_format=ROW. However, all above cases must be handled.
Fix 1:
Introduce new error codes (sql/share/errmsg.txt). Ensure that all
conditions are detected and handled in decide_logging_format()
Problem 2:
The binlog format shall be determined once per statement, in
decide_logging_format(). It shall not be changed before or after that.
Before decide_logging_format() is called, all information necessary to
determine the logging format must be available. This principle ensures
that all unsafe statements are handled in a consistent way.
However, this principle is not followed:
thd->set_current_stmt_binlog_row_based_if_mixed() is called in several
places, including from code executing UPDATE..LIMIT,
INSERT..SELECT..LIMIT, DELETE..LIMIT, INSERT DELAYED, and
SET @@binlog_format. After Problem 1 was fixed, that caused
inconsistencies where these unsafe statements would not print the
appropriate warnings or errors for some of the conflicts.
Fix 2:
Remove calls to THD::set_current_stmt_binlog_row_based_if_mixed() from
code executed after decide_logging_format(). Compensate by calling the
set_current_stmt_unsafe() at parse time. This way, all unsafe statements
are detected by decide_logging_format().
Problem 3:
INSERT DELAYED is not unsafe: it is logged in statement format even if
binlog_format=MIXED, and no warning is printed even if
binlog_format=STATEMENT. This is BUG#45825.
Fix 3:
Made INSERT DELAYED set itself to unsafe at parse time. This allows
decide_logging_format() to detect that a warning should be printed or
the binlog_format changed.
Problem 4:
LIMIT clause were not marked as unsafe when executed inside stored
functions/triggers/views/prepared statements. This is
BUG#45785.
Fix 4:
Make statements containing the LIMIT clause marked as unsafe at
parse time, instead of at execution time. This allows propagating
unsafe-ness to the view.
Make the caller of Query_log_event, Execute_load_log_event
constructors and THD::binlog_query to provide the error code
instead of having the constructors to figure out the error code.
with seg fault
Multiple-table DELETE from a table joined to itself may cause
server crash. This was originally discovered with MEMORY engine,
but may affect other engines with different symptoms.
The problem was that the server violated SE API by performing
parallel table scan in one handler and removing records in
another (delete on the fly optimization).
on 5.0
The server crashes on an assert in net_end_statement indicating that the
Diagnostics area wasn't set properly during execution.
This happened on a multi table DELETE operation using the IGNORE keyword.
The keyword is suppose to allow for execution to continue on a best effort
despite some non-fatal errors. Instead execution stopped and no client
response was sent which would have led to a protocol error if it hadn't been
for the assert.
This patch corrects this issue by checking for the existence of an IGNORE
option before setting an error state during row-by-row delete iteration.
When the thread executing a DDL was killed after finished its
execution but before writing the binlog event, the error code in
the binlog event could be set wrongly to ER_SERVER_SHUTDOWN or
ER_QUERY_INTERRUPTED.
This patch fixed the problem by ignoring the kill status when
constructing the event for DDL statements.
This patch also included the following changes in order to
provide the test case.
1) modified mysqltest to support variable for connection command
2) modified mysql-test-run.pl, add new variable MYSQL_SLAVE to
run mysql client against the slave mysqld.
TRUNCATE TABLE fails to replicate when stmt-based binlogging is not supported.
There were two separate problems with the code, both of which are fixed with
this patch:
1. An error was printed by InnoDB for TRUNCATE TABLE in statement mode when
the in isolation levels READ COMMITTED and READ UNCOMMITTED since InnoDB
does permit statement-based replication for DML statements. However,
the TRUNCATE TABLE is not transactional, but is a DDL, and should therefore
be allowed to be replicated as a statement.
2. The statement was not logged in mixed mode because of the error above, but
the error was not reported to the client.
This patch fixes the problem by treating TRUNCATE TABLE a DDL, that is, it is
always logged as a statement and not reporting an error from InnoDB for TRUNCATE
TABLE.
When using CREATE TEMPORARY TABLE LIKE to create a temporary table,
or using TRUNCATE to delete all rows of a temporary table, they
did not set the tmp_table_used flag, and cause the omission of
"SET @@session.pseudo_thread_id" when dumping binlog with mysqlbinlog,
and cause error when replay the statements.
This patch fixed the problem by setting tmp_table_used in these two
cases. (Done by He Zhenxing 2009-01-12)
The special TRUNCATE TABLE (DDL) transaction wasn't being properly
rolled back if a error occurred during row by row deletion. The
error can be caused by a foreign key restriction imposed by InnoDB
SE and would cause the server to erroneously issue a implicit
commit.
The solution is to rollback the transaction if a truncation via row
by row deletion fails, otherwise commit. All effects of a TRUNCATE
ABLE operation are rolled back if a row by row deletion fails.
TRUNCATE TABLE for InnoDB tables returned a count showing an approximation
of the number of rows affected to gain efficiency.
Now the statement always returns 0 rows affected for clarity.
- In QUICK_INDEX_MERGE_SELECT::read_keys_and_merge: when we got table->sort from Unique,
tell init_read_record() not to use rr_from_cache() because a) rowids are already sorted
and b) it might be that the the data is used by filesort(), which will need record rowids
(which rr_from_cache() cannot provide).
- Fully de-initialize the table->sort read in QUICK_INDEX_MERGE_SELECT::get_next(). This fixes BUG#35477.
(bk trigger: file as fix for BUG#35478).
The bool data type was redefined to BOOL (4 bytes on windows).
Removed the #define and fixed some of the warnings that were uncovered
by this.
Note that the fix also disables 2 warnings :
4800 : 'type' : forcing value to bool 'true' or 'false' (performance warning)
4805: 'operation' : unsafe mix of type 'type' and type 'type' in operation
These warnings will be handled in a separate bug, as they are performance related or bogus.
Fixed to int the return type of functions that return more than
2 distinct values.
binlog_format=mixed
Statement-based replication of DELETE ... LIMIT, UPDATE ... LIMIT,
INSERT ... SELECT ... LIMIT is not safe as order of rows is not
defined.
With this fix, we issue a warning that this statement is not safe to
replicate in statement mode, or go to row-based mode in mixed mode.
Note that we may consider a statement as safe if ORDER BY primary_key
is present. However it may confuse users to see very similiar statements
replicated differently.
Note 2: regular UPDATE statement (w/o LIMIT) is unsafe as well, but
this patch doesn't address this issue. See comment from Kristian
posted 18 Mar 10:55.
In cases when TRUNCATE was executed by invoking mysql_delete() rather
than by table recreation (for example, when TRUNCATE was issued on
InnoDB table with is referenced by foreign key) triggers were invoked.
In debug builds this also led to crash because of an assertion, which
assumes that some preliminary actions take place before trigger
invocation, which doesn't happen in case of TRUNCATE.
The fix is not to execute triggers in mysql_delete() when this
function is used by TRUNCATE.
a SELECT doesn't cause ROLLBACK of statem".
The idea of the fix is to ensure that we always commit the current
statement at the end of dispatch_command(). In order to not issue
redundant disc syncs, an optimization of the two-phase commit
protocol is implemented to bypass the two phase commit if
the transaction is read-only.
the reason for the failure were incorrect asserts.
Removing asserts altogether as there is no the implication does not hold
(as explained in the comments for the file).
called from a SELECT doesn't cause ROLLBACK of state"
Make private all class handler methods (PSEA API) that may modify
data. Introduce and deploy public ha_* wrappers for these methods in
all sql/.
This necessary to keep track of all data modifications in sql/,
which is in turn necessary to be able to optimize two-phase
commit of those transactions that do not modify data.
cause ROLLBACK of statement", part 1. Review fixes.
Do not send OK/EOF packets to the client until we reached the end of
the current statement.
This is a consolidation, to keep the functionality that is shared by all
SQL statements in one place in the server.
Currently this functionality includes:
- close_thread_tables()
- log_slow_statement().
After this patch and the subsequent patch for Bug#12713, it shall also include:
- ha_autocommit_or_rollback()
- net_end_statement()
- query_cache_end_of_result().
In future it may also include:
- mysql_reset_thd_for_next_command().
error evaluating WHERE"
DELETE with a subquery in WHERE clause would sometimes ignore subquery
evaluation error and proceed with deletion.
The fix is to check for an error after evaluation of the WHERE clause
in DELETE.
Addressed review comments.
Query_log_event::error_code
A query can perform completely having the local var error of mysql_$query
zero, where $query in insert, update, delete, load,
and be binlogged with error_code e.g KILLED_QUERY while there is no
reason do to so.
That can happen because Query_log_event consults thd->killed flag to
evaluate error_code.
Fixed with implementing a scheme suggested and partly implemented at
time of bug@22725 work-on. error_status is cached immediatly after the
control leaves the main rows-loop and that instance always corresponds
to `error' the local of mysql_$query functions. The cached value
is passed to Query_log_event constructor, not the default thd->killed
which can be changed in between of the caching and the constructing.