revno: 4559
committer: Marc Alff <marc.alff@oracle.com>
branch nick: mysql-5.6-bug14741537-v4
timestamp: Thu 2012-11-08 22:40:31 +0100
message:
Bug#14741537 - MYSQL 5.6, GTID AND PERFORMANCE_SCHEMA
Before this fix, statements using performance_schema tables:
- were marked as unsafe for replication,
- did cause warnings during execution,
- were written to the binlog, either in STATEMENT or ROW format.
When using replication with the new GTID feature,
unsafe warnings are elevated to errors,
which prevents to use both the performance_schema and GTID together.
The root cause of the problem is not related to raising warnings/errors
in some special cases, but deeper: statements involving the performance
schema should not even be written to the binary log in the first place,
because the content of the performance schema tables is 'local' to a server
instance, and may differ greatly between nodes in a replication
topology.
In particular, the DBA should be able to configure (INSERT, UPDATE, DELETE)
or flush (TRUNCATE) performance schema tables on one node,
without affecting other nodes.
This fix introduces the concept of a 'non-replicated' or 'local' table,
and adjusts the replication logic to ignore tables that are not replicated
when deciding if or how to log a statement to the binlog.
Note that while this issue was detected using the performance_schema,
other tables are also affected by the same problem.
This fix define as 'local' the following tables, which are then never
replicated:
- performance_schema.*
- mysql.general_log
- mysql.slow_log
- mysql.slave_relay_log_info
- mysql.slave_master_info
- mysql.slave_worker_info
Existing behavior for information_schema.* is unchanged by this fix,
to limit the scope of changes.
Coding wise, this fix implements the following changes:
1)
Performance schema tables are not using any replication flags,
since performance schema tables are not replicated.
2)
In open_table_from_share(),
tables with no replication capabilities (performance_schema.*),
tables with TABLE_CATEGORY_LOG (logs)
and tables with TABLE_CATEGORY_RPL_INFO (replication)
are marked as non replicated, with TABLE::no_replicate
3)
A new THD member, THD::m_binlog_filter_state,
indicate if the current statement is written to the binlog
(normal cases for most statements), or is to be discarded
(because the statements affects non replicated tables).
4)
In THD::decide_logging_format(), the replication logic
is changed to take into account non replicated tables.
Statements that affect only non replicated tables are
executed normally (no warning or errors), but not written
to the binlog.
Statements that affect (i.e., write to) a replicated table
while also using (i.e., reading from or writing to) a non replicated table
are executed normally in MIXED and ROW binlog format,
and cause a new error in STATEMENT binlog format.
THD::decide_logging_format() uses THD::m_binlog_filter_state
to indicate if a statement is to be ignored, when writing to
the binlog.
5)
In THD::binlog_query(), statements marked as ignored
are not written to the binary log.
6)
For row based replication, the existing test for 'table->no_replicate',
has been moved from binlog_log_row() to check_table_binlog_row_based().
-Note 1031 Table storage engine for 't1' doesn't have this option
+Note 1031 Table storage engine for 'InnoDB' doesn't have this option
They were caused by a change in MariaDB which changed ER_ILLEGAL_HA message
text to be like:
"Storage engine InnoDB of the table `test`.`t1` doesn't have this option"
Some the error calls were changed to pass new parameters, but some were left
to be old. Also the error text in errmsg-ut8.txt was not changed.
Implement facility for the commit in one thread to wait for the commit of
another to complete first. The wait is done in a way that does not hinder
that a waiter and a waitee can group commit together with a single fsync()
in both binlog and InnoDB. The wait is done efficiently with respect to
locking.
The patch was originally made to support TaoBao parallel replication with
in-order commit; now it will be adapted to also be used for parallel
replication of group-committed transactions.
A waiter THD registers itself with a prior waitee THD. The waiter will then
complete its commit at the earliest in the same group commit of the waitee
(when using binlog). The wait can also be done explicitly by the waitee.
1. DROP DATABASE should use ha_discover_table_names(), not look at .frm files.
2. filename_to_tablename() also encodes temp file names #sql- -> #mysql50##sql
3. no special treatment for #sql- files, no TABLE_LIST::internal_tmp_table
4. discover also table file names, that start from #
-Added test and extra code to ensure we don't leave keyread on for a handler table.
-Create on disk temporary files always with long data pointers if SQL_SMALL_RESULT is not used. This ensures that we can handle temporary files bigger than 4G.
mysql-test/include/default_mysqld.cnf:
Run test suite with smaller aria keybuffer size
mysql-test/suite/maria/maria3.result:
Run test suite with smaller aria keybuffer size
mysql-test/suite/sys_vars/r/aria_pagecache_buffer_size_basic.result:
Run test suite with smaller aria keybuffer size
sql/handler.cc:
Disable key read (extra safety if something went wrong)
sql/multi_range_read.cc:
Ensure we have don't leave keyread on for secondary_file
sql/opt_range.cc:
Simplify code with mark_columns_used_by_index_no_reset()
Ensure that read_keys_and_merge() disableds keyread if it enables it
sql/opt_subselect.cc:
Remove not anymore used argument for create_internal_tmp_table()
sql/sql_derived.cc:
Remove not anymore used argument for create_internal_tmp_table()
sql/sql_select.cc:
Use 'enable_keyread()' instead of calling HA_EXTRA_RESET. (Makes debugging easier)
Create on disk temporary files always with long data pointers if SQL_SMALL_RESULT is not used. This ensures that we can handle temporary files bigger than 4G.
Remove not anymore used argument for create_internal_tmp_table()
More DBUG
sql/sql_select.h:
Remove not anymore used argument for create_internal_tmp_table()
bzr merge lp:maria/5.5 -rtag:mariadb-5.5.31
Text conflict in cmake/cpack_rpm.cmake
Text conflict in debian/dist/Debian/control
Text conflict in debian/dist/Ubuntu/control
Text conflict in sql/CMakeLists.txt
Conflict adding file sql/db.opt. Moved existing file to sql/db.opt.moved.
Conflict adding file sql/db.opt.moved. Moved existing file to sql/db.opt.moved.moved.
Text conflict in sql/mysqld.cc
Text conflict in support-files/mysql.spec.sh
8 conflicts encountered.
* print "table doesn't exist in engine" when a table doesn't exist in the engine,
instead of "file not found" (if no file was involved)
* print a complete filename that cannot be found ('t1.MYI', not 't1')
* it's not an error for a DROP if a table doesn't exist in the engine (or some table
files cannot be found) - if the DROP succeeded regardless
DOWNGRADED FROM 5.6.11 TO 5.6.10
Problem was new syntax not accepted by previous version.
Fixed by adding version comment of /*!50531 around the
new syntax.
Like this in the .frm file:
'PARTITION BY KEY /*!50611 ALGORITHM = 2 */ () PARTITIONS 3'
and also changing the output from SHOW CREATE TABLE to:
CREATE TABLE t1 (a INT)
/*!50100 PARTITION BY KEY */ /*!50611 ALGORITHM = 1 */ /*!50100 ()
PARTITIONS 3 */
It will always add the ALGORITHM into the .frm for KEY [sub]partitioned
tables, but for SHOW CREATE TABLE it will only add it in case it is the non
default ALGORITHM = 1.
Also notice that for 5.5, it will say /*!50531 instead of /*!50611, which
will make upgrade from 5.5 > 5.5.31 to 5.6 < 5.6.11 fail!
If one downgrades an fixed version to the same major version (5.5 or 5.6) the
bug 14521864 will be visible again, but unless the .frm is updated, it will
work again when upgrading again.
Also fixed so that the .frm does not get updated version
if a single partition check passes.
DOWNGRADED FROM 5.6.11 TO 5.6.10
Problem was new syntax not accepted by previous version.
Fixed by adding version comment of /*!50531 around the
new syntax.
Like this in the .frm file:
'PARTITION BY KEY /*!50611 ALGORITHM = 2 */ () PARTITIONS 3'
and also changing the output from SHOW CREATE TABLE to:
CREATE TABLE t1 (a INT)
/*!50100 PARTITION BY KEY */ /*!50611 ALGORITHM = 1 */ /*!50100 ()
PARTITIONS 3 */
It will always add the ALGORITHM into the .frm for KEY [sub]partitioned
tables, but for SHOW CREATE TABLE it will only add it in case it is the non
default ALGORITHM = 1.
Also notice that for 5.5, it will say /*!50531 instead of /*!50611, which
will make upgrade from 5.5 > 5.5.31 to 5.6 < 5.6.11 fail!
If one downgrades an fixed version to the same major version (5.5 or 5.6) the
bug 14521864 will be visible again, but unless the .frm is updated, it will
work again when upgrading again.
Also fixed so that the .frm does not get updated version
if a single partition check passes.