Before this fix, configuring the server with:
- performance_schema_events_waits_history_size=0
- performance_schema_events_waits_history_long_size=0
could cause a crash in the performance schema.
These settings to 0 are intended to be valid and supported,
and are in fact working properly in mysql 5.6 and up already.
This fix backports the code fix and test cases from mysql 5.6
to the mysql 5.5 release.
Details:
- Archive storage engine file access were not instrumented and thus
were not shown in PS tables.
Fix:
- Added instrumentation code by using PS Apis for I/O.
Before this fix, the test performance_schema.relaylog would fail
with sporadic failures related to statistics on update_cond.
The reason for these failures is that thread scheduling makes
impossible to predict if instrumented conditions will be used on not.
The fix is to relax the test case, to not collect statistics about:
- wait/synch/cond/sql/MYSQL_BIN_LOG::update_cond
- wait/synch/cond/sql/MYSQL_RELAY_LOG::update_cond
Bug 12430414 - THE TEST PERFSCHEMA.SELECTS.TEST CAN AFFECT SUCCEEDING TESTS
Bug 12430599 - THE TEST PERFSCHEMA.ONE_THREAD_PER_CON. CAN AFFECT SUCCEEDING TESTS
Bug 12431153 - THE TEST PERFSCHEMA.PFS_UPGRADE CAN AFFECT SUCCEEDING TEST
Before this fix, all the performance schema instrumentation for both the binary log
and the relay log would use the following instruments:
- wait/io/file/sql/binlog
- wait/io/file/sql/binlog_index
- wait/synch/mutex/sql/MYSQL_BIN_LOG::LOCK_index
- wait/synch/cond/sql/MYSQL_BIN_LOG::update_cond
This instrumentation is too general and can be more specific.
With this fix, the binlog instrumentation is identical,
and the relay log instrumentation is changed to:
- wait/io/file/sql/relaylog
- wait/io/file/sql/relaylog_index
- wait/synch/mutex/sql/MYSQL_RELAY_LOG::LOCK_index
- wait/synch/cond/sql/MYSQL_RELAY_LOG::update_cond
With this change, the performance instrumentation for the binary log and the relay log,
which share the same structure but have different uses, is more detailed.
This is especially important for hosts in the middle of a replication chain,
that are both masters (binlog) and slaves (relaylog).
The problem from a user point of view was that on Solaris the
time related functions (e.g. NOW(), SYSDATE(), etc) would always
return a fixed time.
This bug was happening due to a logic in the time retrieving
wrapper function which would only call the time() function every
half second. This interval between calls would be calculated
using the gethrtime() and the logic relied on the fact that time
returned by it is monotonic.
Unfortunately, due to bugs in the gethrtime() implementation,
there are some cases where the time returned by it can drift
(See Solaris bug id 6600939), potentially causing the interval
calculation logic to fail.
Since newer versions of Solaris (10+) have alleviated the
performance degradation associated with time(2), the solution is
to simply directly rely on time() at each invocation.
This simplification has an upside that it allows us to eliminate
a lock which was used to control access to the variables used
to track the half second interval, thus improving the overall
scalability of timekeeping related functions (e.g. NOW()).
Benchmarks runs have shown no significant degradation associated
with this change. With this, there are actually improvements in
performance for cases involving many connections.
In summary, the changes introduced by this patch are:
a) my_time() and my_micro_time_and_time() no longer use gethrtime().
Instead, time() and gettimeofdate() are used correspondingly.
b) my_micro_time() is changed to not use gethrtime() so as to
have the same time source as my_micro_time_and_time().
There shouldn't be any performance impact from this change
since this function is used only a few times during statement
execution and, on Solaris, gettimeofday() shows acceptable
performance.
Fixed the test case to be independent of build options used.
Removed the lowercase-table-names constraint, since performance schema tables are now in lowercase.
This is a code cleanup.
The implementation of a storage engine (subclasses of handler) is not supposed
to call my_error() directly inside the engine implementation,
but only return error codes, and report errors later at the demand
of the sql layer only (if needed), using handler::print_error().
This fix removes misplaced calls to my_error(),
and provide an implementation of print_error() instead.
Given that the sql layer implementation of create table, ha_create_table(),
does not use print_error() but returns ER_CANT_CREATE_TABLE directly,
the return code for create table statements using the performance schema
has changed to ER_CANT_CREATE_TABLE.
Adjusted the test suite accordingly.
Before this fix, the test thread_cache failed with spurious failures.
The test used:
-- disconnect X
-- connect Y
while assuming that connection Y would reuse connection X slot in the thread cache.
For this to happen, the disconnect X operation must be given enough time to complete,
otherwise connect Y can be executed in the server before X actually finishes.
This fix uses wait conditions to make the test execution more controlled,
and more reproductible.
Before this fix, the test myisam_file_io executed:
- (a) an update on setup_instrument to disable non myisam file io instruments
- (b) a truncate on events_waits_history_long
and later
- (c) a select on events_waits_history_long
Surprisingly, events that were supposed to be disabled in (a) and removed in (b)
still were found in (c).
This happened for events such as
wait/io/file/innodb/innodb_data_file fil0fil.c: sync
because the sync was started before (a) and completed after (b),
and as a consequence was added in the performance schema history, as expected.
Presence of these records in the history made the test fail.
This fix makes the test script more robust to account for extra spill waits records in (c).
This fix affects the test suite only.
Before this fix, performance schema tests dml_*.test could
fail with spurious failure, depending on the table content.
This fix simplifies the SELECT tests in the dml_*.test scripts,
to only verify that the SELECT operation passed the security checks
and succeeded, which was the original intent of the test.
Usage of
--replace_column 1 # 2 # 3 # 4 # ...
to discard the test output was replaced by a simpler and more maintainable
--disable_result_log
which also work for empty tables.
Text conflict in mysql-test/suite/perfschema/r/dml_setup_instruments.result
Text conflict in mysql-test/suite/perfschema/r/global_read_lock.result
Text conflict in mysql-test/suite/perfschema/r/server_init.result
Text conflict in mysql-test/suite/perfschema/t/global_read_lock.test
Text conflict in mysql-test/suite/perfschema/t/server_init.test
bug #57006 "Deadlock between HANDLER and FLUSH TABLES WITH READ
LOCK" and bug #54673 "It takes too long to get readlock for
'FLUSH TABLES WITH READ LOCK'".
The first bug manifested itself as a deadlock which occurred
when a connection, which had some table open through HANDLER
statement, tried to update some data through DML statement
while another connection tried to execute FLUSH TABLES WITH
READ LOCK concurrently.
What happened was that FTWRL in the second connection managed
to perform first step of GRL acquisition and thus blocked all
upcoming DML. After that it started to wait for table open
through HANDLER statement to be flushed. When the first connection
tried to execute DML it has started to wait for GRL/the second
connection creating deadlock.
The second bug manifested itself as starvation of FLUSH TABLES
WITH READ LOCK statements in cases when there was a constant
stream of concurrent DML statements (in two or more
connections).
This has happened because requests for protection against GRL
which were acquired by DML statements were ignoring presence of
pending GRL and thus the latter was starved.
This patch solves both these problems by re-implementing GRL
using metadata locks.
Similar to the old implementation acquisition of GRL in new
implementation is two-step. During the first step we block
all concurrent DML and DDL statements by acquiring global S
metadata lock (each DML and DDL statement acquires global IX
lock for its duration). During the second step we block commits
by acquiring global S lock in COMMIT namespace (commit code
acquires global IX lock in this namespace).
Note that unlike in old implementation acquisition of
protection against GRL in DML and DDL is semi-automatic.
We assume that any statement which should be blocked by GRL
will either open and acquires write-lock on tables or acquires
metadata locks on objects it is going to modify. For any such
statement global IX metadata lock is automatically acquired
for its duration.
The first problem is solved because waits for GRL become
visible to deadlock detector in metadata locking subsystem
and thus deadlocks like one in the first bug become impossible.
The second problem is solved because global S locks which
are used for GRL implementation are given preference over
IX locks which are acquired by concurrent DML (and we can
switch to fair scheduling in future if needed).
Important change:
FTWRL/GRL no longer blocks DML and DDL on temporary tables.
Before this patch behavior was not consistent in this respect:
in some cases DML/DDL statements on temporary tables were
blocked while in others they were not. Since the main use cases
for FTWRL are various forms of backups and temporary tables are
not preserved during backups we have opted for consistently
allowing DML/DDL on temporary tables during FTWRL/GRL.
Important change:
This patch changes thread state names which are used when
DML/DDL of FTWRL is waiting for global read lock. It is now
either "Waiting for global read lock" or "Waiting for commit
lock" depending on the stage on which FTWRL is.
Incompatible change:
To solve deadlock in events code which was exposed by this
patch we have to replace LOCK_event_metadata mutex with
metadata locks on events. As result we have to prohibit
DDL on events under LOCK TABLES.
This patch also adds extensive test coverage for interaction
of DML/DDL and FTWRL.
Performance of new and old global read lock implementations
in sysbench tests were compared. There were no significant
difference between new and old implementations.
Before this fix, the performance schema tables were defined in UPPERCASE.
This was incompatible with the lowercase_table_names option, and caused
issues with the install / upgrade process, when changing the lower case
table names setting *after* the install or upgrade.
With this fix, all performance schema tables are exposed with lowercase names.
As a result, the name of the performance schema table is always lowercase,
no matter how / if / when the lowercase_table_names setting if changed.
This change is to align the 5.5 performance_schema.THREADS
table definition with the 5.6 performance_schema.THREADS table,
to facilitate the 5.5 -> 5.6 migration later.
In the table performance_schema.THREADS:
- renamed ID to PROCESSLIST_ID, removed not null
- changed NAME from varchar(64) to varchar(128)
to match the columns definitions from 5.6
Adjusted the test cases accordingly.
Note: this fix is for 5.5 only, to null merge into 5.6
Before this fix, the test output for perfschema.server_init would
vary between executions, because some of the objects tested were
not guaranteed to exist in all configurations / code paths.
This fix removes these weak tests.
Also, comments referring to abandonned code have been cleaned up.
Before this fix, the server could crash inside a memcpy when reading data
from the EVENTS_WAITS_CURRENT / HISTORY / HISTORY_LONG tables.
The root cause is that the length used in a memcpy could be corrupted,
when another thread writes data in the wait record being read.
Reading unsafe data is ok, per design choice, and the code does sanitize
the data in general, but did not sanitize the length given to memcpy.
The fix is to also sanitize the schema name / object name / file name
length when extracting the data to produce a row.
With recent changes in the performance schema default sizing parameters,
the memory used by a mysqld binary increased accordingly.
This negatively affects the MTR test suite,
because running several tests in parallel now consumes more ressources.
The fix is to leave the default production values unchanged,
and to configure the MTR environment to limit memory
used when running tests in the test suite, which is ok
because only a few objects are typically used within a test script.
This fix:
- changed the default configuration in MTR to use less memory
- adjusted the performance schema tests accordingly
Note that 1,000 mutex instances was too short and caused test failures
in the past in team trees, so the default used is now 10,000 in MTR.
The amount of memory used by the performance schema itself
can be observed with the statement SHOW ENGINE PERFORMANCE_SCHEMA STATUS
Before this fix, the server did not recognize 'short' (as in -a)
options but only 'long' (as in --ansi) options
in the startup command line, due to earlier changes in 5.5
introduced for the performance schema.
The root cause is that handle_options() did not honor the
my_getopt_skip_unknown flag when parsing 'short' options.
The fix changes handle_options(), so that my_getopt_skip_unknown is
honored in all cases.
Note that there are limitations to this,
see the added doxygen documentation in handle_options().
The current usage of handle_options() by the server to
parse early performance schema options fits within the limitations.
This has been enforced by an assert for PARSE_EARLY options, for safety.
Before this fix, some tests failed due to lack of instrumentation slots
in the performance schema, because the default sizing was too low.
Now that more code has been instrumented, the default sizing has to be adjusted
to match the current instrumentation consumption.
This change:
- increases the number of rwlock classes from 20 to 30,
- increases the number of rwlock and mutex instances to 1 million.
Both are to account for the volume of data instrumented
when the innodb storage engine is used (because of the innodb buffer pool).
Adjusted the test output accordingly.
temp table
This patch introduces two key changes in the replication's behavior.
Firstly, it reverts part of BUG#51894 which puts any update to temporary tables
into the trx-cache. Now, updates to temporary tables are handled according to
the type of their engines as a regular table.
Secondly, an unsafe mixed statement, (i.e. a statement that access transactional
table as well non-transactional or temporary table, and writes to any of them),
are written into the trx-cache in order to minimize errors in the execution when
the statement logging format is in use.
Such changes has a direct impact on which statements are classified as unsafe
statements and thus part of BUG#53259 is reverted.
TABLES <list> WITH READ LOCK are incompatible".
The problem was that FLUSH TABLES <list> WITH READ LOCK
which was issued when other connection has acquired global
read lock using FLUSH TABLES WITH READ LOCK was blocked
and has to wait until global read lock is released.
This issue stemmed from the fact that FLUSH TABLES <list>
WITH READ LOCK implementation has acquired X metadata locks
on tables to be flushed. Since these locks required acquiring
of global IX lock this statement was incompatible with global
read lock.
This patch addresses problem by using SNW metadata type of
lock for tables to be flushed by FLUSH TABLES <list> WITH
READ LOCK. It is OK to acquire them without global IX lock
as long as we won't try to upgrade those locks. Since SNW
locks allow concurrent statements using same table FLUSH
TABLE <list> WITH READ LOCK now has to wait until old
versions of tables to be flushed go away after acquiring
metadata locks. Since such waiting can lead to deadlock
MDL deadlock detector was extended to take into account
waits for flush and resolve such deadlocks.
As a bonus code in open_tables() which was responsible for
waiting old versions of tables to go away was refactored.
Now when we encounter old version of table in open_table()
we don't back-off and wait for all old version to go away,
but instead wait for this particular table to be flushed.
Such approach supported by deadlock detection should reduce
number of scenarios in which FLUSH TABLES aborts concurrent
multi-statement transactions.
Note that active FLUSH TABLES <list> WITH READ LOCK still
blocks concurrent FLUSH TABLES WITH READ LOCK statement
as the former keeps tables open and thus prevents the
latter statement from doing flush.