1
0
mirror of https://github.com/MariaDB/server.git synced 2025-12-01 17:39:21 +03:00
Commit Graph

248 Commits

Author SHA1 Message Date
Jan Lindström
01209de763 Merge remote-tracking branch 'origin/bb-10.1-jplindst' into 10.1 2017-08-29 20:30:18 +03:00
Marko Mäkelä
11352d52cd Merge 10.0 into 10.1 2017-08-28 15:05:46 +03:00
Marko Mäkelä
f87cb652ac MDEV-13637 InnoDB change buffer housekeeping can cause redo log overrun and possibly deadlocks
The function ibuf_remove_free_page() may be called while the caller
is holding several mutexes or rw-locks. Because of this, this
housekeeping loop may cause performance glitches for operations that
involve tables that are stored in the InnoDB system tablespace.
Also deadlocks might be possible.

The worst impact of all is that due to the mutexes being held, calls to
log_free_check() had to be skipped during this housekeeping.
This means that the cyclic InnoDB redo log may be overwritten.
If the system crashes during this, it would be unable to recover.

The entry point to the problematic code is ibuf_free_excess_pages().
It would make sense to call it before acquiring any mutexes or rw-locks,
in any 'pessimistic' operation that involves the system tablespace.

fseg_create_general(), fseg_alloc_free_page_general(): Do not call
ibuf_free_excess_pages() while potentially holding some latches.

ibuf_remove_free_page(): Do call log_free_check(), like every operation
that is about to generate redo log should do.

ibuf_free_excess_pages(): Remove some assertions that are replaced
by stricter assertions in the log_free_check() that is now called by
ibuf_remove_free_page().

row_mtr_start(): New function, to perform necessary preparations when
starting a mini-transaction for row operations. For pessimistic operations
on secondary indexes that are located in the system tablespace,
this includes calling ibuf_free_excess_pages().

row_undo_ins_remove_sec_low(), row_undo_mod_del_mark_or_remove_sec_low(),
row_undo_mod_del_unmark_sec_and_undo_update(): Call row_mtr_start().

row_ins_sec_index_entry(): Call ibuf_free_excess_pages() if the operation
may involve allocating pages and change buffering in the system tablespace.

row_upd_sec_index_entry(): Slightly refactor the code. The
delete-marking of the old entry is done in-place. It could be
change-buffered, but the old code should be unlikely to have
invoked ibuf_free_excess_pages() in this case.
2017-08-28 08:57:51 +03:00
Marko Mäkelä
582545a384 MDEV-13637 InnoDB change buffer housekeeping can cause redo log overrun and possibly deadlocks
The function ibuf_remove_free_page() may be called while the caller
is holding several mutexes or rw-locks. Because of this, this
housekeeping loop may cause performance glitches for operations that
involve tables that are stored in the InnoDB system tablespace.
Also deadlocks might be possible.

The worst impact of all is that due to the mutexes being held, calls to
log_free_check() had to be skipped during this housekeeping.
This means that the cyclic InnoDB redo log may be overwritten.
If the system crashes during this, it would be unable to recover.

The entry point to the problematic code is ibuf_free_excess_pages().
It would make sense to call it before acquiring any mutexes or rw-locks,
in any 'pessimistic' operation that involves the system tablespace.

fseg_create_general(), fseg_alloc_free_page_general(): Do not call
ibuf_free_excess_pages() while potentially holding some latches.

ibuf_remove_free_page(): Do call log_free_check(), like every operation
that is about to generate redo log should do.

ibuf_free_excess_pages(): Remove some assertions that are replaced
by stricter assertions in the log_free_check() that is now called by
ibuf_remove_free_page().

row_ins_sec_index_entry(), row_undo_ins_remove_sec_low(),
row_undo_mod_del_mark_or_remove_sec_low(),
row_undo_mod_del_unmark_sec_and_undo_update(): Call
ibuf_free_excess_pages() if the operation may involve allocating pages
and change buffering in the system tablespace.
2017-08-25 14:01:51 +03:00
Sergei Golubchik
27412877db Merge branch '10.2' into bb-10.2-ext 2017-08-25 10:25:48 +02:00
Marko Mäkelä
59caf2c3c1 MDEV-13485 MTR tests fail massively with --innodb-sync-debug
The parameter --innodb-sync-debug, which is disabled by default,
aims to find potential deadlocks in InnoDB.

When the parameter is enabled, lots of tests failed. Most of these
failures were due to bogus diagnostics. But, as part of this fix,
we are also fixing a bug in error handling code and removing dead
code, and fixing cases where an uninitialized mutex was being
locked and unlocked.

dict_create_foreign_constraints_low(): Remove an extraneous
mutex_exit() call that could cause corruption in an error handling
path. Also, do not unnecessarily acquire dict_foreign_err_mutex.
Its only purpose is to control concurrent access to
dict_foreign_err_file.

row_ins_foreign_trx_print(): Replace a redundant condition with a
debug assertion.

srv_dict_tmpfile, srv_dict_tmpfile_mutex: Remove. The
temporary file is never being written to or read from.

log_free_check(): Allow SYNC_FTS_CACHE (fts_cache_t::lock)
to be held.

ha_innobase::inplace_alter_table(), row_merge_insert_index_tuples():
Assert that no unexpected latches are being held.

sync_latch_meta_init(): Properly initialize dict_operation_lock_key
at SYNC_DICT_OPERATION. dict_sys->mutex is SYNC_DICT, and
the now-removed SRV_DICT_TMPFILE was wrongly registered at
SYNC_DICT_OPERATION.

buf_block_init(): Correctly register buf_block_t::debug_latch.
It was previously misleadingly reported as LATCH_ID_DICT_FOREIGN_ERR.

latch_level_t: Correct the relative latching order of
SYNC_IBUF_PESS_INSERT_MUTEX,SYNC_INDEX_TREE and
SYNC_FILE_FORMAT_TAG,SYNC_DICT_OPERATION to avoid bogus failures.

row_drop_table_for_mysql(): Avoid accessing btr_defragment_mutex
if the defragmentation thread has not been started. This is the
case during fts_drop_orphaned_tables() in recv_recovery_rollback_active().

fil_space_destroy_crypt_data(): Avoid acquiring fil_crypt_threads_mutex
when it is uninitialized. We may have created crypt_data before the
mutex was created, and the mutex creation would be skipped if
InnoDB startup failed or --innodb-read-only was specified.
2017-08-23 08:44:11 +03:00
Jan Lindström
c23efc7d50 Merge remote-tracking branch 'origin/10.0-galera' into 10.1 2017-08-21 13:35:00 +03:00
Marko Mäkelä
5d1c0d0086 MDEV-13331 FK DELETE CASCADE does not honor innodb_lock_wait_timeout
row_ins_check_foreign_constraint(): On timeout,
return DB_LOCK_WAIT_TIMEOUT instead of DB_LOCK_WAIT,
so that the lock wait will be properly terminated.
Also, replace some redundant assignments.

It looks like this bug was introduced in MySQL 5.7.8 by:

    commit a97f6b91227c7e0fc3151cfe5421891e79c12d19
    Author: Annamalai Gurusami <annamalai.gurusami@oracle.com>
    Date:   Tue Jun 9 16:02:31 2015 +0530

        Bug #20953265 INNODB: FAILING ASSERTION: RESULT != FTS_INVALID
2017-08-15 10:51:43 +03:00
sjaakola
396770fb67 Refs: MW-369 * changed parent row key type to S(hared), when FK child table is being updated or deleted 2017-08-14 13:40:37 +03:00
sjaakola
cc3bee92b6 Refs: MW-369 * changed insert for a FK child table to take exclusive lock on FK parent table 2017-08-14 13:09:16 +03:00
Alexander Barkov
8b2c7c9444 Merge remote-tracking branch 'origin/10.2' into bb-10.2-ext 2017-07-07 12:43:10 +04:00
Sergei Golubchik
f6633bf058 Merge branch '10.1' into 10.2 2017-07-05 19:08:55 +02:00
Alexander Barkov
5c0df0e4a8 Merge remote-tracking branch 'origin/10.2' into bb-10.2-ext 2017-07-04 15:31:25 +04:00
Marko Mäkelä
c436338d9d Assert that DB_TRX_ID must be set on delete-marked records
This is preparation for MDEV-12288, which would set DB_TRX_ID=0
when purging history. Also with that change in place, delete-marked
records must always refer to an undo log record via a nonzero
DB_TRX_ID column. (The DB_TRX_ID is only present in clustered index
leaf page records.)

btr_cur_parse_del_mark_set_clust_rec(), rec_get_trx_id():
Statically allocate the offsets
(should never use the heap). Add some debug assertions.

Replace some use of rec_get_trx_id() with row_get_rec_trx_id().

trx_undo_report_row_operation(): Add some sanity checks that are
common for all operations that produce undo log.
2017-07-01 11:02:58 +03:00
Sachin Setiya
e333d82964 MDEV-12398 All cluster nodes stop due to a foreign key constraint failure
Comment from Codership:-
To fix the problem, we changed the certification logic in galera to treat insert
on child table row as exclusive to prevent any operation on referenced
parent table row. At the same time, update and delete on
 child table row were demoted to "shared", which makes it possible to
update/delete referenced parent table row, but only in a later transaction.
 This change allows somewhat more concurrency for foreign key constrained
 transactions, but is still safe for correct certification end result.
2017-06-22 11:38:50 +05:30
Alexander Barkov
765347384a Merge remote-tracking branch 'origin/10.2' into bb-10.2-ext 2017-06-15 15:27:11 +04:00
Sachin Setiya
92209ac6f6 Merge tag 'mariadb-10.0.31' into 10.0-galera
Signed-off-by: Sachin Setiya <sachin.setiya@mariadb.com>
2017-05-30 15:28:52 +05:30
Alexander Barkov
9bc3225642 Merge tag 'mariadb-10.2.6' into bb-10.2-ext 2017-05-26 19:32:28 +04:00
Marko Mäkelä
8acf4d6f78 Follow-up fixes for MDEV-10139 Support for InnoDB SEQUENCE objects
Because SEQUENCE objects or NO_ROLLBACK tables do not support locking
or MVCC or transactions, avoid starting a transaction.

row_upd_step(): Do not start a transaction. Let the caller do that.

que_thr_step(): Call trx_start_if_not_started_xa() for QUE_NODE_UPDATE.
(The InnoDB SQL parser is not used for accessing NO_ROLLBACK tables.)

row_ins_step(): Correct a too strict assertion and comment about
concurrency. Multiple concurrent readers are allowed.

row_update_for_mysql_using_upd_graph(): Do not start the transaction
for NO_ROLLBACK tables.

row_search_mvcc(): For NO_ROLLBACK tables, skip locking even inside
LOCK TABLES. Only call trx_start_if_not_started() at the start
of a statement, not for each individual request.
2017-05-24 18:23:16 +03:00
Marko Mäkelä
70505dd45b Merge 10.1 into 10.2 2017-05-22 09:46:51 +03:00
Marko Mäkelä
13a350ac29 Merge 10.0 into 10.1 2017-05-19 12:29:37 +03:00
Marko Mäkelä
9f89b94ba6 MDEV-12358 Work around what looks like a bug in GCC 7.1.0
The parameter thr of the function btr_cur_optimistic_insert()
is not declared as nonnull, but GCC 7.1.0 with -O3 is wrongly
optimizing away the first part of the condition
UNIV_UNLIKELY(thr && thr_get_trx(thr)->fake_changes)
when the function is being called by row_merge_insert_index_tuples()
with thr==NULL.

The fake_changes is an XtraDB addition. This GCC bug only appears
to have an impact on XtraDB, not InnoDB.

We work around the problem by not attempting to dereference thr
when both BTR_NO_LOCKING_FLAG and BTR_NO_UNDO_LOG_FLAG are set
in the flags. Probably BTR_NO_LOCKING_FLAG alone should suffice.

btr_cur_optimistic_insert(), btr_cur_pessimistic_insert(),
btr_cur_pessimistic_update(): Correct comments that disagree with
usage and with nonnull attributes. No other parameter than thr can
actually be NULL.

row_ins_duplicate_error_in_clust(): Remove an unused parameter.

innobase_is_fake_change(): Unused function; remove.

ibuf_insert_low(), row_log_table_apply(), row_log_apply(),
row_undo_mod_clust_low():
Because we will be passing BTR_NO_LOCKING_FLAG | BTR_NO_UNDO_LOG_FLAG
in the flags, the trx->fake_changes flag will be treated as false,
which is the right thing to do at these low-level operations
(change buffer merge, ALTER TABLE…LOCK=NONE, or ROLLBACK).
This might be fixing actual XtraDB bugs.

Other callers that pass these two flags are also passing thr=NULL,
implying fake_changes=false. (Some callers in ROLLBACK are passing
BTR_NO_LOCKING_FLAG and a nonnull thr. In these callers, fake_changes
better be false, to avoid corruption.)
2017-05-17 16:09:22 +03:00
Aleksey Midenkov
7445be89af IB: correct way of using start_time_micro [fixes #189] 2017-05-11 12:57:48 +03:00
Marko Mäkelä
c22ef4df26 MDEV-12253 post-merge fix: Use accessors for dict_table_t::file_unreadable 2017-05-06 15:54:31 +03:00
Aleksey Midenkov
4383e16cbe IB: skip check_ref on historical record [fixes #101] 2017-05-05 20:36:23 +03:00
Aleksey Midenkov
46badf17c4 IB: FK cascade delete when parent versioned [fixes #101] 2017-05-05 20:36:23 +03:00
Aleksey Midenkov
d54d36c45e IB, SQL: (0.4) COMMIT_ID-based ordering of transactions
IB:
* removed CONCURR_TRX from VTQ;
* new fields in VTQ: COMMIT_ID, ISO_LEVEL.

SQL:
* renamed BEGIN_TS, COMMIT_TS to VTQ_BEGIN_TS, VTQ_COMMIT_TS;
* new functions: VTQ_COMMIT_ID, VTQ_ISO_LEVEL, VTQ_TRX_ID, VTQ_TRX_SEES, VTQ_TRX_SEES_EQ;
* versioned SELECT for IB uses VTQ_TRX_SEES, VTQ_TRX_SEES_EQ.

Closes #71
2017-05-05 20:36:17 +03:00
Kosov Eugene
20b2719f45 Misc: foreign check code cleanups 2017-05-05 20:36:16 +03:00
kevgs
a22cbc453f IB: (0.4) foreign keys for versioned tables (#58) 2017-05-05 20:36:15 +03:00
Aleksey Midenkov
5dea51657d IB: optimized update for non-versioned fields
Fixes #53
2017-05-05 20:36:14 +03:00
Aleksey Midenkov
5c4473dc74 IB: misc fix 2017-05-05 20:36:10 +03:00
Aleksey Midenkov
53a892fcfd IB: 0.2 part IV
* BEGIN_TS(), COMMIT_TS() SQL functions;
* VTQ instead of packed stores secs + usecs like my_timestamp_to_binary() does;
* versioned SELECT to IB is translated with COMMIT_TS();
* SQL fixes:
  - FOR_SYSTEM_TIME_UNSPECIFIED condition compares to TIMESTAMP_MAX_VALUE;
  - segfault fix #36: multiple execute of prepared stmt;
  - different tables to same stored procedure fix (#39)
* Fixes of previous parts: ON DUPLICATE KEY, other misc fixes.
2017-05-05 20:36:10 +03:00
Aleksey Midenkov
1ec7dbe176 IB: 0.2 part III
* versioned DML: INSERT, UPDATE, DELETE;
* general refactoring and fixes.

Warning: breaks 'insert' and 'update' tests since they require part IV.
2017-05-05 20:36:08 +03:00
Aleksey Midenkov
bdb12d1499 IB: 0.2 part II
* moved vers_notify_vtq() to commit phase;
* low_level insert (load test passed);
* rest of SYS_VTQ columns filled: COMMIT_TS, CONCURR_TRX;
* savepoints support;
* I_S.INNODB_SYS_VTQ adjustments:
  - limit to I_S_SYS_VTQ_LIMIT(10000) of most recent records;
  - CONCURR_TRX limit to I_S_MAX_CONCURR_TRX(100) with '...' truncation marker;
  - TIMESTAMP fields show fractions of seconds.
2017-05-05 20:36:08 +03:00
Aleksey Midenkov
84e1971128 IB: 0.2 part I
* SYS_VTQ internal InnoDB table;
* I_S.INNODB_SYS_VTQ table;
* vers_notify_vtq(): add record to SYS_VTQ on versioned DML;
* SYS_VTQ columns filled: TRX_ID, BEGIN_TS.
2017-05-05 20:36:07 +03:00
Alexander Barkov
ac53b49b1b Merge remote-tracking branch 'origin/10.2' into bb-10.2-ext 2017-05-05 16:12:54 +04:00
Marko Mäkelä
f9cc391863 Merge 10.1 into 10.2
This only merges MDEV-12253, adapting it to MDEV-12602 which is already
present in 10.2 but not yet in the 10.1 revision that is being merged.

TODO: Error handling in crash recovery needs to be improved.
If a page cannot be decrypted (or read), we should cleanly abort
the startup. If innodb_force_recovery is specified, we should
ignore the problematic page and apply redo log to other pages.
Currently, the test encryption.innodb-redo-badkey randomly fails
like this (the last messages are from cmake -DWITH_ASAN):

2017-05-05 10:19:40 140037071685504 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1635994
2017-05-05 10:19:40 140037071685504 [ERROR] InnoDB: Missing MLOG_FILE_NAME or MLOG_FILE_DELETE before MLOG_CHECKPOINT for tablespace 1
2017-05-05 10:19:40 140037071685504 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[2201] with error Data structure corruption
2017-05-05 10:19:41 140037071685504 [Note] InnoDB: Starting shutdown...
i=================================================================
==5226==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed: 0x612000018588 in thread T0
    #0 0x736750 in operator delete(void*) (/mariadb/server/build/sql/mysqld+0x736750)
    #1 0x1e4833f in LatchCounter::~LatchCounter() /mariadb/server/storage/innobase/include/sync0types.h:599:4
    #2 0x1e480b8 in LatchMeta<LatchCounter>::~LatchMeta() /mariadb/server/storage/innobase/include/sync0types.h:786:17
    #3 0x1e35509 in sync_latch_meta_destroy() /mariadb/server/storage/innobase/sync/sync0debug.cc:1622:3
    #4 0x1e35314 in sync_check_close() /mariadb/server/storage/innobase/sync/sync0debug.cc:1839:2
    #5 0x1dfdc18 in innodb_shutdown() /mariadb/server/storage/innobase/srv/srv0start.cc:2888:2
    #6 0x197e5e6 in innobase_init(void*) /mariadb/server/storage/innobase/handler/ha_innodb.cc:4475:3
2017-05-05 10:38:53 +03:00
Debarun Banerjee
4e41ac26f5 BUG#25082593 FOREIGN KEY VALIDATION DOESN'T NEED TO ACQUIRE GAP LOCK IN READ COMMITTED
Problem :
---------
This bug is filed from the base replication bug#25040331 where the
slave thread times out while INSERT operation waits on GAP lock taken
during Foreign Key validation.

The primary reason for the lock wait is because the statements are
getting replayed in different order. However, we also observed
two things ...

1. The slave thread could always use "Read Committed" isolation for
row level replication.

2. It is not necessary to have GAP locks in "READ Committed" isolation
level in innodb.

This bug is filed to address point(2) to avoid taking GAP locks during
Foreign Key validation.

Solution :
----------
Innodb is primarily designed for "Repeatable Read" and the GAP lock
behaviour is default. For "Read Committed" isolation, we have special
handling in row_search_mvcc to avoid taking the GAP lock while
scanning records.

While looking for Foreign Key, the code is following the default
behaviour taking GAP locks. The suggested fix is to avoid GAP
locking during FK validation similar to normal search operation
(row_search_mvcc) for "Read Committed" isolation level.

Reviewed-by: Sunny Bains <sunny.bains@oracle.com>

RB: 14526
2017-04-26 23:03:29 +03:00
Jan Lindström
765a43605a MDEV-12253: Buffer pool blocks are accessed after they have been freed
Problem was that bpage was referenced after it was already freed
from LRU. Fixed by adding a new variable encrypted that is
passed down to buf_page_check_corrupt() and used in
buf_page_get_gen() to stop processing page read.

This patch should also address following test failures and
bugs:

MDEV-12419: IMPORT should not look up tablespace in
PageConverter::validate(). This is now removed.

MDEV-10099: encryption.innodb_onlinealter_encryption fails
sporadically in buildbot

MDEV-11420: encryption.innodb_encryption-page-compression
failed in buildbot

MDEV-11222: encryption.encrypt_and_grep failed in buildbot on P8

Removed dict_table_t::is_encrypted and dict_table_t::ibd_file_missing
and replaced these with dict_table_t::file_unreadable. Table
ibd file is missing if fil_get_space(space_id) returns NULL
and encrypted if not. Removed dict_table_t::is_corrupted field.

Ported FilSpace class from 10.2 and using that on buf_page_check_corrupt(),
buf_page_decrypt_after_read(), buf_page_encrypt_before_write(),
buf_dblwr_process(), buf_read_page(), dict_stats_save_defrag_stats().

Added test cases when enrypted page could be read while doing
redo log crash recovery. Also added test case for row compressed
blobs.

btr_cur_open_at_index_side_func(),
btr_cur_open_at_rnd_pos_func(): Avoid referencing block that is
NULL.

buf_page_get_zip(): Issue error if page read fails.

buf_page_get_gen(): Use dberr_t for error detection and
do not reference bpage after we hare freed it.

buf_mark_space_corrupt(): remove bpage from LRU also when
it is encrypted.

buf_page_check_corrupt(): @return DB_SUCCESS if page has
been read and is not corrupted,
DB_PAGE_CORRUPTED if page based on checksum check is corrupted,
DB_DECRYPTION_FAILED if page post encryption checksum matches but
after decryption normal page checksum does not match. In read
case only DB_SUCCESS is possible.

buf_page_io_complete(): use dberr_t for error handling.

buf_flush_write_block_low(),
buf_read_ahead_random(),
buf_read_page_async(),
buf_read_ahead_linear(),
buf_read_ibuf_merge_pages(),
buf_read_recv_pages(),
fil_aio_wait():
        Issue error if page read fails.

btr_pcur_move_to_next_page(): Do not reference page if it is
NULL.

Introduced dict_table_t::is_readable() and dict_index_t::is_readable()
that will return true if tablespace exists and pages read from
tablespace are not corrupted or page decryption failed.
Removed buf_page_t::key_version. After page decryption the
key version is not removed from page frame. For unencrypted
pages, old key_version is removed at buf_page_encrypt_before_write()

dict_stats_update_transient_for_index(),
dict_stats_update_transient()
        Do not continue if table decryption failed or table
        is corrupted.

dict0stats.cc: Introduced a dict_stats_report_error function
to avoid code duplication.

fil_parse_write_crypt_data():
        Check that key read from redo log entry is found from
        encryption plugin and if it is not, refuse to start.

PageConverter::validate(): Removed access to fil_space_t as
tablespace is not available during import.

Fixed error code on innodb.innodb test.

Merged test cased innodb-bad-key-change5 and innodb-bad-key-shutdown
to innodb-bad-key-change2.  Removed innodb-bad-key-change5 test.
Decreased unnecessary complexity on some long lasting tests.

Removed fil_inc_pending_ops(), fil_decr_pending_ops(),
fil_get_first_space(), fil_get_next_space(),
fil_get_first_space_safe(), fil_get_next_space_safe()
functions.

fil_space_verify_crypt_checksum(): Fixed bug found using ASAN
where FIL_PAGE_END_LSN_OLD_CHECKSUM field was incorrectly
accessed from row compressed tables. Fixed out of page frame
bug for row compressed tables in
fil_space_verify_crypt_checksum() found using ASAN. Incorrect
function was called for compressed table.

Added new tests for discard, rename table and drop (we should allow them
even when page decryption fails). Alter table rename is not allowed.
Added test for restart with innodb-force-recovery=1 when page read on
redo-recovery cant be decrypted. Added test for corrupted table where
both page data and FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION is corrupted.

Adjusted the test case innodb_bug14147491 so that it does not anymore
expect crash. Instead table is just mostly not usable.

fil0fil.h: fil_space_acquire_low is not visible function
and fil_space_acquire and fil_space_acquire_silent are
inline functions. FilSpace class uses fil_space_acquire_low
directly.

recv_apply_hashed_log_recs() does not return anything.
2017-04-26 15:19:16 +03:00
Thirunarayanan Balathandayuthapani
f5759bd8d8 Bug #24347476 HIGH PRIORITY TRX FAILED TO KILL LOW PRIORITY TRX WHEN FOREIGN KEYS ARE INVOLVED
Problem:
=======
High priority transaction can't able to kill the blocking transaction
when foreign keys are involved. trx_kill_blocking() missing while checking
the foreign key constraint.

Fix:
===
Add trx_kill_blocking() while checking for the foreign key constraint.

Reviewed-by: Debarun Banerjee <debarun.banerjee@oracle.com>
RB: 13579
2017-04-24 15:29:12 +03:00
Marko Mäkelä
7c767a30a7 MDEV-10139 Support for InnoDB SEQUENCE objects
We introduce a NO_ROLLBACK flag for InnoDB tables. This flag only works
for tables that have a single index. Apart from undo logging, this flag
will also prevent locking and the assignment of DB_ROW_ID or DB_TRX_ID,
and imply READ UNCOMMITTED isolation. It is assumed that the SQL layer
is guaranteeing mutual exclusion.

After the initial insert of the single record during CREATE SEQUENCE,
InnoDB will be updating the single record in-place. This is crash-safe
thanks to the redo log. (That is, after a crash after CREATE SEQUENCE
was committed, the effect of sequence operations will be observable
fully or not at all.)

When it comes to the durability of the updates of SEQUENCE in
InnoDB, there is a clear analogy to MDEV-6076 Persistent AUTO_INCREMENT.
The updates would be made persistent by the InnoDB redo log flush
at transaction commit or rollback (or XA PREPARE), provided that
innodb_log_flush_at_trx_commit=1.

Similar to AUTO_INCREMENT, it is possible that the update of a SEQUENCE
in a middle of transaction becomes durable before the COMMIT/ROLLBACK of
the transaction, in case the InnoDB redo log is being flushed as a result
of the a commit or rollback of some other transaction, or as a result of
a redo log checkpoint that can be initiated at any time by operations that
are writing redo log.

dict_table_t::no_rollback(): Check if the table does not support rollback.

BTR_NO_ROLLBACK: Logging and locking flags for no_rollback() tables.

DICT_TF_BITS: Add the NO_ROLLBACK flag.

row_ins_step(): Assign 0 to DB_ROW_ID and DB_TRX_ID, and skip
any locking for no-rollback tables. There will be only a single row
in no-rollback tables (or there must be a proper PRIMARY KEY).

row_search_mvcc(): Execute the READ UNCOMMITTED code path for
no-rollback tables.

ha_innobase::external_lock(), ha_innobase::store_lock():
Block CREATE/DROP SEQUENCE in innodb_read_only mode.
This probably has no effect for CREATE SEQUENCE, because already
ha_innobase::create() should have been called (and refused)
before external_lock() or store_lock() is called.

ha_innobase::store_lock(): For CREATE SEQUENCE, do not acquire any
InnoDB locks, even though TL_WRITE is being requested. (This is just
a performance optimization.)

innobase_copy_frm_flags_from_create_info(), row_drop_table_for_mysql():
Disable persistent statistics for no_rollback tables.
2017-04-07 19:12:40 +04:00
Marko Mäkelä
97acc4a1c3 MDEV-12270 Port MySQL 8.0 Bug#21141390 REMOVE UNUSED FUNCTIONS AND CONVERT GLOBAL SYMBOLS TO STATIC
InnoDB defines some functions that are not called at all.
Other functions are called, but only from the same compilation unit.

Remove some function declarations and definitions, and add 'static'
keywords. Some symbols must be kept for separately compiled tools,
such as innochecksum.
2017-03-17 12:48:50 +02:00
Marko Mäkelä
4e1116b2c6 MDEV-12271 Port MySQL 8.0 Bug#23150562 REMOVE UNIV_MUST_NOT_INLINE AND UNIV_NONINL
Also, remove empty .ic files that were not removed by my MySQL commit.

Problem:
InnoDB used to support a compilation mode that allowed to choose
whether the function definitions in .ic files are to be inlined or not.
This stopped making sense when InnoDB moved to C++ in MySQL 5.6
(and ha_innodb.cc started to #include .ic files), and more so in
MySQL 5.7 when inline methods and functions were introduced
in .h files.

Solution:
Remove all references to UNIV_NONINL and UNIV_MUST_NOT_INLINE from
all files, assuming that the symbols are never defined.
Remove the files fut0fut.cc and ut0byte.cc which only mattered when
UNIV_NONINL was defined.
2017-03-17 12:42:07 +02:00
Marko Mäkelä
5ff6694d70 enum btr_latch_mode: Incorporate some flags.
This fixes some GCC 6.3.0 warnings and makes the code a little
more debugging-friendly.
2017-03-09 10:30:36 +02:00
Marko Mäkelä
89d80c1b0b Fix many -Wconversion warnings.
Define my_thread_id as an unsigned type, to avoid mismatch with
ulonglong.  Change some parameters to this type.

Use size_t in a few more places.

Declare many flag constants as unsigned to avoid sign mismatch
when shifting bits or applying the unary ~ operator.

When applying the unary ~ operator to enum constants, explictly
cast the result to an unsigned type, because enum constants can
be treated as signed.

In InnoDB, change the source code line number parameters from
ulint to unsigned type. Also, make some InnoDB functions return
a narrower type (unsigned or uint32_t instead of ulint;
bool instead of ibool).
2017-03-07 19:07:27 +02:00
Marko Mäkelä
8777458a6e MDEV-6076 Persistent AUTO_INCREMENT for InnoDB
This should be functionally equivalent to WL#6204 in MySQL 8.0.0, with
the notable difference that the file format changes are limited to
repurposing a previously unused data field in B-tree pages.

For persistent InnoDB tables, write the last used AUTO_INCREMENT
value to the root page of the clustered index, in the previously
unused (0) PAGE_MAX_TRX_ID field, now aliased as PAGE_ROOT_AUTO_INC.
Unlike some other previously unused InnoDB data fields, this one was
actually always zero-initialized, at least since MySQL 3.23.49.

The writes to PAGE_ROOT_AUTO_INC are protected by SX or X latch on the
root page. The SX latch will allow concurrent read access to the root
page. (The field PAGE_ROOT_AUTO_INC will only be read on the
first-time call to ha_innobase::open() from the SQL layer. The
PAGE_ROOT_AUTO_INC can only be updated when executing SQL, so
read/write races are not possible.)

During INSERT, the PAGE_ROOT_AUTO_INC is updated by the low-level
function btr_cur_search_to_nth_level(), adding no extra page
access. [Adaptive hash index lookup will be disabled during INSERT.]

If some rare UPDATE modifies an AUTO_INCREMENT column, the
PAGE_ROOT_AUTO_INC will be adjusted in a separate mini-transaction in
ha_innobase::update_row().

When a page is reorganized, we have to preserve the PAGE_ROOT_AUTO_INC
field.

During ALTER TABLE, the initial AUTO_INCREMENT value will be copied
from the table. ALGORITHM=COPY and online log apply in LOCK=NONE will
update PAGE_ROOT_AUTO_INC in real time.

innodb_col_no(): Determine the dict_table_t::cols[] element index
corresponding to a Field of a non-virtual column.
(The MySQL 5.7 implementation of virtual columns breaks the 1:1
relationship between Field::field_index and dict_table_t::cols[].
Virtual columns are omitted from dict_table_t::cols[]. Therefore,
we must translate the field_index of AUTO_INCREMENT columns into
an index of dict_table_t::cols[].)

Upgrade from old data files:

By default, the AUTO_INCREMENT sequence in old data files would appear
to be reset, because PAGE_MAX_TRX_ID or PAGE_ROOT_AUTO_INC would contain
the value 0 in each clustered index page. In new data files,
PAGE_ROOT_AUTO_INC can only be 0 if the table is empty or does not contain
any AUTO_INCREMENT column.

For backward compatibility, we use the old method of
SELECT MAX(auto_increment_column) for initializing the sequence.

btr_read_autoinc(): Read the AUTO_INCREMENT sequence from a new-format
data file.

btr_read_autoinc_with_fallback(): A variant of btr_read_autoinc()
that will resort to reading MAX(auto_increment_column) for data files
that did not use AUTO_INCREMENT yet. It was manually tested that during
the execution of innodb.autoinc_persist the compatibility logic is
not activated (for new files, PAGE_ROOT_AUTO_INC is never 0 in nonempty
clustered index root pages).

initialize_auto_increment(): Replaces
ha_innobase::innobase_initialize_autoinc(). This initializes
the AUTO_INCREMENT metadata. Only called from ha_innobase::open().

ha_innobase::info_low(): Do not try to lazily initialize
dict_table_t::autoinc. It must already have been initialized by
ha_innobase::open() or ha_innobase::create().

Note: The adjustments to class ha_innopart were not tested, because
the source code (native InnoDB partitioning) is not being compiled.
2016-12-16 09:19:19 +02:00
Sergei Golubchik
1cae1af6f9 MDEV-5800 InnoDB support for indexed vcols
* remove old 5.2+ InnoDB support for virtual columns
  * enable corresponding parts of the innodb-5.7 sources
  * copy corresponding test cases from 5.7
  * copy detailed Alter_inplace_info::HA_ALTER_FLAGS flags from 5.7
     - and more detailed detection of changes in fill_alter_inplace_info()
  * more "innodb compatibility hooks" in sql_class.cc to
     - create/destroy/reset a THD (used by background purge threads)
     - find a prelocked table by name
     - open a table (from a background purge thread)

  * different from 5.7:
    - new service thread "thd_destructor_proxy" to make sure all THDs are
      destroyed at the correct point in time during the server shutdown
    - proper opening/closing of tables for vcol evaluations in
       + FK checks (use already opened prelocked tables)
       + purge threads (open the table, MDLock it, add it to tdc, close
         when not needed)
    - cache open tables in vc_templ
    - avoid unnecessary allocations, reuse table->record[0] and table->s->default_values
    - not needed in 5.7, because it overcalculates:
      + tell the server to calculate vcols for an on-going inline ADD INDEX
      + calculate vcols for correct error messages

  * update other engines (mroonga/tokudb) accordingly
2016-12-12 20:27:42 +01:00
Marko Mäkelä
c868acdf65 MDEV-11487 Revert InnoDB internal temporary tables from WL#7682
WL#7682 in MySQL 5.7 introduced the possibility to create light-weight
temporary tables in InnoDB. These are called 'intrinsic temporary tables'
in InnoDB, and in MySQL 5.7, they can be created by the optimizer for
sorting or buffering data in query processing.

In MariaDB 10.2, the optimizer temporary tables cannot be created in
InnoDB, so we should remove the dead code and related data structures.
2016-12-09 12:05:07 +02:00
Sergey Vojtovich
f4d885c4e9 MDEV-10813 - Clean-up InnoDB atomics, memory barriers and mutexes
Replaced InnoDB atomic operations with server atomic operations.

Moved INNODB_RW_LOCKS_USE_ATOMICS - it is always defined (code won't compile
otherwise).

NOTE: InnoDB uses thread identifiers as a target for atomic operations.
Thread identifiers should be considered opaque: any attempt to use a
thread ID other than in pthreads calls is nonportable and can lead to
unspecified results.
2016-10-17 18:35:48 +04:00
Jan Lindström
fec844aca8 Merge InnoDB 5.7 from mysql-5.7.14.
Contains also:
       MDEV-10549 mysqld: sql/handler.cc:2692: int handler::ha_index_first(uchar*): Assertion `table_share->tmp_table != NO_TMP_TABLE || m_lock_type != 2' failed. (branch bb-10.2-jan)
       Unlike MySQL, InnoDB still uses THR_LOCK in MariaDB

       MDEV-10548 Some of the debug sync waits do not work with InnoDB 5.7 (branch bb-10.2-jan)
       enable tests that were fixed in MDEV-10549

       MDEV-10548 Some of the debug sync waits do not work with InnoDB 5.7 (branch bb-10.2-jan)
       fix main.innodb_mysql_sync - re-enable online alter for partitioned innodb tables
2016-09-08 15:49:03 +03:00