1
0
mirror of https://github.com/MariaDB/server.git synced 2025-08-30 11:22:14 +03:00
Commit Graph

11 Commits

Author SHA1 Message Date
Marko Mäkelä
f77329ace9 Bug#13721257 RACE CONDITION IN UPDATES OR INSERTS OF WIDE RECORDS
This bug was originally filed and fixed as Bug#12612184. The original
fix was buggy, and it was patched by Bug#12704861. Also that patch was
buggy (potentially breaking crash recovery), and both fixes were
reverted.

This fix was not ported to the built-in InnoDB of MySQL 5.1, because
the function signatures of many core functions are different from
InnoDB Plugin and later versions. The block allocation routines and
their callers would have to changed so that they handle block
descriptors instead of page frames.

When a record is updated so that its size grows, non-updated columns
can be selected for external (off-page) storage. The bug is that the
initially inserted updated record contains an all-zero BLOB pointer to
the field that was not updated. Only after the BLOB pages have been
allocated and written, the valid pointer can be written to the record.

Between the release of the page latch in mtr_commit(mtr) after
btr_cur_pessimistic_update() and the re-latching of the page in
btr_pcur_restore_position(), other threads can see the invalid BLOB
pointer consisting of 20 zero bytes. Moreover, if the system crashes
at this point, the situation could persist after crash recovery, and
the contents of the non-updated column would be permanently lost.

The problem is amplified by the ROW_FORMAT=DYNAMIC and
ROW_FORMAT=COMPRESSED that were introduced in
innodb_file_format=barracuda in InnoDB Plugin, but the bug does exist
in all InnoDB versions.

The fix is as follows. After a pessimistic B-tree operation that needs
to write out off-page columns, allocate the pages for these columns in
the mini-transaction that performed the B-tree operation (btr_mtr),
but write the pages in a separate mini-transaction (blob_mtr). Do
mtr_commit(blob_mtr) before mtr_commit(btr_mtr). A quirk: Do not reuse
pages that were previously freed in btr_mtr. Only write the off-page
columns to 'fresh' pages.

In this way, crash recovery will see redo log entries for blob_mtr
before any redo log entry for btr_mtr. It will apply the BLOB page
writes to pages that were marked free at that point. If crash recovery
fails to see all of the btr_mtr redo log, there will be some
unreachable BLOB data in free pages, but the B-tree will be in a
consistent state.

btr_page_alloc_low(): Renamed from btr_page_alloc(). Add the parameter
init_mtr. Return an allocated block, or NULL. If init_mtr!=mtr but
the page was already X-latched in mtr, do not initialize the page.

btr_page_alloc(): Wrapper for btr_page_alloc_for_ibuf() and
btr_page_alloc_low().

btr_page_free(): Add a debug assertion that the page was a B-tree page.

btr_lift_page_up(): Return the father block.

btr_compress(), btr_cur_compress_if_useful(): Add the parameter ibool
adjust, for adjusting the cursor position.

btr_cur_pessimistic_update(): Preserve the cursor position when
big_rec will be written and the new flag BTR_KEEP_POS_FLAG is defined.
Remove a duplicate rec_get_offsets() call. Keep the X-latch on
index->lock when big_rec is needed.

btr_store_big_rec_extern_fields(): Replace update_inplace with
an operation code, and local_mtr with btr_mtr. When not doing a
fresh insert and btr_mtr has freed pages, put aside any pages that
were previously X-latched in btr_mtr, and free the pages after
writing out all data. The data must be written to 'fresh' pages,
because btr_mtr will be committed and written to the redo log after
the BLOB writes have been written to the redo log.

btr_blob_op_is_update(): Check if an operation passed to
btr_store_big_rec_extern_fields() is an update or insert-by-update.

fseg_alloc_free_page_low(), fsp_alloc_free_page(),
fseg_alloc_free_extent(), fseg_alloc_free_page_general(): Add the
parameter init_mtr. Return an allocated block, or NULL. If
init_mtr!=mtr but the page was already X-latched in mtr, do not
initialize the page.

xdes_get_descriptor_with_space_hdr(): Assert that the file space
header is being X-latched.

fsp_alloc_from_free_frag(): Refactored from fsp_alloc_free_page().

fsp_page_create(): New function, for allocating, X-latching and
potentially initializing a page. If init_mtr!=mtr but the page was
already X-latched in mtr, do not initialize the page.

fsp_free_page(): Add ut_ad(0) to the error outcomes.

fsp_free_page(), fseg_free_page_low(): Increment mtr->n_freed_pages.

fsp_alloc_seg_inode_page(), fseg_create_general(): Assert that the
page was not previously X-latched in the mini-transaction. A file
segment or inode page should never be allocated in the middle of an
mini-transaction that frees pages, such as btr_cur_pessimistic_delete().

fseg_alloc_free_page_low(): If the hinted page was allocated, skip the
check if the tablespace should be extended. Return NULL instead of
FIL_NULL on failure. Remove the flag frag_page_allocated. Instead,
return directly, because the page would already have been initialized.

fseg_find_free_frag_page_slot() would return ULINT_UNDEFINED on error,
not FIL_NULL. Correct a bogus assertion.

fseg_alloc_free_page(): Redefine as a wrapper macro around
fseg_alloc_free_page_general().

buf_block_buf_fix_inc(): Move the definition from the buf0buf.ic to
buf0buf.h, so that it can be called from other modules.

mtr_t: Add n_freed_pages (number of pages that have been freed).

page_rec_get_nth_const(), page_rec_get_nth(): The inverse function of
page_rec_get_n_recs_before(), get the nth record of the record
list. This is faster than iterating the linked list. Refactored from
page_get_middle_rec().

trx_undo_rec_copy(): Add a debug assertion for the length.

trx_undo_add_page(): Return a block descriptor or NULL instead of a
page number or FIL_NULL.

trx_undo_report_row_operation(): Add debug assertions.

trx_sys_create_doublewrite_buf(): Assert that each page was not
previously X-latched.

page_cur_insert_rec_zip_reorg(): Make use of page_rec_get_nth().

row_ins_clust_index_entry_by_modify(): Pass BTR_KEEP_POS_FLAG, so that
the repositioning of the cursor can be avoided.

row_ins_index_entry_low(): Add DEBUG_SYNC points before and after
writing off-page columns. If inserting by updating a delete-marked
record, do not reposition the cursor or commit the mini-transaction
before writing the off-page columns.

row_build(): Tighten a debug assertion about null BLOB pointers.

row_upd_clust_rec(): Add DEBUG_SYNC points before and after writing
off-page columns. Do not reposition the cursor or commit the
mini-transaction before writing the off-page columns.

rb:939 approved by Jimmy Yang
2012-02-17 11:42:04 +02:00
Yasufumi Kinoshita
115f5e8551 Bug#12400341 INNODB CAN LEAVE ORPHAN IBD FILES AROUND
If we meet DB_TOO_MANY_CONCURRENT_TRXS during the execution tab_create_graph from row_create_table_for_mysql(), .ibd file for the table should be created already but was not deleted for the error handling.

rb:875 approved by Jimmy Yang
2012-01-10 14:18:58 +09:00
Marko Mäkelä
0ff2a182b6 Bug #11766513 - 59641: Prepared XA transaction in system after hard crash
causes future shutdown hang

InnoDB would hang on shutdown if any XA transactions exist in the
system in the PREPARED state. This has been masked by the fact that
MySQL would roll back any PREPARED transaction on shutdown, in the
spirit of Bug #12161 Xa recovery and client disconnection.

[mysql-test-run] do_shutdown_server: Interpret --shutdown_server 0 as
a request to kill the server immediately without initiating a
shutdown procedure.

xid_cache_insert(): Initialize XID_STATE::rm_error in order to avoid a
bogus error message on XA ROLLBACK of a recovered PREPARED transaction.

innobase_commit_by_xid(), innobase_rollback_by_xid(): Free the InnoDB
transaction object after rolling back a PREPARED transaction.

trx_get_trx_by_xid(): Only consider transactions whose
trx->is_prepared flag is set. The MySQL layer seems to prevent
attempts to roll back connected transactions that are in the PREPARED
state from another connection, but it is better to play it safe. The
is_prepared flag was introduced in the InnoDB Plugin.

trx_n_prepared: A new counter, counting the number of InnoDB
transactions in the PREPARED state.

logs_empty_and_mark_files_at_shutdown(): On shutdown, allow
trx_n_prepared transactions to exist in the system.

trx_undo_free_prepared(), trx_free_prepared(): New functions, to free
the memory objects of PREPARED transactions on shutdown. This is not
needed in the built-in InnoDB, because it would collect all allocated
memory on shutdown. The InnoDB Plugin needs this because of
innodb_use_sys_malloc.

trx_sys_close(): Invoke trx_free_prepared() on all remaining
transactions.
2011-04-07 21:12:54 +03:00
Vasil Dimov
844a8098aa Fix typo, should be UNIV_SYNC_DEBUG. 2010-09-15 19:58:36 +03:00
Vasil Dimov
c8a205c68b (partially) Fix Bug#55227 Fix compiler warnings in innodb with gcc 4.6
Fix compiler warning:
trx/trx0sys.c: In function 'trx_sys_create_doublewrite_buf':
trx/trx0sys.c:244:15: error: variable 'new_block' set but not used [-Werror=unused-but-set-variable]
2010-09-15 19:47:35 +03:00
Sergey Vojtovich
0aa607e414 Applying InnoDB snapshot
Detailed revision comments:

r6891 | vdimov | 2010-03-26 16:19:01 +0200 (Fri, 26 Mar 2010) | 5 lines
Non-functional change: update copyright year to 2010 of the files
that have been modified after 2010-01-01 according to svn.

for f in $(svn log -v -r{2010-01-01}:HEAD |grep "^   M " |cut -b 16- |sort -u) ; do sed -i "" -E 's/(Copyright \(c\) [0-9]{4},) [0-9]{4}, (.*Innobase Oy.+All Rights Reserved)/\1 2010, \2/' $f ; done
2010-04-01 17:01:22 +04:00
Sergey Vojtovich
2b38090990 Applying InnoDB snapshot
Detailed revision comments:

r6792 | marko | 2010-03-10 13:56:41 +0200 (Wed, 10 Mar 2010) | 1 line
branches/zip: Copy tests from branches/5.1 that were lost in some merge.
r6793 | marko | 2010-03-10 14:02:19 +0200 (Wed, 10 Mar 2010) | 60 lines
branches/zip: Merge revisions 6669:6788 from branches/5.1:

  ------------------------------------------------------------------------
  r6774 | calvin | 2010-03-03 23:56:10 +0200 (Wed, 03 Mar 2010) | 2 lines
  Changed paths:
     M /branches/5.1/trx/trx0sys.c

  branches/5.1: fix bug#51653: outdated reference to set-variable
  Non functional change.
  ------------------------------------------------------------------------
  r6780 | vasil | 2010-03-08 19:13:20 +0200 (Mon, 08 Mar 2010) | 4 lines
  Changed paths:
     M /branches/5.1/plug.in

  branches/5.1:

  Whitespace fixup.
  ------------------------------------------------------------------------
  r6783 | jyang | 2010-03-09 17:54:14 +0200 (Tue, 09 Mar 2010) | 9 lines
  Changed paths:
     M /branches/5.1/handler/ha_innodb.cc
     M /branches/5.1/mysql-test/innodb_bug21704.result
     A /branches/5.1/mysql-test/innodb_bug47621.result
     A /branches/5.1/mysql-test/innodb_bug47621.test

  branches/5.1: Fix bug #47621 "MySQL and InnoDB data dictionaries
  will become out of sync when renaming columns". MySQL does not
  provide new column name information to storage engine to
  update the system table. To avoid column name mismatch, we shall
  just request a table copy for now.

  rb://246 approved by Marko.
  ------------------------------------------------------------------------
  r6785 | vasil | 2010-03-10 09:04:38 +0200 (Wed, 10 Mar 2010) | 11 lines
  Changed paths:
     M /branches/5.1/mysql-test/innodb_bug38231.test

  branches/5.1:

  Add the missing --reap statements in innodb_bug38231.test. Probably MySQL
  enforced the presence of those recently and the test started failing like:

    main.innodb_bug38231                     [ fail ]
            Test ended at 2010-03-10 08:48:32

    CURRENT_TEST: main.innodb_bug38231
    mysqltest: At line 49: Cannot run query on connection between send and reap
  ------------------------------------------------------------------------
  r6788 | vasil | 2010-03-10 10:53:21 +0200 (Wed, 10 Mar 2010) | 8 lines
  Changed paths:
     M /branches/5.1/mysql-test/innodb_bug38231.test

  branches/5.1:

  In innodb_bug38231.test: replace the fragile sleep 0.2 that depends on timing
  with a more robust condition which waits for the TRUNCATE and LOCK commands
  to appear in information_schema.processlist. This could also break if there
  are other sessions executing the same SQL commands, but there are none during
  the execution of the mysql test.
  ------------------------------------------------------------------------
2010-04-01 16:20:37 +04:00
Sergey Vojtovich
eb27168c8a Applying InnoDB snapshot
Detailed revision comments:

r6275 | pekka | 2009-12-03 18:32:47 +0200 (Thu, 03 Dec 2009) | 10 lines
branches/zip: Minor changes which allow build with UNIV_HOTBACKUP
defined to succeed:

include/trx0sys.h: Allow Hot Backup build to see some
                   TRX_SYS_DOUBLEWRITE_... macros. 
trx/trx0sys.c:     Exclude trx_sys_close() function from Hot Backup build.
log/log0recv.[ch]: Exclude recv_sys_var_init() function from Hot Backup build.

This change should not affect !UNIV_HOTBACKUP build.
2010-04-01 15:01:13 +04:00
Satya B
a1092e9b66 Applying InnoDB Plugin 1.0.6 snapshot,part 1. Fixes BUG#45992 and BUG#46656
Detailed revision comments:

r6130 | marko | 2009-11-02 11:42:56 +0200 (Mon, 02 Nov 2009) | 9 lines
branches/zip: Free all resources at shutdown. Set pointers to NULL, so
that Valgrind will not complain about freed data structures that are
reachable via pointers.  This addresses Bug #45992 and Bug #46656.

This patch is mostly based on changes copied from branches/embedded-1.0,
mainly c5432, c3439, c3134, c2994, c2978, but also some other code was
copied.  Some added cleanup code is specific to MySQL/InnoDB.

rb://199 approved by Sunny Bains
2009-11-30 17:02:05 +05:30
Sergey Vojtovich
bae6276d45 Update to innoplug-1.0.4. 2009-07-30 17:42:56 +05:00
Satya B
3945d5e554 Adding innodb_plugin-1.0.4 as storage/innodb_plugin. 2009-05-27 15:15:59 +05:30