columns again
This is follow-up to Bug #54358. Not all occurrences of the bug were fixed.
We need to check all calls to btr_copy_externally_stored_field_prefix_low()
and do the right thing when the pointer to the off-page column is null
(full of zero bytes).
It turns out that only the call to btr_copy_externally_stored_field_prefix()
in row_sel_sec_rec_is_for_blob() needs to be changed.
For fetching complete off-page columns rather than prefixes, the function
btr_rec_copy_externally_stored_field() already checks if the pointer
is null (all-zero). Two of its callers (row_merge_copy_blobs() and
row_sel_fetch_columns()) are never executed as READ COMMITTED and can
rightfully assert that the fetch succeeded. The third caller,
row_sel_store_mysql_rec(), already does the right thing.
The calls in row_upd_ext_fetch() and trx_undo_page_fetch_ext() must
expect that the off-page column exists. Update and rollback are
locking operations, never READ UNCOMMITTED.
row_search_for_mysql(): When a secondary index record might not be
visible in the current transaction's read view and we consult the
clustered index and optionally some undo log records, return the
relevant columns of the clustered index record to MySQL instead of the
secondary index record.
ibuf_insert_to_index_page_low(): New function, refactored from
ibuf_insert_to_index_page().
ibuf_insert_to_index_page(): When we are inserting a record in place
of a delete-marked record and some fields of the record differ, update
that record just like row_ins_sec_index_entry_by_modify() would do.
btr_cur_update_alloc_zip(): Make the function public.
mysql_row_templ_t: Add clust_rec_field_no.
row_sel_store_mysql_rec(), row_sel_push_cache_row_for_mysql(): Add the
flag rec_clust, for returning data at clust_rec_field_no instead of
rec_field_no. Resurrect the debug assertion that the record not be
marked for deletion. (Bug #55626)
[UNIV_DEBUG || UNIV_IBUF_DEBUG] ibuf_debug, buf_page_get_gen(),
buf_flush_page_try():
Implement innodb_change_buffering_debug=1 for evicting pages from the
buffer pool, so that change buffering will be attempted more
frequently.
This is a regression from the fix for bug no 38999. A storage engine capable
of reading only a subset of a table's columns updates corresponding bits in
the read buffer to signal that it has read NULL values for the corresponding
columns. It cannot, and should not, update any other bits. Bug no 38999
occurred because the implementation of UPDATE statements compare the NULL bits
using memcmp, inadvertently comparing bits that were never requested from the
storage engine. The regression was caused by the storage engine trying to
alleviate the situation by writing to all NULL bits, even those that it had no
knowledge of. This has devastating effects for the index merge algorithm,
which relies on all NULL bits, except those explicitly requested, being left
unchanged.
The fix reverts the fix for bug no 38999 in both InnoDB and InnoDB plugin and
changes the server's method of comparing records. For engines that always read
entire rows, we proceed as usual. For engines capable of reading only select
columns, the record buffers are now compared on a column by column basis. An
assertion was also added so that non comparable buffers are never read. Some
relevant copy-pasted code was also consolidated in a new function.
Remove a bogus debug assertion that triggered the bug.
Add assertions precisely where records must not be delete-marked.
And a comment to clarify when the record is allowed to be delete-marked.
(READ UNCOMMITTED access failure of off-page DYNAMIC or COMPRESSED columns).
Records that lack incompletely written externally stored columns may
be accessed by READ UNCOMMITTED transaction even without involving a
crash during an INSERT or UPDATE operation. I verified this as follows.
(1) added a delay after the mini-transaction for writing the clustered
index 'stub' record was committed (patch attached)
(2) started mysqld in gdb, setting breakpoints to the where the
assertions about READ UNCOMMITTED were added in the bug fix
(3) invoked ibtest3 --create-options=key_block_size=2
to create BLOBs in a COMPRESSED table
(4) invoked the following:
yes 'set transaction isolation level read uncommitted;
checksum table blobt3;select sleep(1);'|mysql -uroot test
(5) noted that one of the breakpoints was triggered
(return(NULL) in btr_rec_copy_externally_stored_field())
=== modified file 'storage/innodb_plugin/row/row0ins.c'
--- storage/innodb_plugin/row/row0ins.c 2010-06-30 08:17:25 +0000
+++ storage/innodb_plugin/row/row0ins.c 2010-06-30 08:17:25 +0000
@@ -2120,6 +2120,7 @@ function_exit:
rec_t* rec;
ulint* offsets;
mtr_start(&mtr);
+ os_thread_sleep(5000000);
btr_cur_search_to_nth_level(index, 0, entry, PAGE_CUR_LE,
BTR_MODIFY_TREE, &cursor, 0,
=== modified file 'storage/innodb_plugin/row/row0upd.c'
--- storage/innodb_plugin/row/row0upd.c 2010-06-30 08:11:55 +0000
+++ storage/innodb_plugin/row/row0upd.c 2010-06-30 08:11:55 +0000
@@ -1763,6 +1763,7 @@ row_upd_clust_rec(
rec_offs_init(offsets_);
mtr_start(mtr);
+ os_thread_sleep(5000000);
ut_a(btr_pcur_restore_position(BTR_MODIFY_TREE, pcur, mtr));
rec = btr_cur_get_rec(btr_cur);
columns
When the server crashes after a record stub has been inserted and
before all its off-page columns have been written, the record will
contain incomplete off-page columns after crash recovery. Such records
may only be accessed at the READ UNCOMMITTED isolation level or when
rolling back a recovered transaction in recv_recovery_rollback_active().
Skip these records at the READ UNCOMMITTED isolation level.
TODO: Add assertions for checking the above assumptions hold when an
incomplete BLOB is encountered.
btr_rec_copy_externally_stored_field(): Return NULL if the field is
incomplete.
row_prebuilt_t::templ_contains_blob: Clarify what "BLOB" means in this
context. Hint: MySQL BLOBs are not the same as InnoDB BLOBs.
row_sel_store_mysql_rec(): Return FALSE if not all columns could be
retrieved. Previously this function always returned TRUE. Assert that
the record is not delete-marked.
row_sel_push_cache_row_for_mysql(): Return FALSE if not all columns
could be retrieved.
row_search_for_mysql(): Skip records containing incomplete off-page
columns. Assert that the transaction isolation level is READ
UNCOMMITTED.
rb://380 approved by Jimmy Yang
(InnoDB plugin branch)
mysql-test/suite/innodb_plugin/r/innodb_mysql.result:
test case
mysql-test/suite/innodb_plugin/t/innodb_mysql.test:
test case
storage/innodb_plugin/row/row0sel.c:
init null bytes with default values as they might be
left uninitialized in some cases and these uninited bytes
might be copied into mysql record buffer that leads to
valgrind warnings on next use of the buffer.
In semi-consistent read, only unlock freshly locked non-matching records.
lock_rec_lock_fast(): Return LOCK_REC_SUCCESS,
LOCK_REC_SUCCESS_CREATED, or LOCK_REC_FAIL instead of TRUE/FALSE.
enum db_err: Add DB_SUCCESS_LOCKED_REC for indicating a successful
operation where a record lock was created.
lock_sec_rec_read_check_and_lock(),
lock_clust_rec_read_check_and_lock(), lock_rec_enqueue_waiting(),
lock_rec_lock_slow(), lock_rec_lock(), row_ins_set_shared_rec_lock(),
row_ins_set_exclusive_rec_lock(), sel_set_rec_lock(),
row_sel_get_clust_rec_for_mysql(): Return DB_SUCCESS_LOCKED_REC if a
new record lock was created. Adjust callers.
row_unlock_for_mysql(): Correct the function documentation.
row_prebuilt_t::new_rec_locks: Correct the documentation.
Detailed revision comments:
r6750 | marko | 2010-02-22 08:57:23 +0200 (Mon, 22 Feb 2010) | 2 lines
branches/zip: row_fetch_store_uint4(): Remove unused function.
This was added to trunk in r435.
Detailed revision comments:
r6749 | vasil | 2010-02-20 18:45:41 +0200 (Sat, 20 Feb 2010) | 5 lines
Non-functional change: update copyright year to 2010 of the files
that have been modified after 2010-01-01 according to svn.
for f in $(svn log -v -r{2010-01-01}:HEAD |grep "^ M " |cut -b 16- |sort -u) ; do sed -i "" -E 's/(Copyright \(c\) [0-9]{4},) [0-9]{4}, (.*Innobase Oy.+All Rights Reserved)/\1 2010, \2/' $f ; done
Detailed revision comments:
r6635 | marko | 2010-02-10 11:07:05 +0200 (Wed, 10 Feb 2010) | 4 lines
branches/zip: Clean up after r6559. Now that
btr_pcur_open_with_no_init() is a macro, do not mix preprocessor
directives in the macro invocation, because it is implementation-defined
whether that is going to work.
Detailed revision comments:
r6447 | marko | 2010-01-13 17:43:44 +0200 (Wed, 13 Jan 2010) | 5 lines
branches/zip: row_sel_get_clust_rec_for_mysql(): On the READ UNCOMMITTED
isolation level, do not attempt to access a clustered index record
that has been marked for deletion. This fixes Issue #433.
Approved by Heikki over the IM.
Detailed revision comments:
r6426 | marko | 2010-01-12 15:36:14 +0200 (Tue, 12 Jan 2010) | 2 lines
branches/zip: row_sel_sec_rec_is_for_clust_rec(): Document the return value
more accurately.
Detailed revision comments:
r6348 | marko | 2009-12-22 11:04:34 +0200 (Tue, 22 Dec 2009) | 37 lines
branches/zip: Merge a change from MySQL:
r6351 | marko | 2009-12-22 11:11:18 +0200 (Tue, 22 Dec 2009) | 1 line
branches/zip: Remove an obsolete declaration of LOCK_thread_count.
r6352 | marko | 2009-12-22 12:33:01 +0200 (Tue, 22 Dec 2009) | 104 lines
branches/zip: Merge revisions 6206:6350 from branches/5.1,
except r6347, r6349, r6350 which were committed separately
to both branches, and r6310, which was backported from zip to 5.1.
------------------------------------------------------------------------
r6206 | jyang | 2009-11-20 09:38:43 +0200 (Fri, 20 Nov 2009) | 3 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: Non-functional change, fix formatting.
------------------------------------------------------------------------
r6230 | sunny | 2009-11-24 23:52:43 +0200 (Tue, 24 Nov 2009) | 3 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
branches/5.1: Fix autoinc failing test results.
(this should be skipped when merging 5.1 into zip)
------------------------------------------------------------------------
r6231 | sunny | 2009-11-25 10:26:27 +0200 (Wed, 25 Nov 2009) | 7 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
M /branches/5.1/row/row0sel.c
branches/5.1: Fix BUG#49032 - auto_increment field does not initialize to last value in InnoDB Storage Engine.
We use the appropriate function to read the column value for non-integer
autoinc column types, namely float and double.
rb://208. Approved by Marko.
------------------------------------------------------------------------
r6232 | sunny | 2009-11-25 10:27:39 +0200 (Wed, 25 Nov 2009) | 2 lines
Changed paths:
M /branches/5.1/row/row0sel.c
branches/5.1: This is an interim fix, fix white space errors.
------------------------------------------------------------------------
r6233 | sunny | 2009-11-25 10:28:35 +0200 (Wed, 25 Nov 2009) | 2 lines
Changed paths:
M /branches/5.1/include/mach0data.h
M /branches/5.1/include/mach0data.ic
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
M /branches/5.1/row/row0sel.c
branches/5.1: This is an interim fix, fix tests and make read float/double arg const.
------------------------------------------------------------------------
r6234 | sunny | 2009-11-25 10:29:03 +0200 (Wed, 25 Nov 2009) | 2 lines
Changed paths:
M /branches/5.1/row/row0sel.c
branches/5.1: This is an interim fix, fix whitepsace issues.
------------------------------------------------------------------------
r6235 | sunny | 2009-11-26 01:14:42 +0200 (Thu, 26 Nov 2009) | 9 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: Fix Bug#47720 - REPLACE INTO Autoincrement column with negative values.
This bug is similiar to the negative autoinc filter patch from earlier,
with the additional handling of filtering out the negative column values
set explicitly by the user.
rb://184
Approved by Heikki.
------------------------------------------------------------------------
r6242 | vasil | 2009-11-27 22:07:12 +0200 (Fri, 27 Nov 2009) | 4 lines
Changed paths:
M /branches/5.1/export.sh
branches/5.1:
Minor changes to support plugin snapshots.
------------------------------------------------------------------------
r6306 | calvin | 2009-12-14 15:12:46 +0200 (Mon, 14 Dec 2009) | 5 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: fix bug#49267: innodb-autoinc.test fails on windows
because of different case mode
There is no change to the InnoDB code, only to fix test case by
changing "T1" to "t1".
------------------------------------------------------------------------
r6324 | jyang | 2009-12-17 06:54:24 +0200 (Thu, 17 Dec 2009) | 8 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/include/lock0lock.h
M /branches/5.1/include/srv0srv.h
M /branches/5.1/lock/lock0lock.c
M /branches/5.1/log/log0log.c
M /branches/5.1/srv/srv0srv.c
M /branches/5.1/srv/srv0start.c
branches/5.1: Fix bug #47814 - Diagnostics are frequently not
printed after a long lock wait in InnoDB. Separate out the
lock wait timeout check thread from monitor information
printing thread.
rb://200 Approved by Marko.
------------------------------------------------------------------------
r6364 | marko | 2009-12-26 21:06:31 +0200 (Sat, 26 Dec 2009) | 4 lines
branches/zip: ibuf_bitmap_get_map_page():
Define a wrapper macro that passes __FILE__, __LINE__ of the caller
to buf_page_get_gen().
This will ease the diagnosis of the likes of Issue #135.
Detailed revision comments:
r6285 | marko | 2009-12-09 09:24:50 +0200 (Wed, 09 Dec 2009) | 13 lines
branches/zip: row_sel_fetch_columns(): Remove redundant code that was
accidentally added in r1591, which introduced dfield_t::ext in order
to make the merge sort of fast index creation support externally
stored columns,
Initially, I tried to allocate the bit for dfield_t::ext from
dfield_t::len by making the length 31 bits and mapping UNIV_SQL_NULL
to something that would fit in it. Then I decided that it would be
too risky. The redundant check was part of the mapping. The
condition may have been dfield_is_null() initially.
This redundant code was noticed by Sergey Petrunya on the MySQL
internals list.
r6288 | marko | 2009-12-09 09:51:00 +0200 (Wed, 09 Dec 2009) | 15 lines
branches/zip: row_upd_copy_columns(): Remove redundant code that was
accidentally added in r1591, which introduced dfield_t::ext in order
to make the merge sort of fast index creation support externally
stored columns.
Initially, I tried to allocate the bit for dfield_t::ext from
dfield_t::len by making the length 31 bits and mapping UNIV_SQL_NULL
to something that would fit in it. Then I decided that it would be
too risky. The redundant check was part of the mapping. The
condition may have been dfield_is_null() initially.
This is similar to the redundant code in row_sel_fetch_columns() that
was noticed by Sergey Petrunya on the MySQL internals list and removed
in r6285. As far as I can tell, there are no redundant UNIV_SQL_NULL
assignments remaining after this change.