CREATE TABLE bug13510739 (c INTEGER NOT NULL, PRIMARY KEY (c)) ENGINE=INNODB;
INSERT INTO bug13510739 VALUES (1), (2), (3), (4);
DELETE FROM bug13510739 WHERE c=2;
HANDLER bug13510739 OPEN;
HANDLER bug13510739 READ `primary` = (2);
HANDLER bug13510739 READ `primary` NEXT; <-- crash
The bug is that in the particular testcase row_search_for_mysql() picked up
a delete-marked record and quit, leaving the cursor non-positioned state and
on the subsequent 'get next' call the code crashed because of the
non-positioned cursor.
In row0sel.cc (line numbers from mysql-trunk):
4653 if (rec_get_deleted_flag(rec, comp)) {
...
4679 if (index == clust_index && unique_search) {
4680
4681 err = DB_RECORD_NOT_FOUND;
4682
4683 goto normal_return;
4684 }
it quit from here, not storing the cursor position.
In contrast, if the record=2 is not found at all (e.g. sleep(1) after DELETE
to let the purge wipe it away completely) then 'get = 2' does find record=3
and quits from here:
4366 if (0 != cmp_dtuple_rec(search_tuple, rec, offsets)) {
...
4394 btr_pcur_store_position(pcur, &mtr);
4395
4396 err = DB_RECORD_NOT_FOUND;
4397 #if 0
4398 ut_print_name(stderr, trx, FALSE, index->name);
4399 fputs(" record not found 3\n", stderr);
4400 #endif
4401
4402 goto normal_return;
Another fix could be to extend the condition on line 4366 to hold only if
seach_tuple matches rec AND if rec is not delete marked.
Notice that in the above test case if we wait about 1 second somewhere after
DELETE and before 'get = 2', then the testcase does not crash and returns 4
instead. Not sure if this is the correct behavior, but this bugfix removes
the crash and makes the code return what it also returns in the non-crashing
case (if rec=2 is not found during 'get = 2', e.g. we have sleep(1) there).
Approved by: Marko (http://bur03.no.oracle.com/rb/r/863/)
When there is a secondary index on a column prefix of an externally
stored column and an entry in the secondary index is shorter than the
reserved prefix length, it should mean that the secondary index entry
is holding the complete column value. When comparing this secondary
index column value to the column in the clustered index row, we must
compare the entire prefix that was fetched from the clustered
index. The bug was that we would just compare that the column in the
clustered index starts with the value found in the secondary index
column.
This bug affects only the InnoDB Barracuda formats (ROW_FORMAT=DYNAMIC
and ROW_FORMAT=COMPRESSED), in which columns that are stored off-page
in the clustered index do not contain any prefix in the clustered
index record.
row_sel_sec_rec_is_for_blob(): Add the parameter prefix_len, for
ifield->prefix_len. Add some assertions.
Sorry, I did not manage to produce a test case. This patch does
produce correct results on the data set that Michael isolated on our
test machine. That was with the purge and background rollback
suspended, because they would make the bug go away.
rb:760 approved by Sunny Bains
This change is a followup to
vasil.dimov@oracle.com-20110907145810-v98kldmho23vhhic
which triggered the usage of the prefetch and valgrind tests spat lots of
warnings.
The prefetch code will be removed.
Discussed with: Marko (over IM)
row_sel_field_store_in_mysql_format(): Do not pad the unused part of
the buffer reserved for a True VARCHAR column (introduced in 5.0.3).
Add Valgrind instrumentation ensuring that the unused part will be
flagged uninitialized.
row_sel_copy_cached_field_for_mysql(): New function: Copy a field
that is in the MySQL row format, not copying the unused tail of
VARCHAR columns.
row_sel_pop_cached_row_for_mysql(): Invoke
row_sel_copy_cached_field_for_mysql() for copying fields.
When the row is long, copy it field-by-field.
rb:715 approved by Inaam Rana
columns again
This is follow-up to Bug #54358. Not all occurrences of the bug were fixed.
We need to check all calls to btr_copy_externally_stored_field_prefix_low()
and do the right thing when the pointer to the off-page column is null
(full of zero bytes).
It turns out that only the call to btr_copy_externally_stored_field_prefix()
in row_sel_sec_rec_is_for_blob() needs to be changed.
For fetching complete off-page columns rather than prefixes, the function
btr_rec_copy_externally_stored_field() already checks if the pointer
is null (all-zero). Two of its callers (row_merge_copy_blobs() and
row_sel_fetch_columns()) are never executed as READ COMMITTED and can
rightfully assert that the fetch succeeded. The third caller,
row_sel_store_mysql_rec(), already does the right thing.
The calls in row_upd_ext_fetch() and trx_undo_page_fetch_ext() must
expect that the off-page column exists. Update and rollback are
locking operations, never READ UNCOMMITTED.
row_search_for_mysql(): When a secondary index record might not be
visible in the current transaction's read view and we consult the
clustered index and optionally some undo log records, return the
relevant columns of the clustered index record to MySQL instead of the
secondary index record.
ibuf_insert_to_index_page_low(): New function, refactored from
ibuf_insert_to_index_page().
ibuf_insert_to_index_page(): When we are inserting a record in place
of a delete-marked record and some fields of the record differ, update
that record just like row_ins_sec_index_entry_by_modify() would do.
btr_cur_update_alloc_zip(): Make the function public.
mysql_row_templ_t: Add clust_rec_field_no.
row_sel_store_mysql_rec(), row_sel_push_cache_row_for_mysql(): Add the
flag rec_clust, for returning data at clust_rec_field_no instead of
rec_field_no. Resurrect the debug assertion that the record not be
marked for deletion. (Bug #55626)
[UNIV_DEBUG || UNIV_IBUF_DEBUG] ibuf_debug, buf_page_get_gen(),
buf_flush_page_try():
Implement innodb_change_buffering_debug=1 for evicting pages from the
buffer pool, so that change buffering will be attempted more
frequently.
This is a regression from the fix for bug no 38999. A storage engine capable
of reading only a subset of a table's columns updates corresponding bits in
the read buffer to signal that it has read NULL values for the corresponding
columns. It cannot, and should not, update any other bits. Bug no 38999
occurred because the implementation of UPDATE statements compare the NULL bits
using memcmp, inadvertently comparing bits that were never requested from the
storage engine. The regression was caused by the storage engine trying to
alleviate the situation by writing to all NULL bits, even those that it had no
knowledge of. This has devastating effects for the index merge algorithm,
which relies on all NULL bits, except those explicitly requested, being left
unchanged.
The fix reverts the fix for bug no 38999 in both InnoDB and InnoDB plugin and
changes the server's method of comparing records. For engines that always read
entire rows, we proceed as usual. For engines capable of reading only select
columns, the record buffers are now compared on a column by column basis. An
assertion was also added so that non comparable buffers are never read. Some
relevant copy-pasted code was also consolidated in a new function.
Remove a bogus debug assertion that triggered the bug.
Add assertions precisely where records must not be delete-marked.
And a comment to clarify when the record is allowed to be delete-marked.
(READ UNCOMMITTED access failure of off-page DYNAMIC or COMPRESSED columns).
Records that lack incompletely written externally stored columns may
be accessed by READ UNCOMMITTED transaction even without involving a
crash during an INSERT or UPDATE operation. I verified this as follows.
(1) added a delay after the mini-transaction for writing the clustered
index 'stub' record was committed (patch attached)
(2) started mysqld in gdb, setting breakpoints to the where the
assertions about READ UNCOMMITTED were added in the bug fix
(3) invoked ibtest3 --create-options=key_block_size=2
to create BLOBs in a COMPRESSED table
(4) invoked the following:
yes 'set transaction isolation level read uncommitted;
checksum table blobt3;select sleep(1);'|mysql -uroot test
(5) noted that one of the breakpoints was triggered
(return(NULL) in btr_rec_copy_externally_stored_field())
=== modified file 'storage/innodb_plugin/row/row0ins.c'
--- storage/innodb_plugin/row/row0ins.c 2010-06-30 08:17:25 +0000
+++ storage/innodb_plugin/row/row0ins.c 2010-06-30 08:17:25 +0000
@@ -2120,6 +2120,7 @@ function_exit:
rec_t* rec;
ulint* offsets;
mtr_start(&mtr);
+ os_thread_sleep(5000000);
btr_cur_search_to_nth_level(index, 0, entry, PAGE_CUR_LE,
BTR_MODIFY_TREE, &cursor, 0,
=== modified file 'storage/innodb_plugin/row/row0upd.c'
--- storage/innodb_plugin/row/row0upd.c 2010-06-30 08:11:55 +0000
+++ storage/innodb_plugin/row/row0upd.c 2010-06-30 08:11:55 +0000
@@ -1763,6 +1763,7 @@ row_upd_clust_rec(
rec_offs_init(offsets_);
mtr_start(mtr);
+ os_thread_sleep(5000000);
ut_a(btr_pcur_restore_position(BTR_MODIFY_TREE, pcur, mtr));
rec = btr_cur_get_rec(btr_cur);
columns
When the server crashes after a record stub has been inserted and
before all its off-page columns have been written, the record will
contain incomplete off-page columns after crash recovery. Such records
may only be accessed at the READ UNCOMMITTED isolation level or when
rolling back a recovered transaction in recv_recovery_rollback_active().
Skip these records at the READ UNCOMMITTED isolation level.
TODO: Add assertions for checking the above assumptions hold when an
incomplete BLOB is encountered.
btr_rec_copy_externally_stored_field(): Return NULL if the field is
incomplete.
row_prebuilt_t::templ_contains_blob: Clarify what "BLOB" means in this
context. Hint: MySQL BLOBs are not the same as InnoDB BLOBs.
row_sel_store_mysql_rec(): Return FALSE if not all columns could be
retrieved. Previously this function always returned TRUE. Assert that
the record is not delete-marked.
row_sel_push_cache_row_for_mysql(): Return FALSE if not all columns
could be retrieved.
row_search_for_mysql(): Skip records containing incomplete off-page
columns. Assert that the transaction isolation level is READ
UNCOMMITTED.
rb://380 approved by Jimmy Yang
(InnoDB plugin branch)
mysql-test/suite/innodb_plugin/r/innodb_mysql.result:
test case
mysql-test/suite/innodb_plugin/t/innodb_mysql.test:
test case
storage/innodb_plugin/row/row0sel.c:
init null bytes with default values as they might be
left uninitialized in some cases and these uninited bytes
might be copied into mysql record buffer that leads to
valgrind warnings on next use of the buffer.
In semi-consistent read, only unlock freshly locked non-matching records.
lock_rec_lock_fast(): Return LOCK_REC_SUCCESS,
LOCK_REC_SUCCESS_CREATED, or LOCK_REC_FAIL instead of TRUE/FALSE.
enum db_err: Add DB_SUCCESS_LOCKED_REC for indicating a successful
operation where a record lock was created.
lock_sec_rec_read_check_and_lock(),
lock_clust_rec_read_check_and_lock(), lock_rec_enqueue_waiting(),
lock_rec_lock_slow(), lock_rec_lock(), row_ins_set_shared_rec_lock(),
row_ins_set_exclusive_rec_lock(), sel_set_rec_lock(),
row_sel_get_clust_rec_for_mysql(): Return DB_SUCCESS_LOCKED_REC if a
new record lock was created. Adjust callers.
row_unlock_for_mysql(): Correct the function documentation.
row_prebuilt_t::new_rec_locks: Correct the documentation.
Detailed revision comments:
r6750 | marko | 2010-02-22 08:57:23 +0200 (Mon, 22 Feb 2010) | 2 lines
branches/zip: row_fetch_store_uint4(): Remove unused function.
This was added to trunk in r435.
Detailed revision comments:
r6749 | vasil | 2010-02-20 18:45:41 +0200 (Sat, 20 Feb 2010) | 5 lines
Non-functional change: update copyright year to 2010 of the files
that have been modified after 2010-01-01 according to svn.
for f in $(svn log -v -r{2010-01-01}:HEAD |grep "^ M " |cut -b 16- |sort -u) ; do sed -i "" -E 's/(Copyright \(c\) [0-9]{4},) [0-9]{4}, (.*Innobase Oy.+All Rights Reserved)/\1 2010, \2/' $f ; done
Detailed revision comments:
r6635 | marko | 2010-02-10 11:07:05 +0200 (Wed, 10 Feb 2010) | 4 lines
branches/zip: Clean up after r6559. Now that
btr_pcur_open_with_no_init() is a macro, do not mix preprocessor
directives in the macro invocation, because it is implementation-defined
whether that is going to work.
Detailed revision comments:
r6447 | marko | 2010-01-13 17:43:44 +0200 (Wed, 13 Jan 2010) | 5 lines
branches/zip: row_sel_get_clust_rec_for_mysql(): On the READ UNCOMMITTED
isolation level, do not attempt to access a clustered index record
that has been marked for deletion. This fixes Issue #433.
Approved by Heikki over the IM.
Detailed revision comments:
r6426 | marko | 2010-01-12 15:36:14 +0200 (Tue, 12 Jan 2010) | 2 lines
branches/zip: row_sel_sec_rec_is_for_clust_rec(): Document the return value
more accurately.
Detailed revision comments:
r6348 | marko | 2009-12-22 11:04:34 +0200 (Tue, 22 Dec 2009) | 37 lines
branches/zip: Merge a change from MySQL:
r6351 | marko | 2009-12-22 11:11:18 +0200 (Tue, 22 Dec 2009) | 1 line
branches/zip: Remove an obsolete declaration of LOCK_thread_count.
r6352 | marko | 2009-12-22 12:33:01 +0200 (Tue, 22 Dec 2009) | 104 lines
branches/zip: Merge revisions 6206:6350 from branches/5.1,
except r6347, r6349, r6350 which were committed separately
to both branches, and r6310, which was backported from zip to 5.1.
------------------------------------------------------------------------
r6206 | jyang | 2009-11-20 09:38:43 +0200 (Fri, 20 Nov 2009) | 3 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: Non-functional change, fix formatting.
------------------------------------------------------------------------
r6230 | sunny | 2009-11-24 23:52:43 +0200 (Tue, 24 Nov 2009) | 3 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
branches/5.1: Fix autoinc failing test results.
(this should be skipped when merging 5.1 into zip)
------------------------------------------------------------------------
r6231 | sunny | 2009-11-25 10:26:27 +0200 (Wed, 25 Nov 2009) | 7 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
M /branches/5.1/row/row0sel.c
branches/5.1: Fix BUG#49032 - auto_increment field does not initialize to last value in InnoDB Storage Engine.
We use the appropriate function to read the column value for non-integer
autoinc column types, namely float and double.
rb://208. Approved by Marko.
------------------------------------------------------------------------
r6232 | sunny | 2009-11-25 10:27:39 +0200 (Wed, 25 Nov 2009) | 2 lines
Changed paths:
M /branches/5.1/row/row0sel.c
branches/5.1: This is an interim fix, fix white space errors.
------------------------------------------------------------------------
r6233 | sunny | 2009-11-25 10:28:35 +0200 (Wed, 25 Nov 2009) | 2 lines
Changed paths:
M /branches/5.1/include/mach0data.h
M /branches/5.1/include/mach0data.ic
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
M /branches/5.1/row/row0sel.c
branches/5.1: This is an interim fix, fix tests and make read float/double arg const.
------------------------------------------------------------------------
r6234 | sunny | 2009-11-25 10:29:03 +0200 (Wed, 25 Nov 2009) | 2 lines
Changed paths:
M /branches/5.1/row/row0sel.c
branches/5.1: This is an interim fix, fix whitepsace issues.
------------------------------------------------------------------------
r6235 | sunny | 2009-11-26 01:14:42 +0200 (Thu, 26 Nov 2009) | 9 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: Fix Bug#47720 - REPLACE INTO Autoincrement column with negative values.
This bug is similiar to the negative autoinc filter patch from earlier,
with the additional handling of filtering out the negative column values
set explicitly by the user.
rb://184
Approved by Heikki.
------------------------------------------------------------------------
r6242 | vasil | 2009-11-27 22:07:12 +0200 (Fri, 27 Nov 2009) | 4 lines
Changed paths:
M /branches/5.1/export.sh
branches/5.1:
Minor changes to support plugin snapshots.
------------------------------------------------------------------------
r6306 | calvin | 2009-12-14 15:12:46 +0200 (Mon, 14 Dec 2009) | 5 lines
Changed paths:
M /branches/5.1/mysql-test/innodb-autoinc.result
M /branches/5.1/mysql-test/innodb-autoinc.test
branches/5.1: fix bug#49267: innodb-autoinc.test fails on windows
because of different case mode
There is no change to the InnoDB code, only to fix test case by
changing "T1" to "t1".
------------------------------------------------------------------------
r6324 | jyang | 2009-12-17 06:54:24 +0200 (Thu, 17 Dec 2009) | 8 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
M /branches/5.1/include/lock0lock.h
M /branches/5.1/include/srv0srv.h
M /branches/5.1/lock/lock0lock.c
M /branches/5.1/log/log0log.c
M /branches/5.1/srv/srv0srv.c
M /branches/5.1/srv/srv0start.c
branches/5.1: Fix bug #47814 - Diagnostics are frequently not
printed after a long lock wait in InnoDB. Separate out the
lock wait timeout check thread from monitor information
printing thread.
rb://200 Approved by Marko.
------------------------------------------------------------------------
r6364 | marko | 2009-12-26 21:06:31 +0200 (Sat, 26 Dec 2009) | 4 lines
branches/zip: ibuf_bitmap_get_map_page():
Define a wrapper macro that passes __FILE__, __LINE__ of the caller
to buf_page_get_gen().
This will ease the diagnosis of the likes of Issue #135.
Detailed revision comments:
r6285 | marko | 2009-12-09 09:24:50 +0200 (Wed, 09 Dec 2009) | 13 lines
branches/zip: row_sel_fetch_columns(): Remove redundant code that was
accidentally added in r1591, which introduced dfield_t::ext in order
to make the merge sort of fast index creation support externally
stored columns,
Initially, I tried to allocate the bit for dfield_t::ext from
dfield_t::len by making the length 31 bits and mapping UNIV_SQL_NULL
to something that would fit in it. Then I decided that it would be
too risky. The redundant check was part of the mapping. The
condition may have been dfield_is_null() initially.
This redundant code was noticed by Sergey Petrunya on the MySQL
internals list.
r6288 | marko | 2009-12-09 09:51:00 +0200 (Wed, 09 Dec 2009) | 15 lines
branches/zip: row_upd_copy_columns(): Remove redundant code that was
accidentally added in r1591, which introduced dfield_t::ext in order
to make the merge sort of fast index creation support externally
stored columns.
Initially, I tried to allocate the bit for dfield_t::ext from
dfield_t::len by making the length 31 bits and mapping UNIV_SQL_NULL
to something that would fit in it. Then I decided that it would be
too risky. The redundant check was part of the mapping. The
condition may have been dfield_is_null() initially.
This is similar to the redundant code in row_sel_fetch_columns() that
was noticed by Sergey Petrunya on the MySQL internals list and removed
in r6285. As far as I can tell, there are no redundant UNIV_SQL_NULL
assignments remaining after this change.