CHECK_SIMPLE_EQUALITY
PROBLEM:
Crash in "check_simple_equality" when using a subquery with "IN" and
"ALL" in prepare.
ANALYSIS:
Crash can be reproduced using a simplified query like this one:
prepare s from "select 1 from g1 where 1 < all (
select @:=(1 in (select 1 from g1)) from g1)";
This bug is currently present only on 5.5.and 5.1. Its fixed as part
of work log(#1110) in 5.6. We are taking one change to fix this
in 5.5 and 5.1.
Problem seems to be present because we are trying to evaluate "is_null"
on an argument which is part of a subquery
(In Item_is_not_null_test::update_used_tables()).
But the condition to evaluate is only when we do not have a sub query
present, which means to say that "with_subselect" is not set.
With respect to the above query, we create an object of type
"Item_in_optimizer" which by definition is always associated with a
subquery. While in 5.6 we set "with_subselect" to true for
"Item_in_optimizer" object, we do not do the same in 5.5. This results in
the evaluation for "is_null" resulting in a coredump.
So, we are now setting "with_subselect" to true for "Item_in_optimizer"
in 5.1 and 5.5.
mysql-test/r/func_in.result:
Result file changes for the test case added
mysql-test/t/func_in.test:
Test case added for Bug#13012483
sql/item_cmpfunc.h:
Changed Item_in_optimizer::Item_in_optimizer( ) to set "with_subselect"
to true
Suppress innodb_bug34300 from failing if InnoDB prints:
120221 11:05:03 InnoDB: ERROR: the age of the last checkpoint is 9439048,
InnoDB: which exceeds the log group capacity 9433498.
by default the log capacity is 2 log files, 5 MB each.
This bug was originally filed and fixed as Bug#12612184. The original
fix was buggy, and it was patched by Bug#12704861. Also that patch was
buggy (potentially breaking crash recovery), and both fixes were
reverted.
This fix was not ported to the built-in InnoDB of MySQL 5.1, because
the function signatures of many core functions are different from
InnoDB Plugin and later versions. The block allocation routines and
their callers would have to changed so that they handle block
descriptors instead of page frames.
When a record is updated so that its size grows, non-updated columns
can be selected for external (off-page) storage. The bug is that the
initially inserted updated record contains an all-zero BLOB pointer to
the field that was not updated. Only after the BLOB pages have been
allocated and written, the valid pointer can be written to the record.
Between the release of the page latch in mtr_commit(mtr) after
btr_cur_pessimistic_update() and the re-latching of the page in
btr_pcur_restore_position(), other threads can see the invalid BLOB
pointer consisting of 20 zero bytes. Moreover, if the system crashes
at this point, the situation could persist after crash recovery, and
the contents of the non-updated column would be permanently lost.
The problem is amplified by the ROW_FORMAT=DYNAMIC and
ROW_FORMAT=COMPRESSED that were introduced in
innodb_file_format=barracuda in InnoDB Plugin, but the bug does exist
in all InnoDB versions.
The fix is as follows. After a pessimistic B-tree operation that needs
to write out off-page columns, allocate the pages for these columns in
the mini-transaction that performed the B-tree operation (btr_mtr),
but write the pages in a separate mini-transaction (blob_mtr). Do
mtr_commit(blob_mtr) before mtr_commit(btr_mtr). A quirk: Do not reuse
pages that were previously freed in btr_mtr. Only write the off-page
columns to 'fresh' pages.
In this way, crash recovery will see redo log entries for blob_mtr
before any redo log entry for btr_mtr. It will apply the BLOB page
writes to pages that were marked free at that point. If crash recovery
fails to see all of the btr_mtr redo log, there will be some
unreachable BLOB data in free pages, but the B-tree will be in a
consistent state.
btr_page_alloc_low(): Renamed from btr_page_alloc(). Add the parameter
init_mtr. Return an allocated block, or NULL. If init_mtr!=mtr but
the page was already X-latched in mtr, do not initialize the page.
btr_page_alloc(): Wrapper for btr_page_alloc_for_ibuf() and
btr_page_alloc_low().
btr_page_free(): Add a debug assertion that the page was a B-tree page.
btr_lift_page_up(): Return the father block.
btr_compress(), btr_cur_compress_if_useful(): Add the parameter ibool
adjust, for adjusting the cursor position.
btr_cur_pessimistic_update(): Preserve the cursor position when
big_rec will be written and the new flag BTR_KEEP_POS_FLAG is defined.
Remove a duplicate rec_get_offsets() call. Keep the X-latch on
index->lock when big_rec is needed.
btr_store_big_rec_extern_fields(): Replace update_inplace with
an operation code, and local_mtr with btr_mtr. When not doing a
fresh insert and btr_mtr has freed pages, put aside any pages that
were previously X-latched in btr_mtr, and free the pages after
writing out all data. The data must be written to 'fresh' pages,
because btr_mtr will be committed and written to the redo log after
the BLOB writes have been written to the redo log.
btr_blob_op_is_update(): Check if an operation passed to
btr_store_big_rec_extern_fields() is an update or insert-by-update.
fseg_alloc_free_page_low(), fsp_alloc_free_page(),
fseg_alloc_free_extent(), fseg_alloc_free_page_general(): Add the
parameter init_mtr. Return an allocated block, or NULL. If
init_mtr!=mtr but the page was already X-latched in mtr, do not
initialize the page.
xdes_get_descriptor_with_space_hdr(): Assert that the file space
header is being X-latched.
fsp_alloc_from_free_frag(): Refactored from fsp_alloc_free_page().
fsp_page_create(): New function, for allocating, X-latching and
potentially initializing a page. If init_mtr!=mtr but the page was
already X-latched in mtr, do not initialize the page.
fsp_free_page(): Add ut_ad(0) to the error outcomes.
fsp_free_page(), fseg_free_page_low(): Increment mtr->n_freed_pages.
fsp_alloc_seg_inode_page(), fseg_create_general(): Assert that the
page was not previously X-latched in the mini-transaction. A file
segment or inode page should never be allocated in the middle of an
mini-transaction that frees pages, such as btr_cur_pessimistic_delete().
fseg_alloc_free_page_low(): If the hinted page was allocated, skip the
check if the tablespace should be extended. Return NULL instead of
FIL_NULL on failure. Remove the flag frag_page_allocated. Instead,
return directly, because the page would already have been initialized.
fseg_find_free_frag_page_slot() would return ULINT_UNDEFINED on error,
not FIL_NULL. Correct a bogus assertion.
fseg_alloc_free_page(): Redefine as a wrapper macro around
fseg_alloc_free_page_general().
buf_block_buf_fix_inc(): Move the definition from the buf0buf.ic to
buf0buf.h, so that it can be called from other modules.
mtr_t: Add n_freed_pages (number of pages that have been freed).
page_rec_get_nth_const(), page_rec_get_nth(): The inverse function of
page_rec_get_n_recs_before(), get the nth record of the record
list. This is faster than iterating the linked list. Refactored from
page_get_middle_rec().
trx_undo_rec_copy(): Add a debug assertion for the length.
trx_undo_add_page(): Return a block descriptor or NULL instead of a
page number or FIL_NULL.
trx_undo_report_row_operation(): Add debug assertions.
trx_sys_create_doublewrite_buf(): Assert that each page was not
previously X-latched.
page_cur_insert_rec_zip_reorg(): Make use of page_rec_get_nth().
row_ins_clust_index_entry_by_modify(): Pass BTR_KEEP_POS_FLAG, so that
the repositioning of the cursor can be avoided.
row_ins_index_entry_low(): Add DEBUG_SYNC points before and after
writing off-page columns. If inserting by updating a delete-marked
record, do not reposition the cursor or commit the mini-transaction
before writing the off-page columns.
row_build(): Tighten a debug assertion about null BLOB pointers.
row_upd_clust_rec(): Add DEBUG_SYNC points before and after writing
off-page columns. Do not reposition the cursor or commit the
mini-transaction before writing the off-page columns.
rb:939 approved by Jimmy Yang
GRACEFUL SHUTDOWN
During startup mysql picks up .frm files from the tmpdir directory and
tries to drop those tables in the storage engine.
The problem is that when tmpdir ends in / then ha_innobase::delete_table()
is passed a string like "/var/tmp//#sql123", then it wrongly normalizes it
to "/#sql123" and calls row_drop_table_for_mysql() which of course fails
to delete the table entry from the InnoDB dictionary cache.
ha_innobase::delete_table() returns an error but nevertheless mysql wipes
away the .frm file and the entry in the InnoDB dictionary cache remains
orphaned with no easy way to remove it.
The "no easy" way to remove it is to create a similar temporary table again,
copy its .frm file to tmpdir under "#sql123.frm" and restart mysqld with
tmpdir=/var/tmp (no trailing slash) - this way mysql will pick the .frm file
after restart and will try to issue drop table for "/var/tmp/#sql123"
(notice do double slash), ha_innobase::delete_table() will normalize it to
"tmp/#sql123" and row_drop_table_for_mysql() will successfully remove the
table entry from the dictionary cache.
The solution is to fix normalize_table_name_low() to normalize things like
"/var/tmp//table" correctly to "tmp/table".
This patch also adds a test function which invokes
normalize_table_name_low() with various inputs to make sure it works
correctly and a mtr test that calls this test function.
Reviewed by: Marko (http://bur03.no.oracle.com/rb/r/929/)
CASES RESETS DATA POINTER TO SMAL
ISSUE: Myisamchk doing sort recover
on a table reduces data_file_length.
Maximum size of data file decreases,
lesser number of rows are stored.
SOLUTION: Size of data_file_length is
fixed to the original length.
CASES RESETS DATA POINTER TO SMAL
ISSUE: Myisamchk doing sort recover
on a table reduces data_file_length.
Maximum size of data file decreases,
lesser number of rows are stored.
SOLUTION: Size of data_file_length is
fixed to the original length.
BUG#13519696 - 62940: SELECT RESULTS VARY WITH VERSION AND
WITH/WITHOUT INDEX RANGE SCAN
BUG#13453382 - REGRESSION SINCE 5.1.39, RANGE OPTIMIZER WRONG
RESULTS WITH DECIMAL CONVERSION
BUG#13463488 - 63437: CHAR & BETWEEN WITH INDEX RETURNS WRONG
RESULT AFTER MYSQL 5.1.
Those are all cases where the range optimizer got it wrong
with > and >=.
mysql-test/r/range.result:
Without the code fix for DECIMAL, "select count(val) from t2 where val > 0.1155"
(which uses a range scan) returned 127 instead of 128);
Moreover, both
select * from t1 force index (primary) where a=1 and c>= 2.9;
and
select * from t1 force index (primary) where a=1 and c> 2.9;
would miss "1 1 3".
Without the code fix for strings, both
SELECT * FROM t1 WHERE F1 >= 'A ';
and
SELECT * FROM t1 WHERE F1 BETWEEN 'A ' AND 'AAAAA';
would miss "A A A".
sql/item.cc:
Preamble to the explanations below: opt_range.cc:get_mm_leaf() does
this (this is not changed by the patch): changes
column > value
to
column OP V
where:
* V is what is in "column" after we stored "value" in it
(such store operation may have done rounding...)
* OP is > or >=, depending on what's correct.
For example, if c is an INT column,
c > 2.9 is changed to
c OP 3
where OP is >= ('>' would not be correct).
The bugs below are cases where we chose OP wrongly.
Note that such transformations are visible in the optimizer trace.
1) Fix for STRING. In the scenario with CHAR(5) in range.test, this happens,
in get_mm_tree(), for the condition F1>='A ':
* value->save_in_field_no_warnings(field, 1) wants to store the right argument
(named 'item') into the CHAR(5) field; this stores 'A ' (the item's value)
padded with spaces (which changes nothing: still 'A ')
* we come to
case Item_func::GE_FUNC:
/* Don't use open ranges for partial key_segments */
if ((!(key_part->flag & HA_PART_KEY_SEG)) &&
(stored_field_cmp_to_item(param->thd, field, value) < 0))
tree->min_flag= NEAR_MIN;
tree->max_flag=NO_MAX_RANGE;
What this wants to do is: if the field's value is strictly smaller
than the item's, then ">=" can be changed to ">" (this is an optimization,
it can help pruning one useless partition).
* stored_field_cmp_to_item() is called; it compares the field's
and item's values: the item's value (Item_string::val_str()) is
'A ') and the field's value (Field_string::val_str()) is
'A' (yes val_str() removes end spaces unless sql_mode='PAD_CHAR_TO_FULL_LENGTH');
and the comparison is done with stringcmp() which considers
end spaces as relevant; as end spaces differ, function returns a
negative number, and ">='A '" becomes ">'A'" (i.e. the NEAR_MIN
flag is turned on).
During execution the index range scan code will search for "A", find
a match, but exclude it (because of ">"), wrongly.
The badness is the string comparison done by stored_field_cmp_to_item():
we use the reply of this function to determine where the index search
should start, so it should do comparison like index search does
comparisons; index search comparisons are ha_key_cmp() which uses
a collation-aware comparison (in our case, my_strnncollsp_simple(),
which ignores end spaces); so stored_field_cmp_to_item()
needs to do the same. When this is fixed, condition becomes
">='A '".
2) Fix for DECIMAL: just like in other comparisons in stored_field_cmp_to_item(),
we must first pass the field and then the item; otherwise expectations
on what <0 and >0 mean (inferiority, superiority) get violated.
In the test in range.test about c>2.9: c is an INT column, so 2.9
gets stored as 3, then stored_field_cmp_to_item() compares 3
and 2.9; because of the wrong order of arguments passed
to my_decimal_cmp(), range optimizer
thinks that 3 is < 2.9 and thus changes "c> 2.9" to "c> 3".
After fixing the order, it changes to the correct "c>= 3".
In the test in range.inc for val > 0.1155, it was changed to
val > 0.116, now it is changed to val >= 0.116.
- Reverting the patch for Bug # 12584302
The patch will be reverted in 5.1 and 5.5.
The patch will not be reverted in 5.6, the change will
be properly documented in 5.6.
- Backporting DBUG_ASSERT not to crash on '0000-01-00'
(already fixed in mysql-trunk (5.6))
Introducing new collations:
utf8_general_mysql500_ci and ucs2_general_mysql500_ci,
to reproduce behaviour of utf8_general_ci and ucs2_general_ci
from mysql-5.1.23 (and earlier).
The collations are added to simplify upgrade from mysql-5.1.23 and earlier.
Note: The patch does not make new server start over old data automatically.
Some manual upgrade procedures are assumed.
Paul: please get in touch with me to discuss upgrade procedures
when documenting this bug.
modified:
include/m_ctype.h
mysql-test/r/ctype_utf8.result
mysql-test/t/ctype_utf8.test
mysys/charset-def.c
strings/ctype-ucs2.c
strings/ctype-utf8.c
Test extra/rpl_tests/rpl_extra_col_master.test (used by
rpl_extra_col_master_*) ends with the active connection pointing to the
slave. Thus, the two last tests never succeed in changing the binlog
format of the master away from 'row'. With correct active connection
(master) tests fail for binlog 'statement' and 'mixed' formats.
Tests rpl_extra_col_master_* only run when binary log format is
row. Statement and mix replication do not make sense in this
tests since it will try to execute statements on columns that do
not exist. This fix is basically a backport from mysql-5.5, see
changes done as part of BUG 39934.
routines.
mysqldump in xml mode did not dump routines, events or
triggers.
This patch fixes this issue by fixing the if conditions
that disallowed the dump of above mentioned objects in
xml mode, and added the required code to enable dump
in xml format.
client/mysqldump.c:
BUG#11760384 - 52792: mysqldump in XML mode does not dump
routines.
Fixed some if conditions to allow execution of dump methods
for xml and further added the relevant code at places to produce
the dump in xml format.
mysql-test/r/mysqldump.result:
Added a test case for Bug#11760384.
mysql-test/t/mysqldump.test:
Added a test case for Bug#11760384.
If we meet DB_TOO_MANY_CONCURRENT_TRXS during the execution tab_create_graph from row_create_table_for_mysql(), .ibd file for the table should be created already but was not deleted for the error handling.
rb:875 approved by Jimmy Yang
If init_command was incorrect, we couldn't let users execute
queries, but we couldn't report the issue to the client either
as it does not expect error messages before even sending a
command. Thus, we simply disconnected them without throwing
a clear error.
We now go through the proper sequence once (without executing
any user statements) so we can report back what the problem
is. Only then do we disconnect the user.
As always, root remains unaffected by this as init_command is
(still) not executed for them.
mysql-test/r/init_connect.result:
We now report a proper error if init_command fails.
Expect as much.
mysql-test/t/init_connect.test:
We now report a proper error if init_command fails.
Expect as much.
sql/sql_connect.cc:
If init_command fails, throw an error explaining this to
the user.
The counter handler_read_key (SSV::ha_read_key_count) is incremented
incorrectly.
The mysql server maintains a per thread system_status_var (SSV)
object. This object contains among other things the counter
SSV::ha_read_key_count. The purpose of this counter is to measure the
number of requests to read a row based on a key (or the number of
index lookups).
This counter was wrongly incremented in the
ha_innobase::innobase_get_index(). The fix removes
this increment statement (for both innodb and innodb_plugin).
The various callers of the innobase_get_index() was checked to
determine if anybody must increment this counter (if they first call
innobase_get_index() and then perform an index lookup). It was found
that no caller of innobase_get_index() needs to worry about the
SSV::ha_read_key_count counter.
SMALL KEY CACHE
The server crashed on division by zero because the key cache was not
initialized and the block length was 0 which was used in a division.
The fix was to not allow CACHE INDEX if the key cache was not initiallized.
Thus never try LOAD INDEX INTO CACHE for an uninitialized key cache.
Also added some windows files/directories to .bzrignore.
AND HANG IN SHOW TABLE STATUS.
ISSUE: Table corruption due to concurrent queries.
Different threads running insert and check
query leads to table corruption. Not properly locked,
rows are inserted in between check query.
SOLUTION: In check query mutex lock is acquired
for a longer time to handle concurrent
insert and check query.
NOTE: Additionally we backported the fix for CHECKSUM
issue(bug#11758979).
rb://816
approved by: Marko Makela
The title is misleading. This bug was actually introduced by
bug 12635227 and was unearthed by a later optimization.
We need to free buf_page_t structs that we are allocating using
malloc() at shutdown.
The bug was accidentally fixed by fixing
Bug#11759688 52020: InnoDB can still deadlock on just INSERT...ON DUPLICATE KEY
a.k.a. the reintroduction of
Bug#7975 deadlock without any locking, simple select and update
a.k.a. Bug#7975 deadlock without any locking, simple select and update
Bug#7975 was reintroduced when the storage engine API was made
pluggable in MySQL 5.1. Instead of looking at thd->lex directly, we
rely on handler::extra(). But, we were looking at the wrong extra()
flag, and we were ignoring the TRX_DUP_REPLACE flag in places where we
should obey it.
innodb_replace.test: Add tests for hopefully all affected statement
types, so that bug should never ever resurface. This kind of tests
should have been added when fixing Bug#7975 in MySQL 5.0.3 in the
first place.
rb:806 approved by Sunny Bains
This was an attempt to address problems with the Bug#12612184 fix.
Even with this follow-up fix, crash recovery can be broken.
Let us fix the bug later.
In the ON UPDATE CASCADE clause of FOREIGN KEY constraints, the
calculated update vector was not fully initialized. This bug was
introduced in the InnoDB Plugin when implementing support for
ROW_FORMAT=DYNAMIC.
Additionally, the data type information was not initialized, but
apparently it has never been needed in this case. Nevertheless, it is
not good programming practice to pass uninitialized values around.
calc_row_difference(): Declare the update field uninitialized in
Valgrind. Copy the data type information as well, except when the
field is SQL NULL. In the built-in InnoDB, initialize
ufield->extern_storage = FALSE (an initialization bug that had gone
unnoticed this far). The InnoDB Plugin and later have this flag to
dfield_t and have always initialized it properly.
row_ins_cascade_calc_update_vec(): Reduce the scope of some
pointers. Initialize orig_len. (This caused the bug in InnoDB Plugin
and later.)
row_ins_foreign_check_on_constraint(): Simplify a condition. Declare
the update vector uninitialized.
rb:771 approved by Jimmy Yang
PARENT FOR OTHER ONE
Do not try to lookup key_nr'th key in 'table' because there may not be such
a key there. key_nr is the number of the key in the _child_ table name, not
in the parent table.
Instead just print the fields of the record that are covered by the first key
defined on the parent table.
This bug gets a better fix in MySQL 5.6, which is too risky for 5.1 and 5.5.
Approved by: Jon Olav Hauglid (via IM)
USING MYISAM_USE_MMAP ON WINDOWS
When OPTIMIZE/REPAIR TABLE is switching to new data file,
old data file is removed while memory mapping is still
active.
With 5.1 implementation of nt_share_delete() it is not
permitted to remove mmaped file.
This fix disables memory mapping for mi_repair() operations.
mysql-test/r/myisam.result:
A test case for BUG#11757032.
mysql-test/t/myisam.test:
A test case for BUG#11757032.
storage/myisam/ha_myisam.cc:
mi_repair*() functions family use file I/O even if memory
mapping is available.
Since mixing mmap I/O and file I/O may cause various artifacts,
memory mapping must be disabled.
storage/myisam/mi_delete_all.c:
Clean-up: do not attempt to remap file after truncate, since
there is nothing to map.
Buffer over-run on all platforms, crash on windows, wrong result on other platforms,
when rounding numbers which start with 999999999 and have
precision = 9 or 18 or 27 or 36 ...
mysql-test/r/type_newdecimal.result:
New test cases.
mysql-test/t/type_newdecimal.test:
New test cases.
sql/my_decimal.h:
Add sanity checking code, to catch buffer over/under-run.
strings/decimal.c:
The original initialization of intg1 (add 1 if buf[0] == DIG_MAX)
will set p1 to point outside the buffer, and the loop to copy the original value
while (buf0 < p0)
*(--p1) = *(--p0);
will overwrite memory outside the my_decimal object.