1
0
mirror of https://github.com/MariaDB/server.git synced 2025-10-18 09:50:33 +03:00
Commit Graph

5200 Commits

Author SHA1 Message Date
Marko Mäkelä
006a6dc3cc Bug#11766305 - 59392: Remove thr0loc.c and ibuf_inside() - cleanup
Remove the unused debug constants for the latching order level SYNC_THR_LOCAL.
2011-09-06 10:18:03 +03:00
Marko Mäkelä
9aeb3309af Merge mysql-5.1 to mysql-5.5. 2011-09-06 10:14:45 +03:00
Marko Mäkelä
7f48c174f0 Bug #12950803 62294: BUF_BUDDY_RELOCATE CALLS GETTIMEOFDAY ...
buf_buddy_relocate(): The ut_time_us() function is needed for
statistics, calculating the total time spent on relocating blocks.
Until now, we invoked ut_time_us() every time buf_buddy_relocate() was
called. Fix: Only call ut_time_us() when the block can be relocated.
After this fix, the reported relocated_usec will no longer include the
time for the page_hash lookup and for acquiring the block mutex.

Approved by Sunny Bains on IM
2011-09-06 10:08:52 +03:00
Marko Mäkelä
2f49da3fdc Bug#12547647 UPDATE LOGGING COULD EXCEED LOG PAGE SIZE - take 2
The original fix was accidentally pushed to mysql-5.1 after the 5.1.59
clone-off in bzr revision id
marko.makela@oracle.com-20110829081642-z0w992a0mrc62s6w with thne fix
of Bug#12704861 Corruption after a crash during BLOB update.
It was pushed separately to mysql-5.5 in bzr revision id
marko.makela@oracle.com-20110901184804-2901f6qmuro3jas8.

trx_undo_report_row_operation(): If the page for which the undo log
was too big was empty, commit and start the mini-transaction before
acquiring the rollback segment mutex and freeing the undo page. This
is necessary, because the mini-transaction may be holding lower-order
latches in the levels SYNC_FSP and SYNC_FSP_PAGE.

trx_undo_erase_page_end(): Erase also empty pages, because
trx_undo_report_row_operation() needs to commit the mini-transaction
before freeing the empty page.

rb:756 approved by Sunny Bains
2011-09-06 10:04:21 +03:00
Bjorn Munch
52d9e13ffc Bug #11750417 40942: UNABLE TO INSTALL FEDERATED PLUGIN
Link plugin with a copy of string.o
  Copied test from 5.5 but this was dysfunctional, made it work
  Also tested on Windows
2011-09-05 14:38:20 +02:00
Marko Mäkelä
247ada63af Bug#12547647 UPDATE LOGGING COULD EXCEED LOG PAGE SIZE
This fix was accidentally pushed to mysql-5.1 after the 5.1.59 clone-off in
bzr revision id marko.makela@oracle.com-20110829081642-z0w992a0mrc62s6w
with the fix of Bug#12704861 Corruption after a crash during BLOB update
but not merged to mysql-5.5 and upwards.

In the Barracuda formats, the clustered index record no longer
contains a prefix of off-page columns. Because of this, the undo log
must contain these prefixes, so that purge and multi-versioning will
continue to work. However, this also means that an undo log record can
become too big to fit in an undo log page. (It is a limitation of the
undo log that undo records cannot span across multiple pages.)

In case the checks for undo log size fail when CREATE TABLE or CREATE
INDEX is executed, we need a fallback that blocks a modification
operation when the undo log record would exceed the maximum size.

trx_undo_free_last_page_func(): Renamed from trx_undo_free_page_in_rollback().
Define the trx_t parameter only in debug builds.

trx_undo_free_last_page(): Wrapper for trx_undo_free_last_page_func().
Pass the trx_t parameter only in debug builds.

trx_undo_truncate_end_func(): Renamed from trx_undo_truncate_end().
Define the trx_t parameter only in debug builds. Rewrite a for(;;) loop
as a while loop for clarity.

trx_undo_truncate_end(): Wrapper for from trx_undo_truncate_end_func().
Pass the trx_t parameter only in debug builds.

trx_undo_erase_page_end(): Return TRUE if the page was non-empty
to begin with. Refuse to erase empty pages.

trx_undo_report_row_operation(): If the page for which the undo log
was too big was empty, free the undo page and return DB_TOO_BIG_RECORD.

rb:749 approved by Inaam Rana
2011-09-01 21:48:04 +03:00
Jimmy Yang
07770e7e8e Fix Bug 12922077 - SEGV IN DICT_SET_CORRUPTED_INDEX_CACHE_ONLY(), DROP TABLE
rb://752 approved by Sunny Bains
2011-08-29 02:44:28 -07:00
Marko Mäkelä
7295f6ea52 Merge mysql-5.5 to working copy. 2011-08-29 11:55:38 +03:00
Jimmy Yang
d7264adc8a Fix Bug #12882177 - INSTRUMENTATION IN PFS_MUTEX_ENTER_NOWAIT_FUNC IS INCORRECT
Use PSI_MUTEX_TRYLOCK instead of PSI_MUTEX_LOCK when acquire mutex with
"no_wait" option
      
Approved by Sunny Bains
2011-08-29 01:29:57 -07:00
Marko Mäkelä
7c20cd1b5d Merge mysql-5.1 to mysql-5.5. 2011-08-29 11:22:43 +03:00
Marko Mäkelä
41f229cd9e Bug#12704861 Corruption after a crash during BLOB update
The fix of Bug#12612184 broke crash recovery. When a record that
contains off-page columns (BLOBs) is updated, we must first write redo
log about the BLOB page writes, and only after that write the redo log
about the B-tree changes. The buggy fix would log the B-tree changes
first, meaning that after recovery, we could end up having a record
that contains a null BLOB pointer.

Because we will be redo logging the writes off the off-page columns
before the B-tree changes, we must make sure that the pages chosen for
the off-page columns are free both before and after the B-tree
changes. In this way, the worst thing that can happen in crash
recovery is that the BLOBs are written to free pages, but the B-tree
changes are not applied. The BLOB pages would correctly remain free in
this case. To achieve this, we must allocate the BLOB pages in the
mini-transaction of the B-tree operation. A further quirk is that BLOB
pages are allocated from the same file segment as leaf pages. Because
of this, we must temporarily "hide" any leaf pages that were freed
during the B-tree operation by "fake allocating" them prior to writing
the BLOBs, and freeing them again before the mtr_commit() of the
B-tree operation, in btr_mark_freed_leaves().

btr_cur_mtr_commit_and_start(): Remove this faulty function that was
introduced in the Bug#12612184 fix. The problem that this function was
trying to address was that when we did mtr_commit() the BLOB writes
before the mtr_commit() of the update, the new BLOB pages could have
overwritten clustered index B-tree leaf pages that were freed during
the update. If recovery applied the redo log of the BLOB writes but
did not see the log of the record update, the index tree would be
corrupted. The correct solution is to make the freed clustered index
pages unavailable to the BLOB allocation. This function is also a
likely culprit of InnoDB hangs that were observed when testing the
Bug#12612184 fix.

btr_mark_freed_leaves(): Mark all freed clustered index leaf pages of
a mini-transaction allocated (nonfree=TRUE) before storing the BLOBs,
or freed (nonfree=FALSE) before committing the mini-transaction.

btr_freed_leaves_validate(): A debug function for checking that all
clustered index leaf pages that have been marked free in the
mini-transaction are consistent (have not been zeroed out).

btr_page_alloc_low(): Refactored from btr_page_alloc(). Return the
number of the allocated page, or FIL_NULL if out of space. Add the
parameter "mtr_t* init_mtr" for specifying the mini-transaction where
the page should be initialized, or if this is a "fake allocation"
(init_mtr=NULL) by btr_mark_freed_leaves(nonfree=TRUE).

btr_page_alloc(): Add the parameter init_mtr, allowing the page to be
initialized and X-latched in a different mini-transaction than the one
that is used for the allocation. Invoke btr_page_alloc_low(). If a
clustered index leaf page was previously freed in mtr, remove it from
the memo of previously freed pages.

btr_page_free(): Assert that the page is a B-tree page and it has been
X-latched by the mini-transaction. If the freed page was a leaf page
of a clustered index, link it by a MTR_MEMO_FREE_CLUST_LEAF marker to
the mini-transaction.

btr_store_big_rec_extern_fields_func(): Add the parameter alloc_mtr,
which is NULL (old behaviour in inserts) and the same as local_mtr in
updates. If alloc_mtr!=NULL, the BLOB pages will be allocated from it
instead of the mini-transaction that is used for writing the BLOBs.

fsp_alloc_from_free_frag(): Refactored from
fsp_alloc_free_page(). Allocate the specified page from a partially
free extent.

fseg_alloc_free_page_low(), fseg_alloc_free_page_general(): Add the
parameter "mtr_t* init_mtr" for specifying the mini-transaction where
the page should be initialized, or NULL if this is a "fake allocation"
that prevents the reuse of a previously freed B-tree page for BLOB
storage. If init_mtr==NULL, try harder to reallocate the specified page
and assert that it succeeded.

fsp_alloc_free_page(): Add the parameter "mtr_t* init_mtr" for
specifying the mini-transaction where the page should be initialized.
Do not allow init_mtr == NULL, because this function is never to be
used for "fake allocations".

mtr_t: Add the operation MTR_MEMO_FREE_CLUST_LEAF and the flag
mtr->freed_clust_leaf for quickly determining if any
MTR_MEMO_FREE_CLUST_LEAF operations have been posted.

row_ins_index_entry_low(): When columns are being made off-page in
insert-by-update, invoke btr_mark_freed_leaves(nonfree=TRUE) and pass
the mini-transaction as the alloc_mtr to
btr_store_big_rec_extern_fields(). Finally, invoke
btr_mark_freed_leaves(nonfree=FALSE) to avoid leaking pages.

row_build(): Correct a comment, and add a debug assertion that a
record that contains NULL BLOB pointers must be a fresh insert.

row_upd_clust_rec(): When columns are being moved off-page, invoke
btr_mark_freed_leaves(nonfree=TRUE) and pass the mini-transaction as
the alloc_mtr to btr_store_big_rec_extern_fields(). Finally, invoke
btr_mark_freed_leaves(nonfree=FALSE) to avoid leaking pages.

buf_reset_check_index_page_at_flush(): Remove. The function
fsp_init_file_page_low() already sets
bpage->check_index_page_at_flush=FALSE.

There is a known issue in tablespace extension. If the request to
allocate a BLOB page leads to the tablespace being extended, crash
recovery could see BLOB writes to pages that are off the tablespace
file bounds. This should trigger an assertion failure in fil_io() at
crash recovery. The safe thing would be to write redo log about the
tablespace extension to the mini-transaction of the BLOB write, not to
the mini-transaction of the record update. However, there is no redo
log record for file extension in the current redo log format.

rb:693 approved by Sunny Bains
2011-08-29 11:16:42 +03:00
Marko Mäkelä
42ff786cf9 Merge mysql-5.1 to mysql-5.5. 2011-08-22 17:12:27 +03:00
Marko Mäkelä
49ee12d03b Bug #11766591 59733: POSSIBLE DEADLOCK WHEN BUFFERED CHANGES ARE TO BE DISCARDED
The fix in revision id marko.makela@oracle.com-20110815091143-h3zbvm0pv8ni3qql
introduced a false UNIV_SYNC_DEBUG alarm. Relax the assertion.
2011-08-22 17:03:07 +03:00
Jimmy Yang
d9c3d35437 In innobase_format_name() we should call innobase_convert_name() with
"!is_index_name" instead of "is_index_name", so the table name in the
error message would not be formated as index name.
2011-08-17 02:39:55 -07:00
Jimmy Yang
7abcb1dd08 Add two tests for "innodb_large_prefix" and "innodb_force_load_corrupted" in
sys_vars test suite.
2011-08-16 20:51:40 -07:00
Jimmy Yang
95fa7fab3b Fix bug #11830883, SUPPORT "CORRUPTED" BIT FOR INNODB TABLES AND INDEXES.
Also addressed issues in bug #11745133, where we could mark a table
corrupted instead of crashing the server when found a corrupted buffer/page
if the table created with innodb_file_per_table on.
2011-08-16 18:07:59 -07:00
Mats Kindahl
cf5e5f837a Merging into mysql-5.5.16-release. 2011-08-15 20:12:11 +02:00
Marko Mäkelä
7f03063418 Merge mysql-5.1 to mysql-5.5. Add a test case. 2011-08-15 12:18:34 +03:00
Marko Mäkelä
669ff03703 Bug #11766591 59733: Possible deadlock when buffered changes are to be
discarded in buf_page_create()

This bug turned out to be a false alarm, a bug in the UNIV_SYNC_DEBUG
diagnostic code. Because of this, the patch was not backported to the
built-in InnoDB in MySQL 5.1. Furthermore, there is no test case for
InnoDB Plugin in MySQL 5.1, because the delete buffering in MySQL 5.5
makes triggering the failure much easier.

When a freed page for which there exist orphaned buffered changes is
allocated and reused for something else, buf_page_create() will discard
the buffered changes by invoking ibuf_merge_or_delete_for_page().
This would violate the InnoDB latching order.

Tweak the latching order as follows. Move SYNC_IBUF_MUTEX below
SYNC_FSP_PAGE, where it logically belongs, and assign new latching
levels for the ibuf->index->lock and the insert buffer B-tree pages:

#define SYNC_IBUF_MUTEX		370	/* ibuf_mutex */
#define SYNC_IBUF_INDEX_TREE	360
#define SYNC_IBUF_TREE_NODE_NEW	359
#define SYNC_IBUF_TREE_NODE	358

btr_block_get(), btr_page_get(): In UNIV_SYNC_DEBUG, add the parameter
"index" for determining the appropriate latching order
(SYNC_IBUF_TREE_NODE or SYNC_TREE_NODE).

btr_page_alloc_for_ibuf(), btr_create(): Use SYNC_IBUF_TREE_NODE_NEW
instead of SYNC_TREE_NODE_NEW for insert buffer pages.

btr_cur_search_to_nth_level(), btr_pcur_restore_position_func(): Use
SYNC_IBUF_TREE_NODE instead of SYNC_TREE_NODE for insert buffer pages.

btr_search_guess_on_hash(): Assert that the index is not an insert buffer tree.

dict_index_add_to_cache(): Use SYNC_IBUF_INDEX_TREE for the insert
buffer tree (ibuf->index->lock).

ibuf0ibuf.c: Use SYNC_IBUF_TREE_NODE or SYNC_IBUF_TREE_NODE_NEW for
all B-tree pages.

ibuf_merge_or_delete_for_page(): Assert that the user page is
BUF_IO_READ fixed. Only in this way it is OK to latch it as
SYNC_IBUF_TREE_NODE instead of the proper SYNC_TREE_NODE (which would
violate the changed latching order).

sync_thread_add_level(): Remove the special tweak for
SYNC_IBUF_MUTEX. Add rules for the added latching levels.

rb:591 approved by Jimmy Yang
2011-08-15 12:11:43 +03:00
Marko Mäkelä
c8163a16fa Merge mysql-5.1-security to mysql-5.5-security. 2011-08-10 15:03:33 +03:00
Marko Mäkelä
f2f4b19678 Bug#12626794 61240: UNUSED FUNCTIONS ... 2011-08-10 14:56:14 +03:00
Marko Mäkelä
79aa9c177b Merge mysql-5.1 to mysql-5.5. 2011-08-10 12:58:22 +03:00
Marko Mäkelä
7645c5ee90 Bug#12835650 VARCHAR maximum length performance impact
row_sel_field_store_in_mysql_format(): Do not pad the unused part of
the buffer reserved for a True VARCHAR column (introduced in 5.0.3).
Add Valgrind instrumentation ensuring that the unused part will be
flagged uninitialized.

row_sel_copy_cached_field_for_mysql(): New function: Copy a field
that is in the MySQL row format, not copying the unused tail of
VARCHAR columns.

row_sel_pop_cached_row_for_mysql(): Invoke
row_sel_copy_cached_field_for_mysql() for copying fields.
When the row is long, copy it field-by-field.

rb:715 approved by Inaam Rana
2011-08-10 12:25:24 +03:00
Marko Mäkelä
5962cadcfa Merge mysql-5.1 to mysql-5.5. 2011-08-08 12:16:15 +03:00
Marko Mäkelä
6f8a80270c Bug#12770537 I_S.TABLES.DATA_LENGTH does not show on-disk size
for compressed InnoDB tables

ha_innodb::info_low(): For calculating data_length or index_length,
use the compressed page size for compressed tables instead of UNIV_PAGE_SIZE.

rb:714 approved by Sunny Bains
2011-08-08 11:22:18 +03:00
Dmitry Lenev
0b5b1dd197 Fix for bug #11754210 - "45777: CHECK TABLE DOESN'T
SHOW ALL PROBLEMS FOR MERGE TABLE COMPLIANCE IN 5.1".

The problem was that CHECK/REPAIR TABLE for a MERGE table which
had several children missing or in wrong engine reported only
issue with the first such table in its result-set. While in 5.0
this statement returned the whole list of problematic tables.

Ability to report problems for all children was lost during
significant refactorings of MERGE code which were done as part
of work on 5.1 and 5.5 releases.

This patch restores status quo ante refactorings by changing
code in such a way that:
1) Failure to open child table due to its absence during CHECK/
   REPAIR TABLE for a MERGE table is not reported immediately
   when its absence is discovered in open_tables(). Instead
   handling/error reporting in such a situation is postponed
   until the moment when children are attached.
2) Code performing attaching of children no longer stops when
   it encounters first problem with one of the children during
   CHECK/REPAIR TABLE. Instead it continues iteration through
   the child list until all problems caused by child absence/
   wrong engine are reported.

Note that even after this change problem with mismatch of
child/parent definition won't be reported if there is also
another child missing, but this is how it was in 5.0 as well.

mysql-test/r/merge.result:
  Added test case for bug #11754210 - "45777: CHECK TABLE DOESN'T
  SHOW ALL PROBLEMS FOR MERGE TABLE COMPLIANCE IN 5.1".
  Adjusted results of existing tests to the fact that CHECK/REPAIR
  TABLE statements now try to report problems about missing table/
  wrong engine for all underlying tables, and to the fact that
  mismatch of parent/child definitions is always reported as an
  error and not a warning.
mysql-test/t/merge.test:
  Added test case for bug #11754210 - "45777: CHECK TABLE DOESN'T
  SHOW ALL PROBLEMS FOR MERGE TABLE COMPLIANCE IN 5.1".
sql/sql_base.cc:
  Changed code responsible for opening tables to ignore the fact
  that underlying tables of a MERGE table are missing, if this
  table is opened for CHECK/REPAIR TABLE.
  The absence of underlying tables in this case is now detected and
  appropriate error is reported at the point when child tables are
  attached. At this point we can produce full list of problematic
  child tables/errors to be returned as part of CHECK/REPAIR TABLE
  result-set.
storage/myisammrg/ha_myisammrg.cc:
  Changed myisammrg_attach_children_callback() to handle new
  situation, when during CHECK/REPAIR TABLE we do not report 
  error about missing child immediately when this fact is 
  discovered during open_tables() but postpone error-reporting
  till the time when children are attached. 
  Also this callback is now responsible for pushing an error
  mentioning problematic child table to the list of errors to 
  be reported by CHECK/REPAIR TABLE statements.
  Finally, since now myrg_attach_children() no longer relies on
  return value from callback to determine the end of the children
  list, callback no longer needs to set my_errno value and can
  be simplified.
  
  Changed myrg_print_wrong_table() to always report a problem
  with child table as an error and not as a warning. This makes
  reporting for different types of issues with child tables
  more consistent and compatible with 5.0 behavior.
storage/myisammrg/myrg_open.c:
  Changed code in myrg_attach_children() not to abort on the
  first problem with a child table when attaching children to
  parent MERGE table during CHECK/REPAIR TABLE statement 
  execution. This allows CHECK/REPAIR TABLE to report problems 
  about absence/wrong engine for all underlying tables as
  part of their result-set.
2011-07-22 16:31:10 +04:00
Inaam Rana
fce189fadb Merge from 5.1 the fix for Bug 12356373 2011-07-19 10:54:59 -04:00
Inaam Rana
aa3d29681e Bug 12356373 - PERFORMANCE REGRESSION FROM 5.1 TO 5.5 : GROUP BY:
The title of the bug is a little confusing. The actual fix is to
reintroduce random readahead inside InnoDB with a dynamic, global
switch innodb_random_read_ahead [default = off].

Approved by: Sunny Bains
rb://696
2011-07-19 10:37:37 -04:00
unknown
438d21189c Null Merge from mysql-5.1 with second fix for Bug#12637786
Bug#12637786 was fixed with rb:692 by marko.  But that fix has a remaining
bug.  It added this assert;
    ut_ad(ind_field->prefix_len);
before a section of code that assumes there is a prefix_len.  

The patch replaced code that explicitly avoided this with a check for
prefix_len.  It turns out that the purge thread can get to that assert
without a prefix_len because it does not use a row_ext_t* .
When UNIV_DEBUG is not defined, the affect of this is that the purge thread
sets the dfield->len to zero and then cannot find the entry in the index to
purge.  So secondary index entries remain unpurged.

This patch does not do the assert.  Instead, it uses
    'if (ind_field->prefix_len) {...}'
around the section of code that assumes a prefix_len.  This is the way the
patch I provided to Marko did it.

The test case is simply modified to do a sleep(10) in order to give the
purge thread a chance to run. Without the code change to row0row.c, this
modified testcase will assert if InnoDB was compiled with UNIV_DEBUG.
I tried to sleep(5), but it did not always assert.
2011-07-08 08:16:23 -05:00
unknown
6cc0f6a22b Bug#12637786 was fixed with rb:692 by marko. But that fix has a remaining
bug.  It added this assert;
    ut_ad(ind_field->prefix_len);
before a section of code that assumes there is a prefix_len.  

The patch replaced code that explicitly avoided this with a check for
prefix_len.  It turns out that the purge thread can get to that assert
without a prefix_len because it does not use a row_ext_t* .
When UNIV_DEBUG is not defined, the affect of this is that the purge thread
sets the dfield->len to zero and then cannot find the entry in the index to
purge.  So secondary index entries remain unpurged.

This patch does not do the assert.  Instead, it uses
    'if (ind_field->prefix_len) {...}'
around the section of code that assumes a prefix_len.  This is the way the
patch I provided to Marko did it.

The test case is simply modified to do a sleep(10) in order to give the
purge thread a chance to run. Without the code change to row0row.c, this
modified testcase will assert if InnoDB was compiled with UNIV_DEBUG.
I tried to sleep(5), but it did not always assert.
2011-07-07 16:29:30 -05:00
Davi Arnaut
71e0ff64f0 Bug#12727287: Maintainer mode compilation fails with gcc 4.6
GCC 4.6 has new -Wunused-but-set-variable flag, which is enabled
by -Wall, that causes GCC to emit a warning whenever a local variable
is assigned to, but otherwise unused (aside from its declaration).

Since the maintainer mode uses -Wall and -Werror, source code which
triggers these warnings will be rejected. That is, these warnings
become hard errors.

The solution is to fix the code which triggers these specific warnings.
In most of the cases, this is a welcome cleanup as code which triggers
this warning is probably dead anyway.

dbug/dbug.c:
  Unused but set.
libmysqld/lib_sql.cc:
  Length is not necessary as the converted error message is always
  null-terminated.
sql/item_func.cc:
  Make get_var_with_binlog private to this compilation unit.
  If a error was raised, do not attempt to evaluate the user
  variable as the statement execution will be interrupted
  anyway.
sql/mysqld.cc:
  Use a void expression to silence the warning. Avoids the use of
  macros that would make the code more unreadable than it already is.
sql/protocol.cc:
  Length is not necessary as the converted error message is always
  null-terminated. Remove unnecessary casts and assignment.
sql/sql_class.h:
  Function is only used in a single compilation unit.
sql/sql_load.cc:
  Only use the variable outside of EMBEDDED_LIBRARY.
storage/innobase/btr/btr0cur.c:
  Do not retrieve field, only the record length is being used.
storage/perfschema/pfs.cc:
  Use a void expression to silence the warning.
tests/mysql_client_test.c:
  Unused but set.
unittest/mysys/lf-t.c:
  Unused but set.
2011-07-07 08:22:43 -03:00
unknown
7d605ec45f Merge from mysql-5.5.14-release 2011-07-06 01:13:50 +02:00
Karen Langford
f6398a86dd Merge from mysql-5.1.58-release 2011-07-06 00:56:51 +02:00
Kent Boortz
e7c781e58b Updated/added copyright headers 2011-07-03 20:08:47 +02:00
Kent Boortz
027b5f1ed4 Updated/added copyright headers 2011-07-03 17:47:37 +02:00
Kent Boortz
68f00a5686 Updated/added copyright headers 2011-06-30 17:37:13 +02:00
Marko Mäkelä
eeb028bbc1 Bug#12637786 Wrong secondary index entries on CHAR and VARCHAR columns
row_build_index_entry(): In innodb_file_format=Barracuda
(ROW_FORMAT=DYNAMIC or ROW_FORMAT=COMPRESSED), a secondary index on a
full column can refer to a field that is stored off-page in the
clustered index record. Take that into account.

rb:692 approved by Jimmy Yang
2011-06-30 13:18:54 +03:00
Marko Mäkelä
e25fb73be2 Bug #12612184 BLOB debug code cleanup: Forgot an #if
around the declaration of trx_assert_recovered().
2011-06-29 16:48:41 +03:00
Marko Mäkelä
0f37ccb30f Bug #12612184 BLOB debug code cleanup:
Refactor the !rec_offs_any_extern relaxation in row_build().

trx_assert_active(trx_id): Assert that the given transaction is active.
(In the 5.1 built-in InnoDB, there is no trx->is_recovered field.)

trx_assert_recovered(trx_id): Assert that the given transaction is
active and has been recovered after a crash.

row_build(): Replace a bunch of code with an assertion that invokes
trx_assert_active() or trx_assert_recovered() and row_get_rec_trx_id().

row_get_trx_id_offset(): Make the function inlined. Remove the unused
parameter rec, and make all parameters const.

row_get_rec_trx_id(), row_get_rec_roll_ptr(): Make all parameters const.

rb:691 approved by Jimmy Yang
2011-06-29 09:57:15 +03:00
Marko Mäkelä
0c54d44fef Bug#12595087 - 61191: Question about page_zip_available (clean up page0zip.c)
page_zip_dir_elems(): New function, refactored from page_zip_dir_size().

page_zip_dir_size(): Use page_zip_dir_elems()

page_zip_dir_start_offs(): New function: Gets an offset to the
compressed page trailer (the dense page directory), including deleted
records (the free list)

page_zip_dir_start_low(page_zip, n_dense): Constness-preserving
wrapper macro for page_zip_dir_start_offs().

page_zip_dir_start(page_zip): Constness-preserving
wrapper macro for page_zip_dir_start_offs().

page_zip_decompress_node_ptrs(), page_zip_decompress_clust(): Replace
a formula with a fully equivalent page_zip_dir_start_low() call.

page_zip_write_rec(), page_zip_parse_write_node_ptr(),
page_zip_write_node_ptr(), page_zip_write_trx_id_and_roll_ptr(),
page_zip_clear_rec(): Replace a formula with an almost equivalent
page_zip_dir_start() call.
It is OK to replace page_dir_get_n_heap(page) with
page_dir_get_n_heap(page_zip->data), because
ut_ad(page_zip_header_cmp(page_zip, page)) or
page_zip_validate(page_zip, page) asserts that the
page headers are identical.

rb:687 approved by Jimmy Yang
2011-06-28 11:57:09 +03:00
Inaam Rana
e60e65052f Bug 12635227 - 61188: DROP TABLE EXTREMELY SLOW
approved by: Marko
rb://681

Coalescing of free buf_page_t descriptors can prove to be one severe
bottleneck in performance of compression. One such workload where it
hurts badly is DROP TABLE. This patch removes buf_page_t allocations
from buf_buddy and uses ut_malloc instead.
In order to further reduce overhead of colaescing we no longer attempt
to coalesce a block if the corresponding free_list is less than 16 in
size.
2011-06-17 16:20:20 -04:00
Vasil Dimov
0dfe86f53f Silence bogus compiler warning introduced in
marko.makela@oracle.com-20110616072721-8bo92ctixq6eqavr
2011-06-16 16:11:43 +03:00
Marko Mäkelä
b0fc27dc0a Bug #61341 buf_LRU_insert_zip_clean can be O(N) on LRU length
The buf_pool->zip_clean list is only needed for debugging, or for
recomputing buf_pool->page_hash when resizing the buffer pool. Buffer
pool resizing was never fully implemented. Remove the resizing code,
and define buf_pool->zip_clean only in debug builds.

buf_pool->zip_clean, buf_LRU_insert_zip_clean(): Enclose in
#if defined UNIV_DEBUG || UNIV_BUF_DEBUG.

buf_chunk_free(), buf_chunk_all_free(), buf_pool_shrink(),
buf_pool_page_hash_rebuild(), buf_pool_resize(): Remove (unreachable code).

rb:671 approved by Inaam Rana
2011-06-16 14:55:46 +03:00
Marko Mäkelä
e4aa6667a1 Bug#12595087 - 61191: Question about page_zip_available
There is an apparent problem with page_zip_clear_rec().
In btr_cur_optimistic_update() we do this:

	page_cur_delete_rec(page_cursor, index, offsets, mtr);
...
	rec = btr_cur_insert_if_possible(cursor, new_entry, 0/*n_ext*/, mtr);
	ut_a(rec); /* <- We calculated above the insert would fit */

The problem is that page_cur_delete_rec() could fill the modification
log while doing page_zip_clear_rec(), requiring recompression for the
btr_cur_insert_if_possible(). In a pathological case, the data could
fail to recompress.

page_zip_clear_rec(): Leave the page modification log alone. Only
clear the necessary fields.

rb:673 approved by Jimmy Yang
2011-06-16 14:22:12 +03:00
Marko Mäkelä
417a267927 Re-enable the debug assertions for Bug#12650861.
Replace UNIV_BLOB_NULL_DEBUG with UNIV_DEBUG||UNIV_BLOB_LIGHT_DEBUG. 
Fix known bogus failures.

btr_cur_optimistic_update(): If rec_offs_any_null_extern(), assert that
the current transaction is an incomplete transaction that is being
rolled back in crash recovery.

row_build(): If rec_offs_any_null_extern(), assert that the transaction
that last updated the record was recovered during crash recovery
(and will soon be rolled back).
2011-06-16 11:51:04 +03:00
Marko Mäkelä
5b4ceba58d Bug#12612184 Race condition after btr_cur_pessimistic_update()
btr_cur_compress_if_useful(), btr_compress(): Add the parameter ibool
adjust. If adjust=TRUE, adjust the cursor position after compressing
the page.

btr_lift_page_up(): Return a pointer to the father page.

BTR_KEEP_POS_FLAG: A new flag for btr_cur_pessimistic_update().

btr_cur_pessimistic_update(): If *big_rec != NULL and flags &
BTR_KEEP_POS_FLAG, keep the cursor positioned on the updated record.
Also, do not release the index tree x-lock if *big_rec != NULL.

btr_cur_mtr_commit_and_start(): Commits and restarts a
mini-transaction so that it will retain an x-lock on index->lock and
the page of the cursor. This is invoked when
btr_cur_pessimistic_update() returns *big_rec != NULL.

In all callers of btr_cur_pessimistic_update() that do not pass
BTR_KEEP_POS_FLAG, assert that *big_rec == NULL.

btr_cur_compress(): Unused function [in the built-in MySQL 5.1], remove.

page_rec_get_nth(): Return the nth record on the page (an inverse
function of page_rec_get_n_recs_before()). Refactored from
page_get_middle_rec().

page_get_middle_rec(): Invoke page_rec_get_nth().

page_cur_insert_rec_zip_reorg(): Make use of the page directory
shortcuts in page_rec_get_nth() instead of scanning the whole list of
records.

row_ins_clust_index_entry_by_modify(): Pass BTR_KEEP_POS_FLAG to
btr_cur_pessimistic_update().

row_ins_index_entry_low(): If row_ins_clust_index_entry_by_modify()
returns a big_rec, invoke btr_cur_mtr_commit_and_start() in order to
commit and start the mini-transaction without releasing the x-locks on
index->lock and the cursor page, and write the big_rec. Releasing the
page latch in mtr_commit() caused a race condition.

row_upd_clust_rec(): Pass BTR_KEEP_POS_FLAG to
btr_cur_pessimistic_update(). If it returns a big_rec, invoke
btr_cur_mtr_commit_and_start() in order to commit and start the
mini-transaction without releasing the x-locks on index->lock and the
cursor page, and write the big_rec. Releasing the page latch in
mtr_commit() caused a race condition.

sync_thread_add_level(): Add the parameter ibool relock. When TRUE,
bypass the latching order rules.

rw_lock_add_debug_info(): For nested X-lock requests, pass relock=TRUE
to sync_thread_add_level().

rb:678 approved by Jimmy Yang
2011-06-16 10:27:21 +03:00
Marko Mäkelä
a862937699 Introduce UNIV_BLOB_NULL_DEBUG for temporarily hiding Bug#12650861.
Some ut_a(!rec_offs_any_null_extern()) assertion failures are indicating
genuine BLOB bugs, others are bogus failures when rolling back incomplete
transactions at crash recovery. This needs more work, and until I get a
chance to work on it, other testing must not be disrupted by this.
2011-06-15 10:16:59 +03:00
Marko Mäkelä
98d527d3cb Merge a fix from mysql-5.5 to mysql-5.1:
revno 2995.37.209
revision id marko.makela@oracle.com-20110518120508-qhn7vz814vn77v5k
parent marko.makela@oracle.com-20110517121555-lmple24qzxqkzep4
timestamp: Wed 2011-05-18 15:05:08 +0300
message:
  Fix a bogus UNIV_SYNC_DEBUG failure in the fix of Bug #59641
  or Oracle Bug #11766513.

  trx_undo_free_prepared(): Do not acquire or release trx->rseg->mutex.
  This code is invoked in the single-threaded part of shutdown, therefore
  a mutex is not needed.
2011-06-14 08:40:32 +03:00
Marko Mäkelä
4412b5dab6 Disable a debug assertion that was added to track down Bug#12612184.
row_build(): The record may contain null BLOB pointers when the server
is rolling back an insert that was interrupted by a server crash.
2011-06-09 21:50:41 +03:00
Marko Mäkelä
6348b7375a BLOB instrumentation for Bug#12612184 Race condition in row_upd_clust_rec()
If UNIV_DEBUG or UNIV_BLOB_LIGHT_DEBUG is enabled, add
!rec_offs_any_null_extern() assertions, ensuring that records do not
contain null pointers to externally stored columns in inappropriate
places.

btr_cur_optimistic_update(): Assert !rec_offs_any_null_extern().
Incomplete records must never be updated or deleted. This assertion
will cover also the pessimistic route.

row_build(): Assert !rec_offs_any_null_extern(). Search tuples must
never be built from incomplete index entries.

row_rec_to_index_entry(): Assert !rec_offs_any_null_extern() unless
ROW_COPY_DATA is requested. ROW_COPY_DATA is used for
multi-versioning, and therefore it might be valid to copy the most
recent (uncommitted) version while it contains a null pointer to
off-page columns.

row_vers_build_for_consistent_read(),
row_vers_build_for_semi_consistent_read(): Assert !rec_offs_any_null_extern()
on all versions except the most recent one.

trx_undo_prev_version_build(): Assert !rec_offs_any_null_extern() on
the previous version.

rb:682 approved by Sunny Bains
2011-06-09 13:31:15 +03:00