Also addressed issues in bug #11745133, where we could mark a table
corrupted instead of crashing the server when found a corrupted buffer/page
if the table created with innodb_file_per_table on.
The innoDB global variable srv_lower_case_table_names is set to the value of lower_case_table_names declared in mysqld.h server in ha_innodb.cc. Since this variable can change at runtime, it is reset for each handler call to ::create, ::open, ::rename_table & ::delete_table.
But it is possible for tables to be implicitly opened before an explicit handler call is made when an engine is first started or restarted. I was able to reproduce that with the testcase in this patch on a version of InnoDB from 2 weeks ago. It seemed like the change buffer entries for the secondary key was getting put into pages after the restart. (But I am not sure, I did not write down the call stack while it was reproducing.) In the current code, the implicit open, which is actually a call to dict_load_foreigns(), does not occur with this testcase.
The change is to replace srv_lower_case_table_names by an interface function in innodb.cc that retrieves the server global variable when it is needed.
Remove the slot_no member of struct thr_local_struct.
enum srv_thread_type: Remove unused thread types.
srv_get_thread_type(): Unused function, remove.
thr_local_get_slot_no(), thr_local_set_slot_no(): Remove.
srv_thread_type_validate(), srv_slot_get_type(): New functions, for debugging.
srv_table_reserve_slot(): Return the srv_slot_t* directly. Do not create
thread-local storage.
srv_suspend_thread(): Get the srv_slot_t* as parameter. Return void;
the caller knows slot->event already.
srv_thread_has_reserved_slot(), srv_release_threads(): Assert
srv_thread_type_validate(type).
srv_init(): Use mem_zalloc() instead of mem_alloc(). Replace
srv_table_get_nth_slot(), because it now asserts that the kernel_mutex
is being held.
srv_master_thread(), srv_purge_thread(): Remember the slot from
srv_table_reserve_slot().
rb:629 approved by Inaam Rana
Bug #11766501: Multiple RBS break the get rseg with mininum trx_t::no code during purge
Bug# 59291 changes:
Main problem is that truncating the UNDO log at the completion of every
trx_purge() call is expensive as the number of rollback segments is increased.
We truncate after a configurable amount of pages. The innodb_purge_batch_size
parameter is used to control when InnoDB does the actual truncate. The truncate
is done once after 128 (or TRX_SYS_N_RSEGS iterations). In other words we
truncate after purge 128 * innodb_purge_batch_size. The smaller the batch
size the quicker we truncate.
Introduce a new parameter that allows how many rollback segments to use for
storing REDO information. This is really step 1 in allowing complete control
to the user over rollback space management.
New parameters:
i) innodb_rollback_segments = number of rollback_segments to use
(default is now 128) dynamic parameter, can be changed anytime.
Currently there is little benefit in changing it from the default.
Optimisations in the patch.
i. Change the O(n) behaviour of trx_rseg_get_on_id() to O(log n)
Backported from 5.6. Refactor some of the binary heap code.
Create a new include/ut0bh.ic file.
ii. Avoid truncating the rollback segments after every purge.
Related changes that were moved to a separate patch:
i. Purge should not do any flushing, only wait for space to be free so that
it only does purging of records unless it is held up by a long running
transaction that is preventing it from progressing.
ii. Give the purge thread preference over transactions when acquiring the
rseg->mutex during commit. This to avoid purge blocking unnecessarily
when getting the next rollback segment to purge.
Bug #11766501 changes:
Add the rseg to the min binary heap under the cover of the kernel mutex and
the binary heap mutex. This ensures the ordering of the min binary heap.
The two changes have to be committed together because they share the same
that fixes both issues.
rb://567 Approved by: Inaam Rana.
"rows examined" estimates". This change implements "innodb_stats_method"
with options of "nulls_equal", "nulls_unequal" and "null_ignored".
rb://553 approved by Marko
Check whether the master and purge thread are active after creating them. Do
not proceed until both threads have started. We do this by checking whether a
slot has been reserved by both the respective threads.
Add srv_thread_has_reserved_slot() returns slot no or ULINT_UNDEFINED.
rb://536 Approved by Jimmy
InnoDB does not attempt to handle lower_case_table_names == 2 when looking
up foreign table names and referenced table name. It turned that server
variable into a boolean and ignored the possibility of it being '2'.
The setting lower_case_table_names == 2 means that it should be stored and
displayed in mixed case as given, but compared internally in lower case.
Normally the server deals with this since it stores table names. But
InnoDB stores referential constraints for the server, so it needs to keep
track of both lower case and given names.
This solution creates two table name pointers for each foreign and referenced
table name. One to display the name, and one to look it up. Both pointers
point to the same allocated string unless this setting is 2. So the overhead
added is not too much.
Two functions are created in dict0mem.c to populate the ..._lookup versions
of these pointers. Both dict_mem_foreign_table_name_lookup_set() and
dict_mem_referenced_table_name_lookup_set() are called 5 times each.
Add new function os_cond_wait_timed(). Change the os_thread_sleep() calls
to timed conditional waits. Signal the background threads during the shutdown
phase so that we avoid waiting for the sleep to timeout thus saving some time.
rb://439 -- Approved by Jimmy Yang
This patch was originally developed by Vladislav Vaintroub.
The main changes are:
* Use TryEnterCriticalSection in os_fast_mutex_trylock().
* Use lightweight condition variables on Vista or later Windows;
but fall back to events on older Windows, such as XP.
This patch also fixes the following bugs:
bug# 52102 InnoDB Plugin shows performance drop compared to InnoDB
on Windows
bug# 53204 os_fastmutex_trylock is implemented incorrectly on Windows
rb://363 approved by Inaam Rana
and innodb_file_format_max two system variables. And this also fixes
bug #53654 after 2nd shutdown innodb_file_format_check attains strange
values.
rb://366 approved by Marko
in the code but they have nothing to do with the kernel mutex split code.
Some subsequent commits use the new functions. This patch has been tested
with: ./mtr --suite=innodb with UNIV_DEBUG and UNIV_SYNC_DEBUG enabled.
All tests were successful.
and also applying 5.1-ss6355
Detailed revision comments:
r6324 | jyang | 2009-12-17 06:54:24 +0200 (Thu, 17 Dec 2009) | 8 lines
branches/5.1: Fix bug #47814 - Diagnostics are frequently not
printed after a long lock wait in InnoDB. Separate out the
lock wait timeout check thread from monitor information
printing thread.
rb://200 Approved by Marko.
r6349 | marko | 2009-12-22 11:09:54 +0200 (Tue, 22 Dec 2009) | 3 lines
branches/5.1: lock_print_info_summary(): Remove a reference to
innobase_mysql_end_print_arbitrary_thd() that should have been
removed in r6347 when removing the function.
r6350 | marko | 2009-12-22 11:11:09 +0200 (Tue, 22 Dec 2009) | 1 line
branches/5.1: Remove an obsolete declaration of LOCK_thread_count.
-------------------------------------------------------------
revno: 2877
committer: Davi Arnaut <Davi.Arnaut@Sun.COM>
branch nick: 35164-6.0
timestamp: Wed 2008-10-15 19:53:18 -0300
message:
Bug#35164: Large number of invalid pthread_attr_setschedparam calls
Bug#37536: Thread scheduling causes performance degradation at low thread count
Bug#12702: Long queries take 100% of CPU and freeze other applications under Windows
The problem is that although having threads with different priorities
yields marginal improvements [1] in some platforms [2], relying on some
statically defined priorities (QUERY_PRIOR and WAIT_PRIOR) to play well
(or to work at all) with different scheduling practices and disciplines
is, at best, a shot in the dark as the meaning of priority values may
change depending on the scheduling policy set for the process.
Another problem is that increasing priorities can hurt other concurrent
(running on the same hardware) applications (such as AMP) by causing
starvation problems as MySQL threads will successively preempt lower
priority processes. This can be evidenced by Bug#12702.
The solution is to not change the threads priorities and rely on the
system scheduler to perform its job. This also enables a system admin
to increase or decrease the scheduling priority of the MySQL process,
if intended.
Furthermore, the internal wrappers and code for changing the priority
of threads is being removed as they are now unused and ancient.
1. Due to unintentional side effects. On Solaris this could artificially
help benchmarks as calling the priority changing syscall millions of
times is more beneficial than the actual setting of the priority.
2. Where it actually works. It has never worked on Linux as the default
scheduling policy SCHED_OTHER only accepts the static priority 0.
bzr branch mysql-5.1-performance-version mysql-trunk # Summit
cd mysql-trunk
bzr merge mysql-5.1-innodb_plugin # which is 5.1 + Innodb plugin
bzr rm innobase # remove the builtin
Next step: build, test fixes.
BUG#42101 - Race condition in innodb_commit_concurrency
Detailed revision comments:
r4994 | marko | 2009-05-14 15:04:55 +0300 (Thu, 14 May 2009) | 18 lines
branches/5.1: Prevent a race condition in innobase_commit() by ensuring
that innodb_commit_concurrency>0 remains constant at run time. (Bug #42101)
srv_commit_concurrency: Make this a static variable in ha_innodb.cc.
innobase_commit_concurrency_validate(): Check that innodb_commit_concurrency
is not changed from or to 0 at run time. This is needed, because
innobase_commit() assumes that innodb_commit_concurrency>0 remains constant.
Without this limitation, the checks for innodb_commit_concurrency>0
in innobase_commit() should be removed and that function would have to
acquire and release commit_cond_m at least twice per invocation.
Normally, innodb_commit_concurrency=0, and introducing the mutex operations
would mean significant overhead.
innodb_bug42101.test, innodb_bug42101-nonzero.test: Test cases.
rb://123 approved by Heikki Tuuri
1) BUG#43660 - SHOW INDEXES/ANALYZE does NOT update cardinality
for indexes of InnoDB table
Detailed revision comments:
r4699 | vasil | 2009-04-09 14:01:52 +0300 (Thu, 09 Apr 2009) | 15 lines
branches/5.1:
Fix Bug#43660 SHOW INDEXES/ANALYZE does NOT update cardinality for indexes
of InnoDB table
by replacing the PRNG that is used to pick random pages with a better
one.
This is based on r4670 but also adds a new configuration option and
enables the fix only if this option is changed. Please skip the present
revision when merging.
Approved by: Heikki (via email)
1) Buffer Pool size now defaults to 1GB and a minimum of 64MB
2) Log files are 3 by default and each 128MB in size
3) Removed innodb-file-io-threads config variable since no longer used
4) Set read io and write io threads to 8 by default for better default performance
5) Set log buffer size to 16 MB by default and minimum to 2MB
6) Set additional memory buffer pool to 8 MB and minimum to 2MB
7) Set max dirty percent to 75% and decreased to 99% to never allow a completely dirty buffer pool
8) Increased io capacity to 200 for a good default
Detailed description of changes:
r2902 | vasil | 2008-10-28 12:10:25 +0200 (Tue, 28 Oct 2008) | 10 lines
branches/5.1:
Fix Bug#38189 innodb_stats_on_metadata missing
Make the variable innodb_stats_on_metadata visible to the users and
also settable at runtime. Previously it was only "visible" as a command
line startup option to mysqld.
Approved by: Marko (https://svn.innodb.com/rb/r/36)
parameter innodb_thread_concurrency_timer_based is used to
get this new feature (it is set by default). The new feature
is only available on platforms where atomic instructions are
available.
Bug#36600 and Bug#36793:
Bug #36600 SHOW STATUS takes a lot of CPU in buf_get_latched_pages_number
Fix by removing the Innodb_buffer_pool_pages_latched variable from SHOW
STATUS output in non-UNIV_DEBUG compilation.
Bug #36793 rpl_innodb_bug28430 fails on Solaris
This is a back port from branches/zip. This code has been tested on a
big-endian machine too.
------------------------------------------------------------------------
r2361 | sunny | 2008-03-12 09:08:09 +0200 (Wed, 12 Mar 2008) | 3 lines
Changed paths:
M /branches/5.1/include/srv0srv.h
M /branches/5.1/os/os0file.c
M /branches/5.1/srv/srv0srv.c
M /branches/5.1/srv/srv0start.c
branches/5.1: Remove the innodb_flush_method fdatasync option since it was
not being used and there was a potential it could mislead users.
------------------------------------------------------------------------
r2367 | marko | 2008-03-17 10:23:03 +0200 (Mon, 17 Mar 2008) | 5 lines
Changed paths:
M /branches/5.1/handler/ha_innodb.cc
branches/5.1: ha_innobase::check_if_incompatible_data(): Check
HA_CREATE_USED_ROW_FORMAT before comparing row_type. Previously,
the comparison was incorrectly guarded by the presence of an
AUTO_INCREMENT attribute.
------------------------------------------------------------------------
r2374 | vasil | 2008-03-18 09:35:30 +0200 (Tue, 18 Mar 2008) | 11 lines
Changed paths:
M /branches/5.1/dict/dict0dict.c
A /branches/5.1/mysql-test/innodb_bug35220.result
A /branches/5.1/mysql-test/innodb_bug35220.test
branches/5.1:
Fix Bug#35220 ALTER TABLE too picky on reserved word "foreign".
In ALTER TABLE, change the internal parser to search for
``FOREIGN[[:space:]]'' instead of only ``FOREIGN'' when parsing
ALTER TABLE ... DROP FOREIGN KEY ...; otherwise it could be mistaken
with ALTER TABLE ... DROP foreign_col;
Approved by: Heikki
------------------------------------------------------------------------
r2379 | vasil | 2008-03-19 18:48:00 +0200 (Wed, 19 Mar 2008) | 10 lines
Changed paths:
M /branches/5.1/os/os0file.c
branches/5.1:
Fix Bug#34823:
fsync() occasionally returns ENOLCK and causes InnoDB to restart mysqld
Create a wrapper to fsync(2) that retries the operation if the error is
ENOLCK. Use that wrapper instead of fsync(2).
Approved by: Heikki
------------------------------------------------------------------------
r2380 | sunny | 2008-03-21 05:03:56 +0200 (Fri, 21 Mar 2008) | 9 lines
Changed paths:
M /branches/5.1/include/trx0undo.h
M /branches/5.1/trx/trx0trx.c
M /branches/5.1/trx/trx0undo.c
branches/5.1: Fix for Bug# 35352. We've added a heuristic that checks
the size of the UNDO slots cache lists (insert and upate). If either of
cached lists has more than 500 entries then we add any UNDO slots that are
freed, to the common free list instead of the cache list, this is to avoid
the case where all the free slots end up in only one of the lists on startup
after a crash.
Tested with test case for 26590 and passes all mysql-test(s).
------------------------------------------------------------------------
r2383 | vasil | 2008-03-26 09:35:22 +0200 (Wed, 26 Mar 2008) | 4 lines
Changed paths:
M /branches/5.1/include/row0mysql.h
branches/5.1:
Fix typo in comment.
------------------------------------------------------------------------
r2384 | vasil | 2008-03-26 18:26:54 +0200 (Wed, 26 Mar 2008) | 20 lines
Changed paths:
A /branches/5.1/mysql-test/innodb_bug34300.result
A /branches/5.1/mysql-test/innodb_bug34300.test
M /branches/5.1/row/row0sel.c
branches/5.1:
Fix Bug#34300 Tinyblob & tinytext fields currupted after export/import and alter in 5.1
Copy the BLOB fields, that are stored internally, to a safe place
(prebuilt->blob_heap) when converting a row from InnoDB format to
MySQL format in row_sel_store_mysql_rec().
The bug was introduced in:
------------------------------------------------------------------------
r587 | osku | 2006-05-23 15:35:58 +0300 (Tue, 23 May 2006) | 3 lines
Optimize BLOB selects by using prebuilt->blob_heap directly instead of first
reading BLOB data to a temporary heap and then copying it to
prebuilt->blob_heap.
------------------------------------------------------------------------
Approved by: Heikki
------------------------------------------------------------------------
r2386 | vasil | 2008-03-27 07:45:02 +0200 (Thu, 27 Mar 2008) | 22 lines
Changed paths:
M /branches/5.1/mysql-test/innodb.result
branches/5.1:
Merge change from MySQL (this fixes the failing innodb test):
ChangeSet@1.1810.3601.4, 2008-02-07 02:33:21+04:00, gshchepa@host.loc +9 -0
Fixed bug#30059.
Server handles truncation for assignment of too-long values
into CHAR/VARCHAR/TEXT columns in a different ways when the
truncated characters are spaces:
1. CHAR(N) columns silently ignore end-space truncation;
2. TEXT columns post a truncation warning/error in the
non-strict/strict mode.
3. VARCHAR columns always post a truncation note in
any mode.
Space truncation processing has been synchronised over
CHAR/VARCHAR/TEXT columns: current behavior of VARCHAR
columns has been propagated as standard.
Binary-encoded string/BLOB columns are not affected.
------------------------------------------------------------------------
r2387 | vasil | 2008-03-27 08:49:05 +0200 (Thu, 27 Mar 2008) | 8 lines
Changed paths:
M /branches/5.1/row/row0sel.c
branches/5.1:
Check whether *trx->mysql_query_str is != NULL in addition to
trx->mysql_query_str. This adds more safety.
This may or may not fix Bug#35226 RBR event crashes slave.
------------------------------------------------------------------------
Add innodb_stats_on_metadata option, which enables gathering
index statistics when processing metadata commands such as
SHOW TABLE STATUS. Default behavior of the server does not
change (this option is enabled by default).
Fixed BUG#19542 "InnoDB doesn't increase the Handler_read_prev couter".
Fixed BUG#19609 "Case sensitivity of innodb_data_file_path gives stupid error".
Fixed BUG#19727 "InnoDB crashed server and crashed tables are ot recoverable".
Also:
* Remove remnants of the obsolete concept of memoryfixing tables and indexes.
* Remove unused dict_table_LRU_trim().
* Remove unused 'trx' parameter from dict_table_get_on_id_low(),
dict_table_get(), dict_table_get_and_increment_handle_count().
* Add a normal linked list implementation.
* Add a work queue implementation.
* Add 'level' parameter to mutex_create() and rw_lock_create().
Remove mutex_set_level() and rw_lock_set_level().
* Rename SYNC_LEVEL_NONE to SYNC_LEVEL_VARYING.
* Add support for bound ids in InnoDB's parser.
* Define UNIV_BTR_DEBUG for enabling consistency checks of
FIL_PAGE_NEXT and FIL_PAGE_PREV when accessing sibling
pages of B-tree indexes.
btr_validate_level(): Check the validity of the doubly linked
list formed by FIL_PAGE_NEXT and FIL_PAGE_PREV.
* Adapt InnoDB to the new tablename to filename encoding in MySQL 5.1.
ut_print_name(), ut_print_name1(): Add parameter 'table_id' for
distinguishing names of tables from other identifiers.
New: innobase_convert_from_table_id(), innobase_convert_from_id(),
innobase_convert_from_filename(), innobase_get_charset.
dict_accept(), dict_scan_id(), dict_scan_col(), dict_scan_table_name(),
dict_skip_word(), dict_create_foreign_constraints_low(): Add
parameter 'cs' so that isspace() can be replaced with my_isspace(),
whose operation depends on the connection character set.
dict_scan_id(): Convert identifier to UTF-8.
dict_str_starts_with_keyword(): New extern function, to replace
dict_accept() in row_search_for_mysql().
mysql_get_identifier_quote_char(): Replaced with innobase_print_identifier().
ha_innobase::create(): Remove the thd->convert_strin() call. Pass the
statement to InnoDB in the connection character set and let InnoDB
convert the identifier to UTF-8.
* Add max_row_size to dict_table_t.
* btr0cur.c
btr_copy_externally_stored_field(): Only set the 'offset' variable
when needed.
* buf0buf.c
buf_page_io_complete(): Write to the error log if the page number or
the space id o the disk do not match those in memory. Also write to
the error log if a page was read from the doublewrite buffer. The
doublewrite buffer should be only read by the lower-level function
fil_io() at database startup.
* dict0dict.c
dict_scan_table_name(): Remove fallback to differently encoded name
when the table is not found. The encoding is handled at a higher level.
* ha_innodb.cc
Increment statistic counter in ha_innobase::index_prev() (bug 19542).
Add innobase_convert_string wrapper function and a new file
ha_prototypes.h.
innobase_print_identifier(): Remove TODO comment before calling
get_quote_char_for_identifier(). That function apparently assumes
the identifier to be encoded in UTF-8.
* ibuf0ibuf.c|h
ibuf_count_get(), ibuf_counts[], ibuf_count_inited(): Define these
only #ifdef UNIV_IBUF_DEBUG. Previously, when compiled without
UNIV_IBUF_DEBUG, invoking ibuf_count_get() would crash InnoDB.
The function is only being called #ifdef UNIV_IBUF_DEBUG.
* innodb.result
Adjust the results for changes in the foreign key error messages.
* mem0mem.c|h
New: mem_heap_dup(), mem_heap_printf(), mem_heap_cat().
* os0file.c
Check the page trailers also after writing to disk. This improves
chances of diagnosing bug 18886.
os_file_check_page_trailers(): New function for checking that the
two copies of the LSN stamped on the page match.
os_aio_simulated_handle(): Call os_file_check_page_trailers()
before and after os_file_write().
* row0mysql.c
Move trx_commit_for_mysql(trx) calls before calls to
row_mysql_unlock_data_dictionary(trx) (bug 19727).
* row0sel.c
row_fetch_print(): Handle SQL NULL values without crashing.
row_sel_store_mysql_rec(): Remove useless call to rec_get_nth_field
when handling an externally stored column.
Fetch externally stored fields when using InnoDB's internal SQL
parser.
Optimize BLOB selects by using prebuilt->blob_heap directly instead
of first reading BLOB data to a temporary heap and then copying it
to prebuilt->blob_heap.
* srv0srv.c
srv_master_thread(): Remove unreachable code.
* srv0start.c
srv_parse_data_file_paths_and_sizes(): Accept lower-case 'm' and
'g' as abbreviations of megabyte and gigabyte (bug 19609).
srv_parse_megabytes(): New fuction.
* ut0dbg.c|h
Implement InnoDB assertions (ut_a and ut_error) with abort() when
the code is compiled with GCC 3 or later on other platforms than
Windows or Netware. Also disable the variable ut_dbg_stop_threads
and the function ut_dbg_stop_thread() i this case, unless
UNIV_SYC_DEBUG is defined. This should allow the compiler to
generate more compact code for assertions.
* ut0list.c|h
Add ib_list_create_heap().