CHAR(n) in ROW_FORMAT=REDUNDANT tables is always fixed-length
(n*mbmaxlen bytes), but in the temporary file it is variable-length
(n*mbminlen to n*mbmaxlen bytes) for variable-length character sets,
such as UTF-8.
The temporary file format used during index creation and online ALTER
TABLE is based on ROW_FORMAT=COMPACT. Thus, it should use the
variable-length encoding even if the base table is in
ROW_FORMAT=REDUNDNAT.
dtype_get_fixed_size_low(): Replace an assertion-like check with a
debug assertion.
rec_init_offsets_comp_ordinary(), rec_convert_dtuple_to_rec_comp():
Make this an inline function. Replace 'ulint extra' with 'bool temp'.
rec_get_converted_size_comp_prefix_low(): Renamed from
rec_get_converted_size_comp_prefix(), and made inline. Add the
parameter 'bool temp'. If temp=true, do not add REC_N_NEW_EXTRA_BYTES.
rec_get_converted_size_comp_prefix(): Remove the comment about
dict_table_is_comp(). This function is only to be called for other
than ROW_FORMAT=REDUNDANT records.
rec_get_converted_size_temp(): New function for computing temporary
file record size. Omit REC_N_NEW_EXTRA_BYTES from the sizes.
rec_init_offsets_temp(), rec_convert_dtuple_to_temp(): New functions,
for operating on temporary file records.
rb:1559 approved by Jimmy Yang
When master and slave have different schemas, in particular different
AUTO_INCREMENT columns, INSERT_ID events logged for a given table on
master may be applied to a different table on slave on SBR, e.g.:
master has one table (t1) with one auto-inc column and another table
(t2) without auto-inc column, on slave t1 does not have auto-inc
column (despite having the same columns) and t2 has a auto-inc
column. The INSERT_ID that is intended for t1, since t1 on slave
doesn't have auto-inc column is used on t2, causing consistency
problems.
To fix this incorrect behaviour, auto-inc interval allocation via
INSERT_ID is made effectively terminated at the end of top-level
statements on slave and binlog replay.
Problem description: Incorrect key file. Key file is corrupted,
while reading the keys from the file. The problem here is that
keyseg->start (which should point to the beginning of a field)
is pointing beyond total record length.
Fix: If keyseg->start is greater than total record length then
return error.
btr_lift_page_up() writes wrong page number (different by -1) for upper than father page.
But in almost all of the cases, the father page should be root page, no upper
pages. It is very rare path.
In addition the leaf page should not be lifted unless the father page is root.
Because the branch pages should not become the leaf pages.
rb://1336 approved by Marko Makela.
btr_lift_page_up() writes wrong page number (different by -1) for upper than father page.
But in almost all of the cases, the father page should be root page, no upper
pages. It is very rare path.
In addition the leaf page should not be lifted unless the father page is root.
Because the branch pages should not become the leaf pages.
rb://1336 approved by Marko Makela.
Problem description: Corrupt key file for the table. Size of the
key is greater than the maximum specified size. This results in
the overflow of the key buffer while reading the key from key
file.
Fix: If size of key is greater than the maximum size it returns
an error before writing it into the key buffer. Gives error as
corrupt file but no stack overflow.
When a new primary key is added to an InnoDB table, then the following
steps are taken by InnoDB plugin:
. let t1 be the original table.
. a temporary table t1@00231 will be created by cloning t1.
. all data will be copied from t1 to t1@00231.
. rename t1 to t1@00232.
. rename t1@00231 to t1.
. drop t1@00232.
The rename and drop operations involve file operations. But file operations
cannot be rolled back. So in row_merge_rename_tables(), just after doing data
dictionary update and before doing any file operations, generate redo logs
for file operations and commit the transaction. This will ensure that any
crash after this commit, the table is still recoverable by moving .ibd and
.frm files. Manual recovery is required.
During recovery, the rename file operation redo logs are processed.
Previously this was being ignored.
rb://1460 approved by Marko Makela.
TABLE DATA IF DUMPS MYSQL DATABA
Problem: If mysqldump is run without --events (or with --skip-events)
it will not dump the mysql.event table's data. This behaviour is inconsistent
with that of --routines option, which does not affect the dumping of
mysql.proc table. According to the Manual, --events (--skip-events) defines,
if the Event Scheduler events for the dumped databases should be included
in the mysqldump output and this has nothing to do with the mysql.event table
itself.
Solution: A warning has been added when mysqldump is used without --events
(or with --skip-events) and a separate patch with the behavioral change
will be prepared for 5.6/trunk.
TABLE DATA IF DUMPS MYSQL DATABA
Problem: If mysqldump is run without --events (or with --skip-events)
it will not dump the mysql.event table's data. This behaviour is inconsistent
with that of --routines option, which does not affect the dumping of
mysql.proc table. According to the Manual, --events (--skip-events) defines,
if the Event Scheduler events for the dumped databases should be included
in the mysqldump output and this has nothing to do with the mysql.event table
itself.
Solution: A warning has been added when mysqldump is used without --events
(or with --skip-events) and a separate patch with the behavioral change
will be prepared for 5.6/trunk.
THREAD POOLING STRESS TEST
PROBLEM:
Connection stress tests which consists of concurrent
kill connections interleaved with mysql ping queries
cause the mysqld server which uses thread pool scheduler
to crash.
FIX:
Killing a connection involves shutdown and close of client
socket and this can cause EPOLLHUP(or EPOLLERR) events to be
to be queued and handled after disarming and cleanup of
of the connection object (THD) is being done.We disarm the
the connection by modifying the epoll mask to zero which
ensure no events come and release the ownership of waiting
thread that collect events and then do the cleanup of THD.
object.As per the linux kernel epoll source code (
http://lxr.linux.no/linux+*/fs/eventpoll.c#L1771), EPOLLHUP
(or EPOLLERR) can't be masked even if we set EPOLL mask
to zero. So we disarm the connection and thus prevent
execution of any query processing handler/queueing to
client ctx. queue by removing the client fd from the epoll
set via EPOLL_CTL_DEL. Also there is a race condition which
involve the following threads:
1) Thread X executing KILL CONNECTION Y and is in THD::awake
and using mysys_var (holding LOCK_thd_data).
2) Thread Y in tp_process_event executing and is being killed.
3) Thread Z receives KILL flag internally and possible call
the tp_thd_cleanup function which set thread session variable
and changing mysys_var.
The fix for the above race is to set thread session variable
under LOCK_thd_data.
We also do not call THD::awake if we found the thread in the
thread list that is to be killed but it's KILL_CONNECTION flag
set thus avoiding any possible concurrent cleanup. This patch
is approved by Mikael Ronstrom via email review.
The patch "mysql-chain-certs.patch" needs to be adapted
to code changes in "vio/viosslfactories.c" which were
done in MySQL 5.5.
Then, the patch can be re-enabled in the spec file.
When a new primary key is added to an InnoDB table, then the following
steps are taken by InnoDB plugin:
. let t1 be the original table.
. a temporary table t1@00231 will be created by cloning t1.
. all data will be copied from t1 to t1@00231.
. rename t1 to t1@00232.
. rename t1@00231 to t1.
. drop t1@00232.
The rename and drop operations involve file operations. But file operations
cannot be rolled back. So in row_merge_rename_tables(), just after doing data
dictionary update and before doing any file operations, generate redo logs
for file operations and commit the transaction. This will ensure that any
crash after this commit, the table is still recoverable by moving .ibd and
.frm files. Manual recovery is required.
During recovery, the rename file operation redo logs are processed.
Previously this was being ignored.
rb://1460 approved by Marko Makela.
Analysis
---------
my_stat() calls stat() and if the stat() call fails we try to set
the variable my_errno which is actually a thread specific data .
We try to get the address of this thread specific data using
my_pthread_getspecifc(),but for the purge thread we have not defined
any thread specific data so it returns null and when dereferencing
null we get a segmentation fault.
init_available_charsets() seen in the core stack is invoked
through pthread_once() .pthread_once is used for one time
initialization.Since free_charsets() is called before innodb plugin
shutdown ,purge thread calls init_avaliable_charsets() which leads
to the crash.
Fix
---
Call free_charsets() after the innodb plugin shutdown,since purge
threads are still using the charsets.
Analysis
---------
my_stat() calls stat() and if the stat() call fails we try to set
the variable my_errno which is actually a thread specific data .
We try to get the address of this thread specific data using
my_pthread_getspecifc(),but for the purge thread we have not defined
any thread specific data so it returns null and when dereferencing
null we get a segmentation fault.
init_available_charsets() seen in the core stack is invoked
through pthread_once() .pthread_once is used for one time
initialization.Since free_charsets() is called before innodb plugin
shutdown ,purge thread calls init_avaliable_charsets() which leads
to the crash.
Fix
---
Call free_charsets() after the innodb plugin shutdown,since purge
threads are still using the charsets.
A change to "vio/viosslfactories.c" in August, 2012,
broke a patch which is to be applied during the build
of ULN RPMs.
The patch file is
"packaging/rpm-uln/mysql-chain-certs.patch"
This change bypasses the problem by not trying to apply
the patch.
This is a regression and must be fixed, not bypassed.
VARIABLES
Analysis:
-------------
After executing the query, new value of the user defined
variables are set in the function "select_dumpvar::send_data".
"select_dumpvar::send_data" first calls function
"Item_func_set_user_var::save_item_result()". This function
checks the nullness of the Item_field passed as parameter
to it and saves it. The nullness of item is stored with
arg[0]'s null_value flag. Then "select_dumpvar::send_data" calls
"Item_func_set_user_var::update()" which notices null
result that was saved and calls "Item_func_set_user_var::
update_hash". But here null_value is not set and args[0]
is different from that given to function "Item_func_set_user_var::
set_item_result()". This causes "Item_func_set_user_var::
update_hash" function to believe that its getting non-null value.
"user_var_entry::length" set to 0 and hence "user_var_entry::value"
is made to point to extra_area allocated in "user_var_entry".
And "Item_func_set_user_var::update_hash" tries to write
at memory beyond extra_area for result type DECIMAL. Because of
this invalid write issue is reported by Valgrind.
Before this bug was introduced, we avoided this problem by
creating "Item_func_set_user_var" object with the same
Item_field as arg[0] and as parameter to
Item_func_set_user_var::save_item_result(). But now
they are refering to different args[0]. Because of this
null_value flag set in parameter Item_field in function
"Item_func_set_user_var::save_item_result()" is not
reflected in "Item_func_set_user_var" object.
Fix:
------------
This issue is reported on versions 5.5.24. Issue does not exists
in 5.5.23, 5.1, 5.6 and trunk.
This issue was introduced by
revid:georgi.kodinov@oracle.com-20120309130449-82e3bs5v3et1x0ef (fix for
bug #12408412), which was pushed into 5.5 and later releases. This patch
has later been reversed in 5.6 and trunk by
revid:norvald.ryeng@oracle.com-20121010135242-xj34gg73h04hrmyh (fix for
bug #14664077). Backported this patch in 5.5 also to fix this issue.
sql/item_func.cc:
here unsigned value is converted to signed value.
sql/item_func.h:
last_insert_id() gives an auto_incremented value which can be
positive only,so defined it as a unsigned longlong sets the
unsigned_flag to 1.
Brief description: After insert some rows to MEMORY table with HASH key some
rows can't be deleted in one step.
Problem Analysis/solution: info->current_ptr will have the information about the
current hash pointer from where we can traverse to the list to get all the
remaining tuples.
In hp_delete_key we are updating info->current_ptr with the last_pos based on
the flag parameter(which is the keydef and last index are same). As part of the
fix we are making it to zero only when the code flow reaches to the end of the
function hp_delete_key() it means that the next record which has to get deleted
will be at the starting of the list so, that in the next call to
read record(heap_rnext()) will take line number 100 path instead of 102 path,
please see the below code in file hp_rnext.c, function heap_rnext().
99 else if (!info->current_ptr) /* Deleted or first call */
100 pos= hp_search(info, keyinfo, info->lastkey, 0);
101 else
102 pos= hp_search(info, keyinfo, info->lastkey, 1);
with that change the hp_search() will update the info->current_ptr with the
record which needs to be deleted.
storage/heap/hp_delete.c:
In heap_delete_key() function we are making info->current_ptr to 0 if
flag is enabled.
PROBLEM
-------
optimize on partiton will recreate the whole table
instead of just partition.
ANALYSIS
--------
At present innodb doesn't support optimize option ,so we do a rebuild of the
whole table and then call analyze() on the table.Presently for any optimize()
option (on table or partition) we display the following info to the user
"Table does not support optimize, doing recreate + analyze instead".
FIX
---
It was decided for GA versions(5.1 and 5.5) whenever the user tries to
optimize a partition(s) we will will display the following info the user
"Table does not support optimize on partitions.
All partitions will be rebuilt and analyzed."
Earlier partitions were not analyzed.Now all partitions will be analyzed.
If the user wants to optimize the whole table ,we will display the
previous info to the user. i.e
"Table does not support optimize, doing recreate + analyze instead"
For 5.6+ versions we will raise a new bug to support optimize() options
in innodb.
PROBLEM
-------
optimize on partiton will recreate the whole table
instead of just partition.
ANALYSIS
--------
At present innodb doesn't support optimize option ,so we do a rebuild of the
whole table and then call analyze() on the table.Presently for any optimize()
option (on table or partition) we display the following info to the user
"Table does not support optimize, doing recreate + analyze instead".
FIX
---
It was decided for GA versions(5.1 and 5.5) whenever the user tries to
optimize a partition(s) we will will display the following info the user
"Table does not support optimize on partitions.
All partitions will be rebuilt and analyzed."
Earlier partitions were not analyzed.Now all partitions will be analyzed.
If the user wants to optimize the whole table ,we will display the
previous info to the user. i.e
"Table does not support optimize, doing recreate + analyze instead"
For 5.6+ versions we will raise a new bug to support optimize() options
in innodb.
- improve the logic that decides when ndbcluster should be enabled and the extra
test suites for MySQL Cluster should be added. Should be consistent and logical now ;)
Problem description:
mysql server crashes when we run repair table on currupted table.
Analysis:
The problem with this bug seem to be key_reflength out of bounds
(186 according to debugger). We read this value from meta-data
segment of .MYI file while doing mi_open().
If you look into _mi_kpointer() you can see that the upper limit
for key_reflength is 7.
Solution:
In mi_open() there is a line like:
if (share->base.keystart > 65535 || share->base.rec_reflength > 8)
we should verify key_reflength here as well.
DISABLE AND ENABLED DURING DDL OPERATION
PROBLEM: Same thread trying to acquire the same mutex
second time leads to hang/server crash.
While [un]installing audit_log plugin
a thread acquires the LOCK_plugin mutex
and after successful initialization tries
to write in mysql.plugin table. It holds
this mutex for a long time. If some how
plugin table is corrupted then a write to
plugin table will throw an error, thread try
to log this error in the audit_log plugin,
doing so it tries to acquire the mutex
again and results is server hang/crash.
SOLUTION: Releasing the LOCK_plugin mutex before
writing in mysql.plugin table. We dont
need to hold this mutex as thread already
acquired a TL_WRITE lock on mysql.plugin
table.