The problem was that THD::db_access variable was not restored after
database switch in stored-routine-execution code.
The fix is to restore THD::db_access in this case.
Unfortunately, this fix requires additional changes,
because in prepare_schema_table(), called on the parsing stage, we checked
privileges. That was wrong according to our design, but this flaw haven't
struck so far, because it was masked. All privilege checkings must be
done on the execution stage in order to be compatible with prepared statements
and stored routines. So, this patch also contains patch for
prepare_schema_table(), which moves the checkings to the execution phase.
execution breaks replication.
When a stored routine is executed, we switch current
database to the database, in which the routine
has been created. When the stored routine finishes,
we switch back to the original database.
The problem was that if the original database does not
exist (anymore) after routine execution, we raised an error.
The fix is to report a warning, and switch to the NULL database.
thd->options' OPTION_STATUS_NO_TRANS_UPDATE bit was not restored at the end of SF() invocation, where
SF() modified non-ta table.
As the result of this artifact it was not possible to detect whether there were any side-effects when
top-level query ends.
If the top level query table was not modified and the bit is lost there would be no binlogging.
Fixed with preserving the bit inside of thd->no_trans_update struct. The struct agregates two bool flags
telling whether the current query and the current transaction modified any non-ta table.
The flags stmt, all are dropped at the end of the query and the transaction.
TABLE ... WRITE".
Memory and CPU hogging occured when connection which had to wait for table
lock was serviced by thread which previously serviced connection that was
killed (note that connections can reuse threads if thread cache is enabled).
One possible scenario which exposed this problem was when thread which
provided binlog dump to replication slave was implicitly/automatically
killed when the same slave reconnected and started pulling data through
different thread/connection.
The problem also occured when one killed particular query in connection
(using KILL QUERY) and later this connection had to wait for some table
lock.
This problem was caused by the fact that thread-specific mysys_var::abort
variable, which indicates that waiting operations on mysys layer should
be aborted (this includes waiting for table locks), was set by kill
operation but was never reset back. So this value was "inherited" by the
following statements or even other connections (which reused the same
physical thread). Such discrepancy between this variable and THD::killed
flag broke logic on SQL-layer and caused CPU and memory hogging.
This patch tries to fix this problem by properly resetting this member.
There is no test-case associated with this patch since it is hard to test
for memory/CPU hogging conditions in our test-suite.
differences in tables
Certain merge tables were wrongly reported as having incorrect definition:
- Some fields that are 1 byte long (e.g. TINYINT, CHAR(1)), might
be internally casted (in certain cases) to a different type on a
storage engine layer. (affects 4.1 and up)
- If tables in a merge (and a MERGE table itself) had short VARCHAR column (less
than 4 bytes) and at least one (but not all) tables were ALTER'ed (even to an
identical table: ALTER TABLE xxx ENGINE=yyy), table definitions went ouf of
sync. (affects 4.1 only)
This is fixed by relaxing a check for underlying conformance and setting
field type to FIELD_TYPE_STRING in case varchar is shorter than 4
when a table is created.
fixes).
The legend: on a replication slave, in case a trigger creation
was filtered out because of application of replicate-do-table/
replicate-ignore-table rule, the parsed definition of a trigger was not
cleaned up properly. LEX::sphead member was left around and leaked
memory. Until the actual implementation of support of
replicate-ignore-table rules for triggers by the patch for Bug 24478 it
was never the case that "case SQLCOM_CREATE_TRIGGER"
was not executed once a trigger was parsed,
so the deletion of lex->sphead there worked and the memory did not leak.
The fix:
The real cause of the bug is that there is no 1 or 2 places where
we can clean up the main LEX after parse. And the reason we
can not have just one or two places where we clean up the LEX is
asymmetric behaviour of MYSQLparse in case of success or error.
One of the root causes of this behaviour is the code in Item::Item()
constructor. There, a newly created item adds itself to THD::free_list
- a single-linked list of Items used in a statement. Yuck. This code
is unaware that we may have more than one statement active at a time,
and always assumes that the free_list of the current statement is
located in THD::free_list. One day we need to be able to explicitly
allocate an item in a given Query_arena.
Thus, when parsing a definition of a stored procedure, like
CREATE PROCEDURE p1() BEGIN SELECT a FROM t1; SELECT b FROM t1; END;
we actually need to reset THD::mem_root, THD::free_list and THD::lex
to parse the nested procedure statement (SELECT *).
The actual reset and restore is implemented in semantic actions
attached to sp_proc_stmt grammar rule.
The problem is that in case of a parsing error inside a nested statement
Bison generated parser would abort immediately, without executing the
restore part of the semantic action. This would leave THD in an
in-the-middle-of-parsing state.
This is why we couldn't have had a single place where we clean up the LEX
after MYSQLparse - in case of an error we needed to do a clean up
immediately, in case of success a clean up could have been delayed.
This left the door open for a memory leak.
One of the following possibilities were considered when working on a fix:
- patch the replication logic to do the clean up. Rejected
as breaks module borders, replication code should not need to know the
gory details of clean up procedure after CREATE TRIGGER.
- wrap MYSQLparse with a function that would do a clean up.
Rejected as ideally we should fix the problem when it happens, not
adjust for it outside of the problematic code.
- make sure MYSQLparse cleans up after itself by invoking the clean up
functionality in the appropriate places before return. Implemented in
this patch.
- use %destructor rule for sp_proc_stmt to restore THD - cleaner
than the prevoius approach, but rejected
because needs a careful analysis of the side effects, and this patch is
for 5.0, and long term we need to use the next alternative anyway
- make sure that sp_proc_stmt doesn't juggle with THD - this is a
large work that will affect many modules.
Cleanup: move main_lex and main_mem_root from Statement to its
only two descendants Prepared_statement and THD. This ensures that
when a Statement instance was created for purposes of statement backup,
we do not involve LEX constructor/destructor, which is fairly expensive.
In order to track that the transformation produces equivalent
functionality please check the respective constructors and destructors
of Statement, Prepared_statement and THD - these members were
used only there.
This cleanup is unrelated to the patch.
When INSERT is done over a view the table being inserted into is
checked to be unique among all views tables. But if the view contains
self-joined table an error will be thrown even if all tables are used under
different aliases.
The unique_table() function now also checks tables' aliases when needed.
Problem: DROP TRIGGER was not properly handled in combination
with slave filters, which made replication stop
Fix: loading table name before checking slave filters when
dropping a trigger.
- Starting time of a query sent by bootstrapping wasn't initialized
and starting time defaulted to 0. This later used value by NOW-
item and was translated to 1970-01-01 11:00:00.
- Marketing the time with thd->set_time() before the call to
mysql_parse resolves this issue.
- set_time was refactored to be part of the thd->init_for_queries-
process.
- Starting time of a query sent by file bootstrapping wasn't initialized
and starting time defaulted to 0. This later used value by the Now-
item and is translated to 1970-01-01 11:00:00.
- marking the time with thd->set_time() before the call to
mysql_parse resolves this issue.
Two problems here:
Problem 1:
While constructing the join columns list the optimizer does as follows:
1. Sets the join_using_fields/natural_join members of the right JOIN
operand.
2. Makes a "table reference" (TABLE_LIST) to parent the two tables.
3. Assigns the join_using_fields/is_natural_join of the wrapper table
using join_using_fields/natural_join of the rightmost table
4. Sets join_using_fields to NULL for the right JOIN operand.
5. Passes the parent table up to the same procedure on the upper
level.
Step 1 overrides the the join_using_fields that are set for a nested
join wrapping table in step 4.
Fixed by making a designated variable SELECT_LEX::prev_join_using to
pass the data from step 1 to step 4 without destroying the wrapping
table data.
Problem 2:
The optimizer checks for ambiguous columns while transforming
NATURAL JOIN/JOIN USING to JOIN ON. While doing that there was no
distinction between columns that are used in the generated join
condition (where ambiguity can be checked) and the other columns
(where ambiguity can be checked only when resolving references
coming from outside the JOIN construct itself).
Fixed by allowing the non-USING columns to be present in multiple
copies in both sides of the join and moving the ambiguity check
to the place where unqualified references to the join columns are
resolved (find_field_in_natural_join()).