Added missing DBUG_RETURN statements (in mysqldump.c)
Added missing enums
Fixed a lot of wrong DBUG_PRINT() statements, some of which could cause crashes
Removed usage of %lld and %p in printf strings as these are not portable or produces different results on different systems.
limitation)
Note to the reviewer
====================
Warning: reviewing this patch is somewhat involved.
Due to the nature of several issues all affecting the same area,
fixing separately each issue is not practical, since each fix can not be
implemented and tested independently.
In particular, the issues with
- rule recursion
- nested case statements
- forward jump resolution (backpatch list)
are tightly coupled (see below).
Definitions
===========
The expression
CASE expr
WHEN expr THEN expr
WHEN expr THEN expr
...
END
is a "Simple Case Expression".
The expression
CASE
WHEN expr THEN expr
WHEN expr THEN expr
...
END
is a "Searched Case Expression".
The statement
CASE expr
WHEN expr THEN stmts
WHEN expr THEN stmts
...
END CASE
is a "Simple Case Statement".
The statement
CASE
WHEN expr THEN stmts
WHEN expr THEN stmts
...
END CASE
is a "Searched Case Statement".
A "Left Recursive" rule is like
list:
element
| list element
;
A "Right Recursive" rule is like
list:
element
| element list
;
Left and right recursion produces the same language, the difference only
affects the *order* in which the text is parsed.
In a descendant parser (usually written manually), right recursion works
very well, and is typically implemented with a while loop.
In an ascendant parser (yacc/bison) left recursion works very well,
and is implemented naturally by the parser stack.
In both cases, using the wrong type or recursion is very bad and should be
avoided, as it causes technical issues with the parser implementation.
Before this change
==================
The "Simple Case Expression" and "Searched Case Expression" were both
implemented by the "when_list" and "when_list2" rules, which are left
recursive (ok).
These rules, however, used lex->when_list instead of using the parser stack,
which is more complex that necessary, and potentially dangerous because
of other rules using THD::reset_lex.
The "Simple Case Statement" and "Searched Case Statements" were implemented
by the "sp_case", "sp_whens" and in part by "sp_proc_stmt" rules.
Both cases were right recursive (bad).
The grammar involved was convoluted, and is assumed to be the results of
tweaks to get the code generation to work, but is not what someone would
naturally write.
In addition, using a common rule for both "Simple" and "Searched" case
statements was implemented with sp_head::m_flags |= IN_SIMPLE_CASE,
which is a flag and not a stack, and therefore does not take into account
*nested* case statements. This leads to incorrect generated code, and either
a server crash or an incorrect result.
With regards to the backpatch mechanism, a *different* backpatch list was
created for each jump from "WHEN expr THEN stmt" to "END CASE", which
relied on the grammar to be right recursive.
This is a mis-use of the backpatch list, since this list can resolve
multiple references to the same target at once.
The optimizer algorithm used to detect dead code in the "assembly" SQL
instructions, implemented by sp_head::opt_mark(uint ip), was recursive
in some cases (a conditional jump pointing forward to another conditional
jump).
In case of specially crafted code, like
- a long list of "IF expr THEN stmt END IF"
- a long CASE statement
this would actually cause a server crash with a stack overflow.
In general, having a stack that grows proportionally with user data (the
SQL code given by the client in a CREATE PROCEDURE) is to be avoided.
In debug builds only, creating a SP / SF / Trigger which had a significant
amount of code would spend --literally-- several minutes in sp_head::create,
because of the debug code involved with DBUG_PRINT("info", ("Code %s ...
There are several issues with this code:
- in a CASE with 5 000 WHEN, there are 15 000 instructions generated,
which create a sting representation of the code which is 500 000 bytes
long,
- using a String instead of an io stream causes performances to degrade
to a total server freeze, as time is spent doing realloc of a buffer
always too short,
- Printing a 500 000 long string in the debug log is too verbose,
- Generating this string even when DBUG_PRINT is off is useless,
- Having code that potentially can affect the server behavior, used with
#ifdef / #endif is useful in some cases, but is also a bad practice.
After this change
=================
"Case Expressions" (both simple and searched) have been simplified to
not use LEX::when_list, which has been removed.
Considering all the issues affecting case statements, the grammar for these
has been totally re written.
The existing actions, used to generate "assembly" sp_inst* code, have been
preserved but moved in the new grammar, with the following changes:
a) Bison rules are no longer shared between "Simple" and "Searched" case
statements, because a stack instead of a flag is required to handle them.
Nested statements are handled naturally by the parser stack, which by
definition uses the correct rule in the correct context.
Nested statements of the opposite type (simple vs searched) works correctly.
The flag sp_head::IN_SIMPLE_CASE is no longer used.
This is a step towards resolution of WL#2999, which correctly identified
that temporary parsing flags do not belong to sp_head.
The code in the action is shared by mean of the case_stmt_action_xxx()
helpers.
b) The backpatch mechanism, used to resolve forward jumps in the generated
code, has been changed to:
- create a label for the instruction following 'END CASE',
- register each jump at the end of a "WHEN expr THEN stmt" in a *unique*
backpatch list associated with the 'END CASE' label
- resolve all the forward jumps for this label at once.
In addition, the code involving backpatch has been commented, so that a
reader can now understand by reading matching "Registering" and "Resolving"
comments how the forward jumps are resolved and what target they resolve to,
as this is far from evident when reading the code alone.
The implementation of sp_head::opt_mark() has been revised to avoid
recursive calls from jump instructions, and instead add the jump location
to the list of paths to explore during the flow analysis of the instruction
graph, with a call to sp_head::add_mark_lead().
In addition, the flow analysis will stop if an instruction has already
been marked as reachable, which the previous code failed to do in the
recursive case.
sp_head::opt_mark() is now private, to prevent new calls to this method from
being introduced.
The debug code present in sp_head::create() has been removed.
Considering that SHOW PROCEDURE CODE is also available in debug builds,
and can be used anytime regardless of the trace level, as opposed to
"CREATE PROCEDURE" time and only if the trace was on,
removing the code actually makes debugging easier (usable trace).
Tests have been written to cover the parser overflow (big CASE),
and to cover nested CASE statements.
Bug#21025 (misleading error message when creating functions named 'x', or 'y')
Bug#22619 (Spaces considered harmful)
This change contains a fix to report warnings or errors, and multiple tests
cases.
Before this fix, name collisions between:
- Native functions
- User Defined Functions
- Stored Functions
were not systematically reported, leading to confusing behavior.
I) Native / User Defined Function
Before this fix, is was possible to create a UDF named "foo", with the same
name as a native function "foo", but it was impossible to invoke the UDF,
since the syntax "foo()" always refer to the native function.
After this fix, creating a UDF fails with an error if there is a name
collision with a native function.
II) Native / Stored Function
Before this fix, is was possible to create a SF named "db.foo", with the same
name as a native function "foo", but this was confusing since the syntax
"foo()" would refer to the native function. To refer to the Stored Function,
the user had to use the "db.foo()" syntax.
After this fix, creating a Stored Function reports a warning if there is a
name collision with a native function.
III) User Defined Function / Stored Function
Before this fix, creating a User Defined Function "foo" and a Stored Function
"db.foo" are mutually exclusive operations. Whenever the second function is
created, an error is reported. However, the test suite did not cover this
behavior.
After this fix, the behavior is unchanged, and is now covered by test cases.
Note that the code change in this patch depends on the fix for Bug 21114.
Events: crash with procedure which alters events with function
Post-review CS
This fix also changes the handling of KILL command combined with
subquery. It changes the error message given back to "not supported",
from parse error. The error for CREATE|ALTER EVENT has also been changed
to generate "not supported yet" instead of parse error.
In case of a SP call, the error is "not supported yet". This change
cleans the parser from code which should not belong to there. Still
LEX::expr_allows_subselect is existant because it simplifies the handling
of SQLCOM_HA_READ which forbids subselects.
Use lazy initialization for Query_tables_list::sroutines hash.
This step should significantly decrease amount of memory consumed
by stored routines as we no longer will allocate chunk of memory
required for this HASH for each statement in routine.
Evaluate "NULL IN (SELECT ...)" in a special way: Disable pushed-down
conditions and their "consequences":
= Do full table scans instead of unique_[index_subquery] lookups.
= Change appropriate "ref_or_null" accesses to full table scans in
subquery's joins.
Also cache value of NULL IN (SELECT ...) if the SELECT is not correlated
wrt any upper select.
select OK.
The SQL parser was using Item::name to transfer user defined function attributes
to the user defined function (udf). It was not distinguishing between user defined
function call arguments and stored procedure call arguments. Setting Item::name
was causing Item_ref::print() method to print the argument as quoted identifiers
and caused views that reference aggregate functions as udf call arguments (and
rely on Item::print() for the text of the view to store) to throw an undefined
identifier error.
Overloaded Item_ref::print to print aggregate functions as such when printing
the references to aggregate functions taken out of context by split_sum_func2()
Fixed the parser to properly detect using AS clause in stored procedure arguments
as an error.
Fixed printing the arguments of udf call to print properly the udf attribute.
account predicates that become sargable after reading const tables.
In some cases this resulted in choosing non-optimal execution plans.
Now info of such potentially saragable predicates is saved in
an array and after reading const tables we check whether this
predicates has become saragable.
should fail to create
The problem was that this type of errors was checked during view
creation, which doesn't happen when CREATE VIEW is a statement of
a created stored routine.
The solution is to perform the checks at parse time. The idea of the
fix is that the parser checks if a construction just parsed is allowed
in current circumstances by testing certain flags, and this flags are
reset for VIEWs.
The side effect of this change is that if the user already have
such bogus routines, it will now get a error when trying to do
SHOW CREATE PROCEDURE proc;
(and some other) and when trying to execute such routine he will get
ERROR 1457 (HY000): Failed to load routine test.p5. The table mysql.proc is missing, corrupt, or contains bad data (internal code -6)
However there should be very few such users (if any), and they may
(and should) drop these bogus routines.
Presence of a subquery in the ON expression of a join
should not block merging the view that contains this join.
Before this patch the such views were converted into
into temporary table views.
containing a select statement that uses an aggregating IN subquery.
Added a parameter to the function fix_prepare_information
to restore correctly the having clause for the second execution.
Saved andor structure of the having conditions at the proper moment
before any calls of split_sum_func2 that could modify the having structure
adding new Item_ref objects. (These additions, are produced not with
the statement mem_root, but rather with the execution mem_root.)
make st_select_lex::setup_ref_array() take into account that
Item_sum-descendant objects located within descendant SELECTs
may be added into ref_pointer_array.
handle them.
Problem:
CREATE|ALTER EVENT, HANDLER READ, KILL, Partitioning uses `expr` from the
parser. This rule comes with all the rings and bells including subqueries.
However, these commands are not subquery safe. For this reason there are two
fuse checks in the parser. They were checking by command id. CREATE EVENT
should forbid subquery is the fix for
bug#16394 Events: Crash if schedule contains SELECT
The fix has been incorporated as part of the patch for WL#3337 (Event scheduler
new architecture).
Solution:
A new flag was added to LEX command_forbids_subselect. The fuse checks were
changed. The commands are responsible to set the value to true whenever
they can't handle subselects.
Zero-length variables caused failures when using the length to look
up the name in a hash. Instead, signal that no zero-length name can
ever be found and that to encounter one is a syntax error.
- if there are two character set definitions in the column declaration,
we replace the first one with the second one as we store both in the LEX->charset
slot. Add a separate slot to the LEX structure to store underscore charset.
- convert default values to the column charset of STRING, VARSTRING fields
if necessary as well.