1
0
mirror of https://github.com/MariaDB/server.git synced 2025-10-25 18:38:00 +03:00
Commit Graph

6 Commits

Author SHA1 Message Date
unknown
476eaae84d Bug#19194 (Right recursion in parser for CASE causes excessive stack usage,
limitation)

Note to the reviewer
====================

Warning: reviewing this patch is somewhat involved.
Due to the nature of several issues all affecting the same area,
fixing separately each issue is not practical, since each fix can not be
implemented and tested independently.
In particular, the issues with
- rule recursion
- nested case statements
- forward jump resolution (backpatch list)
are tightly coupled (see below).

Definitions
===========

The expression
  CASE expr
  WHEN expr THEN expr
  WHEN expr THEN expr
  ...
  END
is a "Simple Case Expression".

The expression
  CASE
  WHEN expr THEN expr
  WHEN expr THEN expr
  ...
  END
is a "Searched Case Expression".

The statement
  CASE expr
  WHEN expr THEN stmts
  WHEN expr THEN stmts
  ...
  END CASE
is a "Simple Case Statement".

The statement
  CASE
  WHEN expr THEN stmts
  WHEN expr THEN stmts
  ...
  END CASE
is a "Searched Case Statement".

A "Left Recursive" rule is like
  list:
      element
    | list element
    ;

A "Right Recursive" rule is like
  list:
      element
    | element list
    ;

Left and right recursion produces the same language, the difference only
affects the *order* in which the text is parsed.

In a descendant parser (usually written manually), right recursion works
very well, and is typically implemented with a while loop.
In an ascendant parser (yacc/bison) left recursion works very well,
and is implemented naturally by the parser stack.
In both cases, using the wrong type or recursion is very bad and should be
avoided, as it causes technical issues with the parser implementation.

Before this change
==================

The "Simple Case Expression" and "Searched Case Expression" were both
implemented by the "when_list" and "when_list2" rules, which are left
recursive (ok).

These rules, however, used lex->when_list instead of using the parser stack,
which is more complex that necessary, and potentially dangerous because
of other rules using THD::reset_lex.

The "Simple Case Statement" and "Searched Case Statements" were implemented
by the "sp_case", "sp_whens" and in part by "sp_proc_stmt" rules.
Both cases were right recursive (bad).

The grammar involved was convoluted, and is assumed to be the results of
tweaks to get the code generation to work, but is not what someone would
naturally write.

In addition, using a common rule for both "Simple" and "Searched" case
statements was implemented with sp_head::m_flags |= IN_SIMPLE_CASE,
which is a flag and not a stack, and therefore does not take into account
*nested* case statements. This leads to incorrect generated code, and either
a server crash or an incorrect result.

With regards to the backpatch mechanism, a *different* backpatch list was
created for each jump from "WHEN expr THEN stmt" to "END CASE", which
relied on the grammar to be right recursive.
This is a mis-use of the backpatch list, since this list can resolve
multiple references to the same target at once.

The optimizer algorithm used to detect dead code in the "assembly" SQL
instructions, implemented by sp_head::opt_mark(uint ip), was recursive
in some cases (a conditional jump pointing forward to another conditional
jump).
In case of specially crafted code, like
- a long list of "IF expr THEN stmt END IF"
- a long CASE statement
this would actually cause a server crash with a stack overflow.
In general, having a stack that grows proportionally with user data (the
SQL code given by the client in a CREATE PROCEDURE) is to be avoided.

In debug builds only, creating a SP / SF / Trigger which had a significant
amount of code would spend --literally-- several minutes in sp_head::create,
because of the debug code involved with DBUG_PRINT("info", ("Code %s ...
There are several issues with this code:
- in a CASE with 5 000 WHEN, there are 15 000 instructions generated,
  which create a sting representation of the code which is 500 000 bytes
  long,
- using a String instead of an io stream causes performances to degrade
  to a total server freeze, as time is spent doing realloc of a buffer
  always too short,
- Printing a 500 000 long string in the debug log is too verbose,
- Generating this string even when DBUG_PRINT is off is useless,
- Having code that potentially can affect the server behavior, used with
  #ifdef / #endif is useful in some cases, but is also a bad practice.

After this change
=================

"Case Expressions" (both simple and searched) have been simplified to
not use LEX::when_list, which has been removed.

Considering all the issues affecting case statements, the grammar for these
has been totally re written.

The existing actions, used to generate "assembly" sp_inst* code, have been
preserved but moved in the new grammar, with the following changes:

a) Bison rules are no longer shared between "Simple" and "Searched" case
statements, because a stack instead of a flag is required to handle them.
Nested statements are handled naturally by the parser stack, which by
definition uses the correct rule in the correct context.
Nested statements of the opposite type (simple vs searched) works correctly.
The flag sp_head::IN_SIMPLE_CASE is no longer used.
This is a step towards resolution of WL#2999, which correctly identified
that temporary parsing flags do not belong to sp_head.
The code in the action is shared by mean of the case_stmt_action_xxx()
helpers.

b) The backpatch mechanism, used to resolve forward jumps in the generated
code, has been changed to:
- create a label for the instruction following 'END CASE',
- register each jump at the end of a "WHEN expr THEN stmt" in a *unique*
  backpatch list associated with the 'END CASE' label
- resolve all the forward jumps for this label at once.

In addition, the code involving backpatch has been commented, so that a
reader can now understand by reading matching "Registering" and "Resolving"
comments how the forward jumps are resolved and what target they resolve to,
as this is far from evident when reading the code alone.

The implementation of sp_head::opt_mark() has been revised to avoid
recursive calls from jump instructions, and instead add the jump location
to the list of paths to explore during the flow analysis of the instruction
graph, with a call to sp_head::add_mark_lead().
In addition, the flow analysis will stop if an instruction has already
been marked as reachable, which the previous code failed to do in the
recursive case.
sp_head::opt_mark() is now private, to prevent new calls to this method from
being introduced.

The debug code present in sp_head::create() has been removed.
Considering that SHOW PROCEDURE CODE is also available in debug builds,
and can be used anytime regardless of the trace level, as opposed to
"CREATE PROCEDURE" time and only if the trace was on,
removing the code actually makes debugging easier (usable trace).

Tests have been written to cover the parser overflow (big CASE),
and to cover nested CASE statements.


mysql-test/r/sp-code.result:
  Test cases for nested CASE statements.
mysql-test/t/sp-code.test:
  Test cases for nested CASE statements.
sql/sp_head.cc:
  Re factored opt_mark() to avoid recursion, clean up.
sql/sp_head.h:
  Re factored opt_mark() to avoid recursion, clean up.
sql/sql_lex.cc:
  Removed when_list.
sql/sql_lex.h:
  Removed when_list.
sql/sql_yacc.yy:
  Minor clean up for case expressions,
  Major re write for case statements (Bug#19194).
mysql-test/r/sp_stress_case.result:
  New test for massive CASE statements.
mysql-test/t/sp_stress_case.sh:
  New test for massive CASE statements.
mysql-test/t/sp_stress_case.test:
  New test for massive CASE statements.
2006-11-17 12:14:29 -07:00
unknown
7733615d8d Bug#19207: Final parenthesis omitted for CREATE INDEX in Stored Procedure
Wrong criteria was used to distinguish the case when there was no
lookahead performed in the parser.  Bug affected only statements
ending in one-character token without any optional tail, like CREATE
INDEX and CALL.


mysql-test/r/sp-code.result:
  Add result for bug#19207: Final parenthesis omitted for CREATE INDEX
  in Stored Procedure
mysql-test/t/sp-code.test:
  Add test case for bug#19207: Final parenthesis omitted for CREATE INDEX
  in Stored Procedure
sql/sql_yacc.yy:
  Use (yychar == YYEMPTY) as the criteria of whether lookahead was not
  performed.
2006-07-07 21:24:54 +04:00
unknown
9161d30861 Post-review fix for BUG#15737 (corrected typo in sp-code.test comment)
mysql-test/t/sp-code.test:
  Corrected typo in comment.
2006-01-30 15:04:00 +01:00
unknown
7ee65fcf85 Fixed BUG#15737: Stored procedure optimizer bug with LEAVE
Second version.
  The problem was that the optimizer didn't work correctly with forwards jumps
  to "no-op" hpop and cpop instructions.
  Don't generate "no-op" instructions (hpop 0 and cpop 0), it isn't actually
  necessary.


mysql-test/r/sp-code.result:
  Updated results for new test case (BUG#15737)
mysql-test/t/sp-code.test:
  New test case (BUG#15737)
sql/sp_head.cc:
  Removed backpatch methods from sp_instr_hpop/cpop, since they're not needed any more.
  Added more documentation to sp_head::optimize()
sql/sp_head.h:
  Removed backpatch and opt_mark methods from sp_instr_hpop/cpop, since they're not needed
  any more.
  Added comments to optimizer methods in sp_instr.
sql/sql_yacc.yy:
  Don't generate "no-op" hpop and cpop instructions for LEAVE, it's not necessary.
  Just generate them when needed.
2006-01-25 15:11:49 +01:00
unknown
e99f14e73b Removed forgotten test line in sp-code.test.
mysql-test/r/sp-code.result:
  Removed forgotten test line.
mysql-test/t/sp-code.test:
  Removed forgotten test line.
2005-11-18 18:05:04 +01:00
unknown
6726a6b8b9 Post-review fixes, mainly fixing all print() methods for sp_instr* classes.
Also added mysql-test files:
 include/is_debug_build.inc
 r/is_debug_build.require
 r/sp-code.result
 t/sp-code.test


sql/sp_head.cc:
  Review fixes:
  - Some minor editorial changes
  - Fixed all print() methods for instructions:
    - reserve() enough space
    - check return value from reserve()
    - use qs_append, with length arg, whenever possible
sql/sp_pcontext.cc:
  Review fixes.
  Also fixed bug in find_cursor().
sql/sp_pcontext.h:
  Changed parameter names (review fix).
sql/sql_parse.cc:
  Moved comment. (Review fix)
mysql-test/include/is_debug_build.inc:
  New BitKeeper file ``mysql-test/include/is_debug_build.inc''
mysql-test/r/is_debug_build.require:
  New BitKeeper file ``mysql-test/r/is_debug_build.require''
mysql-test/r/sp-code.result:
  New BitKeeper file ``mysql-test/r/sp-code.result''
mysql-test/t/sp-code.test:
  New BitKeeper file ``mysql-test/t/sp-code.test''
2005-11-18 16:30:27 +01:00