Post fix for bug#23800.
The Item_field constructor now increases the select_n_where_fields counter.
sql_yacc.yy:
Post fix for bug#23800.
Take into account fields that might be added by subselects.
sql_lex.h:
Post fix for bug#23800.
Added the select_n_where_fields variable to the st_select_lex class.
sql_lex.cc:
Post fix for bug#23800.
Initialization of the select_n_where_fields variable.
created for sorting.
Any outer reference in a subquery was represented by an Item_field object.
If the outer select employs a temporary table all such fields should be
replaced with fields from that temporary table in order to point to the
actual data. This replacement wasn't done and that resulted in a wrong
subquery evaluation and a wrong result of the whole query.
Now any outer field is represented by two objects - Item_field placed in the
outer select and Item_outer_ref in the subquery. Item_field object is
processed as a normal field and the reference to it is saved in the
ref_pointer_array. Thus the Item_outer_ref is always references the correct
field. The original field is substituted for a reference in the
Item_field::fix_outer_field() function.
New function called fix_inner_refs() is added to fix fields referenced from
inner selects and to fix references (Item_ref objects) to these fields.
The new Item_outer_ref class is a descendant of the Item_direct_ref class.
It additionally stores a reference to the original field and designed to
behave more like a field.
Two problems here:
Problem 1:
While constructing the join columns list the optimizer does as follows:
1. Sets the join_using_fields/natural_join members of the right JOIN
operand.
2. Makes a "table reference" (TABLE_LIST) to parent the two tables.
3. Assigns the join_using_fields/is_natural_join of the wrapper table
using join_using_fields/natural_join of the rightmost table
4. Sets join_using_fields to NULL for the right JOIN operand.
5. Passes the parent table up to the same procedure on the upper
level.
Step 1 overrides the the join_using_fields that are set for a nested
join wrapping table in step 4.
Fixed by making a designated variable SELECT_LEX::prev_join_using to
pass the data from step 1 to step 4 without destroying the wrapping
table data.
Problem 2:
The optimizer checks for ambiguous columns while transforming
NATURAL JOIN/JOIN USING to JOIN ON. While doing that there was no
distinction between columns that are used in the generated join
condition (where ambiguity can be checked) and the other columns
(where ambiguity can be checked only when resolving references
coming from outside the JOIN construct itself).
Fixed by allowing the non-USING columns to be present in multiple
copies in both sides of the join and moving the ambiguity check
to the place where unqualified references to the join columns are
resolved (find_field_in_natural_join()).
- Make the code produce correct result: use an array of triggers to turn on/off equalities for each
compared column. Also turn on/off optimizations based on those equalities.
- Make EXPLAIN output show "Full scan on NULL key" for tables for which we switch between
ref/unique_subquery/index_subquery and ALL access.
- index_subquery engine now has HAVING clause when it is needed, and it is
displayed in EXPLAIN EXTENDED
- Fix incorrect presense of "Using index" for index/unique-based subqueries (BUG#22930)
// bk trigger note: this commit refers to BUG#24127
Currently in the ONLY_FULL_GROUP_BY mode no hidden fields are allowed in the
select list. To ensure this each expression in the select list is checked
to be a constant, an aggregate function or to occur in the GROUP BY list.
The last two requirements are wrong and doesn't allow valid expressions like
"MAX(b) - MIN(b)" or "a + 1" in a query with grouping by a.
The correct check implemented by the patch will ensure that:
any field reference in the [sub]expressions of the select list
is under an aggregate function or
is mentioned as member of the group list or
is an outer reference or
is part of the select list element that coincide with a grouping element.
The Item_field objects now can contain the position of the select list
expression which they belong to. The position is saved during the
field's Item_field::fix_fields() call.
The non_agg_fields list for non-aggregated fields is added to the SELECT_LEX
class. The SELECT_LEX::cur_pos_in_select_list now contains the position in the
select list of the expression being currently fixed.
Corrected spelling in copyright text
Makefile.am:
Don't update the files from BitKeeper
Many files:
Removed "MySQL Finland AB & TCX DataKonsult AB" from copyright header
Adjusted year(s) in copyright header
Many files:
Added GPL copyright text
Removed files:
Docs/Support/colspec-fix.pl
Docs/Support/docbook-fixup.pl
Docs/Support/docbook-prefix.pl
Docs/Support/docbook-split
Docs/Support/make-docbook
Docs/Support/make-makefile
Docs/Support/test-make-manual
Docs/Support/test-make-manual-de
Docs/Support/xwf
- Removed not used variables and functions
- Added #ifdef around code that is not used
- Renamed variables and functions to avoid conflicts
- Removed some not used arguments
Fixed some class/struct warnings in ndb
Added define IS_LONGDATA() to simplify code in libmysql.c
I did run gcov on the changes and added 'purecov' comments on almost all lines that was not just variable name changes
Bug#4968 "Stored procedure crash if cursor opened on altered table"
Bug#19733 "Repeated alter, or repeated create/drop, fails"
Bug#19182 "CREATE TABLE bar (m INT) SELECT n FROM foo; doesn't work from
stored procedure."
Bug#6895 "Prepared Statements: ALTER TABLE DROP COLUMN does nothing"
Bug#22060 "ALTER TABLE x AUTO_INCREMENT=y in SP crashes server"
Test cases for bugs 4968, 19733, 6895 will be added in 5.0.
Re-execution of CREATE DATABASE, CREATE TABLE and ALTER TABLE
statements in stored routines or as prepared statements caused
incorrect results (and crashes in versions prior to 5.0.25).
In 5.1 the problem occured only for CREATE DATABASE, CREATE TABLE
SELECT and CREATE TABLE with INDEX/DATA DIRECTOY options).
The problem of bugs 4968, 19733, 19282 and 6895 was that functions
mysql_prepare_table, mysql_create_table and mysql_alter_table were not
re-execution friendly: during their operation they used to modify contents
of LEX (members create_info, alter_info, key_list, create_list),
thus making the LEX unusable for the next execution.
In particular, these functions removed processed columns and keys from
create_list, key_list and drop_list. Search the code in sql_table.cc
for drop_it.remove() and similar patterns to find evidence.
The fix is to supply to these functions a usable copy of each of the
above structures at every re-execution of an SQL statement.
To simplify memory management, LEX::key_list and LEX::create_list
were added to LEX::alter_info, a fresh copy of which is created for
every execution.
The problem of crashing bug 22060 stemmed from the fact that the above
metnioned functions were not only modifying HA_CREATE_INFO structure in
LEX, but also were changing it to point to areas in volatile memory of
the execution memory root.
The patch solves this problem by creating and using an on-stack
copy of HA_CREATE_INFO (note that code in 5.1 already creates and
uses a copy of this structure in mysql_create_table()/alter_table(),
but this approach didn't work well for CREATE TABLE SELECT statement).
Fixed compiler warnings (detected by VC++):
- Removed not used variables
- Added casts
- Fixed wrong assignments to bool
- Fixed wrong calls with bool arguments
- Added missing argument to store(longlong), which caused wrong store method to be called.
limitation)
Note to the reviewer
====================
Warning: reviewing this patch is somewhat involved.
Due to the nature of several issues all affecting the same area,
fixing separately each issue is not practical, since each fix can not be
implemented and tested independently.
In particular, the issues with
- rule recursion
- nested case statements
- forward jump resolution (backpatch list)
are tightly coupled (see below).
Definitions
===========
The expression
CASE expr
WHEN expr THEN expr
WHEN expr THEN expr
...
END
is a "Simple Case Expression".
The expression
CASE
WHEN expr THEN expr
WHEN expr THEN expr
...
END
is a "Searched Case Expression".
The statement
CASE expr
WHEN expr THEN stmts
WHEN expr THEN stmts
...
END CASE
is a "Simple Case Statement".
The statement
CASE
WHEN expr THEN stmts
WHEN expr THEN stmts
...
END CASE
is a "Searched Case Statement".
A "Left Recursive" rule is like
list:
element
| list element
;
A "Right Recursive" rule is like
list:
element
| element list
;
Left and right recursion produces the same language, the difference only
affects the *order* in which the text is parsed.
In a descendant parser (usually written manually), right recursion works
very well, and is typically implemented with a while loop.
In an ascendant parser (yacc/bison) left recursion works very well,
and is implemented naturally by the parser stack.
In both cases, using the wrong type or recursion is very bad and should be
avoided, as it causes technical issues with the parser implementation.
Before this change
==================
The "Simple Case Expression" and "Searched Case Expression" were both
implemented by the "when_list" and "when_list2" rules, which are left
recursive (ok).
These rules, however, used lex->when_list instead of using the parser stack,
which is more complex that necessary, and potentially dangerous because
of other rules using THD::reset_lex.
The "Simple Case Statement" and "Searched Case Statements" were implemented
by the "sp_case", "sp_whens" and in part by "sp_proc_stmt" rules.
Both cases were right recursive (bad).
The grammar involved was convoluted, and is assumed to be the results of
tweaks to get the code generation to work, but is not what someone would
naturally write.
In addition, using a common rule for both "Simple" and "Searched" case
statements was implemented with sp_head::m_flags |= IN_SIMPLE_CASE,
which is a flag and not a stack, and therefore does not take into account
*nested* case statements. This leads to incorrect generated code, and either
a server crash or an incorrect result.
With regards to the backpatch mechanism, a *different* backpatch list was
created for each jump from "WHEN expr THEN stmt" to "END CASE", which
relied on the grammar to be right recursive.
This is a mis-use of the backpatch list, since this list can resolve
multiple references to the same target at once.
The optimizer algorithm used to detect dead code in the "assembly" SQL
instructions, implemented by sp_head::opt_mark(uint ip), was recursive
in some cases (a conditional jump pointing forward to another conditional
jump).
In case of specially crafted code, like
- a long list of "IF expr THEN stmt END IF"
- a long CASE statement
this would actually cause a server crash with a stack overflow.
In general, having a stack that grows proportionally with user data (the
SQL code given by the client in a CREATE PROCEDURE) is to be avoided.
In debug builds only, creating a SP / SF / Trigger which had a significant
amount of code would spend --literally-- several minutes in sp_head::create,
because of the debug code involved with DBUG_PRINT("info", ("Code %s ...
There are several issues with this code:
- in a CASE with 5 000 WHEN, there are 15 000 instructions generated,
which create a sting representation of the code which is 500 000 bytes
long,
- using a String instead of an io stream causes performances to degrade
to a total server freeze, as time is spent doing realloc of a buffer
always too short,
- Printing a 500 000 long string in the debug log is too verbose,
- Generating this string even when DBUG_PRINT is off is useless,
- Having code that potentially can affect the server behavior, used with
#ifdef / #endif is useful in some cases, but is also a bad practice.
After this change
=================
"Case Expressions" (both simple and searched) have been simplified to
not use LEX::when_list, which has been removed.
Considering all the issues affecting case statements, the grammar for these
has been totally re written.
The existing actions, used to generate "assembly" sp_inst* code, have been
preserved but moved in the new grammar, with the following changes:
a) Bison rules are no longer shared between "Simple" and "Searched" case
statements, because a stack instead of a flag is required to handle them.
Nested statements are handled naturally by the parser stack, which by
definition uses the correct rule in the correct context.
Nested statements of the opposite type (simple vs searched) works correctly.
The flag sp_head::IN_SIMPLE_CASE is no longer used.
This is a step towards resolution of WL#2999, which correctly identified
that temporary parsing flags do not belong to sp_head.
The code in the action is shared by mean of the case_stmt_action_xxx()
helpers.
b) The backpatch mechanism, used to resolve forward jumps in the generated
code, has been changed to:
- create a label for the instruction following 'END CASE',
- register each jump at the end of a "WHEN expr THEN stmt" in a *unique*
backpatch list associated with the 'END CASE' label
- resolve all the forward jumps for this label at once.
In addition, the code involving backpatch has been commented, so that a
reader can now understand by reading matching "Registering" and "Resolving"
comments how the forward jumps are resolved and what target they resolve to,
as this is far from evident when reading the code alone.
The implementation of sp_head::opt_mark() has been revised to avoid
recursive calls from jump instructions, and instead add the jump location
to the list of paths to explore during the flow analysis of the instruction
graph, with a call to sp_head::add_mark_lead().
In addition, the flow analysis will stop if an instruction has already
been marked as reachable, which the previous code failed to do in the
recursive case.
sp_head::opt_mark() is now private, to prevent new calls to this method from
being introduced.
The debug code present in sp_head::create() has been removed.
Considering that SHOW PROCEDURE CODE is also available in debug builds,
and can be used anytime regardless of the trace level, as opposed to
"CREATE PROCEDURE" time and only if the trace was on,
removing the code actually makes debugging easier (usable trace).
Tests have been written to cover the parser overflow (big CASE),
and to cover nested CASE statements.
Use lazy initialization for Query_tables_list::sroutines hash.
This step should significantly decrease amount of memory consumed
by stored routines as we no longer will allocate chunk of memory
required for this HASH for each statement in routine.
Evaluate "NULL IN (SELECT ...)" in a special way: Disable pushed-down
conditions and their "consequences":
= Do full table scans instead of unique_[index_subquery] lookups.
= Change appropriate "ref_or_null" accesses to full table scans in
subquery's joins.
Also cache value of NULL IN (SELECT ...) if the SELECT is not correlated
wrt any upper select.
select OK.
The SQL parser was using Item::name to transfer user defined function attributes
to the user defined function (udf). It was not distinguishing between user defined
function call arguments and stored procedure call arguments. Setting Item::name
was causing Item_ref::print() method to print the argument as quoted identifiers
and caused views that reference aggregate functions as udf call arguments (and
rely on Item::print() for the text of the view to store) to throw an undefined
identifier error.
Overloaded Item_ref::print to print aggregate functions as such when printing
the references to aggregate functions taken out of context by split_sum_func2()
Fixed the parser to properly detect using AS clause in stored procedure arguments
as an error.
Fixed printing the arguments of udf call to print properly the udf attribute.
account predicates that become sargable after reading const tables.
In some cases this resulted in choosing non-optimal execution plans.
Now info of such potentially saragable predicates is saved in
an array and after reading const tables we check whether this
predicates has become saragable.
should fail to create
The problem was that this type of errors was checked during view
creation, which doesn't happen when CREATE VIEW is a statement of
a created stored routine.
The solution is to perform the checks at parse time. The idea of the
fix is that the parser checks if a construction just parsed is allowed
in current circumstances by testing certain flags, and this flags are
reset for VIEWs.
The side effect of this change is that if the user already have
such bogus routines, it will now get a error when trying to do
SHOW CREATE PROCEDURE proc;
(and some other) and when trying to execute such routine he will get
ERROR 1457 (HY000): Failed to load routine test.p5. The table mysql.proc is missing, corrupt, or contains bad data (internal code -6)
However there should be very few such users (if any), and they may
(and should) drop these bogus routines.
containing a select statement that uses an aggregating IN subquery.
Added a parameter to the function fix_prepare_information
to restore correctly the having clause for the second execution.
Saved andor structure of the having conditions at the proper moment
before any calls of split_sum_func2 that could modify the having structure
adding new Item_ref objects. (These additions, are produced not with
the statement mem_root, but rather with the execution mem_root.)
make st_select_lex::setup_ref_array() take into account that
Item_sum-descendant objects located within descendant SELECTs
may be added into ref_pointer_array.
- if there are two character set definitions in the column declaration,
we replace the first one with the second one as we store both in the LEX->charset
slot. Add a separate slot to the LEX structure to store underscore charset.
- convert default values to the column charset of STRING, VARSTRING fields
if necessary as well.
When executing ALTER TABLE all the attributes of the view were overwritten.
This is contrary to the user's expectations.
So some of the view attributes are preserved now : namely security and
algorithm. This means that if they are not specified in ALTER VIEW
their values are preserved from CREATE VIEW instead of being defaulted.
can be not replicable.
Now CREATE statements for writing in the binlog are created as follows:
- the beginning of the statement is re-created;
- the rest of the statement is copied from the original query.
The problem appears when there is a version-specific comment (produced by
mysqldump), started in the re-created part of the statement and closed in the
copied part -- there is closing comment-parenthesis, but there is no opening
one.
The proper fix could be to re-create original statement, but we can not
implement it in 5.0. So, for 5.0 the fix is just to cut closing
comment-parenthesis. This technique is also used for SHOW CREATE PROCEDURE
statement (so we are able to reuse existing code).
The problem was that we restored SQL_CACHE, SQL_NO_CACHE flags in SELECT
statement from internal structures based on value set later at runtime, not
the original value set by the user.
The solution is to remember that original value.
Bug#19022 "Memory bug when switching db during trigger execution"
Bug#17199 "Problem when view calls function from another database."
Bug#18444 "Fully qualified stored function names don't work correctly in
SELECT statements"
Documentation note: this patch introduces a change in behaviour of prepared
statements.
This patch adds a few new invariants with regard to how THD::db should
be used. These invariants should be preserved in future:
- one should never refer to THD::db by pointer and always make a deep copy
(strmake, strdup)
- one should never compare two databases by pointer, but use strncmp or
my_strncasecmp
- TABLE_LIST object table->db should be always initialized in the parser or
by creator of the object.
For prepared statements it means that if the current database is changed
after a statement is prepared, the database that was current at prepare
remains active. This also means that you can not prepare a statement that
implicitly refers to the current database if the latter is not set.
This is not documented, and therefore needs documentation. This is NOT a
change in behavior for almost all SQL statements except:
- ALTER TABLE t1 RENAME t2
- OPTIMIZE TABLE t1
- ANALYZE TABLE t1
- TRUNCATE TABLE t1 --
until this patch t1 or t2 could be evaluated at the first execution of
prepared statement.
CURRENT_DATABASE() still works OK and is evaluated at every execution
of prepared statement.
Note, that in stored routines this is not an issue as the default
database is the database of the stored procedure and "use" statement
is prohibited in stored routines.
This patch makes obsolete the use of check_db_used (it was never used in the
old code too) and all other places that check for table->db and assign it
from THD::db if it's NULL, except the parser.
How this patch was created: THD::{db,db_length} were replaced with a
LEX_STRING, THD::db. All the places that refer to THD::{db,db_length} were
manually checked and:
- if the place uses thd->db by pointer, it was fixed to make a deep copy
- if a place compared two db pointers, it was fixed to compare them by value
(via strcmp/my_strcasecmp, whatever was approproate)
Then this intermediate patch was used to write a smaller patch that does the
same thing but without a rename.
TODO in 5.1:
- remove check_db_used
- deploy THD::set_db in mysql_change_db
See also comments to individual files.
The st_lex::which_check_option_applicable() function controls for which
statements WITH CHECK OPTION clause should be taken into account. REPLACE and
REPLACE_SELECT wasn't in the list which results in allowing REPLACE to insert
wrong rows in a such view.
The st_lex::which_check_option_applicable() now includes REPLACE and
REPLACE_SELECT in the list of statements for which WITH CHECK OPTION clause is
applicable.