Merging the following features from sql_yacc.yy to sql_yacc_ora.yy:
- system versioning
- column compression
- table value constructor
- spatial predicate WITHIN
- DELETE_DOMAIN_ID
The code passing positions in the query to constructors of
Rewritable_query_parameter descendants (e.g. Item_splocal)
was not reliable. It used various Lex_input_stream methods:
- get_tok_start()
- get_tok_start_prev()
- get_tok_end()
- get_ptr()
to find positions of the recently scanned tokens.
The challenge was mostly to choose between get_tok_start()
and get_tok_start_prev(), taking into account to the current
grammar (depending if lookahead takes place before
or after we read the positions in every particular rule).
But this approach did not work at all in combination
with token contractions, when MYSQLlex() translates
two tokens into one token ID, for example:
WITH ROLLUP -> WITH_ROLLUP_SYM
As a result, the tokenizer is already one more token ahead.
So in query fragment:
"GROUP BY d, spvar WITH ROLLUP"
get_tok_start() points to "ROLLUP".
get_tok_start_prev() points to "WITH".
As a result, it was "WITH" who was erroneously replaced
to NAME_CONST() instead of "spvar".
This patch modifies the code to do it a different way.
Changes:
1. For keywords and identifiers, the tokenizer now
returns LEX_CTRING pointing directly to the query
fragment. So query positions are now just available using:
- $1.str - for the beginning of a token
- $1.str+$1.length - for the end of a token
2. Identifiers are not allocated on the THD memory root
in the tokenizer any more. Allocation is now done
on later stages, in methods like LEX::create_item_ident().
3. Two LEX_CSTRING based structures were added:
- Lex_ident_cli_st - used to store the "client side"
identifier representation, pointing to the
query fragment. Note, these identifiers
are encoded in @@character_set_client
and can have broken byte sequences.
- Lex_ident_sys_st - used to store the "server side"
identifier representation, pointing to the
THD allocated memory. This representation
guarantees that the identifier was checked
for being well-formed, and is encoded in utf8.
4. To distinguish between two identifier types
in the grammar, two Bison types were added:
<ident_cli> and <ident_sys>
5. All non-reserved keywords were marked as
being of the type <ident_cli>.
All reserved keywords are still of the type NONE.
6. All curly brackets in rules collecting
non-reserved keywords into non-terminal
symbols were removed, e.g.:
Was:
keyword_sp_data_type:
BIT_SYM {}
| BOOLEAN_SYM {}
Now:
keyword_sp_data_type:
BIT_SYM
| BOOLEAN_SYM
This is important NOT to have brackets here!!!!
This is needed to make sure that the underlying
Lex_ident_cli_ststructure correctly passes up to
the calling rule.
6. The code to scan identifiers and keywords
was moved from lex_one_token() into new
Lex_input_stream methods:
scan_ident_sysvar()
scan_ident_start()
scan_ident_middle()
scan_ident_delimited()
This was done to:
- get rid of enormous amount of references to &yylval->lex_str
- and remove a lot of references like lip->xxx
7. The allocating functionality which puts identifiers on the
THD memory root now resides in methods of Lex_ident_sys_st,
and in THD::to_ident_sys_alloc().
get_quoted_token() was removed.
8. Cleanup: check_simple_select() was moved as a method to LEX.
9. Cleanup: Some more functionality was moved from *.yy
to new methods were added to LEX:
make_item_colon_ident_ident()
make_item_func_call_generic()
create_item_qualified_asterisk()
Problems:
1. Unlike Item_field::fix_fields(),
Item_sum_sp::fix_length_and_dec() and Item_func_sp::fix_length_and_dec()
did not run the code which resided in adjust_max_effective_column_length(),
therefore they did not extend max_length for the integer return data types
from the user-specified length to the maximum length according to
the data type capacity.
2. The code in adjust_max_effective_column_length() was not correct
for TEXT data, because Field_blob::max_display_length()
multiplies to mbmaxlen. So TEXT variants were unintentionally
promoted to the next longer data type for multi-byte character
sets: TINYTEXT->TEXT, TEXT->MEDIUMTEXT, MEDIUMTEXT->LONGTEXT.
3. Item_sum_sp::create_table_field_from_handler()
Item_func_sp::create_table_field_from_handler()
erroneously called tmp_table_field_from_field_type(),
which converted VARCHAR(>512) to TEXT variants.
So "CREATE..SELECT spfunc()" erroneously converted
VARCHAR to TEXT. This was wrong, because stored
functions have explicitly declared data types,
which should be preserved.
Solution:
- Removing Type_std_attributes(const Field *)
and using instead Type_std_attributes::set() in combination
with field->type_str_attributes() all around the code, e.g.:
Type_std_attributes::set(field->type_std_attributes())
These two ways of copying attributes from a Field
to an Item duplicated each other, and were slightly
different in how to mix max_length and mbmaxlen.
- Removing adjust_max_effective_column_length() and
fixing Field::type_std_attributes() to do all necessary
type-specific calculations , so no further adjustments
is needed.
Field::type_std_attributes() is now called from all affected methods:
Item_field::fix_fields()
Item_sum_sp::fix_length_and_dec()
Item_func_sp::fix_length_and_dec()
This fixes the problem N1.
- Making Field::type_std_attributes() virtual, to make
sure that type-specific adjustments a properly done
by individual Field_xxx classes. Implementing
Field_blob::type_std_attributes() in the way that
no TEXT promotion is done.
This fixes the problem N2.
- Fixing Item_sum_sp::create_table_field_from_handler()
Item_func_sp::create_table_field_from_handler() to
call create_table_field_from_handler() instead of
tmp_table_field_from_field_type() to avoid
VARCHAR->TEXT conversion on "CREATE..SELECT spfunc()".
- Recording mysql-test/suite/compat/oracle/r/sp-param.result
as "CREATE..SELECT spfunc()" now correctly
preserve the data type as specified in the RETURNS clause.
- Adding new tests
The problem resided in this branch of the "option_value_no_option_type" rule:
| '@' '@' opt_var_ident_type internal_variable_name equal set_expr_or_default
Summary:
1. internal_variable_name initialized tmp.var to trg_new_row_fake_var (0x01).
2. The condition "if (tmp.var == NULL)" did not check
the special case with trg_new_row_fake_var,
so Lex->set_system_variable(&tmp, $3, $6) was
called with tmp.var pointing to trg_new_row_fake_var,
which created a sys_var instance pointing to 0x01 instead of
a real system variable.
3. Later, at the trigger invocation time, this method was called:
sys_var::do_deprecated_warning (this=0x1, thd=0x7ffe6c000a98)
Notice, "this" is equal to trg_new_row_fake_var (0x01)
Solution:
The old implementation with separate rules
internal_variable_name (in sql_yacc.yy and sql_yacc_ora.yy) and
internal_variable_name_directly_assignable (in sql_yacc_ora.yy only)
was too complex and hard to follow.
Rewriting the code in a more straightforward way.
1. Changing LEX::set_system_variable()
from:
bool set_system_variable(struct sys_var_with_base *, enum_var_type, Item *);
to:
bool set_system_variable(enum_var_type, sys_var *, const LEX_CSTRING *, Item *);
2. Adding new methods in LEX, which operate with variable names:
bool set_trigger_field(const LEX_CSTRING *, const LEX_CSTRING *, Item *);
bool set_system_variable(enum_var_type var_type, const LEX_CSTRING *name,
Item *val);
bool set_system_variable(THD *thd, enum_var_type var_type,
const LEX_CSTRING *name1,
const LEX_CSTRING *name2,
Item *val);
bool set_default_system_variable(enum_var_type var_type,
const LEX_CSTRING *name,
Item *val);
bool set_variable(const LEX_CSTRING *name, Item *item);
3. Changing the grammar to call the new methods directly
in option_value_no_option_type,
Removing rules internal_variable_name and
internal_variable_name_directly_assignable.
4. Removing "struct sys_var_with_base" and trg_new_row_fake_var.
Good side effect:
- The code in /sql reduced from 314 to 183 lines.
- MDEV-15615 Unexpected syntax error instead of "Unknown system variable" ...
was also fixed automatically
Backporting from bb-10.2-compatibility to bb-10.2-ext
Version: 2018-01-26
- CREATE PACKAGE [BODY] statements are now
entirely written to mysql.proc with type='PACKAGE' and type='PACKAGE BODY'.
- CREATE PACKAGE BODY now supports IF NOT EXISTS
- DROP PACKAGE BODY now supports IF EXISTS
- CREATE OR REPLACE PACKAGE [BODY] is now supported
- CREATE PACKAGE [BODY] now support the DEFINER clause:
CREATE DEFINER user@host PACKAGE pkg ... END;
CREATE DEFINER user@host PACKAGE BODY pkg ... END;
- CREATE PACKAGE [BODY] now supports SQL SECURITY and COMMENT clauses, e.g.:
CREATE PACKAGE p1 SQL SECURITY INVOKER COMMENT "comment" AS ... END;
- Package routines are now created from the package CREATE PACKAGE BODY
statement and don't produce individual records in mysql.proc.
- CREATE PACKAGE BODY now supports package-wide variables.
Package variables can be read and set inside package routines.
Package variables are stored in a separate sp_rcontext,
which is cached in THD on the first packate routine call.
- CREATE PACKAGE BODY now supports the initialization section.
- All public routines (i.e. declared in CREATE PACKAGE)
must have implementations in CREATE PACKAGE BODY
- Only public package routines are available outside of the package
- {CREATE|DROP} PACKAGE [BODY] now respects CREATE ROUTINE and ALTER ROUTINE
privileges
- "GRANT EXECUTE ON PACKAGE BODY pkg" is now supported
- SHOW CREATE PACKAGE [BODY] is now supported
- SHOW PACKAGE [BODY] STATUS is now supported
- CREATE and DROP for PACKAGE [BODY] now works for non-current databases
- mysqldump now supports packages
- "SHOW {PROCEDURE|FUNCTION) CODE pkg.routine" now works for package routines
- "SHOW PACKAGE BODY CODE pkg" now works (the package initialization section)
- A new package body level MDL was added
- Recursive calls for package procedures are now possible
- Routine forward declarations in CREATE PACKATE BODY are now supported.
- Package body variables now work as SP OUT parameters
- Package body variables now work as SELECT INTO targets
- Package body variables now support ROW, %ROWTYPE, %TYPE
- CREATE PACKAGE [BODY] statements are now
entirely written to mysql.proc with type='PACKAGE' and type='PACKAGE BODY'.
- CREATE PACKAGE BODY now supports IF NOT EXISTS
- DROP PACKAGE BODY now supports IF EXISTS
- CREATE OR REPLACE PACKAGE [BODY] is now supported
- CREATE PACKAGE [BODY] now support the DEFINER clause:
CREATE DEFINER user@host PACKAGE pkg ... END;
CREATE DEFINER user@host PACKAGE BODY pkg ... END;
- CREATE PACKAGE [BODY] now supports SQL SECURITY and COMMENT clauses, e.g.:
CREATE PACKAGE p1 SQL SECURITY INVOKER COMMENT "comment" AS ... END;
- Package routines are now created from the package CREATE PACKAGE BODY
statement and don't produce individual records in mysql.proc.
- CREATE PACKAGE BODY now supports package-wide variables.
Package variables can be read and set inside package routines.
Package variables are stored in a separate sp_rcontext,
which is cached in THD on the first packate routine call.
- CREATE PACKAGE BODY now supports the initialization section.
- All public routines (i.e. declared in CREATE PACKAGE)
must have implementations in CREATE PACKAGE BODY
- Only public package routines are available outside of the package
- {CREATE|DROP} PACKAGE [BODY] now respects CREATE ROUTINE and ALTER ROUTINE
privileges
- "GRANT EXECUTE ON PACKAGE BODY pkg" is now supported
- SHOW CREATE PACKAGE [BODY] is now supported
- SHOW PACKAGE [BODY] STATUS is now supported
- CREATE and DROP for PACKAGE [BODY] now works for non-current databases
- mysqldump now supports packages
- "SHOW {PROCEDURE|FUNCTION) CODE pkg.routine" now works for package routines
- "SHOW PACKAGE BODY CODE pkg" now works (the package initialization section)
- A new package body level MDL was added
- Recursive calls for package procedures are now possible
- Routine forward declarations in CREATE PACKATE BODY are now supported.
- Package body variables now work as SP OUT parameters
- Package body variables now work as SELECT INTO targets
- Package body variables now support ROW, %ROWTYPE, %TYPE
Standard compatible behavior for UPDATE: all assignments in SET
are executed "simultaneously", not left-to-right. And `SET a=b,b=a`
will swap the values.
This is a joint patch fixing the following problems:
MDEV-12875 Wrong VIEW column data type for COALESCE(int_column)
MDEV-12886 Different default for INT and BIGINT column in a VIEW for a SELECT with ROLLUP
MDEV-12916 Wrong column data type for an INT field of a cursor-anchored ROW variable
All above problem happened because the global function ::create_tmp_field()
called the top-level Item::create_tmp_field(), which made some tranformation
for INT-result data types. For example, INT(11) became BIGINT(11), because 11
is a corner case and it's not known if it fits or does not fit into INT range,
so Item::create_tmp_field() converted it to BIGINT(11) for safety.
The main idea of this patch is to avoid such tranformations.
1. Fixing Item::create_tmp_field() not to have a special case for INT_RESULT.
Item::create_tmp_field() is changed not to have a special case
for INT_RESULT (which earlier made a decision based on Item's max_length).
It now calls tmp_table_field_from_field_type() for INT_RESULT,
therefore preserves the original data type (e.g. INT, YEAR) without
conversion to BIGINT.
This change is valid, because a number of recent fixes
(e.g. in Item_func_int, Item_hybrid_func, Item_int, Item_splocal)
guarantee that item->type_handler() now properly returns
type_handler_long vs type_handler_longlong. So no adjustment by length
is needed any more for Items returning INT_RESULT.
After this change, Item::create_tmp_field() calls
tmp_table_field_from_field_type() for all XXX_RESULT, except REAL_RESULT.
2. Fixing Item::create_tmp_field() not to have a special case for REAL_RESULT.
Note, the reason for a special case for REAL_RESULT is to have a special
constructor for Field_double(), forcing Field_real::not_fixed to be set
to true.
Taking into account that only Item_sum descendants actually need a special
constructor call Field_double(not_fixed=true), not too loose precision
when mixing individual rows to the aggregate result:
- renaming Item::create_tmp_field() to Item_sum::create_tmp_field().
- changing Item::create_tmp_field() just to call
tmp_table_field_from_field_type() for all XXX_RESULT types.
A special case for REAL_RESULT in Item::create_tmp_field() is now gone.
Item::create_tmp_field() is now symmetric for all XXX_RESULT types,
and now just calls tmp_table_field_from_field_type().
3. Fixing Item_func::create_field_for_create_select() not to have
a special case for STRING_RESULT.
After changes #1 and #2, the code in
Item_func::create_field_for_create_select(), testing result_type(),
becomes useless, because: now Item::create_tmp_field() and
tmp_table_field_from_field_type() do exactly the same thing for all
XXX_RESULT types for Item_func descendants:
a. It calls tmp_table_field_from_field_type for STRING_RESULT directly.
b. For other XXX_RESULT, it goes through Item::create_tmp_field(),
which calls the global function ::create_tmp_field(),
which calls item->create_tmp_field() for FUNC_ITEM,
which calls tmp_table_field_from_field_type() again.
So removing the virtual implementation of
Item_func::create_field_for_create_select().
The inherited Item::create_field_for_create_select() now perfectly
does the job, as it also calls tmp_table_field_from_field_type()
for FUNC_ITEM, independently from XXX_RESULT type.
4. Taking into account #1 and #2, as well as some recent changes,
removing virtual implementations:
- Item_hybrid_func::create_tmp_field()
- Item_hybrid_func::create_field_for_create_select()
- Item_int_func::create_tmp_field()
- Item_int_func::create_field_for_create_select()
- Item_temporal_func::create_field_for_create_select()
The derived versions from Item now perfectly work.
5. Moving a piece of code from create_tmp_field_from_item()
to a new function create_tmp_field_from_item_finalize(),
to reuse it in two places (see #6).
6. Changing the code responsible for BIT->INT/BIGIN tranformation
(which is called for the cases when the created table, e.g. HEAP,
does not fully support BIT) not to call create_tmp_field_from_item(),
because the latter now calls tmp_table_field_from_field_type() instead
of create_tmp_field() and thefore cannot do BIT transformation.
So rewriting this code using a sequence of these calls:
- item->type_handler_long_or_longlong()
- handler->make_and_init_table_field()
- create_tmp_field_from_item_finalize()
7. Miscelaneous changes:
- Moving type_handler_long_or_longlong() from "protected" to "public",
as it's now needed in the global function create_tmp_field().
8. The above changes fixed MDEV-12875, MDEV-12886, MDEV-12916.
So adding tests for these bugs.
1. Implementing the task according to the description:
a. Adding Type_handler::type_handler_for_tmp_table().
b. Adding Type_handler::type_handler_for_union_table.
c. Adding helper methods Type_handler::varstring_type_handler(const Item*),
Type_handler::blob_type_handler(const Item*)
d. Removing Item::make_string_field() and
Item_func_group_concat::make_string_field().
They are not needed any more.
e. Simplifying Item::tmp_table_field_from_field_type() to just two lines.
f. Renaming Item_type_holder::make_field_by_type() and implementing
virtual Item_type_holder::create_tmp_field() instead.
The new implementation is also as simple as two lines.
g. Adding a new virtual method Type_all_attributes::get_typelib(),
to access to TYPELIB definitions for ENUM and SET columns.
h. Simplifying the code branch for TIME_RESULT, DECIMAL_RESULT, STRING_RESULT
in Item::create_tmp_field(). It's now just one line.
i. Implementing Type_handler_enum::make_table_field() and
Type_handler_set::make_table_field().
2. Code simplification in Field_str constructor calls.
a. Changing the "CHARSET_INFO *cs" argument in constuctors for Field_str
and its descendants to "const DTCollation &collation". This is to
avoid two step initialization:
- setting Field_str::derivation and Field_str::repertoire to the
default values first
- then resetting them using:
set_derivation(item->derivation, item->repertoire).
b. Removing Field::set_derivation()
c. Adding a new constructor DTCollation(CHARSET_INFO *cs),
for the old code compatibility.
3. Changes in test results
As a side effect some test results have changed, because
in the old version Item::make_string_field() converted
TINYBLOB to VARCHAR(255). Now TINYBLOB is preserved.
a. sp-row.result
This query:
CREATE TABLE t1 AS SELECT tinyblob_sp_variable;
Now preserves TINYBLOB as the data type.
Before the patch a VARCHAR(255) was created.
b. gis-debug.result
This is a debug test, to make sure that + and - operators
are commutative and non-commutative correspondingly.
The exact data type is not really important.
(But anyway, it now chooses a better data type that fits the result)
Working features:
CREATE OR REPLACE [TEMPORARY] SEQUENCE [IF NOT EXISTS] name
[ INCREMENT [ BY | = ] increment ]
[ MINVALUE [=] minvalue | NO MINVALUE ]
[ MAXVALUE [=] maxvalue | NO MAXVALUE ]
[ START [ WITH | = ] start ] [ CACHE [=] cache ] [ [ NO ] CYCLE ]
ENGINE=xxx COMMENT=".."
SELECT NEXT VALUE FOR sequence_name;
SELECT NEXTVAL(sequence_name);
SELECT PREVIOUS VALUE FOR sequence_name;
SELECT LASTVAL(sequence_name);
SHOW CREATE SEQUENCE sequence_name;
SHOW CREATE TABLE sequence_name;
CREATE TABLE sequence-structure ... SEQUENCE=1
ALTER TABLE sequence RENAME TO sequence2;
RENAME TABLE sequence TO sequence2;
DROP [TEMPORARY] SEQUENCE [IF EXISTS] sequence_names
Missing features
- SETVAL(value,sequence_name), to be used with replication.
- Check replication, including checking that sequence tables are marked
not transactional.
- Check that a commit happens for NEXT VALUE that changes table data (may
already work)
- ALTER SEQUENCE. ANSI SQL version of setval.
- Share identical sequence entries to not add things twice to table list.
- testing insert/delete/update/truncate/load data
- Run and fix Alibaba sequence tests (part of mysql-test/suite/sql_sequence)
- Write documentation for NEXT VALUE / PREVIOUS_VALUE
- NEXTVAL in DEFAULT
- Ensure that NEXTVAL in DEFAULT uses database from base table
- Two NEXTVAL for same row should give same answer.
- Oracle syntax sequence_table.nextval, without any FOR or FROM.
- Sequence tables are treated as 'not read constant tables' by SELECT; Would
be better if we would have a separate list for sequence tables so that
select doesn't know about them, except if refereed to with FROM.
Other things done:
- Improved output for safemalloc backtrack
- frm_type_enum changed to Table_type
- Removed lex->is_view and replaced with lex->table_type. This allows
use to more easy check if item is view, sequence or table.
- Added table flag HA_CAN_TABLES_WITHOUT_ROLLBACK, needed for handlers
that want's to support sequences
- Added handler calls:
- engine_name(), to simplify getting engine name for partition and sequences
- update_first_row(), to be able to do efficient sequence implementations.
- Made binlog_log_row() global to be able to call it from ha_sequence.cc
- Added handler variable: row_already_logged, to be able to flag that the
changed row is already logging to replication log.
- Added CF_DB_CHANGE and CF_SCHEMA_CHANGE flags to simplify
deny_updates_if_read_only_option()
- Added sp_add_cfetch() to avoid new conflicts in sql_yacc.yy
- Moved code for add_table_options() out from sql_show.cc::show_create_table()
- Added String::append_longlong() and used it in sql_show.cc to simplify code.
- Added extra option to dd_frm_type() and ha_table_exists to indicate if
the table is a sequence. Needed by DROP SQUENCE to not drop a table.
Parse context frames (sp_pcontext) can have holes in variable run-time offsets,
the missing offsets reside on the children contexts in such cases.
Example:
CREATE PROCEDURE p1() AS
x0 INT:=100; -- context 0, position 0, run-time 0
CURSOR cur(
p0 INT, -- context 1, position 0, run-time 1
p1 INT -- context 1, position 1, run-time 2
) IS SELECT p0, p1;
x1 INT:=101; -- context 0, position 1, run-time 3
BEGIN
...
END;
Fixing a few methods to take this into account:
- sp_pcontext::find_variable()
- sp_pcontext::retrieve_field_definitions()
- LEX::sp_variable_declarations_init()
- LEX::sp_variable_declarations_finalize()
- LEX::sp_variable_declarations_rowtype_finalize()
- LEX::sp_variable_declarations_with_ref_finalize()
Adding a convenience method:
sp_pcontext::get_last_context_variable(uint offset_from_the_end);
to access variables from the end, rather than from the beginning.
This helps to loop through the context variable array (m_vars)
on the fragment that does not have any holes.
Additionally, renaming sp_pcontext::find_context_variable() to
sp_pcontext::get_context_variable(). This method simply returns
the variable by its index. So let's rename to avoid assumptions
that some heavy lookup is going on inside.