Problem:
sp_cache erroneously looked up fully qualified SP names (e.g. `DB`.`SP`),
in case insensitive style. It was wrong, because only the "name"
part is always case insensitive, while the "db" part should be compared
according to lower_case_table_names (case sensitively for 0,
case insensitively for 1 and 2).
Fix:
Adding a "casedn_name" parameter make_qname() to tell
if the name part should be lower cased:
`DB1`.`SP` -> "DB1.SP" (when casedn_name=false)
`DB1`.`SP` -> "DB1.sp" (when casedn_name=true)
and using make_qname() with casedn_name=true when creating
sp_cache hash lookup keys.
Details:
As a result, it now works as follows:
- sp_head::m_db is converted to lower case if lower_case_table_names>0
during the sp_name initialization phase. So when make_qname() is called,
sp_head::m_db is already normalized. There are no changes in here.
- The initialization phase of sp_head when creating sp_head::m_qname
now calls make_qname() with casedn_name=true,
so sp_head::m_name gets written to sp_head::m_qname in lower case.
- sp_cache_lookup() now also calls make_qname() with casedn_name=true,
so sp_head::m_name gets written to the temporary lookup key in lower case.
- sp_cache::m_hashtable now uses case sensitive comparison
Part#1 A non-functional change
Changing the signature of Identifier_chain2::make_qname() from
bool make_qname(MEM_ROOT *mem_root, LEX_CSTRING *dst) const;
to
LEX_CSTRING make_qname(MEM_ROOT *mem_root) const;
Now the result is returned as LEX_CSTRING from the function rather than
is passed as a parameter.
The return value {NULL,0} means "EOM".
This is a requirement step to fix and merge easier
MDEV-33019 The database part is not case sensitive in SP names
The original MDEV-31991 commit commend:
- Moving some of Database_qualified_name methods into a new class
Identifier_chain2.
- Changing the data type of the following variables from
Database_qualified_name to Identifier_chain2:
* q_pkg_proc in LEX::call_statement_start()
* q_pkg_func in LEX::make_item_func_call_generic()
Rationale:
The data type of Database_qualified_name::m_db will be changed
to Lex_ident_db soon. So Database_qualified_name won't be able
to store the `pkg.routine` part of `db.pkg.routine` any more,
because `pkg` must not depend on lower-case-table-names.
This patch adds PACKAGE support with SQL/PSM dialect for sql_mode=DEFAULT:
- CREATE PACKAGE
- DROP PACKAGE
- CREATE PACKAGE BODY
- DROP PACKAGE BODY
- Package function and procedure invocation from outside of the package:
-- using two step identifiers
SELECT pkg.f1();
CALL pkg.p1()
-- using three step identifiers
SELECT db.pkg.f1();
CALL db.pkg.p1();
This is a non-standard MariaDB extension.
However, later this code can be used to implement
the SQL Standard and DB2 dialects of CREATE MODULE.
1. WITHOUT/WITH VALIDATION may be added to EXCHANGE PARTITION or CONVERT TABLE:
alter table tp exchange partition p1 with table t with validation;
alter table tp exchange partition p1 with table t; -- same as with validation
alter table tp exchange partition p1 with table t without validation;
2. Optional THAN keyword for RANGE partitioning. Normally you type:
create table tp (a int primary key) partition by range (a) (
partition p0 values less than (100),
partition p1 values less than maxvalue);
Now you may type (PARTITION keyword is also optional):
create table tp (a int primary key) partition by range (a) (
p0 values less (100),
p1 values less maxvalue);
This is the follow-up patch that removes explicit use of thd->stmt_arena
for memory allocation and replaces it with call of the method
THD::active_stmt_arena_to_use()
Additionally, this patch adds extra DBUG_ASSERT to check that right
query arena is in use.
This patch is actually follow-up for the task
MDEV-23902: MariaDB crash on calling function
to use correct query arena for a statement. In case invocation of
a function is in progress use its call arena, else use current
query arena that can be either a statement or a regular query arena.
When parsing statements like (SELECT .. FROM ..) ORDER BY <expr>,
there is a step LEX::add_tail_to_query_expression_body_ext_parens()
which calls LEX::wrap_unit_into_derived(). After that the statement
looks like SELECT * FROM (SELECT .. FROM ..), and parser's
Lex_order_limit_lock structure (ORDER BY <expr>) is assigned to
the new SELECT. But what is missing here is that Items in
Lex_order_limit_lock are left with their original name resolution
contexts, and fix_fields() later resolves the names incorrectly.
For example, when processing
(SELECT * FROM t1 JOIN t2 ON a=b) ORDER BY a
Item_field 'a' in the ORDER BY clause is left with the name resolution
context of the derived table (first_name_resolution_table='t1'), so
it is resolved to 't1.a', which is incorrect.
After LEX::wrap_unit_into_derived() the statement looks like
SELECT * FROM (SELECT * FROM t1 JOIN t2 ON a=b) AS '__2' ORDER BY a,
and the name resolution context for Item_field 'a' in the ORDER BY
must be set to the wrapping SELECT's one.
This commit fixes the issue by changing context for Items in
Lex_order_limit_lock after LEX::wrap_unit_into_derived().
and to related methods and their parameters:
- The return value of Spvar_definition::m_column_type_ref()
- The parameter of Spvar_definition::set_column_type_ref()
- The method Qualified_column_ident::resolve_type_ref()
- The parameter of LEX::sp_variable_declarations_column_type_finalize()
The crash happened with an indexed virtual column whose
value is evaluated using a function that has a different meaning
in sql_mode='' vs sql_mode=ORACLE:
- DECODE()
- LTRIM()
- RTRIM()
- LPAD()
- RPAD()
- REPLACE()
- SUBSTR()
For example:
CREATE TABLE t1 (
b VARCHAR(1),
g CHAR(1) GENERATED ALWAYS AS (SUBSTR(b,0,0)) VIRTUAL,
KEY g(g)
);
So far we had replacement XXX_ORACLE() functions for all mentioned function,
e.g. SUBSTR_ORACLE() for SUBSTR(). So it was possible to correctly re-parse
SUBSTR_ORACLE() even in sql_mode=''.
But it was not possible to re-parse the MariaDB version of SUBSTR()
after switching to sql_mode=ORACLE. It was erroneously mis-interpreted
as SUBSTR_ORACLE().
As a result, this combination worked fine:
SET sql_mode=ORACLE;
CREATE TABLE t1 ... g CHAR(1) GENERATED ALWAYS AS (SUBSTR(b,0,0)) VIRTUAL, ...;
INSERT ...
FLUSH TABLES;
SET sql_mode='';
INSERT ...
But the other way around it crashed:
SET sql_mode='';
CREATE TABLE t1 ... g CHAR(1) GENERATED ALWAYS AS (SUBSTR(b,0,0)) VIRTUAL, ...;
INSERT ...
FLUSH TABLES;
SET sql_mode=ORACLE;
INSERT ...
At CREATE time, SUBSTR was instantiated as Item_func_substr and printed
in the FRM file as substr(). At re-open time with sql_mode=ORACLE, "substr()"
was erroneously instantiated as Item_func_substr_oracle.
Fix:
The fix proposes a symmetric solution. It provides a way to re-parse reliably
all sql_mode dependent functions to their original CREATE TABLE time meaning,
no matter what the open-time sql_mode is.
We take advantage of the same idea we previously used to resolve sql_mode
dependent data types.
Now all sql_mode dependent functions are printed by SHOW using a schema
qualifier when the current sql_mode differs from the function sql_mode:
SET sql_mode='';
CREATE TABLE t1 ... SUBSTR(a,b,c) ..;
SET sql_mode=ORACLE;
SHOW CREATE TABLE t1; -> mariadb_schema.substr(a,b,c)
SET sql_mode=ORACLE;
CREATE TABLE t2 ... SUBSTR(a,b,c) ..;
SET sql_mode='';
SHOW CREATE TABLE t1; -> oracle_schema.substr(a,b,c)
Old replacement names like substr_oracle() are still understood for
backward compatibility and used in FRM files (for downgrade compatibility),
but they are not printed by SHOW any more.
This commit addresses column naming issues with CTEs in the use of prepared
statements and stored procedures. Usage of either prepared statements or
procedures with Common Table Expressions and column renaming may be affected.
There are three related but different issues addressed here.
1) First execution issue. Consider the following
prepare s from "with cte (col1, col2) as (select a as c1, b as c2 from t
order by c1) select col1, col2 from cte";
execute s;
After parsing, items in the select are named (c1,c2), order by (and group by)
resolution is performed, then item names are set to (col1, col2).
When the statement is executed, context analysis is again performed, but
resolution of elements in the order by statement will not be able to find c1,
because it was renamed to col1 and remains this way.
The solution is to save the names of these items during context resolution
before they have been renamed. We can then reset item names back to those after
parsing so first execution can resolve items referred to in order and group by
clauses.
2) Second Execution Issue
When the derived table contains more than one select 'unioned' together we could
reasonably think that dealing with only items in the first select (which
determines names in the resultant table) would be sufficient. This can lead to
a different problem. Consider
prepare st from "with cte (c1,c2) as
(select a as col1, sum(b) as col2 from t1 where a > 0 group by col1
union select a as col3, sum(b) as col4 from t2 where b > 2 group by col3)
select * from cte where c1=1";
When the optimizer (only run during the first execution) pushes the outside
condition "c1=1" into every select in the derived table union, it renames the
items to make the condition valid. In this example, this leaves the first item
in the second select named 'c1'. The second execution will now fail 'group by'
resolution.
Again, the solution is to save the names during context analysis, resetting
before subsequent resolution, but making sure that we save/reset the item
names in all the selects in this union.
3) Memory Leak
During parsing Item::set_name() is used to allocate memory in the statement
arena. We cannot use this call during statement execution as this represents
a memory leak. We directly set the item list names to those in the column list
of this CTE (also allocated during parsing).
Approved by Igor Babaev <igor@mariadb.com>
Changing the code handling sql_mode-dependent function DECODE():
- removing parser tokens DECODE_MARIADB_SYM and DECODE_ORACLE_SYM
- removing the DECODE() related code from sql_yacc.yy/sql_yacc_ora.yy
- adding handling of DECODE() with help of a new Create_func_func_decode
An "ITERATE innerLoop" did not work properly inside
a WHILE loop, which itself is inside an outer FOR loop:
outerLoop:
FOR
...
innerLoop:
WHILE
...
ITERATE innerLoop;
...
END WHILE;
...
END FOR;
It erroneously generated an integer increment code for the outer FOR loop.
There were two problems:
1. "ITERATE innerLoop" worked like "ITERATE outerLoop"
2. It was always integer increment, even in case of FOR cursor loops.
Background:
- A FOR loop automatically creates a dedicated sp_pcontext stack entry,
to put the iteration and bound variables on it.
- Other loop types (LOOP, WHILE, REPEAT), do not generate a dedicated
slack entry.
The old code erroneously assumed that sp_pcontext::m_for_loop
either describes the most inner loop (in case the inner loop is FOR),
or is empty (in case the inner loop is not FOR).
But in fact, sp_pcontext::m_for_loop is never empty inside a FOR loop:
it describes the closest FOR loop, even if this FOR loop has nested
non-FOR loops inside.
So when we're near the ITERATE statement in the above script,
sp_pcontext::m_for_loop is not empty - it stores information about
the FOR loop labeled as "outrLoop:".
Fix:
- Adding a new member sp_pcontext::Lex_for_loop::m_start_label,
to remember the explicit or the auto-generated label correspoding
to the start of the FOR body. It's used during generation
of "ITERATE loop_label" code to check if "loop_label" belongs
to the current FOR loop pointed by sp_pcontext::m_for_loop,
or belongs to a non-FOR nested loop.
- Adding LEX methods sp_for_loop_intrange_iterate() and
sp_for_loop_cursor_iterate() to reuse the code between
methods handling:
* ITERATE
* END FOR
- Adding a test for Lex_for_loop::is_for_loop_cursor()
and generate a code either a cursor fetch, or for an integer increment.
Before this change, it always erroneously generated an integer increment
version.
- Cleanup: Initialize Lex_for_loop_st::m_cursor_offset inside
Lex_for_loop_st::init(), to avoid not initialized members.
- Cleanup: Removing a redundant method:
Lex_for_loop_st::init(const Lex_for_loop_st &other)
Using Lex_for_loop_st::operator(const Lex_for_loop_st &other) instead.
The function setup_windows() called at the prepare phase of processing a
select builds a list of all window specifications used in the select. This list
is built on the statement memory and it must be done only once.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
- Removing two copies of the drop_routine.
Adding a shared and much simplified version.
- Removing LEX metods:
bool stmt_drop_function(const DDL_options_st &options,
const Lex_ident_sys_st &db,
const Lex_ident_sys_st &name);
bool stmt_drop_function(const DDL_options_st &options,
const Lex_ident_sys_st &name);
bool stmt_drop_procedure(const DDL_options_st &options,
sp_name *name);
The code inside the methods was very similar.
Adding one method instead:
bool stmt_drop_routine(const Sp_handler *sph,
const DDL_options_st &options,
const Lex_ident_sys_st &db,
const Lex_ident_sys_st &name);
- Adding a new virtual method Sp_handler:sqlcom_drop().
It helped to unify the code inside the new stmt_drop_routine().
Problem:
Under terms of MDEV-27490, we'll update Unicode version used
to compare identifiers to 14.0.0. Unlike in the old Unicode version,
in the new version a string can grow during lower-case. We cannot
perform check_db_name() inplace any more.
Change summary:
- Allocate memory to store lower-cased identifiers in memory root
- Removing check_db_name() performing both in-place lower-casing and validation
at the same time. Splitting it into two separate stages:
* creating a memory-root lower-cased copy of an identifier
(using new MEM_ROOT functions and Query_arena wrapper methods)
* performing validation on a constant string
(using Lex_ident_fs methods)
Implementation details:
- Adding a mysys helper function to allocate lower-cased strings on MEM_ROOT:
lex_string_casedn_root()
and a Query_arena wrappers for it:
make_ident_casedn()
make_ident_opt_casedn()
- Adding a Query_arena method to perform both MEM_ROOT lower-casing and
database name validation at the same time:
to_ident_db_internal_with_error()
This method is very close to the old (pre-11.3) check_db_name(),
but performs lower-casing to a newly allocated MEM_ROOT
memory (instead of performing lower-casing the original string in-place).
- Adding a Table_ident method which additionally handles derived table names:
to_ident_db_internal_with_error()
- Removing the old check_db_name()
- Moving some of Database_qualidied_name methods into a new class
Identifier_chain2.
- Changing the data type of the following variables from
Database_qualified_name to Identifier_chain2:
* q_pkg_proc in LEX::call_statement_start()
* q_pkg_func in LEX::make_item_func_call_generic()
Rationale:
The data type of Database_qualified_name::m_db will be changed
to Lex_ident_db soon. So Database_qualified_name won't be able
to store the `pkg.routine` part of `db.pkg.routine` any more,
because `pkg` must not depend on lower-case-table-names.
Changing LEX_CSTRING* parameters of LEX::make_sp_name() to Lex_ident_sys_st.
This makes the code clear because a value of Lex_ident_sys_st has
some guaranteed additional constraints over a base LEX_CSTRING:
- Its LEX_CSTRING::str is not NULL (sql_yacc.yy would abort otherwise)
- Its LEX_CSTRING::str is 0-terminated
- Its a valid utf8 string
- The string pointed by LEX_CSTRING::str was created on THD::mem_root
Also changing "pass by pointer" to "pass by reference",
as these parameters can never be NULL - they are Bison stack variables.