Also expand vcol field index coverings to include indexes covering all
the fields in the expression. The reasoning goes as follows: let f(c1,
c2, ..., cn) be a function on applied to columns c1, c2, ..., cn, if
f(...) is covered by an index, so should vc whose expression is
f(...).
For example, if t.vf = t.c1 + t.c2, and t has three indexes (vf), (c1,
c2), (c1).
Before this change, vf's index covering is a singleton {(vf)}. Let's call
that the "conventional" index covering.
After this change vf's index covering is now {(vf), (c1, c2)}, since
(c1, c2) covers both c1 and c2. Let's call (c1, c2) in this case the
"extra" covering.
With the coverings updated, when an index in the "extra" covering is
chosen for keyread, the vcol also needs to be calculated. In this case
we mark vcol in the table read_set, and ensure it is computed.
With these changes, we see various improvements, including from using
full table scan + filesort to full index scan + filesort when ORDER BY
an indexed vcol (here vc = c + 1 is a vcol and both c and vc are
indexes):
explain select c + 1 from t order by vc;
id select_type table type possible_keys key key_len ref rows Extra
-1 SIMPLE t ALL NULL NULL NULL NULL 10000 Using filesort
+1 SIMPLE t index NULL c 5 NULL 10000 Using index; Using filesort
The substitutions are followed updates to all_fields which include a
copy of the ORDER BY/GROUP BY item pointers, as well as corresponding
updates to ref_pointer_array so that the all_fields and
ref_pointer_array remain in sync.
Another, related change is the recomputation of table index covering
on substitutions. It not only reflects the correct table index
covering after the substitutions, but also improve executions where
the vcol index can be chosen, such as this example (here vc = c + 1
and vc is the only index in the table), from full table scan +
filesort to full index scan:
select vc from t order by c + 1;
We do it in SELECT as well as in single table DELETE/UPDATE.
This patch adds support for SYS_REFCURSOR (a weakly typed cursor)
for both sql_mode=ORACLE and sql_mode=DEFAULT.
Works as a regular stored routine variable, parameter and return value:
- can be passed as an IN parameter to stored functions and procedures
- can be passed as an INOUT and OUT parameter to stored procedures
- can be returned from a stored function
Note, strongly typed REF CURSOR will be added separately.
Note, to maintain dependencies easier, some parts of sql_class.h
and item.h were moved to new header files:
- select_results.h:
class select_result_sink
class select_result
class select_result_interceptor
- sp_cursor.h:
class sp_cursor_statistics
class sp_cursor
- sp_rcontext_handler.h
class Sp_rcontext_handler and its descendants
The implementation consists of the following parts:
- A new class sp_cursor_array deriving from Dynamic_array
- A new class Statement_rcontext which contains data shared
between sub-statements of a compound statement.
It has a member m_statement_cursors of the sp_cursor_array data type,
as well as open cursor counter. THD inherits from Statement_rcontext.
- A new data type handler Type_handler_sys_refcursor in plugins/type_cursor/
It is designed to store uint16 references -
positions of the cursor in THD::m_statement_cursors.
- Type_handler_sys_refcursor suppresses some derived numeric features.
When a SYS_REFCURSOR variable is used as an integer an error is raised.
- A new abstract class sp_instr_fetch_cursor. It's needed to share
the common code between "OPEN cur" (for static cursors) and
"OPER cur FOR stmt" (for SYS_REFCURSORs).
- New sp_instr classes:
* sp_instr_copen_by_ref - OPEN sys_ref_curor FOR stmt;
* sp_instr_cfetch_by_ref - FETCH sys_ref_cursor INTO targets;
* sp_instr_cclose_by_ref - CLOSE sys_ref_cursor;
* sp_instr_destruct_variable - to destruct SYS_REFCURSOR variables when
the execution goes out of the BEGIN..END block
where SYS_REFCURSOR variables are declared.
- New methods in LEX:
* sp_open_cursor_for_stmt - handles "OPEN sys_ref_cursor FOR stmt".
* sp_add_instr_fetch_cursor - "FETCH cur INTO targets" for both
static cursors and SYS_REFCURSORs.
* sp_close - handles "CLOSE cur" both for static cursors and SYS_REFCURSORs.
- Changes in cursor functions to handle both static cursors and SYS_REFCURSORs:
* Item_func_cursor_isopen
* Item_func_cursor_found
* Item_func_cursor_notfound
* Item_func_cursor_rowcount
- A new system variable @@max_open_cursors - to limit the number
of cursors (static and SYS_REFCURSORs) opened at the same time.
Its allowed range is [0-65536], with 50 by default.
- A new virtual method Type_handler::can_return_bool() telling
if calling item->val_bool() is allowed for Items of this data type,
or if otherwise the "Illegal parameter for operation" error should be raised
at fix_fields() time.
- New methods in Sp_rcontext_handler:
* get_cursor()
* get_cursor_by_ref()
- A new class Sp_rcontext_handler_statement to handle top level statement
wide cursors which are shared by all substatements.
- A new virtual method expr_event_handler() in classes Item and Field.
It's needed to close (and make available for a new OPEN)
unused THD::m_statement_cursors elements which do not have any references
any more. It can happen in various moments in time, e.g.
* after evaluation parameters of an SQL routine
* after assigning a cursor expression into a SYS_REFCURSOR variable
* when leaving a BEGIN..END block with SYS_REFCURSOR variables
* after setting OUT/INOUT routine actual parameters from formal
parameters.
* rpl.rpl_system_versioning_partitions updated for MDEV-32188
* innodb.row_size_error_log_warnings_3 changed error for MDEV-33658
(checks are done in a different order)
normalize_cond() translated `WHERE col` into `WHERE col<>0`
But the opetator "not equal to 0" does not necessarily exists
for all data types.
For example, the query:
SELECT * FROM t1 WHERE inet6col;
was translated to:
SELECT * FROM t1 WHERE inet6col<>0;
which further failed with this error:
ERROR : Illegal parameter data types inet6 and bigint for operation '<>'
This patch changes the translation from `col<>0` to `col IS TRUE`.
So now
SELECT * FROM t1 WHERE inet6col;
gets translated to:
SELECT * FROM t1 WHERE inet6col IS TRUE;
Details:
1. Implementing methods:
- Field_longstr::val_bool()
- Field_string::val_bool()
- Item::val_int_from_val_str()
If the input contains bad data,
these methods raise a better error message:
Truncated incorrect BOOLEAN value
Before the change, the error was:
Truncated incorrect DOUBLE value
2. Fixing normalize_cond() to generate Item_func_istrue/Item_func_isfalse
instances instead of Item_func_ne/Item_func_eq
3. Making Item_func_truth sargable, so it uses the range optimizer.
Implementing the following methods:
- get_mm_tree(), get_mm_leaf(), add_key_fields() in Item_func_truth.
- get_func_mm_tree(), for all Item_func_truth descendants.
4. Implementing the method negated_item() for all Item_func_truth
descendants, so the negated item has a chance to be sargable:
For example,
WHERE NOT col IS NOT FALSE -- this notation is not sargable
is now translated to:
WHERE col IS FALSE -- this notation is sargable
it's incorrect to zero out table->triggers->extra_null_bitmap
before a statement, because if insert uses an explicit field list
and omits a field that has no default value, the field should
get NULL implicitly. So extra_null_bitmap should have 1s for all
fields that have no defaults
* create extra_null_bitmap_init and initialize it as above
* copy extra_null_bitmap_init to extra_null_bitmap for inserts
* still zero out extra_null_bitmap for updates/deletes where
all fields definitely have a value
* make not_null_fields_have_null_values() to send
ER_NO_DEFAULT_FOR_FIELD for fields with no default and no value,
otherwise creation of a trigger with an empty body would change the
error message
MDEV-28127 did is_equal() which compared vcol expressions
literally. But another table vcol expression is not equal because of
different table name.
We implement another comparison method is_identical() which respects
different table name in vcol comparison. If any field item points to
table_A and compared field item points to table_B, such items are
treated as equal in (table_A, table_B) comparison. This is done by
cloning table_B expression and renaming any table_B entries to table_A
in it.
* sysvars should be REQUIRED_ARG
* fix a mix of US and UK spelling (use US)
* use consistent naming
* work if VEC_DISTANCE arguments are in the swapped order (const, col)
* work if VEC_DISTANCE argument is NULL/invalid or wrong length
* abort INSERT if the value is invalid or wrong length
* store the "number of neighbors" in a blob in endianness-independent way
* use field->store(longlong, bool) not field->store(double)
* a lot more error checking everywhere
* cleanup after errors
* simplify calling conventions, remove reinterpret_cast's
* todo/XXX comments
* whitespaces
* use float consistently
memory management is still totally PoC quality
the information about index algorithm was stored in two
places inconsistently split between both.
BTREE index could have key->algorithm == HA_KEY_ALG_BTREE, if the user
explicitly specified USING BTREE or HA_KEY_ALG_UNDEF, if not.
RTREE index had key->algorithm == HA_KEY_ALG_RTREE
and always had key->flags & HA_SPATIAL
FULLTEXT index had key->algorithm == HA_KEY_ALG_FULLTEXT
and always had key->flags & HA_FULLTEXT
HASH index had key->algorithm == HA_KEY_ALG_HASH or HA_KEY_ALG_UNDEF
long unique index always had key->algorithm == HA_KEY_ALG_LONG_HASH
In this commit:
All indexes except BTREE and HASH always have key->algorithm
set, HA_SPATIAL and HA_FULLTEXT flags are not used anymore (except
for storage to keep frms backward compatible).
As a side effect ALTER TABLE now detects FULLTEXT index renames correctly
to explicit row_start/row_end columns
In case of adding both system fields of same type (length, unsigned
flag) as old implicit system fields do the rename of implicit system
fields to the ones specified in ALTER, remove SYSTEM_INVISIBLE flag in
that case. Correct PERIOD clause must be specified in ALTER as well.
MDEV-34904 Inplace alter for implicit to explicit versioning is broken
Whether ALTER goes inplace and how it goes inplace depends on
handler_flags which goes from alter_info->flags by this logic:
ha_alter_info->handler_flags|= (alter_info->flags & ~flags_to_remove);
ALTER_VERS_EXPLICIT was not in flags_to_remove and its value (1ULL <<
35) clashed with ALTER_ADD_NON_UNIQUE_NON_PRIM_INDEX.
ALTER_VERS_EXPLICIT must not affect inplace, it is SQL-only so we
remove it from handler_flags.
Adding support for the ROW data type in the stored function RETURNS clause:
- explicit ROW(..members...) for both sql_mode=DEFAULT and sql_mode=ORACLE
CREATE FUNCTION f1() RETURNS ROW(a INT, b VARCHAR(32)) ...
- anchored "ROW TYPE OF [db1.]table1" declarations for sql_mode=DEFAULT
CREATE FUNCTION f1() RETURNS ROW TYPE OF test.t1 ...
- anchored "[db1.]table1%ROWTYPE" declarations for sql_mode=ORACLE
CREATE FUNCTION f1() RETURN test.t1%ROWTYPE ...
Adding support for anchored scalar data types in RETURNS clause:
- "TYPE OF [db1.]table1.column1" for sql_mode=DEFAULT
CREATE FUNCTION f1() RETURNS TYPE OF test.t1.column1;
- "[db1.]table1.column1" for sql_mode=ORACLE
CREATE FUNCTION f1() RETURN test.t1.column1%TYPE;
Details:
- Adding a new sql_mode_t parameter to
sp_head::create()
sp_head::sp_head()
sp_package::create()
sp_package::sp_package()
to guarantee early initialization of sp_head::m_sql_mode.
Before this change, this member was not initialized at all during
CREATE FUNCTION/PROCEDURE/PACKAGE statements, and was not used.
Now it needs to be initialized to write properly the
mysql.proc.returns column, according to the create time sql_mode.
- Code refactoring to make the things simpler and functions smaller:
* Adding a new method
Field_row::row_create_fields(THD *thd, List<Spvar_definition> *list)
to make a Virtual_tmp_table with Fields for ROW members
from an explicit definition.
* Adding a new method
Field_row::row_create_fields(THD *thd, const Spvar_definition &def)
to make a Virtual_tmp_table with Fields for ROW members
from an explicit or a table anchored definition.
* Adding a new method
Item_args::add_array_of_item_field(THD *thd, const Virtual_tmp_table &vtable)
to create and array of Item_field corresponding to all Field instances
in a Virtual_tmp_table
* Removing Item_field_row::row_create_items(). It was decomposed
into the new methods described above.
* Moving the code from the loop body in sp_rcontext::init_var_items()
into a separate method Spvar_definition::make_item_field_row(),
to make the code clearer (smaller functions).
make_item_field_row() itself uses the new methods described above.
- Changing the data type of sp_head::m_return_field_def
from Column_definition to Spvar_definition.
So now it supports not only SQL column field types,
but also explicit ROW and anchored ROW data types,
as well as anchored column types.
- Adding a new Column_definition parameter to sp_head::create_result_field().
Before this patch, create_result_field() took the definition only
from m_return_field_def. Now it's also called with a local Column_definition
variable which contains the explicit definition resolved from an
anchored defition.
- Modifying sql_yacc.yy to support the new grammar.
Adding new helper methods:
* sf_return_fill_definition_row()
* sf_return_fill_definition_rowtype_of()
* sf_return_fill_definition_type_of()
- Fixing tests in:
* Virtual_tmp_table::setup_field_pointers() in sql_select.cc
* Send_field::normalize() in field.h
* store_column_type()
to prevent calling Type_handler_row::field_type(),
which is implemented a DBUG_ASSERT(0).
Before this patch the affected methods and functions were called only
for scalar data types. Now ROW is also possible.
- Adding a new virtual method Field::cols()
- Overriding methods:
Item_func_sp::cols()
Item_func_sp::element_index()
Item_func_sp::check_cols()
Item_func_sp::bring_value()
to support the ROW data type.
- Extending the rule sp_return_type to support
* explicit ROW and anchored ROW data types
* anchored scalar data types
- Overriding Field_row::sql_type() to print
the data type of an explicit ROW.
Changing the return type of the following functions:
- CURRENT_TIMESTAMP, CURRENT_TIMESTAMP(), NOW()
- SYSDATE()
- FROM_UNIXTIME()
from DATETIME to TIMESTAMP.
Note, the old function NOW() returning DATETIME is still available
as LOCALTIMESTAMP or LOCALTIMESTAMP(), e.g.:
SELECT
LOCALTIMESTAMP, -- DATETIME
CURRENT_TIMESTAMP; -- TIMESTAMP
The change in the functions return data type fixes some problems
that occurred near a DST change:
- Problem #1
INSERT INTO t1 (timestamp_field) VALUES (CURRENT_TIMESTAMP);
INSERT INTO t1 (timestamp_field) VALUES (COALESCE(CURRENT_TIMESTAMP));
could result into two different values inserted.
- Problem #2
INSERT INTO t1 (timestamp_field) VALUES (FROM_UNIXTIME(1288477526));
INSERT INTO t1 (timestamp_field) VALUES (FROM_UNIXTIME(1288477526+3600));
could result into two equal TIMESTAMP values near a DST change.
Additional changes:
- FROM_UNIXTIME(0) now returns SQL NULL instead of '1970-01-01 00:00:00'
(assuming time_zone='+00:00')
- UNIX_TIMESTAMP('1970-01-01 00:00:00') now returns SQL NULL instead of 0
(assuming time_zone='+00:00'
These additional changes are needed for consistency with TIMESTAMP fields,
which cannot store '1970-01-01 00:00:00 +00:00'
Field_blob::store() has special code for GROUP_CONCAT temporary table
(to store blob values in Blob_mem_storage - this prevents them
from being freed/overwritten when a next row is read).
Field_geom and Field_blob_compressed inherit from Field_blob but they
have their own ::store() method without this special Blob_mem_storage
support.
Considering that non-grouping CONCAT() of such fields converts
them to plain BLOB, let's do the same for GROUP_CONCAT. To do it,
Item_func_group_concat::setup will signal that it's creating
a temporary table for GROUP_CONCAT, and Field_blog::make_new_field()
override will create base Field_blob when under group concat.
Field_string::val_int(), Field_string::val_real(), Field_string::val_decimal()
passed the whole buffer of field_length bytes to data type conversion routines.
This made conversion routines to print redundant trailing spaces in case of warnings.
Adding a method Field_string::to_lex_cstring() and using it inside
val_int(), val_real(), val_decimal(), val_str().
After this change conversion routines get the same value with what val_str() returns,
and no redundant trailing spaces are displayed.
MDEV-32188 make TIMESTAMP use whole 32-bit unsigned range
- Added --update-history option to mariadb-dump to change 2038
row_end timestamp to 2106.
- Updated ALTER TABLE ... to convert old row_end timestamps to
2106 timestamp for tables created before MariaDB 11.4.0.
- Fixed bug in CHECK TABLE where we wrongly suggested to USE REPAIR
TABLE when ALTER TABLE...FORCE is needed.
- mariadb-check printed table names that where used with REPAIR TABLE but
did not print table names used with ALTER TABLE or with name repair.
Fixed by always printing a table that is fixed if --silent is not
used.
- Added TABLE::vers_fix_old_timestamp() that will change max-timestamp
for versioned tables when replication from a pre-11.4.0 server.
A few test cases changed. This is caused by:
- CHECK TABLE now prints 'Please do ALTER TABLE... instead of
'Please do REPAIR TABLE' when there is a problem with the information
in the .frm file (for example a very old frm file).
- mariadb-check now prints repaired table names.
- mariadb-check also now prints nicer error message in case ALTER TABLE
is needed to repair a table.
MDEV-32188 make TIMESTAMP use whole 32-bit unsigned range
- Changed usage of timeval to my_timeval as the timeval parts on windows
are 32-bit long, which causes some compiler issues on windows.
Fixing the problem that an operation involving a mix of
two or more GEOMETRY operands did not preserve their SRIDs.
Now SRIDs are preserved by hybrid functions, subqueries, TVCs, UNIONs, VIEWs.
Incompatible change:
An attempt to mix two different SRIDs now raises an error.
Details:
- Adding a new class Type_extra_attributes. It's a generic
container which can store very specific data type attributes.
For now it can store one uint32 and one const pointer attribute
(for GEOMETRY's SRID and for ENUM/SET TYPELIB respectively).
In the future it can grow as needed.
Type_extra_attributes will also be reused soon to store "const Type_zone*"
pointers for the TIMESTAMP's "WITH TIME ZONE 'tz'" attribute
(a timestamp data type with a fixed time zone independent from @@time_zone).
The time zone attribute will be stored in exactly the same way like
a TYPELIB pointer is stored by ENUM/SET.
- Removing Column_definition_attributes members "interval" and "srid".
Deriving Column_definition_attributes from the generic attribute container
Type_extra_attributes instead.
- Adding a new class Type_typelib_attributes, to store
the TYPELIB of the ENUM and SET data types. Deriving Field_enum from it.
Removing the member Field_enum::typelib.
- Adding a new class Type_geom_attributes, to store
the GEOMETRY related attributes. Deriving Field_geom from it.
Removing the member Field_geom::srid.
- Removing virtual methods:
Field::get_typelib()
Type_all_attributes::get_typelib() and
Type_all_attributes::set_typelib()
They were very specific to TYPELIB.
Adding more generic virtual methods instead:
* Field::type_extra_attributes() - to get extra attributes
* Type_all_attributes::type_extra_attributes() - to get extra attributes
* Type_all_attributes::type_extra_attributes_addr() - to set extra attributes
- Removing Item_type_holder::enum_set_typelib. Deriving Item_type_holder
from the generic attribute container Type_extra_attributes instead.
This makes it possible for UNION to preserve SRID
(in addition to preserving TYPELIB).
- Deriving Item_hybrid_func from Type_extra_attributes.
This makes it possible for hybrid functions (e.g. CASE, COALESCE,
LEAST, GREATEST etc) to preserve SRID.
- Deriving Item_singlerow_subselect from Type_extra_attributes and
overriding methods:
* Item_cache::type_extra_attributes()
* subselect_single_select_engine::fix_length_and_dec()
* Item_singlerow_subselect::type_extra_attributes()
* Item_singlerow_subselect::type_extra_attributes_addr()
This is needed to preserve SRID in subqueries and TVCs
- Cleanup: fixing the data type of members
* Binlog_type_info::m_enum_typelib
* Binlog_type_info::m_set_typelib
from "TYPELIB *" to "const TYPELIB *"
This patch also fixes:
MDEV-33050 Build-in schemas like oracle_schema are accent insensitive
MDEV-33084 LASTVAL(t1) and LASTVAL(T1) do not work well with lower-case-table-names=0
MDEV-33085 Tables T1 and t1 do not work well with ENGINE=CSV and lower-case-table-names=0
MDEV-33086 SHOW OPEN TABLES IN DB1 -- is case insensitive with lower-case-table-names=0
MDEV-33088 Cannot create triggers in the database `MYSQL`
MDEV-33103 LOCK TABLE t1 AS t2 -- alias is not case sensitive with lower-case-table-names=0
MDEV-33109 DROP DATABASE MYSQL -- does not drop SP with lower-case-table-names=0
MDEV-33110 HANDLER commands are case insensitive with lower-case-table-names=0
MDEV-33119 User is case insensitive in INFORMATION_SCHEMA.VIEWS
MDEV-33120 System log table names are case insensitive with lower-cast-table-names=0
- Removing the virtual function strnncoll() from MY_COLLATION_HANDLER
- Adding a wrapper function CHARSET_INFO::streq(), to compare
two strings for equality. For now it calls strnncoll() internally.
In the future it will turn into a virtual function.
- Adding new accent sensitive case insensitive collations:
- utf8mb4_general1400_as_ci
- utf8mb3_general1400_as_ci
They implement accent sensitive case insensitive comparison.
The weight of a character is equal to the code point of its
upper case variant. These collations use Unicode-14.0.0 casefolding data.
The result of
my_charset_utf8mb3_general1400_as_ci.strcoll()
is very close to the former
my_charset_utf8mb3_general_ci.strcasecmp()
There is only a difference in a couple dozen rare characters, because:
- the switch from "tolower" to "toupper" comparison, to make
utf8mb3_general1400_as_ci closer to utf8mb3_general_ci
- the switch from Unicode-3.0.0 to Unicode-14.0.0
This difference should be tolarable. See the list of affected
characters in the MDEV description.
Note, utf8mb4_general1400_as_ci correctly handles non-BMP characters!
Unlike utf8mb4_general_ci, it does not treat all BMP characters
as equal.
- Adding classes representing names of the file based database objects:
Lex_ident_db
Lex_ident_table
Lex_ident_trigger
Their comparison collation depends on the underlying
file system case sensitivity and on --lower-case-table-names
and can be either my_charset_bin or my_charset_utf8mb3_general1400_as_ci.
- Adding classes representing names of other database objects,
whose names have case insensitive comparison style,
using my_charset_utf8mb3_general1400_as_ci:
Lex_ident_column
Lex_ident_sys_var
Lex_ident_user_var
Lex_ident_sp_var
Lex_ident_ps
Lex_ident_i_s_table
Lex_ident_window
Lex_ident_func
Lex_ident_partition
Lex_ident_with_element
Lex_ident_rpl_filter
Lex_ident_master_info
Lex_ident_host
Lex_ident_locale
Lex_ident_plugin
Lex_ident_engine
Lex_ident_server
Lex_ident_savepoint
Lex_ident_charset
engine_option_value::Name
- All the mentioned Lex_ident_xxx classes implement a method streq():
if (ident1.streq(ident2))
do_equal();
This method works as a wrapper for CHARSET_INFO::streq().
- Changing a lot of "LEX_CSTRING name" to "Lex_ident_xxx name"
in class members and in function/method parameters.
- Replacing all calls like
system_charset_info->coll->strcasecmp(ident1, ident2)
to
ident1.streq(ident2)
- Taking advantage of the c++11 user defined literal operator
for LEX_CSTRING (see m_strings.h) and Lex_ident_xxx (see lex_ident.h)
data types. Use example:
const Lex_ident_column primary_key_name= "PRIMARY"_Lex_ident_column;
is now a shorter version of:
const Lex_ident_column primary_key_name=
Lex_ident_column({STRING_WITH_LEN("PRIMARY")});