1. Disable opt_hint_timeout.test for embedded server. Make sure
the test does not crash even when started for embedded server.
Disable view-protocol since hints are not supported inside views.
2. Hints are designed to behave like regular /* ... */ comments:
`SELECT /*+ " */ 1;` -- a valid SQL query
However, the mysql client program waits for the closing doublequote
character.
Another problem is observed when there is no space character between
closing `*/` of the hint and the following `*`:
`SELECT /*+ some_hints(...) */* FROM t1;`
In this case the client treats `/*` as a comment section opening and
waits for the closing `*/` sequence.
This commit fixes all of these issues
It places a limit N (a timeout value in milliseconds) on how long
a statement is permitted to execute before the server terminates it.
Syntax:
SELECT /*+ MAX_EXECUTION_TIME(milliseconds) */ ...
Only top-level SELECT statements support the hint.
Valgrind is single threaded and only changes threads as part of
system calls or waits.
Some busy loops were identified and fixed where the server assumes
that some other thread will change the state, which will not happen
with valgrind.
Based on patch by Monty. Original patch introduced VALGRIND_YIELD,
which emits pthread_yield() only in valgrind builds. However it was
agreed that it is a good idea to emit yield() unconditionally, such
that other affected schedulers (like SCHED_FIFO) benefit from this
change. Also avoid pthread_yield() in favour of standard
std::this_thread::yield().
The main purpose of this allow one to use the --read-only
option to ensure that no one can issue a query that can
block replication.
The --read-only option can now take 4 different values:
0 No read only (as before).
1 Blocks changes for users without the 'READ ONLY ADMIN'
privilege (as before).
2 Blocks in addition LOCK TABLES and SELECT IN SHARE MODE
for not 'READ ONLY ADMIN' users.
3 Blocks in addition 'READ_ONLY_ADMIN' users for all the
previous statements.
read_only is changed to an enum and one can use the following
names for the lock levels:
OFF, ON, NO_LOCK, NO_LOCK_NO_ADMIN
Too keep things compatible with older versions config files, one can
still use values FALSE and TRUE, which are mapped to OFF and ON.
The main visible changes are:
- 'show variables like "read_only"' now returns a string
instead of a number.
- Error messages related to read_only violations now contains
the current value off readonly.
Other things:
- is_read_only_ctx() renamed to check_read_only_with_error()
- Moved TL_READ_SKIP_LOCKED to it's logical place
Reviewed by: Sergei Golubchik <serg@mariadb.org>
REPAIR of InnoDB tables was logging ALTER TABLE and REPAIR to ddl log.
ALTER TABLE contained the new tableid and REPAIR, wrongly, contained the
old rowid.
Now only REPAIR is logged
ddl.log changes:
REPAIR TABLE and OPTIMIZE TABLE that are done through ALTER TABLE will
now contain the old and new table id. If not done through ALTER TABLE,
only the current rowid will be shown (as before).
MDEV-36563 Assertion `!mysql_bin_log.is_open()' failed in
THD::mark_tmp_table_as_free_for_reuse
The purpose of this commit is to ensure that creation and changes of
temporary tables are properly and predicable logged to the binary
log. It also fixes some bugs where ROW logging was used in MIXED mode,
when STATEMENT would be a better (and expected) choice.
In this comment STATEMENT stands for logging to binary log in
STATEMENT format, MIXED stands for MIXED binlog format and ROW for ROW
binlog format.
New rules for logging of temporary tables
- CREATE of temporary tables are now by default binlogged only if
STATEMENT binlog format is used. If it is binlogged, 1 is stored in
TABLE_SHARE->table_creation_was_logged. The user can change this
behavior by setting create_temporary_table_binlog_formats to
MIXED,STATEMENT in which case the create is logged in statement
format also in MIXED mode (as before).
- Changes to temporary tables are only binlogged if and only if
the CREATE was logged. The logging happens under STATEMENT or MIXED.
If binlog_format=ROW, temporary table changes are not binlogged. A
temporary table that are changed under ROW are marked as 'not up to
date in binlog' and no future row changes are logged. Any usage of
this temporary table will force row logging of other tables in any
future statements using the temporary table to be row logged.
- DROP TEMPORARY is binlogged only of the CREATE was binlogged.
Changes done:
- Row logging is forced for any statement using temporary tables that
are not up to date in the binary log.
(Before the row logging was forced if the user has a temporary table)
- If there is any changes to the temporary table that is not binlogged,
the table is marked as not up to date.
- TABLE_SHARE->table_creation_was_logged has a new definition for
temporary tables:
0 Table creating was not logged to binary log
1 Table creating was logged to binary log and table is up to date.
2 Table creating was logged to binary log but some changes where
not logged to binary log.
Table is not up to date in binary log is defined as value 0 or 2.
- If a multi-table-update or multi-table-delete fails then
all updated temporary tables are marked as not up to date.
- Enforce row logging if the query is using temporary tables
that are not up to date.
Before row logging was enforced if the user had any
temporary tables.
- When dropping temporary tables use IF EXISTS. This ensures
that slave will not stop if it had crashed and lost the
temporary tables.
- Remove comment and version from DROP /*!4000 TEMPORARY.. generated when
a connection closes that has open temporary tables. Added 'generated by
server' at the end of the DROP.
Bugs fixed:
- When using temporary tables with commands that forced row based,
like INSERT INTO temporary_table VALUES (UUID()), this was never
logged which causes the temporary table to be inconsistent on
master and slave.
- Used binlog format is now clearly defined. It is now only depending
on the current binlog_format and the tables used.
Before it was depending on the user had ANY temporary tables and
the state of 'current_stmt_binlog_format' set by previous queries.
This also caused temporary tables to be logged to binary log in
some cases.
- CREATE TABLE t1 LIKE not_logged_temporary_table caused replication
to stop.
- Rename of not binlogged temporary tables where binlogged to binary log
which caused replication to stop.
Changes in behavior:
- By default create_temporary_table_binlog_formats=STATEMENT, which
means that CREATE TEMPORARY is not logged to binary log under MIXED
binary logging. This can be changed by setting
create_temporary_table_binlog_formats to MIXED,STATEMENT.
- Using temporary tables that was not logged to the binary log will
cause any query using them for updating other tables to be logged in
ROW format. Before all queries was logged in ROW format if the user had
any temporary tables, even if they were not used by the query.
- Generated DROP TEMPORARY TABLE is now always using IF EXISTS and
has a "generated by server" comment in the binary log.
The consequences of the above is that manipulations of a lot of rows
through temporary tables will by default be be slower in mixed mode.
For example:
BEGIN;
CREATE TEMPORARY TABLE tmp AS SELECT a, b, c FROM
large_table1 JOIN large_table2 ON ...;
INSERT INTO other_table SELECT b, c FROM tmp WHERE a <100;
DROP TEMPORARY TABLE tmp;
COMMIT;
By default this will create a huge entry in the binary log, compared
to just a few hundred bytes in statement mode. However the change in
this commit will make usage of temporary tables more reliable and
predicable and is thus worth it. Using statement mode or
create_temporary_table_binlog_formats can be used to avoid this issue.
This is needed to make it easy for users to automatically ignore long
char and varchars when using ANALYZE TABLE PERSISTENT.
These fields can cause problems as they will consume
'CHARACTERS * MAX_CHARACTER_LENGTH * 2 * number_of_rows' space on disk
during analyze, which can easily be much bigger than the analyzed table.
This commit adds a new user variable, analyze_max_length, default value 4G.
Any field that is bigger than this in bytes, will be ignored by
ANALYZE TABLE PERSISTENT unless it is specified in FOR COLUMNS().
While doing this patch, I noticed that we do not skip GEOMETRY columns from
ANALYZE TABLE, like we do with BLOB. This should be fixed when merging
to the 'main' branch. At the same time we should add a resonable default
value for analyze_max_length, probably 1024, like we have for
max_sort_length.
When the SELECT sub-statement executes a stored function that is defined
to modify a non-transactional table, like
delimiter |;
create function f_ia(arg int)
returns integer
begin
insert into ti_pk set a=1;
insert into ta set a=1;
insert into ti_pk set a=arg;
return 1;
end |
delimiter ;|
any modified records that the function has succeeded
on must be binlogged as a "side effect" of CREATE-SELECT.
It is expected that a failing CREATE-SELECT like
--error ER_DUP_ENTRY
set statement binlog_format = ROW for create table t_y (a int) engine=aria select f_ia(1 /* err in Innodb after Aria stmt is done */) as a;
leaves upon itself the following state:
include/show_binlog_events.inc
Log_name Pos Event_type Server_id End_log_pos Info
master-bin.000001 # Gtid # # BEGIN GTID #-#-#
master-bin.000001 # Table_map # #
table_id: # (test. ta)
master-bin.000001 # Write_rows_v1 # #
table_id: # flags: STMT_END_F
master-bin.000001 # Query # # COMMIT
select * from ta;
a
1
select count(*) = 0 from ti_pk;
true
However it's not so for the binlog part.
The reason is that prior to MDEV-34150 fixes the CREATE-SELECT's
errored phase leaves the binlog caches intact (the file:pos from 10.11 c06c36218a)
to defer their reset to the rollback phase of the top-level
/* the statement cache gets binlogged */
where the side-effect changes gets binlogged.
MDEV-34150 fixes harmed (+#4 line) the statement cache in particular
in the error phase (file:pos are from 395db6f1d5 the current 11.8 )
/* The caches incl the statement cache are gone */
/* 'cos of MDEV-34150 */
+#4 0x00005d75f9b6a92e in THD::binlog_remove_rows_events (this=0x52c000240288) at log.cc:579
Apparently it should not have been there, as proper emptying (either
with reset for the transactional cache or flush and then reset for the
statement cache) is (must be) always done via binlog_rollback of the
top-level statement.
To observe the above requirement the case is fixed with the removal of
thd->binlog_remove_rows_events() and its definition.
Tested with rpl.rpl_create_select_row.
Reviewed-by Brandon Nesterenko.
Added capability to create a trigger associated with several trigger
events. For this goal, the syntax of the CREATE TRIGGER statement
was extended to support the syntax structure { event [ OR ... ] }
for the `trigger_event` clause. Since one trigger will be able to
handle several events it should be provided a way to determine what
kind of event is handled on execution of a trigger. For this goal
support of the clauses INSERTING, UPDATING , DELETING was added by
this patch. These clauses can be used inside a trigger body to detect
what kind of trigger action is currently processed using the following
boilerplate:
IF INSERTING THEN ...
ELSIF UPDATING THEN ...
ELSIF DELETING THEN ...
In case one of the clauses INSERTING, UPDATING, DELETING specified in
a trigger's body not matched with a trigger event type, the error
ER_INCOMPATIBLE_EVENT_FLAG is emitted.
After this patch be pushed, one Trigger object will be associated with
several trigger events. It means that the array
Table_triggers_list::triggers
can contain several pointers to the same Trigger object in array members
corresponding to different events. Moreover, support of several trigger
events for the same trigger requires that the data members `next` and
`action_order` of the Trigger class be converted to arrays to store
relating information per trigger event base.
Ability to specify the same trigger for different event types results in
necessity to handle invalid cases on execution of the multi-event
trigger, when the OLD or NEW qualifiers doesn't match a current event
type against that the trigger is run. The clause OLD should produces
the NULL value for INSERT event, whereas the clause NEW should produce
the NULL value for DELETE event.
This patch adds support for SYS_REFCURSOR (a weakly typed cursor)
for both sql_mode=ORACLE and sql_mode=DEFAULT.
Works as a regular stored routine variable, parameter and return value:
- can be passed as an IN parameter to stored functions and procedures
- can be passed as an INOUT and OUT parameter to stored procedures
- can be returned from a stored function
Note, strongly typed REF CURSOR will be added separately.
Note, to maintain dependencies easier, some parts of sql_class.h
and item.h were moved to new header files:
- select_results.h:
class select_result_sink
class select_result
class select_result_interceptor
- sp_cursor.h:
class sp_cursor_statistics
class sp_cursor
- sp_rcontext_handler.h
class Sp_rcontext_handler and its descendants
The implementation consists of the following parts:
- A new class sp_cursor_array deriving from Dynamic_array
- A new class Statement_rcontext which contains data shared
between sub-statements of a compound statement.
It has a member m_statement_cursors of the sp_cursor_array data type,
as well as open cursor counter. THD inherits from Statement_rcontext.
- A new data type handler Type_handler_sys_refcursor in plugins/type_cursor/
It is designed to store uint16 references -
positions of the cursor in THD::m_statement_cursors.
- Type_handler_sys_refcursor suppresses some derived numeric features.
When a SYS_REFCURSOR variable is used as an integer an error is raised.
- A new abstract class sp_instr_fetch_cursor. It's needed to share
the common code between "OPEN cur" (for static cursors) and
"OPER cur FOR stmt" (for SYS_REFCURSORs).
- New sp_instr classes:
* sp_instr_copen_by_ref - OPEN sys_ref_curor FOR stmt;
* sp_instr_cfetch_by_ref - FETCH sys_ref_cursor INTO targets;
* sp_instr_cclose_by_ref - CLOSE sys_ref_cursor;
* sp_instr_destruct_variable - to destruct SYS_REFCURSOR variables when
the execution goes out of the BEGIN..END block
where SYS_REFCURSOR variables are declared.
- New methods in LEX:
* sp_open_cursor_for_stmt - handles "OPEN sys_ref_cursor FOR stmt".
* sp_add_instr_fetch_cursor - "FETCH cur INTO targets" for both
static cursors and SYS_REFCURSORs.
* sp_close - handles "CLOSE cur" both for static cursors and SYS_REFCURSORs.
- Changes in cursor functions to handle both static cursors and SYS_REFCURSORs:
* Item_func_cursor_isopen
* Item_func_cursor_found
* Item_func_cursor_notfound
* Item_func_cursor_rowcount
- A new system variable @@max_open_cursors - to limit the number
of cursors (static and SYS_REFCURSORs) opened at the same time.
Its allowed range is [0-65536], with 50 by default.
- A new virtual method Type_handler::can_return_bool() telling
if calling item->val_bool() is allowed for Items of this data type,
or if otherwise the "Illegal parameter for operation" error should be raised
at fix_fields() time.
- New methods in Sp_rcontext_handler:
* get_cursor()
* get_cursor_by_ref()
- A new class Sp_rcontext_handler_statement to handle top level statement
wide cursors which are shared by all substatements.
- A new virtual method expr_event_handler() in classes Item and Field.
It's needed to close (and make available for a new OPEN)
unused THD::m_statement_cursors elements which do not have any references
any more. It can happen in various moments in time, e.g.
* after evaluation parameters of an SQL routine
* after assigning a cursor expression into a SYS_REFCURSOR variable
* when leaving a BEGIN..END block with SYS_REFCURSOR variables
* after setting OUT/INOUT routine actual parameters from formal
parameters.
wait_for_prior_commit() can be called multiple times per event group,
only do my_error() the first time the call fails.
Remove redundant set_overwrite_status() calls.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
Reviewed-by: Monty <monty@mariadb.org>
It was not possible to use a package body variable as a
fetch target:
CREATE PACKAGE BODY pkg AS
vc INT := 0;
FUNCTION f1 RETURN INT AS
CURSOR cur IS SELECT 1 AS c FROM DUAL;
BEGIN
OPEN cur;
FETCH cur INTO vc; -- this returned "Undeclared variable: vc" error.
CLOSE cur;
RETURN vc;
END;
END;
FETCH assumed that all fetch targets reside of the same sp_rcontext
instance with the cursor. This patch fixes the problem.
Now a cursor and its fetch target can reside in different sp_rcontext
instances.
Details:
- Adding a helper class sp_rcontext_addr
(a combination of Sp_rcontext_handler pointer and an offset in the rcontext)
- Adding a new class sp_fetch_target deriving from sp_rcontext_addr.
Fetch targets in "FETCH cur INTO target1, target2 ..." are now collected
into this structure instead of sp_variable.
sp_variable cannot be used any more to store fetch targets,
because it does not have a pointer to Sp_rcontext_handler
(it only has the current rcontext offset).
- Removing members sp_instr_set members m_rcontext_handler and m_offset.
Deriving sp_instr_set from sp_rcontext_addr instead.
- Renaming sp_instr_cfetch member "List<sp_variable> m_varlist"
to "List<sp_fetch_target> m_fetch_target_list".
- Fixing LEX::sp_add_cfetch() to return the pointer to the
created sp_fetch_target instance (instead of returning bool).
This helps to make the grammar in sql_yacc.c simpler
- Renaming LEX::sp_add_cfetch() to LEX::sp_add_instr_cfetch(),
as `if(sp_add_cfetch())` changed its meaning to the opposite,
to avoid automatic wrong merge from earlier versions.
- Chaning the "List<sp_variable> *vars" parameter to sp_cursor::fetch
to have the data type "List<sp_fetch_target> *".
- Changing the data type of "List<sp_variable> &vars" in
sp_cursor::Select_fetch_into_spvars::send_data_to_variable_list()
to "List<sp_fetch_target> &".
- Adding THD helper methods get_rcontext() and get_variable().
- Moving the code from sql_yacc.yy into a new LEX method
LEX::make_fetch_target().
- Simplifying the grammar in sql_yacc.yy using the new LEX method.
Changing the data type of the bison rule sp_fetch_list from "void"
to "List<sp_fetch_target> *".
* rpl.rpl_system_versioning_partitions updated for MDEV-32188
* innodb.row_size_error_log_warnings_3 changed error for MDEV-33658
(checks are done in a different order)
Reintroduces delete_while_scanning optimization for multi_delete.
Reverse some test changes from the initial feature devlopment now
that we delete-on-the-fly once again.
We now allow multitable queries with order by and limit, such as:
delete t1.*, t2.* from t1, t2 order by t1.id desc limit 3;
To predict what rows will be deleted, run the equivalent select:
select t1.*, t2.* from t1, t2 order by t1.id desc limit 3;
Additionally, index hints are now supported with single table delete statements:
delete from t2 use index(xid) order by (id) limit 2;
This approach changes the multi_delete SELECT result interceptor to use a temporary
table to collect row ids pertaining to the rows that will be deleted, rather than
directly deleting rows from the target table(s). Row ids are collected during
send_data, then read during send_eof to delete target rows. In the event that the
temporary table created in memory is not big enough for all matching rows, it is
converted to an aria table.
Other changes:
- Deleting from a sequence now affects zero rows instead of emitting an error
Limitations:
- The federated connector does not create implicit row ids, so we to use a key
when conditionally deleting. See the change in federated_maybe_16324629.test
This patch includes a few changes to make the code easier to maintain:
- Renamed SQL_I_List::link_in_list to SQL_I_List::insert. link_in_list was
ambiguous as it could refer to a link or it could refer to a node
- Remove field_name local variable in multi_update::initialize_tables because
it is not used when creating the temporary tables
- multi_update changes:
- Move temp table callocs to init, a more natural location for them, and moved
tables_to_update to const member variable so we don't recompute it.
- Filter out jtbm tables and tables not in the update map, pushing those that
will be updated into an update_targets container. This simplifies checks and
loops in initialize_tables.
Implementation of this task adds ability to raise the signal with
SQLSTATE '02TRG' from a BEFORE INSERT/UPDATE/DELETE trigger and handles
this signal as an indicator meaning 'to throw away the current row'
on processing the INSERT/UPDATE/DELETE statement. The signal with
SQLSTATE '02TRG' has special meaning only in case it is raised inside
BEFORE triggers, for AFTER trigger's this value of SQLSTATE isn't treated
in any special way. In according with SQL standard, the SQLSTATE class '02'
means NO DATA and sql_errno for this class is set to value
ER_SIGNAL_NOT_FOUND by current implementation of MariaDB server.
Implementation of this task assigns the value ER_SIGNAL_SKIP_ROW_FROM_TRIGGER
to sql_errno in Diagnostics_area in case the signal is raised from a trigger
and SQLSTATE has value '02TRG'.
To catch signal with SQLTSATE '02TRG' and handle it in special way, the methods
Table_triggers_list::process_triggers
select_insert::store_values
select_create::store_values
Rows_log_event::process_triggers
and the overloaded function
fill_record_n_invoke_before_triggers
were extended with extra out parameter for returning the flag whether
to skip the current values being processed by INSERT/UPDATE/DELETE
statement. This extra parameter is passed as nullptr in case of AFTER trigger
and BEFORE trigger this parameter points to a variable to store a marker
whether to skip the current record or store it by calling write_record().
mysql_prepare_create_table: Extract a Key initialization part that
relates to length calculation and long unique index designation.
append_system_key_parts call also moves there.
Move this initialization before the duplicate elimination.
Extract WITHOUT OVERPLAPS check into a separate function. It had to be moved
earlier in the code to preserve the order of the error checks, as in the tests.
MDEV-28127 did is_equal() which compared vcol expressions
literally. But another table vcol expression is not equal because of
different table name.
We implement another comparison method is_identical() which respects
different table name in vcol comparison. If any field item points to
table_A and compared field item points to table_B, such items are
treated as equal in (table_A, table_B) comparison. This is done by
cloning table_B expression and renaming any table_B entries to table_A
in it.
The problems were that:
1) resources was freed "asimetric" normal execution in send_eof,
in case of error in destructor.
2) destructor was not called in case of SP for result objects.
(so if the last SP execution ended with error resorces was not
freeded on reinit before execution (cleanup() called before next
execution) and destructor also was not called due to lack of
delete call for the object)
Result cleanup() renamed to reset_for_next_ps_execution() to better
reflect function().
All result method revised and freeing resources made "symetric".
Destructor of result object called for SP.
Added skipped invalidation in case of error in insert.
Removed misleading naming of reset(thd) (could be mixed with
with reset()).
This commit updates default memory allocations size used with MEM_ROOT
objects to minimize the number of calls to malloc().
Changes:
- Updated MEM_ROOT block sizes in sql_const.h
- Updated MALLOC_OVERHEAD to also take into account the extra memory
allocated by my_malloc()
- Updated init_alloc_root() to only take MALLOC_OVERHEAD into account as
buffer size, not MALLOC_OVERHEAD + sizeof(USED_MEM).
- Reset mem_root->first_block_usage if and only if first block was used.
- Increase MEM_ROOT buffers sized used by my_load_defaults, plugin_init,
Create_tmp_table, allocate_table_share, TABLE and TABLE_SHARE.
This decreases number of malloc calls during queries.
- Use a small buffer for THD->main_mem_root in THD::THD. This avoids
multiple malloc() call for new connections.
I tried the above changes on a complex select query with 12 tables.
The following shows the number of extra allocations that where used
to increase the size of the MEM_ROOT buffers.
Original code:
- Connection to MariaDB: 9 allocations
- First query run: 146 allocations
- Second query run: 24 allocations
Max memory allocated for thd when using with heap table: 61,262,408
Max memory allocated for thd when using Aria tmp table: 419,464
After changes:
Connection to MariaDB: 0 allocations
- First run: 25 allocations
- Second run: 7 allocations
Max memory allocated for thd when using with heap table: 61,347,424
Max memory allocated for thd when using Aria table: 529,168
The new code uses slightly more memory, but avoids memory fragmentation
and is slightly faster thanks to much fewer calls to malloc().
Reviewed-by: Sergei Golubchik <serg@mariadb.org>
Added Query_time (total time spent running queries) to status_variables.
Other things:
- Added SHOW_MICROSECOND_STATUS type that shows an ulonglong variable
in microseconds converted to a double (in seconds).
- Changed Busy_time and Cpu_time to use SHOW_MICROSECOND_STATUS, which
simplified the code and avoids some double divisions for each query.
Reviewed-by: Sergei Golubchik <serg@mariadb.org>