1
0
mirror of https://github.com/MariaDB/server.git synced 2025-07-27 18:02:13 +03:00
Commit Graph

7543 Commits

Author SHA1 Message Date
ea2bc974dc Merge 10.1 into 10.2 2020-07-01 12:03:55 +03:00
2b8b7394a1 MDEV-22222: Assertion `state() == s_executing || state() == s_preparing || state() == s_prepared || state() == s_must_abort || state() == s_aborting || state() == s_cert_failed || state() == s_must_replay' failed in wsrep::transaction::before_rollback
LOCK TABLE will do implicit commit, we need to properly handle transaction after commit.
2020-06-28 23:07:41 +02:00
5a7794d3a8 MDEV-21910 Deadlock between BF abort and manual KILL command
When high priority replication slave applier encounters lock conflict in innodb,
it will force the conflicting lock holder transaction (victim) to rollback.
This is a must in multi-master sychronous replication model to avoid cluster lock-up.
This high priority victim abort (aka "brute force" (BF) abort), is started
from innodb lock manager while holding the victim's transaction's (trx) mutex.
Depending on the execution state of the victim transaction, it may happen that the
BF abort will call for THD::awake() to wake up the victim transaction for the rollback.
Now, if BF abort requires THD::awake() to be called, then the applier thread executed
locking protocol of: victim trx mutex -> victim THD::LOCK_thd_data

If, at the same time another DBMS super user issues KILL command to abort the same victim,
it will execute locking protocol of: victim THD::LOCK_thd_data  -> victim trx mutex.
These two locking protocol acquire mutexes in opposite order, hence unresolvable mutex locking
deadlock may occur.

The fix in this commit adds THD::wsrep_aborter flag to synchronize who can kill the victim
This flag is set both when BF is called for from innodb and by KILL command.
Either path of victim killing will bail out if victim's wsrep_killed is already
set to avoid mutex conflicts with the other aborter execution. THD::wsrep_aborter
records the aborter THD's ID. This is needed to preserve the right to kill
the victim from different locations for the same aborter thread.
It is also good error logging, to see who is reponsible for the abort.

A new test case was added in galera.galera_bf_kill_debug.test for scenario where
wsrep applier thread and manual KILL command try to kill same idle victim
2020-06-26 09:56:23 +03:00
f1838434b8 MDEV-22706: Assertion `!current' failed in PROFILING::start_new_query
Analysis:
========
When "Profiling" is enabled, server collects the resource usage of each
statement that gets executed in current session. Profiling doesn't support
nested statements. In order to ensure this behavior when profiling is enabled
for a statement, there should not be any other active query which is being
profiled. This active query information is stored in 'current' variable. When
a nested query arrives it finds 'current' being not NULL and server aborts.

When 'init_connect' and 'init_slave' system variables are set they contain a
set of statements to be executed. "execute_init_command" is the function call
which invokes "dispatch_command" for each statement provided in
'init_connect', 'init_slave' system variables. "execute_init_command" invokes
"start_new_query" and it passes the statement list to "dispatch_command". This
"dispatch_command" intern invokes "start_new_query" which leads to nesting of
queries. Hence '!current' assert is triggered.

Fix:
===
Remove profiling from "execute_init_command" as it will be done within
"dispatch_command" execution.
2020-06-25 13:03:34 +05:30
ca38b6e427 Merge 10.3 into 10.4 2020-05-26 11:54:55 +03:00
ecc7f305dd Merge 10.2 into 10.3 2020-05-25 19:41:58 +03:00
be647ff14d Fixed deadlock with LOCK TABLES and ALTER TABLE
MDEV-21398 Deadlock (server hang) or assertion failure in
Diagnostics_area::set_error_status upon ALTER under lock

This failure could only happen if one locked the same table
multiple times and then did an ALTER TABLE on the table.

Major change is to change all instances of
table->m_needs_reopen= true;
to
table->mark_table_for_reopen();

The main fix for the problem was to ensure that we mark all
instances of the table in the locked_table_list and when we
reopen the tables, we first close all tables before reopening
and locking them.

Other things:
- Don't call thd->locked_tables_list.reopen_tables if there
  are no tables marked for reopen. (performance)
2020-05-23 14:58:33 +03:00
2c3c851d2c Merge 10.3 into 10.4 2020-05-05 20:33:10 +03:00
7fb73ed143 Merge branch '10.2' into 10.3 2020-05-04 16:47:11 +02:00
ca091e6372 Merge branch '10.1' into 10.2 2020-05-02 08:44:17 +02:00
23c6fb3e62 Merge branch '5.5' into 10.1 2020-04-30 17:36:41 +02:00
6bb28e0bc5 Bug#29915479 RUNNING COM_REGISTER_SLAVE WITHOUT COM_BINLOG_DUMP CAN RESULTS IN SERVER EXIT
in fact, in MariaDB it cannot, but it can show spurious slaves
in SHOW SLAVE HOSTS.

slave was registered in COM_REGISTER_SLAVE and un-registered after
COM_BINLOG_DUMP. If there was no COM_BINLOG_DUMP, it would never
unregister.
2020-04-30 10:13:18 +02:00
c238e9b96a MDEV-20685: compile fixes for Solaris/OSX/AIX
sig_return: Solaris/OSX returns different function ptr
Move defination to my_alarm.h as its the only use.

prevents compile warnings (copied from 10.3 branch)

mysys/my_sync.c:136:19: error: 'cur_dir_name' defined but not used [-Werror=unused-const-variable=]
  136 | static const char cur_dir_name[]= {FN_CURLIB, 0};
      |                   ^~~~~~~~~~~~

fix compile error (DEPRECATED) leaked from ssl headers.

In file included from /export/home/dan/mariadb-server-10.4/sql/sys_vars.cc:37:
/export/home/dan/mariadb-server-10.4/sql/sys_vars.ic:69: error: "DEPRECATED" redefined [-Werror]
   69 | #define DEPRECATED(X) X
      |
In file included from /export/home/dan/mariadb-server-10.4/include/violite.h:150,
                 from /export/home/dan/mariadb-server-10.4/sql/sql_class.h:38,
                 from /export/home/dan/mariadb-server-10.4/sql/sys_vars.cc:36:
/usr/include/openssl/ssl.h:2356: note: this is the location of the previous definition
 2356 | # define DEPRECATED __attribute__((deprecated))
      |

Avoid Werror condition on non-Linux:

plugin/server_audit/server_audit.c:2267:7: error: variable 'db_len_off' set but not used [-Werror=unused-but-set-variable]
 2267 |   int db_len_off;
      |       ^~~~~~~~~~
plugin/server_audit/server_audit.c:2266:7: error: variable 'db_off' set but not used [-Werror=unused-but-set-variable]
 2266 |   int db_off;
      |       ^~~~~~

auth_gssapi fix include path for Solaris

Consistent with the upstream packaged patch:
https://github.com/OpenIndiana/oi-userland/blob/oi/hipster/components/database/mariadb-103/patches/06-gssapi.h.patch

compile warnings on Solaris

[ 91%] Building C object plugin/server_audit/CMakeFiles/server_audit.dir/server_audit.c.o
/plugin/server_audit/server_audit.c: In function 'auditing_v8':
/plugin/server_audit/server_audit.c:2194:20: error: unused variable 'db_len_off' [-Werror=unused-variable]
 2194 |   static const int db_len_off= 128;
      |                    ^~~~~~~~~~
/plugin/server_audit/server_audit.c:2193:20: error: unused variable 'db_off' [-Werror=unused-variable]
 2193 |   static const int db_off= 120;
      |                    ^~~~~~
/plugin/server_audit/server_audit.c:2192:20: error: unused variable 'cmd_off' [-Werror=unused-variable]
 2192 |   static const int cmd_off= 4432;
      |                    ^~~~~~~
At top level:
/plugin/server_audit/server_audit.c:2192:20: error: 'cmd_off' defined but not used [-Werror=unused-const-variable=]
/plugin/server_audit/server_audit.c:2193:20: error: 'db_off' defined but not used [-Werror=unused-const-variable=]
 2193 |   static const int db_off= 120;
      |                    ^~~~~~
/plugin/server_audit/server_audit.c:2194:20: error: 'db_len_off' defined but not used [-Werror=unused-const-variable=]
 2194 |   static const int db_len_off= 128;
      |                    ^~~~~~~~~~
cc1: all warnings being treated as errors

tested on:
$ uname -a
SunOS openindiana 5.11 illumos-b97b1727bc i86pc i386 i86pc
2020-04-29 12:02:47 +03:00
2e12d471ea Merge 10.2 into 10.3 2020-04-27 14:24:41 +03:00
c06845d6f0 Merge 10.1 into 10.2 2020-04-27 13:28:13 +03:00
6be05ceb05 MDEV-22203: WSREP_ON is unnecessarily expensive to evaluate
This is a backport of the applicable part of
commit 93475aff8d and
commit 2c39f69d34
from 10.4.

Before 10.4 and Galera 4, WSREP_ON is a macro that points to
a global Boolean variable, so it is not that expensive to
evaluate, but we will add an unlikely() hint around it.

WSREP_ON_NEW: Remove. This macro was introduced in
commit c863159c32
when reverting WSREP_ON to its previous definition.

We replace some use of WSREP_ON with WSREP(thd), like it was done
in 93475aff8d. Note: the macro
WSREP() in 10.1 is equivalent to WSREP_NNULL() in 10.4.

Item_func_rand::seed_random(): Avoid invoking current_thd
when WSREP is not enabled.
2020-04-27 09:40:51 +03:00
93475aff8d MDEV-22203: WSREP_ON is unnecessarily expensive to evaluate
Replaced WSREP_ON macro by single global variable WSREP_ON
that is then updated at server statup and on wsrep_on and
wsrep_provider update functions.
2020-04-24 13:12:46 +03:00
edc3899d97 MDEV-22051: Protocol::end_statement(): Assertion `0' failed on Galera node upon DDL attempt with conflicting lock
If FTWRL is issued, DDL statements should report error back to user before
TOI is started.
2020-04-08 16:42:18 +03:00
e2f1f88fa6 Merge 10.3 into 10.4 2020-03-30 14:50:23 +03:00
1a9b6c4c7f Merge 10.2 into 10.3 2020-03-30 11:12:56 +03:00
ba679ae52f Fix for rpl_start_stop_slave failure
One may access freed THD members after LOCK_thd_kill is released.

With original code it can happen when killing wsrep-disabled thread on a
wsrep-enabled server. With 91ab42a8 it is happening on a wsrep-disabled
server.
2020-03-26 19:19:41 +04:00
5918b17004 MDEV-21473 conflicts with async slave BF aborting (#1475)
If async slave thread (slave SQL handler), becomes a BF victim, it may occasionally happen that rollbacker thread is used to carry out the rollback instead of the async slave thread.
This can happen, if async slave thread has flagged "idle" state when BF thread tries to figure out how to kill the victim.
The issue was possible to test by using a galera cluster as slave for external master, and issuing high load of conflicting writes through async replication and directly against galera cluster nodes.
However, a deterministic mtr test for the "conflict window" has not yet been worked on.

The fix, in this patch makes sure that async slave thread state is never set to IDLE. This prevents the rollbacker thread to intervene.
The wsrep_query_state change was refactored to happen by dedicated function to make controlling the idle state change in one place.
2020-03-24 11:01:42 +02:00
646d1ec83a Merge branch '10.3' into 10.4 2020-02-11 14:40:35 +01:00
58b70dc136 Merge branch '10.2' into 10.3 2020-02-10 20:34:16 +01:00
b08579aa28 MDEV-16308 : protocol messed up sporadically
Context involves semicolon batching, and the error starts 10.2
No reproducible examples were made yet, but TCP trace suggests
multiple packets that are "squeezed" together (e.g overlong OK packet
that has a trailer which is belongs to another packet)

Remove thd->get_stmt_da()->set_skip_flush() when processing a batch.
skip_flush stems from the COM_MULTI code, which was checked in during
10.2 (and is never used)

The fix is confirmed to work, when evaluated by bug reporter (one of them)

We never reproduced it locally, with multiple tries
thus the root cause analysis is still missing.
2020-02-10 12:44:53 +01:00
87a61355e8 Merge 10.3 into 10.4
The MDEV-17062 fix in commit c4195305b2
was omitted.
2020-01-20 15:49:48 +02:00
b04429434a Merge branch '10.1' into 10.2
# Conflicts:
#	sql/sp_head.cc
#	sql/sql_select.cc
#	sql/sql_trigger.cc
2020-01-17 00:24:17 +03:00
5e5ae51b73 MDEV-21341: Fix UBSAN failures: Issue Six
(Variant #2 of the patch, which keeps the sp_head object inside the
MEM_ROOT that sp_head object owns)
(10.3 requires extra work due to sp_package, will commit a separate
patch for it)

sp_head::operator new() and operator delete() were dereferencing sp_head*
pointers to memory that didn't hold a valid sp_head object (it was
not created/already destroyed).
This caused UBSan to crash when looking up type information.

Fixed by providing static sp_head::create() and sp_head::destroy() methods.
2020-01-14 18:15:32 +03:00
d531b4ee3a MDEV-21341: Fix UBSAN failures: Issue Six
(Variant #2 of the patch, which keeps the sp_head object inside the
MEM_ROOT that sp_head object owns)
(10.3 version of the fix, with handling for class sp_package)

sp_head::operator new() and operator delete() were dereferencing sp_head*
pointers to memory that didn't hold a valid sp_head object (it was
not created/already destroyed).
This caused UBSan to crash when looking up type information.

Fixed by providing static sp_head::create() and sp_head::destroy() methods.
2020-01-12 22:15:55 +03:00
33f55789d3 MDEV-18727 improve DML operation of System Versioning (10.4)
UPDATE, DELETE: replace linear search of current/historical records
with vers_setup_conds().

Additional DML cases in view.test
2019-11-25 16:01:43 +03:00
0076dce2c8 MDEV-18727 improve DML operation of System Versioning
MDEV-18957 UPDATE with LIMIT clause is wrong for versioned partitioned tables

UPDATE, DELETE: replace linear search of current/historical records
with vers_setup_conds().

Additional DML cases in view.test
2019-11-22 14:29:03 +03:00
ec40980ddd Merge 10.3 into 10.4 2019-11-01 15:23:18 +02:00
55b2281a5d Merge branch '10.2' into 10.3 2019-10-31 10:58:06 +01:00
42ada91542 cleanup: RAII helper for swapping of thd->security_ctx 2019-10-28 08:17:56 +01:00
67687d06bf Simplify TABLE::decide_logging_format()
- Use local variables table and share to simplify code
- Use sql_command_flags to detect what kind of command was used
- Added CF_DELETES_DATA to simplify detecton of delete commands
- Removed duplicate error in create_table_from_items().
2019-10-20 11:52:29 +03:00
de2186dd2f MDEV-20074: Lost connection on update trigger
Instead of checking lex->sql_command which does not corect in case of triggers
mark tables for insert.
2019-10-17 17:32:14 +02:00
7e44c455f4 Merge remote-tracking branch 'origin/10.2' into 10.3 2019-10-01 09:37:40 +04:00
46b785262b Fix -Wunused for CMAKE_BUILD_TYPE=RelWithDebInfo
For release builds, do not declare unused variables.

unpack_row(): Omit a debug-only variable from WSREP diagnostic message.

create_wsrep_THD(): Fix -Wmaybe-uninitialized for the PSI_thread_key.
2019-09-30 12:49:53 +03:00
c016ea660e Merge 10.2 into 10.3 2019-09-23 10:25:34 +03:00
efefafd02f fix for thread getting stuck after BF ABORT (#1362)
- Fixes a situation in which a thread gets BF aborted and does not send the reply back to
  the client, even though the connection is still alive. That caused
  both sides to hang waiting for the next message. Now we explicitly
  check that the connection is still alive.
- MTR test for the above
- Replaced thd->killed assignments to thd->reset_kill_query where applicable.
2019-09-17 10:58:20 +03:00
509d103810 MDEV-20561 Galera node shutdown fails in non-Primary (#1386)
Command COM_SHUTDOWN was rejected in non-Primary because
server_command_flags[COM_SHUTDOWN] had value CF_NO_COM_MULTI
instead of CF_SKIP_WSREP_CHECK.

As a fix removed assignment
server_command_flags[CF_NO_COM_MULTI]= CF_NO_COM_MULTI
which overwrote server_command_flags[COM_SHUTDOWN].
2019-09-15 11:24:57 +03:00
40beeb1402 MDEV-20561 Galera node shutdown fails in non-Primary (#1386)
Command COM_SHUTDOWN was rejected in non-Primary because
server_command_flags[COM_SHUTDOWN] had value CF_NO_COM_MULTI
instead of CF_SKIP_WSREP_CHECK.

As a fix removed assignment
server_command_flags[CF_NO_COM_MULTI]= CF_NO_COM_MULTI
which overwrote server_command_flags[COM_SHUTDOWN].
2019-09-13 09:18:11 +03:00
db4a27ab73 Merge 10.3 into 10.4 2019-08-31 06:53:45 +03:00
d1ef02e959 MDEV-20265 Unknown column in field list
This patch corrects the fix of the patch for mdev-19421 that resolved
the problem of parsing some embedded join expressions such as
  t1 join t2 left join t3 on t2.a=t3.a on t1.a=t2.a.
Yet the patch contained a bug that prevented proper context analysis
of the queries where such expressions were used together with comma
separated table references in from clauses.
2019-08-30 14:09:43 -07:00
f42a23178e MDEV-20425: Fix -Wimplicit-fallthrough
With --skip-debug-assert, DBUG_ASSERT(false) will allow execution to
continue. Hence, we will need /* fall through */ after them.

Some DBUG_ASSERT(0) were replaced by break; when the switch () statement
was followed by DBUG_ASSERT(0).
2019-08-30 14:11:59 +03:00
32ec5fb979 Merge 10.2 into 10.3 2019-08-21 15:23:45 +03:00
48c67038b9 Merge 10.1 into 10.2
For MDEV-15955, the fix in create_tmp_field_from_item() would cause a
compilation error. After a discussion with Alexander Barkov, the fix
was omitted and only the test case was kept.

In 10.3 and later, MDEV-15955 is fixed properly by overriding
create_tmp_field() in Item_func_user_var.
2019-08-20 09:15:28 +03:00
a02dd7e614 Merge 5.5 into 10.1 2019-08-20 07:31:44 +03:00
e746f451d5 MDEV-20265 Unknown column in field list
This patch corrects the fix of the patch for mdev-19421 that resolved
the problem of parsing some embedded join expressions such as
  t1 join t2 left join t3 on t2.a=t3.a on t1.a=t2.a.
Yet the patch contained a bug that prevented proper context analysis
of the queries where such expressions were used together with comma
separated table references in from clauses.
2019-08-19 14:20:41 -07:00
65d48b4a7b Merge 10.2 to 10.3 2019-08-13 19:28:51 +03:00