1
0
mirror of https://github.com/MariaDB/server.git synced 2025-10-15 11:08:40 +03:00
Commit Graph

402 Commits

Author SHA1 Message Date
Marko Mäkelä
5c3ff5cb93 Merge 10.3 into 10.4 2019-04-02 11:04:54 +03:00
Sergei Golubchik
4e1d3f83b7 Merge branch '10.2' into 10.3 2019-03-29 19:41:41 +01:00
Sergei Golubchik
f2a0c758da Merge branch '10.1' into 10.2 2019-03-29 10:58:20 +01:00
Marko Mäkelä
d0116e10a5 Revert MDEV-18464 and MDEV-12009
This reverts commit 21b2fada7a
and commit 81d71ee6b2.

The MDEV-18464 change introduces a few data race issues. Contrary to
the documentation, the field trx_t::victim is not always being protected
by lock_sys_t::mutex and trx_t::mutex. Most importantly, it seems
that KILL QUERY could wrongly avoid acquiring both mutexes when
invoking lock_trx_handle_wait_low(), in case another thread had
already set trx->victim=true.

We also revert MDEV-12009, because it should depend on the MDEV-18464
fix being present.
2019-03-28 12:39:50 +02:00
Jan Lindström
81d71ee6b2 MDEV-12009: Allow to force kill user threads/query which are flagged as high priority by Galera
As noted on kill_one_thread SUPER should be able to kill even
system threads i.e. threads/query flagged as high priority or
wsrep applier thread. Normal user, should not able to kill
threads/query flagged as high priority (BF) or wsrep applier
thread.
2019-03-28 08:43:44 +02:00
Sergei Golubchik
f97d879bf8 cmake: re-enable -Werror in the maintainer mode
now we can afford it. Fix -Werror errors. Note:
* old gcc is bad at detecting uninit variables, disable it.
* time_t is int or long, cast it for printf's
2019-03-27 22:51:37 +01:00
Marko Mäkelä
117291db8b Merge 10.2 into 10.3 2019-03-19 16:04:59 +02:00
sysprg
26432e49d3 MDEV-17262: mysql crashed on galera while node rejoined cluster (#895)
This patch contains a fix for the MDEV-17262/17243 issues and
new mtr test.

These issues (MDEV-17262/17243) have two reasons:

1) After an intermediate commit, a transaction loses its status
of "transaction that registered in the MySQL for 2pc coordinator"
(in the InnoDB) due to the fact that since version 10.2 the
write_row() function (which located in the ha_innodb.cc) does
not call trx_register_for_2pc(m_prebuilt->trx) during the processing
of split transactions. It is necessary to restore this call inside
the write_row() when an intermediate commit was made (for a split
transaction).

Similarly, we need to set the flag of the started transaction
(m_prebuilt->sql_stat_start) after intermediate commit.

The table->file->extra(HA_EXTRA_FAKE_START_STMT) called from the
wsrep_load_data_split() function (which located in sql_load.cc)
will also do this, but it will be too late. As a result, the call
to the wsrep_append_keys() function from the InnoDB engine may be
lost or function may be called with invalid transaction identifier.

2) If a transaction with the LOAD DATA statement is divided into
logical mini-transactions (of the 10K rows) and binlog is rotated,
then in rare cases due to the wsrep handler re-registration at the
boundary of the split, the last portion of data may be lost. Since
splitting of the LOAD DATA into mini-transactions is technical,
I believe that we should not allow these mini-transactions to fall
into separate binlogs. Therefore, it is necessary to prohibit the
rotation of binlog in the middle of processing LOAD DATA statement.

https://jira.mariadb.org/browse/MDEV-17262 and
https://jira.mariadb.org/browse/MDEV-17243
2019-03-18 07:39:51 +02:00
Sergei Golubchik
b64fde8f38 Merge branch '10.2' into 10.3 2019-03-17 13:06:41 +01:00
Teemu Ollakka
1ef50a34ec 10.4 wsrep group commit fixes (#1224)
* MDEV-16509 Improve wsrep commit performance with binlog disabled

Release commit order critical section early after trx_commit_low() if
binlog is not transaction coordinator. In order to avoid two phase commit,
binlog_hton is not registered for THD during IO_CACHE population.

Implemented a test which verifies that the transactions release
commit order early.

This optimization will change behavior during recovery as the commit
is not two phase when binlog is off. Fixed and recorded wsrep-recover-v25
and wsrep-recover to match the behavior.

* MDEV-18730 Ordering for wsrep binlog group commit

Previously out of order execution was allowed for wsrep commits.
Established proper ordering by populating wait_for_commit
for every wsrep THD and making group commit leader to wait for
prior commits before proceeding to trx_group_commit_leader().

* MDEV-18730 Added a test case to verify correct commit ordering

* MDEV-16509, MDEV-18730 Review fixes

Use WSREP_EMULATE_BINLOG() macro to decide if the binlog_hton
should be registered. Whitespace/syntax fixes and cleanups.

* MDEV-16509 Require binlog for galera_var_innodb_disallow_writes test

If the commit to InnoDB is done in one phase, the native InnoDB behavior
is that the transaction is committed in memory before it is persisted to
disk. This means that the innodb_disallow_writes=ON may not prevent
transaction to become visible to other readers before commit is completely
over. On the other hand, if the commit is two phase (as it is with binlog),
the transaction will be blocked in prepare phase.

Fixed the test to use binlog, which enforces two phase commit, which
in turn makes commit to block before the changes become visible to
other connections. This guarantees that the test produces expected
result.
2019-03-15 07:09:13 +02:00
Jan Lindström
d0ebb155fe MDEV-18577: Indexes problem on import dump SQL
Problem was that we skipped background persistent statistics calculation
on applier nodes if thread is marked as high priority (a.k.a BF).
However, on applier nodes all DDL which is replicate will be executed
as high priority i.e BF.

Fixed by allowing background persistent statistics calculation on
applier nodes even when thread is marked as BF. This could lead
BF lock waits but for queries on that node needs that statistics.
2019-03-13 10:18:12 +02:00
Marko Mäkelä
2a791c53ad Merge 10.3 into 10.4 2019-03-06 09:00:52 +02:00
Julius Goryavsky
50b3632fa4 MDEV-9519: Data corruption will happen on the Galera cluster size change
If we have a 2+ node cluster which is replicating from an async master
and the binlog_format is set to STATEMENT and multi-row inserts are executed
on a table with an auto_increment column such that values are automatically
generated by MySQL, then the server node generates wrong auto_increment
values, which are different from what was generated on the async master.

In the title of the MDEV-9519 it was proposed to ban start slave on a Galera
if master binlog_format = statement and wsrep_auto_increment_control = 1,
but the problem can be solved without such a restriction.

The causes and fixes:

1. We need to improve processing of changing the auto-increment values
after changing the cluster size.

2. If wsrep auto_increment_control switched on during operation of
the node, then we should immediately update the auto_increment_increment
and auto_increment_offset global variables, without waiting of the next
invocation of the wsrep_view_handler_cb() callback. In the current version
these variables retain its initial values if wsrep_auto_increment_control
is switched on during operation of the node, which leads to inconsistent
results on the different nodes in some scenarios.

3. If wsrep auto_increment_control switched off during operation of the node,
then we must return the original values of the auto_increment_increment and
auto_increment_offset global variables, as the user has set. To make this
possible, we need to add a "shadow copies" of these variables (which stores
the latest values set by the user).

https://jira.mariadb.org/browse/MDEV-9519
2019-02-26 08:09:04 +02:00
Julius Goryavsky
2c734c980e MDEV-9519: Data corruption will happen on the Galera cluster size change
If we have a 2+ node cluster which is replicating from an async master
and the binlog_format is set to STATEMENT and multi-row inserts are executed
on a table with an auto_increment column such that values are automatically
generated by MySQL, then the server node generates wrong auto_increment
values, which are different from what was generated on the async master.

In the title of the MDEV-9519 it was proposed to ban start slave on a Galera
if master binlog_format = statement and wsrep_auto_increment_control = 1,
but the problem can be solved without such a restriction.

The causes and fixes:

1. We need to improve processing of changing the auto-increment values
after changing the cluster size.

2. If wsrep auto_increment_control switched on during operation of
the node, then we should immediately update the auto_increment_increment
and auto_increment_offset global variables, without waiting of the next
invocation of the wsrep_view_handler_cb() callback. In the current version
these variables retain its initial values if wsrep_auto_increment_control
is switched on during operation of the node, which leads to inconsistent
results on the different nodes in some scenarios.

3. If wsrep auto_increment_control switched off during operation of the node,
then we must return the original values of the auto_increment_increment and
auto_increment_offset global variables, as the user has set. To make this
possible, we need to add a "shadow copies" of these variables (which stores
the latest values set by the user).

https://jira.mariadb.org/browse/MDEV-9519
2019-02-26 07:45:11 +02:00
Julius Goryavsky
243f829c1c MDEV-9519: Data corruption will happen on the Galera cluster size change
If we have a 2+ node cluster which is replicating from an async master
and the binlog_format is set to STATEMENT and multi-row inserts are executed
on a table with an auto_increment column such that values are automatically
generated by MySQL, then the server node generates wrong auto_increment
values, which are different from what was generated on the async master.

In the title of the MDEV-9519 it was proposed to ban start slave on a Galera
if master binlog_format = statement and wsrep_auto_increment_control = 1,
but the problem can be solved without such a restriction.

The causes and fixes:

1. We need to improve processing of changing the auto-increment values
after changing the cluster size.

2. If wsrep auto_increment_control switched on during operation of
the node, then we should immediately update the auto_increment_increment
and auto_increment_offset global variables, without waiting of the next
invocation of the wsrep_view_handler_cb() callback. In the current version
these variables retain its initial values if wsrep_auto_increment_control
is switched on during operation of the node, which leads to inconsistent
results on the different nodes in some scenarios.

3. If wsrep auto_increment_control switched off during operation of the node,
then we must return the original values of the auto_increment_increment and
auto_increment_offset global variables, as the user has set. To make this
possible, we need to add a "shadow copies" of these variables (which stores
the latest values set by the user).

https://jira.mariadb.org/browse/MDEV-9519
2019-02-25 11:19:07 +02:00
mkaruza
3e64e7f24c WSREP debug log levels support
Global variable wsrep_debug now can be used to filter wsrep-lib messages based on debug level provided.
Type of wsrep_debug is now set to be unsigned int, so tests and configuration files changed accordingly.
2019-02-13 18:47:27 +01:00
Brave Galera Crew
36a2a185fe Galera4 2019-01-23 15:30:00 +04:00
Alexey Botchkov
cc18a5db9b MDEV-5313 Improving audit API.
json_locate_key() implemented.
get rid of 'key_len' argument in functions.
2019-01-18 03:18:02 +04:00
Alexey Botchkov
294d9bf248 MDEV-5313 Improving audit api.
JSON api implementations and tests pushed.
sql_acl.cc fixed with the new function names.
2019-01-17 03:52:52 +04:00
Alexey Botchkov
b1527ef51c MDEV-5313 Improving audit api.
Service added to handle json.
2018-12-12 01:49:39 +04:00
Marko Mäkelä
074c684099 Merge 10.3 into 10.4 2018-11-06 16:24:16 +02:00
Marko Mäkelä
df563e0c03 Merge 10.2 into 10.3
main.derived_cond_pushdown: Move all 10.3 tests to the end,
trim trailing white space, and add an "End of 10.3 tests" marker.
Add --sorted_result to tests where the ordering is not deterministic.

main.win_percentile: Add --sorted_result to tests where the
ordering is no longer deterministic.
2018-11-06 09:40:39 +02:00
Marko Mäkelä
32062cc61c Merge 10.1 into 10.2 2018-11-06 08:41:48 +02:00
Marko Mäkelä
d63e198061 Merge 10.0 into 10.1 2018-11-05 12:15:17 +02:00
Marko Mäkelä
f0cb21ea2e Remove dead code is_thd_killed() 2018-11-02 12:42:01 +02:00
Sergei Golubchik
7c40996cc8 MDEV-12321 authentication plugin: SET PASSWORD support
Support SET PASSWORD for authentication plugins.

Authentication plugin API is extended with two optional methods:
* hash_password() is used to compute a password hash (or digest)
  from the plain-text password. This digest will be stored in mysql.user
  table
* preprocess_hash() is used to convert this digest into some memory
  representation that can be later used to authenticate a user.
  Build-in plugins convert the hash from hexadecimal or base64 to binary,
  to avoid doing it on every authentication attempt.

Note a change in behavior: when loading privileges (on startup or on
FLUSH PRIVILEGES) an account with an unknown plugin was loaded with a
warning (e.g. "Plugin 'foo' is not loaded"). But such an account could
not be used for authentication until the plugin is installed. Now an
account like that will not be loaded at all (with a warning, still).
Indeed, without plugin's preprocess_hash() method the server cannot know
how to load an account. Thus, if a new authentication plugin is
installed run-time, one might need FLUSH PRIVILEGES to activate all
existing accounts that were using this new plugin.
2018-10-31 16:06:16 +01:00
Sergei Golubchik
0e388d43a7 cleanup: add 'const' to password validation API 2018-10-31 16:06:16 +01:00
Sergei Golubchik
44f6f44593 Merge branch '10.0' into 10.1 2018-10-30 15:10:01 +01:00
Thirunarayanan Balathandayuthapani
1dacd5f299 MDEV-12547: InnoDB FULLTEXT index has too strict innodb_ft_result_cache_limit max limit
- Backported the MYSQL_SYSVAR_SIZE_T to 10.0
- The parameter innodb_ft_result_cache_limit was only 32 bits wide
also on 64-bit systems. Make it size_t, so that it will be 64 bits
on 64-bit systems.
- Added a test case that show how innodb_ft_result_cache_limit variables
behaves in 32bit and 64 bit system.
2018-10-16 13:02:50 +05:30
Marko Mäkelä
2a955c7a83 Merge 10.3 into 10.4 2018-10-10 10:36:51 +03:00
Marko Mäkelä
43ee6915fa Merge 10.2 into 10.3 2018-10-09 09:11:30 +03:00
Vladislav Vaintroub
8c2360dee8 MDEV-17373 Windows: application verifier stop "Attempt to use an unknown SOCKET" 2018-10-05 16:48:51 +01:00
Marko Mäkelä
444c380ceb Merge 10.3 into 10.4 2018-10-05 08:09:49 +03:00
Sergei Golubchik
57e0da50bb Merge branch '10.2' into 10.3 2018-09-28 16:37:06 +02:00
Oleksandr Byelkin
28f08d3753 Merge branch '10.1' into 10.2 2018-09-14 08:47:22 +02:00
Sergei Golubchik
a6246cab16 fix failures of innodb_plugin tests in --embedded
Post-fix for 7e8ed15b95

Also, apply the same innodb fix to xtradb.
2018-09-04 09:19:50 +02:00
Alexander Barkov
e61568ee93 Merge remote-tracking branch 'origin/10.3' into 10.4 2018-07-03 14:02:05 +04:00
Sergei Golubchik
36e59752e7 Merge branch '10.2' into 10.3 2018-06-30 16:39:20 +02:00
Sergei Golubchik
b942aa34c1 Merge branch '10.1' into 10.2 2018-06-21 23:47:39 +02:00
Vicențiu Ciorbaru
6e55236c0a Merge branch '10.0-galera' into 10.1 2018-06-12 19:39:37 +03:00
Sergei Golubchik
ced6638773 mysys: ME_ERROR_LOG_ONLY flag 2018-06-04 12:32:23 +02:00
Michael Widenius
70c1110a29 Optimize performance schema likely/unlikely
Performance schema likely/unlikely assume that performance schema
is enabled by default, which causes a performance degradation for
default installations that doesn't have performance schema enabled.

Fixed by changing the likely/unlikely in PS to assume it's
not enabled. This can be changed by compiling with
-DPSI_ON_BY_DEFAULT

Other changes:
- Added psi_likely/psi_unlikely that is depending on
  PSI_ON_BY_DEFAULT. psi_likely() is assumed to be true
  if PS is enabled.
- Added likely/unlikely to some PS interface code.
- Moved pfs_enabled to mysys (was initialized but not used before)
- Added "if (pfs_likely(pfs_enabled))" around calls to PS to avoid
  an extra call if PS is not enabled.
- Moved checking flag_global_instrumention before other flags
  to speed up the case when PS is not enabled.
2018-05-07 00:07:33 +03:00
Daniel Black
ccd566af20 MDEV-8743: mysqld port/socket - FD_CLOEXEC if no SOCK_CLOEXEC
In MDEV-8743, the port/socket of mysqld was changed to set FD_CLOEXEC.
The existing mysql_socket_socket function already set that with
SOCK_CLOEXEC when the socket was created. So here we move the fcntl
functionality to the mysql_socket_socket as port/socket are the only
callers.

Preprocessor checks of SOCK_CLOEXEC cannot be done as its a 0 if not
there and SOCK_CLOEXEC (being the value of the enum in bits/socket_type.h)
Preprocesssor logic for arithmetic and non-arithmetic defines are
hard/nonportable/ugly to read. As such we just check in my_global.h
and define HAVE_SOCK_CLOEXEC if we have it.

There was a disparity in behaviour between defined(WITH_WSREP) and
not depending on the OS, so the WITH_WSREP condition was removed
from setting calling fcntl.

All sockets are now maked SOCK_CLOEXEC/FD_CLOEXEC.

strace of mysqld with SOCK_CLOEXEC:

socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC, IPPROTO_TCP) = 10
write(2, "180419 14:52:40 [Note] Server socket created on IP: '127.0.0.1'.\n", 65180419 14:52:40 [Note] Server socket created on IP: '127.0.0.1'.
) = 65
setsockopt(10, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
bind(10, {sa_family=AF_INET, sin_port=htons(16020), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
listen(10, 150)                         = 0
socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC, IPPROTO_TCP) = 11
write(2, "180419 14:52:40 [Note] Server socket created on IP: '127.0.0.1'.\n", 65180419 14:52:40 [Note] Server socket created on IP: '127.0.0.1'.
) = 65
setsockopt(11, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
bind(11, {sa_family=AF_INET, sin_port=htons(16021), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
listen(11, 150)                         = 0
socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0) = 12
unlink("/home/dan/repos/build-mariadb-server-10.0/mysql-test/var/tmp/mysqld.1.sock") = -1 ENOENT (No such file or directory)
setsockopt(12, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
umask(000)                              = 006
bind(12, {sa_family=AF_UNIX, sun_path="/home/dan/repos/build-mariadb-server-10.0/mysql-test/var/tmp/mysqld.1.sock"}, 110) = 0
umask(006)                              = 000
listen(12, 150)                         = 0
2018-04-19 14:34:46 +10:00
luz.paz
3dd01669b4 Misc. typos
Found via `codespell -i 3 -w --skip="./debian/po" -I ../mariadb-server-word-whitelist.txt  ./cmake/ ./debian/ ./Docs/ ./include/ ./man/ ./plugin/ ./strings/`
2018-04-05 15:26:57 +04:00
Teemu Ollakka
33aad1d273 MDEV-15505 Fixes to compilation without -DWITH_WSREP:BOOL=ON
Removed including wsrep_api.h from service_wsrep.h. This caused
various kinds of collisions with definitions when wsrep is
not supposed to be built in. Defined functions wsrep_xid_seqno()
and wsrep_xid_uuid() in wsrep_dummy.cc. Replaced wsrep_seqno_t
with long long where wsrep_api.h is not included.

Removed wsrep_xid_seqno() macro from wsrep_mysqld.h and made
wsrep code using wsrep_xid_seqno() in handler.cc to be compiled
in only if WITH_WSREP is ON.

Included wsrep_api.h for mariabackup if WITH_WSREP is ON.
2018-03-21 12:02:09 +02:00
Teemu Ollakka
b125ae0a84 MDEV-15505 New wsrep XID format for backwards compatibility
A new wsrep XID format was added to keep the XID implementation
backwards compatible. Original version always reads XID seqno
part in host byte order, the new version in little endian byte
order. Wsrep XID will always be written in the new format.

Included wsrep_api.h from service_wsrep.h for wsrep type definitions.
Removed redundant wsrep XID code from mariabackup and included
service_wsrep.h in order to use
2018-03-12 14:51:49 +02:00
Teemu Ollakka
dd74b94823 MDEV-15505 Fix wsrep XID seqno byte order
The problem is that the seqno part of wsrep XID is always
stored in host byte order. This may cause issues when a physical
backup is restored on a host with different architecture, the
seqno part with XID may have incorrect value.

In order to fix this, wsrep XID seqno is always written into
XID data buffer in little endian byte order using int8store()
and read from data buffer using sint8korr(). For backwards
compatibility the seqno is read from TRX_SYS page in host
byte order during upgrade.

This patch implements byte ordering in wsrep_xid_init(),
wsrep_xid_seqno(), and exposes functions to read wsrep
XID uuid and seqno in wsrep_service_st. Backwards compatibility
for upgrade is provided in trx_rseg_init_wsrep_xid().
2018-03-12 14:46:20 +02:00
Vladislav Vaintroub
6c279ad6a7 MDEV-15091 : Windows, 64bit: reenable and fix warning C4267 (conversion from 'size_t' to 'type', possible loss of data)
Handle string length as size_t, consistently (almost always:))
Change function prototypes to accept size_t, where in the past
ulong or uint were used. change local/member variables to size_t
when appropriate.

This fix excludes rocksdb, spider,spider, sphinx and connect for now.
2018-02-06 12:55:58 +00:00
Monty
a7e352b54d Changed database, tablename and alias to be LEX_CSTRING
This was done in, among other things:
- thd->db and thd->db_length
- TABLE_LIST tablename, db, alias and schema_name
- Audit plugin database name
- lex->db
- All db and table names in Alter_table_ctx
- st_select_lex db

Other things:
- Changed a lot of functions to take const LEX_CSTRING* as argument
  for db, table_name and alias. See init_one_table() as an example.
- Changed some function arguments from LEX_CSTRING to const LEX_CSTRING
- Changed some lists from LEX_STRING to LEX_CSTRING
- threads_mysql.result changed because process list_db wasn't always
  correctly updated
- New append_identifier() function that takes LEX_CSTRING* as arguments
- Added new element tmp_buff to Alter_table_ctx to separate temp name
  handling from temporary space
- Ensure we store the length after my_casedn_str() of table/db names
- Removed not used version of rename_table_in_stat_tables()
- Changed Natural_join_column::table_name and db_name() to never return
  NULL (used for print)
- thd->get_db() now returns db as a printable string (thd->db.str or "")
2018-01-30 21:33:55 +02:00
Sergei Golubchik
8f102b584d Merge branch 'github/10.3' into bb-10.3-temporal 2018-01-17 00:45:02 +01:00