Initial support
tested against OpenSSL 1.0.1, 1.0.2, 1.1.0, Yassl and LibreSSL
not working on Windows with native SChannel support, due to wrong cipher
mapping: Latter one requires push of CONC-241 fixes.
Please note that OpenSSL 0.9.8 and OpenSSL 1.1.0 will not work: Even if
the build succeeds, test cases will fail with various errors, especially
when using different tls libraries or versions for client and server.
Other things
- Ensure that ut_d() is set to EXPR if ut_ad() is DEBUG_ASSERT()
If not, we will get a crash in purge_sys_t::~purge_sys_t() as
this ut_ad() code expect's that the ut_d() codes has been executed
Benefits of this patch:
- Removed a lot of calls to strlen(), especially for field_string
- Strings generated by parser are now const strings, less chance of
accidently changing a string
- Removed a lot of calls with LEX_STRING as parameter (changed to pointer)
- More uniform code
- Item::name_length was not kept up to date. Now fixed
- Several bugs found and fixed (Access to null pointers,
access of freed memory, wrong arguments to printf like functions)
- Removed a lot of casts from (const char*) to (char*)
Changes:
- This caused some ABI changes
- lex_string_set now uses LEX_CSTRING
- Some fucntions are now taking const char* instead of char*
- Create_field::change and after changed to LEX_CSTRING
- handler::connect_string, comment and engine_name() changed to LEX_CSTRING
- Checked printf() related calls to find bugs. Found and fixed several
errors in old code.
- A lot of changes from LEX_STRING to LEX_CSTRING, especially related to
parsing and events.
- Some changes from LEX_STRING and LEX_STRING & to LEX_CSTRING*
- Some changes for char* to const char*
- Added printf argument checking for my_snprintf()
- Introduced null_clex_str, star_clex_string, temp_lex_str to simplify
code
- Added item_empty_name and item_used_name to be able to distingush between
items that was given an empty name and items that was not given a name
This is used in sql_yacc.yy to know when to give an item a name.
- select table_name."*' is not anymore same as table_name.*
- removed not used function Item::rename()
- Added comparision of item->name_length before some calls to
my_strcasecmp() to speed up comparison
- Moved Item_sp_variable::make_field() from item.h to item.cc
- Some minimal code changes to avoid copying to const char *
- Fixed wrong error message in wsrep_mysql_parse()
- Fixed wrong code in find_field_in_natural_join() where real_item() was
set when it shouldn't
- ER_ERROR_ON_RENAME was used with extra arguments.
- Removed some (wrong) ER_OUTOFMEMORY, as alloc_root will already
give the error.
TODO:
- Check possible unsafe casts in plugin/auth_examples/qa_auth_interface.c
- Change code to not modify LEX_CSTRING for database name
(as part of lower_case_table_names)
Intermediate commit.
Move the discovery of mysql.gtid_slave_pos* tables into the SQL thread.
This avoids doing things like opening tables and scanning the mysql
schema for tables inside of the START SLAVE statement, which might
interact badly with existing transaction or table locks.
(Even though START SLAVE is documented to implicitly commit any active
transactions, this appears not to be the case in current code).
Table discovery fits naturally in the SQL thread init code, next to
the loading of mysql.gtid_slave_pos state.
Intermediate commit.
Implement auto-creation of mysql.gtid_slave_pos* tables with needed engines,
if listed in --gtid-pos-auto-engines.
Uses an asynchronous approach to minimise locking overhead.
The list of available tables is extended with a flag. Extra entries are
added for --gtid-pos-auto-engines tables that do not exist yet, marked as
not existing but ready for auto-creation.
If record_gtid() needs a table marked for auto-creation, it sends a request
to the slave background thread to create the table, and continues to use an
existing table for the current and immediately coming transactions.
As soon as the slave background thread has made the new table available, it
will be used for all subsequent relevant transactions in record_gtid().
This asynchronous approach also avoids a lot of complex issues around trying
to do DDL in the middle of an on-going transaction.
When master_use_gtid=no, the IO thread loads the slave GTID state from
the master during connect. This races with the SQL thread when
gtid_ignore_duplicates=1. If an event is in the relay log from before
the new connect and has not been applied yet, moving the slave
position causes the SQL thread to think that event should be skipped
due to gtid_ignore_duplicates=1.
This patch simply disables gtid_ignore_duplicates when not using GTID,
which seems to be what one would expect.
- Before this patch during startup all slave threads was started without
any check that they had started properly.
- If one did a START SLAVE, STOP SLAVE or CHANGE MASTER as first command to the server
there was a chance that server could access structures that where not
properly initialized which could lead to crashes in
Log_event::read_log_event
- Fixed by waiting for slave threads to start up properly also during
server startup, like we do with START SLAVE.
Protection added to reopen_file() and new_file_impl().
Without this we could get an assert in fn_format() as name == 0,
because the file was closed and name reset, atthe same time
new_file_impl() was called.
is starting.
This is needed as if we kill the START SLAVE thread too early during
shutdown then the IO_THREAD or SQL_THREAD will not have time to properly
initlize it's replication or THD structures and clean_up() will try to
delete master_info structures that are still in use.
The reason for this is that stop slave takes LOCK_active_mi over the
whole operation while some slave operations will also need LOCK_active_mi
which causes deadlocks.
Fixed by introducing object counting for Master_info and not taking
LOCK_active_mi over stop slave or even stop_all_slaves()
Another benefit of this approach is that it allows:
- Multiple threads can run SHOW SLAVE STATUS at the same time
- START/STOP/RESET/SLAVE STATUS on a slave will not block other slaves
- Simpler interface for handling get_master_info()
- Added some missing unlock of 'log_lock' in error condtions
- Moved rpl_parallel_inactivate_pool(&global_rpl_thread_pool) to end
of stop_slave() to not have to use LOCK_active_mi inside
terminate_slave_threads()
- Changed argument for remove_master_info() to Master_info, as we always
have this available
- Fixed core dump when doing FLUSH TABLES WITH READ LOCK and parallel
replication. Problem was that waiting for pause_for_ftwrl was not done
when deleting rpt->current_owner after a force_abort.
Description:
============
If you have a relay log index file that has ended up with
some relay log files that do not exists, then RESET SLAVE
ALL is not enough to get back to a clean state.
Analysis:
=========
In the bug scenario slave server is in stopped state and
some of the relay logs got deleted but the relay log index
file is not updated.
During slave server restart replication initialization fails
as some of the required relay logs are missing. User
executes RESET SLAVE/RESET SLAVE ALL command to start a
clean slave. As per the documentation RESET SLAVE command
clears the master info and relay log info repositories,
deletes all the relay log files, and starts a new relay log
file. But in a scenario where the slave server's
Relay_log_info object is not initialized slave will not
purge the existing relay logs. Hence the index file still
remains in a bad state. Users will not be able to start
the slave unless these files are cleared.
Fix:
===
RESET SLAVE/RESET SLAVE ALL commands should do the cleanup
even in a scenario where Relay_log_info object
initialization failed.
Backported a flag named 'error_on_rli_init_info' which is
required to identify slave's Relay_log_info object
initialization failure. This flag exists in MySQL-5.6
onwards as part of BUG#14021292 fix.
During RESET SLAVE/RESET SLAVE ALL execution this flag
indicates the Relay_log_info initialization failure.
In such a case open the relay log index/relay log files
and do the required clean up.
Make the slave SQL thread always output to the error log the message "Slave
SQL thread exiting, replication stopped in ..." whenever it previously
outputted "Slave SQL thread initialized, starting replication ...".
Before this patch, it was somewhat inconsistent in which cases the message
would be output and in which not, depending on the exact time and cause of
the condition that caused the SQL thread to stop.
The SQL thread keeps track of the position in the current relay log from
which to read the next event. This position is not normally used, but a
certain interaction with the IO thread can cause the SQL thread to re-open
the relay log and seek to the stored position.
In parallel replication, there were a couple of places where the position
was not updated. This created a race where a re-open of the relay log could
seek to the wrong position and start re-reading and processing events
already handled once, causing various kinds of problems.
Fix this by moving the position update into a single place in
apply_event_and_update_pos(), which should ensure that the position is
always updated in the parallel replication case.
This problem was found from the testcase of MDEV-10863, but it is logically
a separate problem.
This has no functional changes, but it helps avoid merge problems from 10.0
to 10.1. In 10.0, code that checks for parallel replication uses
opt_slave_parallel_threads > 0, but this check needs to be
mi->using_parallel() in 10.1. By using the same check in 10.0 (with
unchanged semantics), merge problems to 10.1 are avoided.
This occured when the SQL thread (but not the IO thread) stops while
GTID and parallel replication are used with multiple domain ids in the
GTID position, and is restarted.
In this case, the SQL needs to start some way back in the relay log,
applying or skipping events within each replication domain as
appropriate.
The SQL threads starts at the beginning of an old relay log file, and
this position may be in the middle of an event group. The bug was that
such partial event group could be re-applied, causing replication
corruption.
This patch fixes the issue, by making sure to skip any initial events
that were part of an earlier (already applied) event group.
cherry-pick from 5.7:
commit 6b24763
Author: Manish Kumar <manish.4.kumar@oracle.com>
Date: Tue Mar 27 13:10:42 2012 +0530
BUG#12977988 - ON STOP SLAVE: ERROR READING PACKET FROM SERVER: LOST CONNECTION
TO MYSQL SERVER
BUG#11761457 - ERROR 2013 + "ERROR READING RELAY LOG EVENT" ON STOP SLAVEBUG#12977988 - ON STOP SLAVE: ERROR READING PACKET FROM SERVER: LOST CONNECTION
TO MYSQL SERVER
Minor review comments/changes:
- A bunch of style-fixes.
- Change macros to static inline functions.
- Update check_event_type() with compressed event types.
- Small .result file update.
Add some event types for the compressed event, there are:
QUERY_COMPRESSED_EVENT,
WRITE_ROWS_COMPRESSED_EVENT_V1,
UPDATE_ROWS_COMPRESSED_EVENT_V1,
DELETE_POWS_COMPRESSED_EVENT_V1,
WRITE_ROWS_COMPRESSED_EVENT,
UPDATE_ROWS_COMPRESSED_EVENT,
DELETE_POWS_COMPRESSED_EVENT.
These events inheritance the uncompressed editor events. One of their constructor functions and write
function have been overridden for uncompressing and compressing. Anything but this is totally the same.
On slave, The IO thread will uncompress and convert them When it receiving the events from the master.
So the SQL and worker threads can be stay unchanged.
Now we use zlib as compress algorithm. It maybe support other algorithm in the future.
Merge feature into 10.2 from feature branch.
Delayed replication adds an option
CHANGE MASTER TO master_delay=<seconds>
Replication will then delay applying events with that many
seconds. This creates a replication slave that reflects the state of
the master some time in the past.
Feature is ported from MySQL source tree.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
Problem:
When using the delayed slave feature, and the SQL thread is delaying,
and the user issues STOP SLAVE, the event we wait for was executed.
It should not be executed.
Fix:
Check the return value from the delay function,
slave.cc:slave_sleep(). If the return value is 1, it means the thread
has been stopped, in this case we don't execute the statement.
Also, refactored the test case for delayed slave a little: added the
test script include/rpl_assert.inc, which asserts that a condition holds
and prints a message if not. Made rpl_delayed_slave.test use this. The
advantage is that the test file is much easier to read and maintain,
because it is clear what is an assertion and what is not, and also the
expected result can be found in the test file, you don't have to compare
it to the result file.
Manually merged into MariaDB from MySQL commit
fd2b210383358fe7697f201e19ac9779879ba72a
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
The original MySQL patch left some refactoring todo's, possibly
because of known conflicts with other parallel development (like
info-repository feature perhaps).
This patch fixes those todos/refactorings.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>