1
0
mirror of https://github.com/MariaDB/server.git synced 2025-07-27 18:02:13 +03:00

BUG#40337 Fsyncing master and relay log to disk after every event is too slow

NOTE: Backporting the patch to next-mr.
      
The fix proposed in BUG#35542 and BUG#31665 introduces a performance issue
when fsyncing the master.info, relay.info and relay-log.bin* after #th events.
Although such solution has been proposed to reduce the probability of corrupted
files due to a slave-crash, the performance penalty introduced by it has
made the approach impractical for highly intensive workloads.
      
In a nutshell, the option --syn-relay-log proposed in BUG#35542 and BUG#31665
simultaneously fsyncs master.info, relay-log.info and relay-log.bin* and
this is the main source of performance issues.
      
This patch introduces new options that give more control to the user on
what should be fsynced and how often:
      
   1) (--sync-master-info, integer) which syncs the master.info after #th event;
   2) (--sync-relay-log, integer) which syncs the relay-log.bin* after #th
   events.
   3) (--sync-relay-log-info, integer) which syncs the relay.info after #th
   transactions.
      
   To provide both performance and increased reliability, we recommend the following
   setup:
      
   1) --sync-master-info = 0 eventually the operating system will fsync it;
   2) --sync-relay-log = 0 eventually the operating system will fsync it;
   3) --sync-relay-log-info = 1 fsyncs it after every transaction;
      
Notice, that the previous setup does not reduce the probability of
corrupted master.info and relay-log.bin*. To overcome the issue, this patch also
introduces a recovery mechanism that right after restart throws away relay-log.bin*
retrieved from a master and updates the master.info based on the relay.info:
      
      
   4) (--relay-log-recovery, boolean) which enables a recovery mechanism that
   throws away relay-log.bin* after a crash.
      
However, it can only recover the incorrect binlog file and position in master.info,
if other informations (host, port password, etc) are corrupted or incorrect,
then this recovery mechanism will fail to work.
This commit is contained in:
Alfranio Correia
2009-09-29 15:40:52 +01:00
parent 4e0cb6dbb7
commit a48ff22004
15 changed files with 364 additions and 31 deletions

View File

@ -1769,6 +1769,16 @@ static sys_var_const sys_relay_log_info_file(&vars, "relay_log_info_file",
(uchar*) &relay_log_info_file);
static sys_var_bool_ptr sys_relay_log_purge(&vars, "relay_log_purge",
&relay_log_purge);
static sys_var_bool_ptr sys_relay_log_recovery(&vars, "relay_log_recovery",
&relay_log_recovery);
static sys_var_uint_ptr sys_sync_binlog_period(&vars, "sync_binlog",
&sync_binlog_period);
static sys_var_uint_ptr sys_sync_relaylog_period(&vars, "sync_relay_log",
&sync_relaylog_period);
static sys_var_uint_ptr sys_sync_relayloginfo_period(&vars, "sync_relay_log_info",
&sync_relayloginfo_period);
static sys_var_uint_ptr sys_sync_masterinfo_period(&vars, "sync_master_info",
&sync_masterinfo_period);
static sys_var_const sys_relay_log_space_limit(&vars,
"relay_log_space_limit",
OPT_GLOBAL, SHOW_LONGLONG,
@ -1784,8 +1794,6 @@ static sys_var_const sys_slave_skip_errors(&vars, "slave_skip_errors",
(uchar*) slave_skip_error_names);
static sys_var_long_ptr sys_slave_trans_retries(&vars, "slave_transaction_retries",
&slave_trans_retries);
static sys_var_int_ptr sys_sync_binlog_period(&vars, "sync_binlog", &sync_binlog_period);
static sys_var_int_ptr sys_sync_relaylog_period(&vars, "sync_relay_log", &sync_relaylog_period);
static sys_var_slave_skip_counter sys_slave_skip_counter(&vars, "sql_slave_skip_counter");