Docs/manual.texi

Updates for BACKUP TABLE/RESTORE TABLE Added Replication FAQ Cleaned up TODO list removing a duplicate and features already implemented Updated changelog for 3.23.25 sql/sql_lex.h Re-added backup_dir to Lex which dispappeared while resovling conflicts Docs/manual.texi: Updates for BACKUP TABLE/RESTORE TABLE Added Replication FAQ Cleaned up TODO list removing a duplicate and features already implemented Updated changelog for 3.23.25 sql/sql_lex.h: Re-added backup_dir to Lex which dispappeared while resovling conflicts
2025-07-30 16:24:05 +03:00 · 2000-09-16 18:23:30 -06:00
parent ff1216a5a9
commit f22a1bfd20
2 changed files with 324 additions and 34 deletions
--- a/Docs/manual.texi
+++ b/Docs/manual.texi
@ -334,6 +334,8 @@ MySQL language reference
 * CHECK TABLE::                 @code{CHECK TABLE} syntax
 * ANALYZE TABLE::               @code{ANALYZE TABLE} syntax
 * REPAIR TABLE::                @code{REPAIR TABLE} syntax
+* BACKUP TABLE::                @code{BACKUP TABLE} syntax
+* RESTORE TABLE::               @code{RESTORE TABLE} syntax
 * DELETE::                      @code{DELETE} syntax
 * SELECT::                      @code{SELECT} syntax
 * JOIN::                        @code{JOIN} syntax
@ -506,6 +508,7 @@ Replication in MySQL
 * Replication Features::        Replication Features
 * Replication Options::         Replication Options in my.cnf
 * Replication SQL::             SQL Commands related to replication
+* Replication FAQ::             Frequently asked questions about replication

 Getting maximum performance from MySQL

@ -11861,6 +11864,8 @@ to restart @code{mysqld} with @code{--skip-grant-tables} to be able to run
 * CHECK TABLE::                 @code{CHECK TABLE} syntax
 * ANALYZE TABLE::               @code{ANALYZE TABLE} syntax
 * REPAIR TABLE::                @code{REPAIR TABLE} syntax
+* BACKUP TABLE::                @code{BACKUP TABLE} syntax
+* RESTORE TABLE::               @code{RESTORE TABLE} syntax
 * DELETE::                      @code{DELETE} syntax
 * SELECT::                      @code{SELECT} syntax
 * JOIN::                        @code{JOIN} syntax
@ -17173,8 +17178,65 @@ The different check types stand for the following:
@item @code{EXTENDED} @tab Do a full key lookup for all keys for each row.  This ensures that the table is 100 % consistent, but will take a long time!
@end multitable

+@findex BACKUP TABLE
+@node BACKUP TABLE,  RESTORE TABLE, CHECK TABLE, Reference
+@section @code{BACKUP TABLE} syntax
+
+@example
+BACKUP TABLE tbl_name[,tbl_name...] TO '/path/to/backup/directory'
+@end example
+
+Make a copy of all the table files to the backup directory that are the
+minimum needed to restore it. Currenlty only works for @code{MyISAM}
+tables. For @code{MyISAM} table, copies @code{.frm} (definition)  and
+ @code{.MYD} (data) files. The index file can be rebuilt from those two.
+
+During the backup, read lock will be held for each table, one at time,
+as they are being backed up. If you want to backup several tables as
+a snapshot, you must first issue @code{LOCK TABLES} obtaining a read
+lock for each table in the group.
+
+
+The command returns a table with the following columns:
+
+@multitable @columnfractions .35 .65
+@item @strong{Column} @tab @strong{Value}
+@item Table @tab Table name
+@item Op @tab Always ``backup''
+@item Msg_type @tab One of @code{status}, @code{error}, @code{info} or @code{warning}.
+@item Msg_text @tab The message.
+@end multitable
+
+
+@findex RESTORE TABLE
+@node RESTORE TABLE, ANALYZE TABLE, BACKUP TABLE, Reference
+@section @code{RESTORE TABLE} syntax
+
+@example
+RESTORE TABLE tbl_name[,tbl_name...] FROM '/path/to/backup/directory'
+@end example
+
+Restores the table(s) from the backup that was made with 
+@code{BACKUP TABLE}. Existing tables will not be overwritten - if you
+try to restore over an existing table, you will get an error. Restore
+will take  longer than BACKUP due to the need to rebuilt the index. The
+more keys you have, the longer it is going to take. Just as 
+@code{BACKUP TABLE}, currently only works of @code{MyISAM} tables.
+
+
+The command returns a table with the following columns:
+
+@multitable @columnfractions .35 .65
+@item @strong{Column} @tab @strong{Value}
+@item Table @tab Table name
+@item Op @tab Always ``restore''
+@item Msg_type @tab One of @code{status}, @code{error}, @code{info} or @code{warning}.
+@item Msg_text @tab The message.
+@end multitable
+
+
@findex ANALYZE TABLE
-@node ANALYZE TABLE, REPAIR TABLE, CHECK TABLE, Reference
+@node ANALYZE TABLE, REPAIR TABLE, RESTORE TABLE, Reference
@section @code{ANALYZE TABLE} syntax

@example
@ -23545,6 +23607,7 @@ tables}.
 * Replication Features::        Replication Features
 * Replication Options::         Replication Options in my.cnf
 * Replication SQL::             SQL Commands related to replication
+* Replication FAQ::             Frequently Asked Questions about replication
@end menu

@node Replication Intro, Replication Implementation, Replication, Replication
@ -23818,7 +23881,7 @@ to the binary log

@end multitable

-@node Replication SQL,  , Replication Options, Replication
+@node Replication SQL, Replication FAQ, Replication Options, Replication
@section SQL commands related to replication

 Replication can be controlled through the SQL interface. Below is the
@ -23887,6 +23950,228 @@ command line. (Slave)

@end multitable

+@node Replication FAQ, , Replication SQL, Replication 
+@section Replication FAQ
+
+@strong{Q}: Why do I sometimes see more than one @code{Binlog_Dump} thread on
+the master after I have restarted the slave?
+
+@strong{A}: @code{Binlog_Dump} is a continuous process that is handled by the
+server the following way:
+
+@itemize
+@item
+catch up on the updates
+@item
+once there are no more updates left, go into @code{pthread_cond_wait()},
+from which we can be woken up either by an update or a kill
+@item 
+on wake up, check the reason, if we are not supposed to die, continue
+the @code{Binlog_dump} loop
+@item
+if there is some fatal error, such as detecting a dead client, 
+terminate the loop
+@end itemize
+
+So if the slave thread stops on the slave, the corresponding 
+@code{Binlog_Dump} thread on the master will not notice it until after
+at least one update to the master ( or a kill), which is needed to wake
+it up from @code{pthread_cond_wait()}.  In the meantime, the slave 
+could have opened another connection, which resulted in another 
+@code{Binlog_Dump} thread.
+
+Once we add @strong{server_id} variable for each server that 
+participates in replication, we will fix @code{Binlog_Dump} thread to
+kill all the zombies from the same slave on reconnect.
+
+@strong{Q}: What issues should I be aware of when setting up two-way
+replication?
+
+@strong{A}: @strong{MySQL} replication currently does not support any
+locking protocol between master and slave to guarantee the atomicity of
+a distributed ( cross-server) update. In in other words, it is possible
+for client A to make an update to  co-master 1, and in the meantime, 
+before it propogates to co-master 2, client B could make an update to
+co-master 2 that will make the update of client A work differently than
+it did on co-master 1. Thus when the update of client A will make it
+to co-master 2, it will produce  tables that will be different than 
+what you have on co-master 1, even after all the updates from co-master
+2 have also propogated. So you should not co-chain two servers in a
+two-way replication relationship, unless you are sure that you updates
+can safely happen in any order, or unless you take care of mis-ordered
+updates somehow in the client code.
+
+Until we implement @code{server_id} variable, you cannot have more than
+two servers in a co-master replication relationship, and you must 
+run @code{mysqld} without @code{log-slave-updates} (default) to avoid
+infinite update loops.
+
+You must also realize that two-way replication actually does not improve
+performance very much, if at all, as far as updates are concerned. Both
+servers need to do the same amount of updates each, as you would have 
+one server do. The only difference is that there will be a little less
+lock contention, because the updates originating on another server will
+be serialized in one slave thread. This benefit, though, might be 
+offset by network delays.
+
+@strong{Q}: How can I use replication to improve performance of my system?
+
+@strong{A}: You should set up one server as the master, and direct all
+writes to it, and configure as many slaves as you have the money and
+rackspace for, distributing the reads among the master and the slaves.
+
+@strong{Q}: What should I do to prepare my client code to use 
+performance-enhancing replication?
+
+@strong{A}: 
+If the part of your code that is responsible for database access has
+been properly abstracted/modularized, converting it to run with the
+replicated setup should be very smooth and easy - just change the 
+implementation of your database access to read from some slave or the
+master, and to awlays write to the master. If your code does not have
+this level of abstraction,
+setting up a replicated system will give you an opportunity/motivation 
+to it clean up. 
+ You should start by creating a wrapper library
+/module with the following functions:
+
+@itemize
+@item
+@code{safe_writer_connect()}
+@item
+@code{safe_reader_connect()}
+@item
+@code{safe_reader_query()}
+@item
+@code{safe_writer_query()}
+@end itemize
+
+@code{safe_} means that the function will take care of handling all
+the error conditions. 
+
+You should then convert your client code to use the wrapper library.
+It may be a painful and scary process at first, but it will pay off in
+the long run. All application that follow the above pattern will be
+able to take advantage of one-master/many slaves solution.  The 
+code will be a lot easier to maintain, and adding troubleshooting 
+options will be trivial - you will just need to modify one or two
+functions, for example, to log how long each query took, or which
+query, among your many thousands, gave you an error. If you have written a lot of code already,
+you may want to automate the conversion task by using Monty's 
+@code{replace} utility, which comes with the standard distribution of 
+@strong{MySQL}, or just write your own Perl script. Hopefully, your
+code follows some recognizable pattern. If not, then you are probably
+better off re-writing it anyway, or at least going through and manually
+beating it into a pattern.
+
+Note that, of course, you can use different names for the
+functions. What is important is having unified interface for connecting
+for reads, connecting for writes, doing a read, and doing a write. 
+
+
+@strong{Q}: When and how much can @code{MySQL} replication improve the performance
+of my system?
+
+@strong{A}: @strong{MySQL} replication is most benefitial for a system
+with frequent reads and not so frequent writes. In theory, by using a
+one master/many slaves setup you can scale by adding more slaves until
+you either run out of network bandwidth, or your update 
+load grows to the point
+that the master cannot handle it. 
+
+In order to determine how many slaves you can get before the added
+benefits begin to level out, and how much you can improve performance
+of your site, you need to know your query patterns, and empirically
+ (by benchmarking) determine the relationship between the throughput
+on reads ( reads per second, or @code{max_reads}) and on writes 
+@code{max_writes}) on a typical master and a typical slave. The
+example below will show you a rather simplified calculation of what you
+can get with replication for our imagined system.
+
+Let's say our system load consist of 10% writes and 90% reads, and we
+have determined that @code{max_reads} = 1200 - 2 * @code{max_writes},
+or in other words, our system can do 1200 reads per second with no
+writes, our average write is twice as slow as average read, 
+and the relationship is 
+linear. Let us suppose that our master and slave are of the same
+capacity, and we have N slaves and 1 master. Then we have for each
+server ( master or slave):
+
+@code{reads = 1200 - 2 * writes} ( from bencmarks)
+
+@code{reads = 9* writes / (N + 1) } ( reads split, but writes go
+to all servers)
+
+@code{9*writes/(N+1) + 2 * writes = 1200}
+
+@code{writes = 1200/(2 + 9/(N+1)}
+
+So if N = 0, which means we have no replication, our system can handle
+1200/11, about 109 writes per second ( which means we will have 9 times
+as many reads to to the nature of our application) 
+
+If N = 1, we can get up to 184 writes per second
+
+If N = 8, we get up to 400
+
+If N = 17, 480 writes
+
+Eventually as N approaches infinity ( and our budget negative infinity),
+we can get very close to 600 writes per second, increasing system
+throughput about 5.5 times. However, with only 8 servers, we increased
+it almost 4 times already. 
+
+Note that our computations assumed infitine network bandwidth, and 
+neglected several other factors that could turn out to be signficant on
+your system. In many cases, you may not be able to make a computation
+similar to the one above that will accurately predict what will happen
+on your system if you add N replication slaves. However, answering the
+following questions should help you decided whether and how much if at
+all the replication will improve the performance of your system:
+
+@itemize
+@item
+What is the read/write ratio on your system?
+@item
+How much more write load can one server handle if you reduce the reads?
+@item
+How many slaves do you have bandwidth for on your network?
+@end itemize
+
+@strong{Q}: How can I use replication to provide redundancy/high 
+availability?
+
+@strong{A}: With the currently available features, you would have to
+set up a master and a slave (or several slaves), and write a script 
+that will monitor the
+master to see if it is up, and instruct your applications and 
+the slaves of the master change in case of failure. Some suggestions:
+
+@itemize
+@item
+To tell a slave to change the master use @code{CHANGE MASTER TO} command
+@item 
+A good way to keep your applications informed where the master is is by
+having a dynamic DNS entry for the master. With @strong{bind} you can
+use @code{nsupdate} to dynamically update your DNS
+@item
+You should run your slaves with @code{log-bin} option and without
+@code{log-slave-updates}. This way the slave will be ready to become a
+master as soon as you issue @code{STOP SLAVE}; @code{FLUSH MASTER}, and
+@code{CHANGE MASTER TO} on the other slaves. It will also help you catch
+spurious updates that may happen because of misconfiguration of the 
+slave ( ideally, you want to configure access rights so that no client
+can update the slave, except for the slave thread) combined with the
+bugs in your client programs ( they should never update the slave 
+directly). 
+
+@end itemize
+
+We are currently working on intergrating an automatic master election
+system into @strong{MySQL}, but until it is ready, you will have to
+create your own monitoring tools .
+
+
@cindex Performance
@cindex Optimization
@node Performance, MySQL Benchmarks, Replication, Top
@ -36291,6 +36576,14 @@ though, so 3.23 is not released as a stable version yet.
@appendixsubsec Changes in release 3.23.25
@itemize @bullet
@item
+Fixed wrong time for @code{Connect} state of the slave thread in
+@code{processlist}
+@item
+Added logging to @code{--log} on the slave of successful connect to
+the master
+@item
+Added @code{BACKUP TABLE} and @code{RESTORE TABLE}
+@item
@code{HEAP} tables didn't use keys properly. (Bug from 3.23.23)
@item
 Added better support for @code{MERGE} tables (keys, mapping, creation,
@ -36421,7 +36714,7 @@ Fixed @code{INSERT INTO bdb_table  ... SELECT} to work with BDB tables.
@code{CHECK TABLE} now updates key statistics for the table.
@item
@code{ANALYZE TABLE} will now only update tables that have been changed
-since thee last @code{ANALYZE}. Note that this is a new feature and tables
+since the last @code{ANALYZE}. Note that this is a new feature and tables
 will not be marked to be analyzed until they are updated in any way with
 3.23.23 or newer.  For older tables, you have to do @code{CHECK TABLE}
 to update the key distribution.
@ -40748,10 +41041,8 @@ show columns from t2;
@item
 Implement function: @code{get_changed_tables(timeout,table1,table2,...)}
@item
-Atomic updates; This includes a language that one can even use for
-a set of stored procedures.
-@item
-@code{update items,month set items.price=month.price where items.id=month.id;}
+Atomic multi-table updates, eg @code{update items,month set 
+items.price=month.price where items.id=month.id;}; 
@item
 Change reading through tables to use memmap when possible. Now only
 compressed tables use memmap.
@ -40835,8 +41126,6 @@ Use @code{NULL} instead.
@item
 Add full support for @code{JOIN} with parentheses.
@item
-Reuse threads for systems with a lot of connections.
-@item
 As an alternative for one thread / connection manage a pool of threads
 to handle the queries.
@item
--- a/sql/sql_lex.h
+++ b/sql/sql_lex.h
@ -126,6 +126,7 @@ typedef struct st_lex {
  HA_CHECK_OPT   check_opt;			// check/repair options
  HA_CREATE_INFO create_info;
  LEX_MASTER_INFO mi;				// used by CHANGE MASTER
+  char* backup_dir;                             // used by RESTORE/BACKUP
  ulong thread_id,type;
  ulong options;
  enum_sql_command sql_command;