mirror of
https://github.com/postgres/postgres.git
synced 2025-05-21 15:54:08 +03:00
Improvements to the backup & restore documentation.
This commit is contained in:
parent
e3391133ae
commit
2ff4e44043
@ -1,5 +1,5 @@
|
|||||||
<!--
|
<!--
|
||||||
$PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.38 2004/03/09 16:57:46 neilc Exp $
|
$PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.39 2004/04/22 07:02:35 neilc Exp $
|
||||||
-->
|
-->
|
||||||
<chapter id="backup">
|
<chapter id="backup">
|
||||||
<title>Backup and Restore</title>
|
<title>Backup and Restore</title>
|
||||||
@ -30,7 +30,7 @@ $PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.38 2004/03/09 16:57:46 neilc Exp
|
|||||||
commands that, when fed back to the server, will recreate the
|
commands that, when fed back to the server, will recreate the
|
||||||
database in the same state as it was at the time of the dump.
|
database in the same state as it was at the time of the dump.
|
||||||
<productname>PostgreSQL</> provides the utility program
|
<productname>PostgreSQL</> provides the utility program
|
||||||
<application>pg_dump</> for this purpose. The basic usage of this
|
<xref linkend="app-pgdump"> for this purpose. The basic usage of this
|
||||||
command is:
|
command is:
|
||||||
<synopsis>
|
<synopsis>
|
||||||
pg_dump <replaceable class="parameter">dbname</replaceable> > <replaceable class="parameter">outfile</replaceable>
|
pg_dump <replaceable class="parameter">dbname</replaceable> > <replaceable class="parameter">outfile</replaceable>
|
||||||
@ -126,10 +126,11 @@ psql <replaceable class="parameter">dbname</replaceable> < <replaceable class
|
|||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
Once restored, it is wise to run <command>ANALYZE</> on each
|
Once restored, it is wise to run <xref linkend="sql-analyze"
|
||||||
database so the optimizer has useful statistics. You
|
endterm="sql-analyze-title"> on each database so the optimizer has
|
||||||
can also run <command>vacuumdb -a -z</> to <command>ANALYZE</> all
|
useful statistics. You can also run <command>vacuumdb -a -z</> to
|
||||||
databases.
|
<command>VACUUM ANALYZE</> all databases; this is equivalent to
|
||||||
|
running <command>VACUUM ANALYZE</command> manually.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
@ -153,13 +154,11 @@ pg_dump -h <replaceable>host1</> <replaceable>dbname</> | psql -h <replaceable>h
|
|||||||
</para>
|
</para>
|
||||||
</important>
|
</important>
|
||||||
|
|
||||||
<tip>
|
|
||||||
<para>
|
<para>
|
||||||
Restore performance can be improved by increasing the
|
For advice on how to load large amounts of data into
|
||||||
configuration parameter <xref
|
<productname>PostgreSQL</productname> efficiently, refer to <xref
|
||||||
linkend="guc-maintenance-work-mem">.
|
linkend="populate">.
|
||||||
</para>
|
</para>
|
||||||
</tip>
|
|
||||||
</sect2>
|
</sect2>
|
||||||
|
|
||||||
<sect2 id="backup-dump-all">
|
<sect2 id="backup-dump-all">
|
||||||
@ -167,12 +166,11 @@ pg_dump -h <replaceable>host1</> <replaceable>dbname</> | psql -h <replaceable>h
|
|||||||
|
|
||||||
<para>
|
<para>
|
||||||
The above mechanism is cumbersome and inappropriate when backing
|
The above mechanism is cumbersome and inappropriate when backing
|
||||||
up an entire database cluster. For this reason the
|
up an entire database cluster. For this reason the <xref
|
||||||
<application>pg_dumpall</> program is provided.
|
linkend="app-pg-dumpall"> program is provided.
|
||||||
<application>pg_dumpall</> backs up each database in a given
|
<application>pg_dumpall</> backs up each database in a given
|
||||||
cluster, and also preserves cluster-wide data such as
|
cluster, and also preserves cluster-wide data such as users and
|
||||||
users and groups. The call sequence for
|
groups. The basic usage of this command is:
|
||||||
<application>pg_dumpall</> is simply
|
|
||||||
<synopsis>
|
<synopsis>
|
||||||
pg_dumpall > <replaceable>outfile</>
|
pg_dumpall > <replaceable>outfile</>
|
||||||
</synopsis>
|
</synopsis>
|
||||||
@ -195,7 +193,7 @@ psql template1 < <replaceable class="parameter">infile</replaceable>
|
|||||||
Since <productname>PostgreSQL</productname> allows tables larger
|
Since <productname>PostgreSQL</productname> allows tables larger
|
||||||
than the maximum file size on your system, it can be problematic
|
than the maximum file size on your system, it can be problematic
|
||||||
to dump such a table to a file, since the resulting file will likely
|
to dump such a table to a file, since the resulting file will likely
|
||||||
be larger than the maximum size allowed by your system. As
|
be larger than the maximum size allowed by your system. Since
|
||||||
<application>pg_dump</> can write to the standard output, you can
|
<application>pg_dump</> can write to the standard output, you can
|
||||||
just use standard Unix tools to work around this possible problem.
|
just use standard Unix tools to work around this possible problem.
|
||||||
</para>
|
</para>
|
||||||
@ -274,7 +272,7 @@ pg_dump -Fc <replaceable class="parameter">dbname</replaceable> > <replaceable c
|
|||||||
For reasons of backward compatibility, <application>pg_dump</>
|
For reasons of backward compatibility, <application>pg_dump</>
|
||||||
does not dump large objects by default.<indexterm><primary>large
|
does not dump large objects by default.<indexterm><primary>large
|
||||||
object</primary><secondary>backup</secondary></indexterm> To dump
|
object</primary><secondary>backup</secondary></indexterm> To dump
|
||||||
large objects you must use either the custom or the TAR output
|
large objects you must use either the custom or the tar output
|
||||||
format, and use the <option>-b</> option in
|
format, and use the <option>-b</> option in
|
||||||
<application>pg_dump</>. See the reference pages for details. The
|
<application>pg_dump</>. See the reference pages for details. The
|
||||||
directory <filename>contrib/pg_dumplo</> of the
|
directory <filename>contrib/pg_dumplo</> of the
|
||||||
@ -315,11 +313,12 @@ tar -cf backup.tar /usr/local/pgsql/data
|
|||||||
<para>
|
<para>
|
||||||
The database server <emphasis>must</> be shut down in order to
|
The database server <emphasis>must</> be shut down in order to
|
||||||
get a usable backup. Half-way measures such as disallowing all
|
get a usable backup. Half-way measures such as disallowing all
|
||||||
connections will not work as there is always some buffering
|
connections will <emphasis>not</emphasis> work
|
||||||
going on. Information about stopping the server can be
|
(<command>tar</command> and similar tools do not take an atomic
|
||||||
found in <xref linkend="postmaster-shutdown">. Needless to say
|
snapshot of the state of the filesystem at a point in
|
||||||
that you also need to shut down the server before restoring the
|
time). Information about stopping the server can be found in
|
||||||
data.
|
<xref linkend="postmaster-shutdown">. Needless to say that you
|
||||||
|
also need to shut down the server before restoring the data.
|
||||||
</para>
|
</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
|
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
<!--
|
<!--
|
||||||
$PostgreSQL: pgsql/doc/src/sgml/perform.sgml,v 1.43 2004/03/25 18:57:57 tgl Exp $
|
$PostgreSQL: pgsql/doc/src/sgml/perform.sgml,v 1.44 2004/04/22 07:02:36 neilc Exp $
|
||||||
-->
|
-->
|
||||||
|
|
||||||
<chapter id="performance-tips">
|
<chapter id="performance-tips">
|
||||||
@ -28,8 +28,8 @@ $PostgreSQL: pgsql/doc/src/sgml/perform.sgml,v 1.43 2004/03/25 18:57:57 tgl Exp
|
|||||||
plan</firstterm> for each query it is given. Choosing the right
|
plan</firstterm> for each query it is given. Choosing the right
|
||||||
plan to match the query structure and the properties of the data
|
plan to match the query structure and the properties of the data
|
||||||
is absolutely critical for good performance. You can use the
|
is absolutely critical for good performance. You can use the
|
||||||
<command>EXPLAIN</command> command to see what query plan the system
|
<xref linkend="sql-explain" endterm="sql-explain-title"> command
|
||||||
creates for any query.
|
to see what query plan the system creates for any query.
|
||||||
Plan-reading is an art that deserves an extensive tutorial, which
|
Plan-reading is an art that deserves an extensive tutorial, which
|
||||||
this is not; but here is some basic information.
|
this is not; but here is some basic information.
|
||||||
</para>
|
</para>
|
||||||
@ -638,30 +638,51 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
|
|||||||
</indexterm>
|
</indexterm>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
Turn off autocommit and just do one commit at
|
Turn off autocommit and just do one commit at the end. (In plain
|
||||||
the end. (In plain SQL, this means issuing <command>BEGIN</command>
|
SQL, this means issuing <command>BEGIN</command> at the start and
|
||||||
at the start and <command>COMMIT</command> at the end. Some client
|
<command>COMMIT</command> at the end. Some client libraries may
|
||||||
libraries may do this behind your back, in which case you need to
|
do this behind your back, in which case you need to make sure the
|
||||||
make sure the library does it when you want it done.)
|
library does it when you want it done.) If you allow each
|
||||||
If you allow each insertion to be committed separately,
|
insertion to be committed separately,
|
||||||
<productname>PostgreSQL</productname> is doing a lot of work for each
|
<productname>PostgreSQL</productname> is doing a lot of work for
|
||||||
row that is added.
|
each row that is added. An additional benefit of doing all
|
||||||
An additional benefit of doing all insertions in one transaction
|
insertions in one transaction is that if the insertion of one row
|
||||||
is that if the insertion of one row were to fail then the
|
were to fail then the insertion of all rows inserted up to that
|
||||||
insertion of all rows inserted up to that point would be rolled
|
point would be rolled back, so you won't be stuck with partially
|
||||||
back, so you won't be stuck with partially loaded data.
|
loaded data.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
If you are issuing a large sequence of <command>INSERT</command>
|
||||||
|
commands to bulk load some data, also consider using <xref
|
||||||
|
linkend="sql-prepare" endterm="sql-prepare-title"> to create a
|
||||||
|
prepared <command>INSERT</command> statement. Since you are
|
||||||
|
executing the same command multiple times, it is more efficient to
|
||||||
|
prepare the command once and then use <command>EXECUTE</command>
|
||||||
|
as many times as required.
|
||||||
</para>
|
</para>
|
||||||
</sect2>
|
</sect2>
|
||||||
|
|
||||||
<sect2 id="populate-copy-from">
|
<sect2 id="populate-copy-from">
|
||||||
<title>Use <command>COPY FROM</command></title>
|
<title>Use <command>COPY</command></title>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
Use <command>COPY FROM STDIN</command> to load all the rows in one
|
Use <xref linkend="sql-copy" endterm="sql-copy-title"> to load
|
||||||
command, instead of using a series of <command>INSERT</command>
|
all the rows in one command, instead of using a series of
|
||||||
commands. This reduces parsing, planning, etc. overhead a great
|
<command>INSERT</command> commands. The <command>COPY</command>
|
||||||
deal. If you do this then it is not necessary to turn off
|
command is optimized for loading large numbers of rows; it is less
|
||||||
autocommit, since it is only one command anyway.
|
flexible than <command>INSERT</command>, but incurs significantly
|
||||||
|
less overhead for large data loads. Since <command>COPY</command>
|
||||||
|
is a single command, there is no need to disable autocommit if you
|
||||||
|
use this method to populate a table.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
Note that loading a large number of rows using
|
||||||
|
<command>COPY</command> is almost always faster than using
|
||||||
|
<command>INSERT</command>, even if multiple
|
||||||
|
<command>INSERT</command> commands are batched into a single
|
||||||
|
transaction.
|
||||||
</para>
|
</para>
|
||||||
</sect2>
|
</sect2>
|
||||||
|
|
||||||
@ -678,11 +699,12 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
|
|||||||
|
|
||||||
<para>
|
<para>
|
||||||
If you are augmenting an existing table, you can drop the index,
|
If you are augmenting an existing table, you can drop the index,
|
||||||
load the table, then recreate the index. Of
|
load the table, and then recreate the index. Of course, the
|
||||||
course, the database performance for other users may be adversely
|
database performance for other users may be adversely affected
|
||||||
affected during the time that the index is missing. One should also
|
during the time that the index is missing. One should also think
|
||||||
think twice before dropping unique indexes, since the error checking
|
twice before dropping unique indexes, since the error checking
|
||||||
afforded by the unique constraint will be lost while the index is missing.
|
afforded by the unique constraint will be lost while the index is
|
||||||
|
missing.
|
||||||
</para>
|
</para>
|
||||||
</sect2>
|
</sect2>
|
||||||
|
|
||||||
@ -701,16 +723,39 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
|
|||||||
</para>
|
</para>
|
||||||
</sect2>
|
</sect2>
|
||||||
|
|
||||||
|
<sect2 id="populate-checkpoint-segments">
|
||||||
|
<title>Increase <varname>checkpoint_segments</varname></title>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
Temporarily increasing the <xref
|
||||||
|
linkend="guc-checkpoint-segments"> configuration variable can also
|
||||||
|
make large data loads faster. This is because loading a large
|
||||||
|
amount of data into <productname>PostgreSQL</productname> can
|
||||||
|
cause checkpoints to occur more often than the normal checkpoint
|
||||||
|
frequency (specified by the <varname>checkpoint_timeout</varname>
|
||||||
|
configuration variable). Whenever a checkpoint occurs, all dirty
|
||||||
|
pages must be flushed to disk. By increasing
|
||||||
|
<varname>checkpoint_segments</varname> temporarily during bulk
|
||||||
|
data loads, the number of checkpoints that are required can be
|
||||||
|
reduced.
|
||||||
|
</para>
|
||||||
|
</sect2>
|
||||||
|
|
||||||
<sect2 id="populate-analyze">
|
<sect2 id="populate-analyze">
|
||||||
<title>Run <command>ANALYZE</command> Afterwards</title>
|
<title>Run <command>ANALYZE</command> Afterwards</title>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
It's a good idea to run <command>ANALYZE</command> or <command>VACUUM
|
Whenever you have significantly altered the distribution of data
|
||||||
ANALYZE</command> anytime you've added or updated a lot of data,
|
within a table, running <xref linkend="sql-analyze"
|
||||||
including just after initially populating a table. This ensures that
|
endterm="sql-analyze-title"> is strongly recommended. This
|
||||||
the planner has up-to-date statistics about the table. With no statistics
|
includes when bulk loading large amounts of data into
|
||||||
or obsolete statistics, the planner may make poor choices of query plans,
|
<productname>PostgreSQL</productname>. Running
|
||||||
leading to bad performance on queries that use your table.
|
<command>ANALYZE</command> (or <command>VACUUM ANALYZE</command>)
|
||||||
|
ensures that the planner has up-to-date statistics about the
|
||||||
|
table. With no statistics or obsolete statistics, the planner may
|
||||||
|
make poor decisions during query planning, leading to poor
|
||||||
|
performance on any tables with inaccurate or nonexistent
|
||||||
|
statistics.
|
||||||
</para>
|
</para>
|
||||||
</sect2>
|
</sect2>
|
||||||
</sect1>
|
</sect1>
|
||||||
|
Loading…
x
Reference in New Issue
Block a user