mirror of
https://github.com/postgres/postgres.git
synced 2025-05-05 09:19:17 +03:00
Add text to "Populating a Database" pointing out that bulk data load into a
table with foreign key constraints eats memory. Per off-line discussion of bug #5480 with its reporter. Also do some minor wordsmithing elsewhere in the same section.
This commit is contained in:
parent
d800b036d2
commit
63f591e969
@ -1,4 +1,4 @@
|
|||||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/perform.sgml,v 1.79 2010/04/28 21:23:29 tgl Exp $ -->
|
<!-- $PostgreSQL: pgsql/doc/src/sgml/perform.sgml,v 1.80 2010/05/29 21:08:04 tgl Exp $ -->
|
||||||
|
|
||||||
<chapter id="performance-tips">
|
<chapter id="performance-tips">
|
||||||
<title>Performance Tips</title>
|
<title>Performance Tips</title>
|
||||||
@ -870,11 +870,11 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
|
|||||||
|
|
||||||
<para>
|
<para>
|
||||||
If you are adding large amounts of data to an existing table,
|
If you are adding large amounts of data to an existing table,
|
||||||
it might be a win to drop the index,
|
it might be a win to drop the indexes,
|
||||||
load the table, and then recreate the index. Of course, the
|
load the table, and then recreate the indexes. Of course, the
|
||||||
database performance for other users might suffer
|
database performance for other users might suffer
|
||||||
during the time the index is missing. One should also think
|
during the time the indexes are missing. One should also think
|
||||||
twice before dropping unique indexes, since the error checking
|
twice before dropping a unique index, since the error checking
|
||||||
afforded by the unique constraint will be lost while the index is
|
afforded by the unique constraint will be lost while the index is
|
||||||
missing.
|
missing.
|
||||||
</para>
|
</para>
|
||||||
@ -890,6 +890,19 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
|
|||||||
the constraints. Again, there is a trade-off between data load
|
the constraints. Again, there is a trade-off between data load
|
||||||
speed and loss of error checking while the constraint is missing.
|
speed and loss of error checking while the constraint is missing.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
What's more, when you load data into a table with existing foreign key
|
||||||
|
constraints, each new row requires an entry in the server's list of
|
||||||
|
pending trigger events (since it is the firing of a trigger that checks
|
||||||
|
the row's foreign key constraint). Loading many millions of rows can
|
||||||
|
cause the trigger event queue to overflow available memory, leading to
|
||||||
|
intolerable swapping or even outright failure of the command. Therefore
|
||||||
|
it may be <emphasis>necessary</>, not just desirable, to drop and re-apply
|
||||||
|
foreign keys when loading large amounts of data. If temporarily removing
|
||||||
|
the constraint isn't acceptable, the only other recourse may be to split
|
||||||
|
up the load operation into smaller transactions.
|
||||||
|
</para>
|
||||||
</sect2>
|
</sect2>
|
||||||
|
|
||||||
<sect2 id="populate-work-mem">
|
<sect2 id="populate-work-mem">
|
||||||
@ -930,11 +943,11 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
|
|||||||
When loading large amounts of data into an installation that uses
|
When loading large amounts of data into an installation that uses
|
||||||
WAL archiving or streaming replication, it might be faster to take a
|
WAL archiving or streaming replication, it might be faster to take a
|
||||||
new base backup after the load has completed than to process a large
|
new base backup after the load has completed than to process a large
|
||||||
amount of incremental WAL data. You might want to disable archiving
|
amount of incremental WAL data. To prevent incremental WAL logging
|
||||||
and streaming replication while loading, by setting
|
while loading, disable archiving and streaming replication, by setting
|
||||||
<xref linkend="guc-wal-level"> to <literal>minimal</>,
|
<xref linkend="guc-wal-level"> to <literal>minimal</>,
|
||||||
<xref linkend="guc-archive-mode"> <literal>off</>, and
|
<xref linkend="guc-archive-mode"> to <literal>off</>, and
|
||||||
<xref linkend="guc-max-wal-senders"> to zero).
|
<xref linkend="guc-max-wal-senders"> to zero.
|
||||||
But note that changing these settings requires a server restart.
|
But note that changing these settings requires a server restart.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
@ -1006,7 +1019,8 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
|
|||||||
<application>pg_dump</> dump as quickly as possible, you need to
|
<application>pg_dump</> dump as quickly as possible, you need to
|
||||||
do a few extra things manually. (Note that these points apply while
|
do a few extra things manually. (Note that these points apply while
|
||||||
<emphasis>restoring</> a dump, not while <emphasis>creating</> it.
|
<emphasis>restoring</> a dump, not while <emphasis>creating</> it.
|
||||||
The same points apply when using <application>pg_restore</> to load
|
The same points apply whether loading a text dump with
|
||||||
|
<application>psql</> or using <application>pg_restore</> to load
|
||||||
from a <application>pg_dump</> archive file.)
|
from a <application>pg_dump</> archive file.)
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
@ -1027,10 +1041,11 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
|
|||||||
<listitem>
|
<listitem>
|
||||||
<para>
|
<para>
|
||||||
If using WAL archiving or streaming replication, consider disabling
|
If using WAL archiving or streaming replication, consider disabling
|
||||||
them during the restore. To do that, set <varname>archive_mode</> off,
|
them during the restore. To do that, set <varname>archive_mode</>
|
||||||
|
to <literal>off</>,
|
||||||
<varname>wal_level</varname> to <literal>minimal</>, and
|
<varname>wal_level</varname> to <literal>minimal</>, and
|
||||||
<varname>max_wal_senders</> zero before loading the dump script,
|
<varname>max_wal_senders</> to zero before loading the dump.
|
||||||
and afterwards set them back to the right values and take a fresh
|
Afterwards, set them back to the right values and take a fresh
|
||||||
base backup.
|
base backup.
|
||||||
</para>
|
</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
@ -1045,9 +1060,13 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
|
|||||||
interrelated the data is, that might seem preferable to manual cleanup,
|
interrelated the data is, that might seem preferable to manual cleanup,
|
||||||
or not. <command>COPY</> commands will run fastest if you use a single
|
or not. <command>COPY</> commands will run fastest if you use a single
|
||||||
transaction and have WAL archiving turned off.
|
transaction and have WAL archiving turned off.
|
||||||
<application>pg_restore</> also has a <option>--jobs</> option
|
</para>
|
||||||
which allows concurrent data loading and index creation, and has
|
</listitem>
|
||||||
the performance advantages of doing COPY in a single transaction.
|
<listitem>
|
||||||
|
<para>
|
||||||
|
If multiple CPUs are available in the database server, consider using
|
||||||
|
<application>pg_restore</>'s <option>--jobs</> option. This
|
||||||
|
allows concurrent data loading and index creation.
|
||||||
</para>
|
</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
<listitem>
|
<listitem>
|
||||||
|
Loading…
x
Reference in New Issue
Block a user