mirror of
https://github.com/postgres/postgres.git
synced 2025-05-02 11:44:50 +03:00
Restructure performance tips into a single chapter ('populating a
database' was way too small to make a chapter). Add a section about using JOIN syntax to direct the planner.
This commit is contained in:
parent
85934d1bdb
commit
ad4fd77a5f
@ -1,4 +1,4 @@
|
|||||||
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/filelist.sgml,v 1.2 2000/12/14 22:30:56 petere Exp $ -->
|
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/filelist.sgml,v 1.3 2000/12/16 02:29:36 tgl Exp $ -->
|
||||||
|
|
||||||
<!entity about SYSTEM "about.sgml">
|
<!entity about SYSTEM "about.sgml">
|
||||||
<!entity history SYSTEM "history.sgml">
|
<!entity history SYSTEM "history.sgml">
|
||||||
@ -26,11 +26,10 @@
|
|||||||
<!entity inherit SYSTEM "inherit.sgml">
|
<!entity inherit SYSTEM "inherit.sgml">
|
||||||
<!entity manage SYSTEM "manage.sgml">
|
<!entity manage SYSTEM "manage.sgml">
|
||||||
<!entity mvcc SYSTEM "mvcc.sgml">
|
<!entity mvcc SYSTEM "mvcc.sgml">
|
||||||
<!entity plan SYSTEM "plan.sgml">
|
<!entity perform SYSTEM "perform.sgml">
|
||||||
<!entity plperl SYSTEM "plperl.sgml">
|
<!entity plperl SYSTEM "plperl.sgml">
|
||||||
<!entity plsql SYSTEM "plsql.sgml">
|
<!entity plsql SYSTEM "plsql.sgml">
|
||||||
<!entity pltcl SYSTEM "pltcl.sgml">
|
<!entity pltcl SYSTEM "pltcl.sgml">
|
||||||
<!entity populate SYSTEM "populate.sgml">
|
|
||||||
<!entity psql SYSTEM "psql.sgml">
|
<!entity psql SYSTEM "psql.sgml">
|
||||||
<!entity query-ug SYSTEM "query-ug.sgml">
|
<!entity query-ug SYSTEM "query-ug.sgml">
|
||||||
<!entity storage SYSTEM "storage.sgml">
|
<!entity storage SYSTEM "storage.sgml">
|
||||||
|
440
doc/src/sgml/perform.sgml
Normal file
440
doc/src/sgml/perform.sgml
Normal file
@ -0,0 +1,440 @@
|
|||||||
|
<!--
|
||||||
|
$Header: /cvsroot/pgsql/doc/src/sgml/perform.sgml,v 1.1 2000/12/16 02:29:36 tgl Exp $
|
||||||
|
-->
|
||||||
|
|
||||||
|
<chapter id="performance-tips">
|
||||||
|
<title>Performance Tips</title>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
Query performance can be affected by many things. Some of these can
|
||||||
|
be manipulated by the user, while others are fundamental to the underlying
|
||||||
|
design of the system. This chapter provides some hints about understanding
|
||||||
|
and tuning <productname>Postgres</productname> performance.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<sect1 id="using-explain">
|
||||||
|
<title>Using <command>EXPLAIN</command></title>
|
||||||
|
|
||||||
|
<note>
|
||||||
|
<title>Author</title>
|
||||||
|
<para>
|
||||||
|
Written by Tom Lane, from e-mail dated 2000-03-27.
|
||||||
|
</para>
|
||||||
|
</note>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
<productname>Postgres</productname> devises a <firstterm>query
|
||||||
|
plan</firstterm> for each query it is given. Choosing the right
|
||||||
|
plan to match the query structure and the properties of the data
|
||||||
|
is absolutely critical for good performance. You can use the
|
||||||
|
<command>EXPLAIN</command> command to see what query plan the system
|
||||||
|
creates for any query. Unfortunately,
|
||||||
|
plan-reading is an art that deserves a tutorial, and I haven't
|
||||||
|
had time to write one. Here is some quick & dirty explanation.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
The numbers that are currently quoted by EXPLAIN are:
|
||||||
|
|
||||||
|
<itemizedlist>
|
||||||
|
<listitem>
|
||||||
|
<para>
|
||||||
|
Estimated start-up cost (time expended before output scan can start,
|
||||||
|
eg, time to do the sorting in a SORT node).
|
||||||
|
</para>
|
||||||
|
</listitem>
|
||||||
|
|
||||||
|
<listitem>
|
||||||
|
<para>
|
||||||
|
Estimated total cost (if all tuples are retrieved, which they may not
|
||||||
|
be --- a query with a LIMIT will stop short of paying the total cost,
|
||||||
|
for example).
|
||||||
|
</para>
|
||||||
|
</listitem>
|
||||||
|
|
||||||
|
<listitem>
|
||||||
|
<para>
|
||||||
|
Estimated number of rows output by this plan node (again, without
|
||||||
|
regard for any LIMIT).
|
||||||
|
</para>
|
||||||
|
</listitem>
|
||||||
|
|
||||||
|
<listitem>
|
||||||
|
<para>
|
||||||
|
Estimated average width (in bytes) of rows output by this plan
|
||||||
|
node.
|
||||||
|
</para>
|
||||||
|
</listitem>
|
||||||
|
</itemizedlist>
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
The costs are measured in units of disk page fetches. (CPU effort
|
||||||
|
estimates are converted into disk-page units using some
|
||||||
|
fairly arbitrary fudge-factors. If you want to experiment with these
|
||||||
|
factors, see the list of run-time configuration parameters in the
|
||||||
|
<citetitle>Administrator's Guide</citetitle>.)
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
It's important to note that the cost of an upper-level node includes
|
||||||
|
the cost of all its child nodes. It's also important to realize that
|
||||||
|
the cost only reflects things that the planner/optimizer cares about.
|
||||||
|
In particular, the cost does not consider the time spent transmitting
|
||||||
|
result tuples to the frontend --- which could be a pretty dominant
|
||||||
|
factor in the true elapsed time, but the planner ignores it because
|
||||||
|
it cannot change it by altering the plan. (Every correct plan will
|
||||||
|
output the same tuple set, we trust.)
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
Rows output is a little tricky because it is <emphasis>not</emphasis> the
|
||||||
|
number of rows
|
||||||
|
processed/scanned by the query --- it is usually less, reflecting the
|
||||||
|
estimated selectivity of any WHERE-clause constraints that are being
|
||||||
|
applied at this node. Ideally the top-level rows estimate will
|
||||||
|
approximate the number of rows actually returned, updated, or deleted
|
||||||
|
by the query (again, without considering the effects of LIMIT).
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
Average width is pretty bogus because the thing really doesn't have
|
||||||
|
any idea of the average length of variable-length columns. I'm thinking
|
||||||
|
about improving that in the future, but it may not be worth the trouble,
|
||||||
|
because the width isn't used for very much.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
Here are some examples (using the regress test database after a
|
||||||
|
vacuum analyze, and almost-7.0 sources):
|
||||||
|
|
||||||
|
<programlisting>
|
||||||
|
regression=# explain select * from tenk1;
|
||||||
|
NOTICE: QUERY PLAN:
|
||||||
|
|
||||||
|
Seq Scan on tenk1 (cost=0.00..333.00 rows=10000 width=148)
|
||||||
|
</programlisting>
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
This is about as straightforward as it gets. If you do
|
||||||
|
|
||||||
|
<programlisting>
|
||||||
|
select * from pg_class where relname = 'tenk1';
|
||||||
|
</programlisting>
|
||||||
|
|
||||||
|
you'll find out that tenk1 has 233 disk
|
||||||
|
pages and 10000 tuples. So the cost is estimated at 233 block
|
||||||
|
reads, defined as 1.0 apiece, plus 10000 * cpu_tuple_cost which is
|
||||||
|
currently 0.01 (try <command>show cpu_tuple_cost</command>).
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
Now let's modify the query to add a qualification clause:
|
||||||
|
|
||||||
|
<programlisting>
|
||||||
|
regression=# explain select * from tenk1 where unique1 < 1000;
|
||||||
|
NOTICE: QUERY PLAN:
|
||||||
|
|
||||||
|
Seq Scan on tenk1 (cost=0.00..358.00 rows=1000 width=148)
|
||||||
|
</programlisting>
|
||||||
|
|
||||||
|
The estimate of output rows has gone down because of the WHERE clause.
|
||||||
|
(This estimate is uncannily accurate because tenk1 is a particularly
|
||||||
|
simple case --- the unique1 column has 10000 distinct values ranging
|
||||||
|
from 0 to 9999, so the estimator's linear interpolation between min and
|
||||||
|
max column values is dead-on.) However, the scan will still have to
|
||||||
|
visit all 10000 rows, so the cost hasn't decreased; in fact it has gone
|
||||||
|
up a bit to reflect the extra CPU time spent checking the WHERE
|
||||||
|
condition.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
Modify the query to restrict the qualification even more:
|
||||||
|
|
||||||
|
<programlisting>
|
||||||
|
regression=# explain select * from tenk1 where unique1 < 100;
|
||||||
|
NOTICE: QUERY PLAN:
|
||||||
|
|
||||||
|
Index Scan using tenk1_unique1 on tenk1 (cost=0.00..89.35 rows=100 width=148)
|
||||||
|
</programlisting>
|
||||||
|
|
||||||
|
and you will see that if we make the WHERE condition selective
|
||||||
|
enough, the planner will
|
||||||
|
eventually decide that an indexscan is cheaper than a sequential scan.
|
||||||
|
This plan will only have to visit 100 tuples because of the index,
|
||||||
|
so it wins despite the fact that each individual fetch is expensive.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
Add another condition to the qualification:
|
||||||
|
|
||||||
|
<programlisting>
|
||||||
|
regression=# explain select * from tenk1 where unique1 < 100 and
|
||||||
|
regression-# stringu1 = 'xxx';
|
||||||
|
NOTICE: QUERY PLAN:
|
||||||
|
|
||||||
|
Index Scan using tenk1_unique1 on tenk1 (cost=0.00..89.60 rows=1 width=148)
|
||||||
|
</programlisting>
|
||||||
|
|
||||||
|
The added clause "stringu1 = 'xxx'" reduces the output-rows estimate,
|
||||||
|
but not the cost because we still have to visit the same set of tuples.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
Let's try joining two tables, using the fields we have been discussing:
|
||||||
|
|
||||||
|
<programlisting>
|
||||||
|
regression=# explain select * from tenk1 t1, tenk2 t2 where t1.unique1 < 100
|
||||||
|
regression-# and t1.unique2 = t2.unique2;
|
||||||
|
NOTICE: QUERY PLAN:
|
||||||
|
|
||||||
|
Nested Loop (cost=0.00..144.07 rows=100 width=296)
|
||||||
|
-> Index Scan using tenk1_unique1 on tenk1 t1
|
||||||
|
(cost=0.00..89.35 rows=100 width=148)
|
||||||
|
-> Index Scan using tenk2_unique2 on tenk2 t2
|
||||||
|
(cost=0.00..0.53 rows=1 width=148)
|
||||||
|
</programlisting>
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
In this nested-loop join, the outer scan is the same indexscan we had
|
||||||
|
in the example before last, and so its cost and row count are the same
|
||||||
|
because we are applying the "unique1 < 100" WHERE clause at that node.
|
||||||
|
The "t1.unique2 = t2.unique2" clause isn't relevant yet, so it doesn't
|
||||||
|
affect the outer scan's row count. For the inner scan, the
|
||||||
|
current
|
||||||
|
outer-scan tuple's unique2 value is plugged into the inner indexscan
|
||||||
|
to produce an indexqual like
|
||||||
|
"t2.unique2 = <replaceable>constant</replaceable>". So we get the
|
||||||
|
same inner-scan plan and costs that we'd get from, say, "explain select
|
||||||
|
* from tenk2 where unique2 = 42". The loop node's costs are then set
|
||||||
|
on the basis of the outer scan's cost, plus one repetition of the
|
||||||
|
inner scan for each outer tuple (100 * 0.53, here), plus a little CPU
|
||||||
|
time for join processing.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
In this example the loop's output row count is the same as the product
|
||||||
|
of the two scans' row counts, but that's not true in general, because
|
||||||
|
in general you can have WHERE clauses that mention both relations and
|
||||||
|
so can only be applied at the join point, not to either input scan.
|
||||||
|
For example, if we added "WHERE ... AND t1.hundred < t2.hundred",
|
||||||
|
that'd decrease the output row count of the join node, but not change
|
||||||
|
either input scan.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
One way to look at variant plans is to force the planner to disregard
|
||||||
|
whatever strategy it thought was the winner, using the enable/disable
|
||||||
|
flags for each plan type. (This is a crude tool, but useful. See
|
||||||
|
also <xref linkend="explicit-joins">.)
|
||||||
|
|
||||||
|
<programlisting>
|
||||||
|
regression=# set enable_nestloop = off;
|
||||||
|
SET VARIABLE
|
||||||
|
regression=# explain select * from tenk1 t1, tenk2 t2 where t1.unique1 < 100
|
||||||
|
regression-# and t1.unique2 = t2.unique2;
|
||||||
|
NOTICE: QUERY PLAN:
|
||||||
|
|
||||||
|
Hash Join (cost=89.60..574.10 rows=100 width=296)
|
||||||
|
-> Seq Scan on tenk2 t2
|
||||||
|
(cost=0.00..333.00 rows=10000 width=148)
|
||||||
|
-> Hash (cost=89.35..89.35 rows=100 width=148)
|
||||||
|
-> Index Scan using tenk1_unique1 on tenk1 t1
|
||||||
|
(cost=0.00..89.35 rows=100 width=148)
|
||||||
|
</programlisting>
|
||||||
|
|
||||||
|
This plan proposes to extract the 100 interesting rows of tenk1
|
||||||
|
using ye same olde indexscan, stash them into an in-memory hash table,
|
||||||
|
and then do a sequential scan of tenk2, probing into the hash table
|
||||||
|
for possible matches of "t1.unique2 = t2.unique2" at each tenk2 tuple.
|
||||||
|
The cost to read tenk1 and set up the hash table is entirely start-up
|
||||||
|
cost for the hash join, since we won't get any tuples out until we can
|
||||||
|
start reading tenk2. The total time estimate for the join also
|
||||||
|
includes a pretty hefty charge for CPU time to probe the hash table
|
||||||
|
10000 times. Note, however, that we are NOT charging 10000 times 89.35;
|
||||||
|
the hash table setup is only done once in this plan type.
|
||||||
|
</para>
|
||||||
|
</sect1>
|
||||||
|
|
||||||
|
<sect1 id="explicit-joins">
|
||||||
|
<title>Controlling the Planner with Explicit JOINs</title>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
Beginning with <productname>Postgres</productname> 7.1 it is possible
|
||||||
|
to control the query planner to some extent by using explicit JOIN
|
||||||
|
syntax. To see why this matters, we first need some background.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
In a simple join query, such as
|
||||||
|
<programlisting>
|
||||||
|
SELECT * FROM a,b,c WHERE a.id = b.id AND b.ref = c.id;
|
||||||
|
</programlisting>
|
||||||
|
the planner is free to join the given tables in any order. For example,
|
||||||
|
it could generate a query plan that joins A to B, using the WHERE clause
|
||||||
|
a.id = b.id, and then joins C to this joined table, using the other
|
||||||
|
WHERE clause. Or it could join B to C and then join A to that result.
|
||||||
|
Or it could join A to C and then join them with B --- but that would
|
||||||
|
be inefficient, since the full Cartesian product of A and C would have
|
||||||
|
to be formed, there being no applicable WHERE clause to allow optimization
|
||||||
|
of the join.
|
||||||
|
(All joins in the <productname>Postgres</productname> executor happen
|
||||||
|
between two input tables, so it's necessary to build up the result in one
|
||||||
|
or another of these fashions.) The important point is that these different
|
||||||
|
join possibilities give semantically equivalent results but may have hugely
|
||||||
|
different execution costs. Therefore, the planner will explore all of them
|
||||||
|
to try to find the most efficient query plan.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
When a query only involves two or three tables, there aren't many join
|
||||||
|
orders to worry about. But the number of possible join orders grows
|
||||||
|
exponentially as the number of tables expands. Beyond ten or so input
|
||||||
|
tables it's no longer practical to do an exhaustive search of all the
|
||||||
|
possibilities, and even for six or seven tables planning may take an
|
||||||
|
annoyingly long time. When there are too many input tables, the
|
||||||
|
<productname>Postgres</productname> planner will switch from exhaustive
|
||||||
|
search to a <firstterm>genetic</firstterm> probabilistic search
|
||||||
|
through a limited number of possibilities. (The switchover threshold is
|
||||||
|
set by the GEQO_THRESHOLD run-time
|
||||||
|
parameter described in the <citetitle>Administrator's Guide</citetitle>.)
|
||||||
|
The genetic search takes less time, but it won't
|
||||||
|
necessarily find the best possible plan.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
When the query involves outer joins, the planner has much less freedom
|
||||||
|
than it does for plain (inner) joins. For example, consider
|
||||||
|
<programlisting>
|
||||||
|
SELECT * FROM a LEFT JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);
|
||||||
|
</programlisting>
|
||||||
|
Although this query's restrictions are superficially similar to the
|
||||||
|
previous example, the semantics are different because a row must be
|
||||||
|
emitted for each row of A that has no matching row in the join of B and C.
|
||||||
|
Therefore the planner has no choice of join order here: it must join
|
||||||
|
B to C and then join A to that result. Accordingly, this query takes
|
||||||
|
less time to plan than the previous query.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
In <productname>Postgres</productname> 7.1, the planner treats all
|
||||||
|
explicit JOIN syntaxes as constraining the join order, even though
|
||||||
|
it is not logically necessary to make such a constraint for inner
|
||||||
|
joins. Therefore, although all of these queries give the same result:
|
||||||
|
<programlisting>
|
||||||
|
SELECT * FROM a,b,c WHERE a.id = b.id AND b.ref = c.id;
|
||||||
|
SELECT * FROM a CROSS JOIN b CROSS JOIN c WHERE a.id = b.id AND b.ref = c.id;
|
||||||
|
SELECT * FROM a JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);
|
||||||
|
</programlisting>
|
||||||
|
the second and third take less time to plan than the first. This effect
|
||||||
|
is not worth worrying about for only three tables, but it can be a
|
||||||
|
lifesaver with many tables.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
You do not need to constrain the join order completely in order to
|
||||||
|
cut search time, because it's OK to use JOIN operators in a plain
|
||||||
|
FROM list. For example,
|
||||||
|
<programlisting>
|
||||||
|
SELECT * FROM a CROSS JOIN b, c, d, e WHERE ...;
|
||||||
|
</programlisting>
|
||||||
|
forces the planner to join A to B before joining them to other tables,
|
||||||
|
but doesn't constrain its choices otherwise. In this example, the
|
||||||
|
number of possible join orders is reduced by a factor of 5.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
If you have a mix of outer and inner joins in a complex query, you
|
||||||
|
might not want to constrain the planner's search for a good ordering
|
||||||
|
of inner joins inside an outer join. You can't do that directly in the
|
||||||
|
JOIN syntax, but you can get around the syntactic limitation by using
|
||||||
|
views. For example,
|
||||||
|
<programlisting>
|
||||||
|
CREATE VIEW v1 AS SELECT ... FROM a, b, c WHERE ...;
|
||||||
|
SELECT * FROM d LEFT JOIN v1 ON (...);
|
||||||
|
</programlisting>
|
||||||
|
Here, joining D must be the last step in the query plan, but the
|
||||||
|
planner is free to consider various join orders for A,B,C.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
Constraining the planner's search in this way is a useful technique
|
||||||
|
both for reducing planning time and for directing the planner to a
|
||||||
|
good query plan. If the planner chooses a bad join order by default,
|
||||||
|
you can force it to choose a better order via JOIN syntax --- assuming
|
||||||
|
that you know of a better order, that is. Experimentation is recommended.
|
||||||
|
</para>
|
||||||
|
</sect1>
|
||||||
|
|
||||||
|
<sect1 id="populate">
|
||||||
|
<title>Populating a Database</title>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
One may need to do a large number of table insertions when first
|
||||||
|
populating a database. Here are some tips and techniques for making that as
|
||||||
|
efficient as possible.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<sect2 id="disable-autocommit">
|
||||||
|
<title>Disable Auto-commit</title>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
Turn off auto-commit and just do one commit at
|
||||||
|
the end. Otherwise <productname>Postgres</productname> is doing a
|
||||||
|
lot of work for each record
|
||||||
|
added. In general when you are doing bulk inserts, you want
|
||||||
|
to turn off some of the database features to gain speed.
|
||||||
|
</para>
|
||||||
|
</sect2>
|
||||||
|
|
||||||
|
<sect2 id="populate-copy-from">
|
||||||
|
<title>Use COPY FROM</title>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
Use <command>COPY FROM STDIN</command> to load all the records in one
|
||||||
|
command, instead
|
||||||
|
of a series of INSERT commands. This reduces parsing, planning, etc
|
||||||
|
overhead a great deal. If you do this then it's not necessary to fool
|
||||||
|
around with autocommit.
|
||||||
|
</para>
|
||||||
|
</sect2>
|
||||||
|
|
||||||
|
<sect2 id="populate-rm-indices">
|
||||||
|
<title>Remove Indices</title>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
If you are loading a freshly created table, the fastest way is to
|
||||||
|
create the table, bulk-load with COPY, then create any indexes needed
|
||||||
|
for the table. Creating an index on pre-existing data is quicker than
|
||||||
|
updating it incrementally as each record is loaded.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
If you are augmenting an existing table, you can <command>DROP
|
||||||
|
INDEX</command>, load the table, then recreate the index. Of
|
||||||
|
course, the database performance for other users may be adversely
|
||||||
|
affected during the time that the index is missing.
|
||||||
|
</para>
|
||||||
|
</sect2>
|
||||||
|
</sect1>
|
||||||
|
|
||||||
|
</chapter>
|
||||||
|
|
||||||
|
<!-- Keep this comment at the end of the file
|
||||||
|
Local variables:
|
||||||
|
mode:sgml
|
||||||
|
sgml-omittag:nil
|
||||||
|
sgml-shorttag:t
|
||||||
|
sgml-minimize-attributes:nil
|
||||||
|
sgml-always-quote-attributes:t
|
||||||
|
sgml-indent-step:1
|
||||||
|
sgml-indent-data:t
|
||||||
|
sgml-parent-document:nil
|
||||||
|
sgml-default-dtd-file:"./reference.ced"
|
||||||
|
sgml-exposed-tags:nil
|
||||||
|
sgml-local-catalogs:("/usr/lib/sgml/catalog")
|
||||||
|
sgml-local-ecat-files:nil
|
||||||
|
End:
|
||||||
|
-->
|
@ -1,269 +0,0 @@
|
|||||||
<!--
|
|
||||||
$Header: /cvsroot/pgsql/doc/src/sgml/Attic/plan.sgml,v 2.5 2000/09/29 20:21:34 petere Exp $
|
|
||||||
-->
|
|
||||||
|
|
||||||
<chapter id="understand-performance">
|
|
||||||
<title>Understanding Performance</title>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
Query performance can be affected by many things. Some of these can
|
|
||||||
be manipulated by the user, while others are fundamental to the underlying
|
|
||||||
design of the system.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
Some performance issues, such as index creation and bulk data
|
|
||||||
loading, are covered elsewhere. This chapter will discuss the
|
|
||||||
<command>EXPLAIN</command> command, and will show how the details
|
|
||||||
of a query can affect the query plan, and hence overall
|
|
||||||
performance.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<sect1 id="using-explain">
|
|
||||||
<title>Using <command>EXPLAIN</command></title>
|
|
||||||
|
|
||||||
<note>
|
|
||||||
<title>Author</title>
|
|
||||||
<para>
|
|
||||||
Written by Tom Lane, from e-mail dated 2000-03-27.
|
|
||||||
</para>
|
|
||||||
</note>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
Plan-reading is an art that deserves a tutorial, and I haven't
|
|
||||||
had time to write one. Here is some quick & dirty explanation.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
The numbers that are currently quoted by EXPLAIN are:
|
|
||||||
|
|
||||||
<itemizedlist>
|
|
||||||
<listitem>
|
|
||||||
<para>
|
|
||||||
Estimated start-up cost (time expended before output scan can start,
|
|
||||||
eg, time to do the sorting in a SORT node).
|
|
||||||
</para>
|
|
||||||
</listitem>
|
|
||||||
|
|
||||||
<listitem>
|
|
||||||
<para>
|
|
||||||
Estimated total cost (if all tuples are retrieved, which they may not
|
|
||||||
be --- LIMIT will stop short of paying the total cost, for
|
|
||||||
example).
|
|
||||||
</para>
|
|
||||||
</listitem>
|
|
||||||
|
|
||||||
<listitem>
|
|
||||||
<para>
|
|
||||||
Estimated number of rows output by this plan node.
|
|
||||||
</para>
|
|
||||||
</listitem>
|
|
||||||
|
|
||||||
<listitem>
|
|
||||||
<para>
|
|
||||||
Estimated average width (in bytes) of rows output by this plan
|
|
||||||
node.
|
|
||||||
</para>
|
|
||||||
</listitem>
|
|
||||||
</itemizedlist>
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
The costs are measured in units of disk page fetches. (CPU effort
|
|
||||||
estimates are converted into disk-page units using some
|
|
||||||
fairly arbitrary fudge-factors. See the <command>SET</command>
|
|
||||||
reference page if you want to experiment with these.)
|
|
||||||
It's important to note that the cost of an upper-level node includes
|
|
||||||
the cost of all its child nodes. It's also important to realize that
|
|
||||||
the cost only reflects things that the planner/optimizer cares about.
|
|
||||||
In particular, the cost does not consider the time spent transmitting
|
|
||||||
result tuples to the frontend --- which could be a pretty dominant
|
|
||||||
factor in the true elapsed time, but the planner ignores it because
|
|
||||||
it cannot change it by altering the plan. (Every correct plan will
|
|
||||||
output the same tuple set, we trust.)
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
Rows output is a little tricky because it is <emphasis>not</emphasis> the number of rows
|
|
||||||
processed/scanned by the query --- it is usually less, reflecting the
|
|
||||||
estimated selectivity of any WHERE-clause constraints that are being
|
|
||||||
applied at this node.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
Average width is pretty bogus because the thing really doesn't have
|
|
||||||
any idea of the average length of variable-length columns. I'm thinking
|
|
||||||
about improving that in the future, but it may not be worth the trouble,
|
|
||||||
because the width isn't used for very much.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
Here are some examples (using the regress test database after a
|
|
||||||
vacuum analyze, and almost-7.0 sources):
|
|
||||||
|
|
||||||
<programlisting>
|
|
||||||
regression=# explain select * from tenk1;
|
|
||||||
NOTICE: QUERY PLAN:
|
|
||||||
|
|
||||||
Seq Scan on tenk1 (cost=0.00..333.00 rows=10000 width=148)
|
|
||||||
</programlisting>
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
This is about as straightforward as it gets. If you do
|
|
||||||
|
|
||||||
<programlisting>
|
|
||||||
select * from pg_class where relname = 'tenk1';
|
|
||||||
</programlisting>
|
|
||||||
|
|
||||||
you'll find out that tenk1 has 233 disk
|
|
||||||
pages and 10000 tuples. So the cost is estimated at 233 block
|
|
||||||
reads, defined as 1.0 apiece, plus 10000 * cpu_tuple_cost which is
|
|
||||||
currently 0.01 (try <command>show cpu_tuple_cost</command>).
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
Now let's modify the query to add a qualification clause:
|
|
||||||
|
|
||||||
<programlisting>
|
|
||||||
regression=# explain select * from tenk1 where unique1 < 1000;
|
|
||||||
NOTICE: QUERY PLAN:
|
|
||||||
|
|
||||||
Seq Scan on tenk1 (cost=0.00..358.00 rows=1000 width=148)
|
|
||||||
</programlisting>
|
|
||||||
|
|
||||||
The estimate of output rows has gone down because of the WHERE clause.
|
|
||||||
(The uncannily accurate estimate is just because tenk1 is a particularly
|
|
||||||
simple case --- the unique1 column has 10000 distinct values ranging
|
|
||||||
from 0 to 9999, so the estimator's linear interpolation between min and
|
|
||||||
max column values is dead-on.) However, the scan will still have to
|
|
||||||
visit all 10000 rows, so the cost hasn't decreased; in fact it has gone
|
|
||||||
up a bit to reflect the extra CPU time spent checking the WHERE
|
|
||||||
condition.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
Modify the query to restrict the qualification even more:
|
|
||||||
|
|
||||||
<programlisting>
|
|
||||||
regression=# explain select * from tenk1 where unique1 < 100;
|
|
||||||
NOTICE: QUERY PLAN:
|
|
||||||
|
|
||||||
Index Scan using tenk1_unique1 on tenk1 (cost=0.00..89.35 rows=100 width=148)
|
|
||||||
</programlisting>
|
|
||||||
|
|
||||||
and you will see that if we make the WHERE condition selective
|
|
||||||
enough, the planner will
|
|
||||||
eventually decide that an indexscan is cheaper than a sequential scan.
|
|
||||||
This plan will only have to visit 100 tuples because of the index,
|
|
||||||
so it wins despite the fact that each individual fetch is expensive.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
Add another condition to the qualification:
|
|
||||||
|
|
||||||
<programlisting>
|
|
||||||
regression=# explain select * from tenk1 where unique1 < 100 and
|
|
||||||
regression-# stringu1 = 'xxx';
|
|
||||||
NOTICE: QUERY PLAN:
|
|
||||||
|
|
||||||
Index Scan using tenk1_unique1 on tenk1 (cost=0.00..89.60 rows=1 width=148)
|
|
||||||
</programlisting>
|
|
||||||
|
|
||||||
The added clause "stringu1 = 'xxx'" reduces the output-rows estimate,
|
|
||||||
but not the cost because we still have to visit the same set of tuples.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
Let's try joining two tables, using the fields we have been discussing:
|
|
||||||
|
|
||||||
<programlisting>
|
|
||||||
regression=# explain select * from tenk1 t1, tenk2 t2 where t1.unique1 < 100
|
|
||||||
regression-# and t1.unique2 = t2.unique2;
|
|
||||||
NOTICE: QUERY PLAN:
|
|
||||||
|
|
||||||
Nested Loop (cost=0.00..144.07 rows=100 width=296)
|
|
||||||
-> Index Scan using tenk1_unique1 on tenk1 t1
|
|
||||||
(cost=0.00..89.35 rows=100 width=148)
|
|
||||||
-> Index Scan using tenk2_unique2 on tenk2 t2
|
|
||||||
(cost=0.00..0.53 rows=1 width=148)
|
|
||||||
</programlisting>
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
In this nested-loop join, the outer scan is the same indexscan we had
|
|
||||||
in the example before last, and so its cost and row count are the same
|
|
||||||
because we are applying the "unique1 < 100" WHERE clause at that node.
|
|
||||||
The "t1.unique2 = t2.unique2" clause isn't relevant yet, so it doesn't
|
|
||||||
affect the outer scan's row count. For the inner scan, the
|
|
||||||
current
|
|
||||||
outer-scan tuple's unique2 value is plugged into the inner indexscan
|
|
||||||
to produce an indexqual like
|
|
||||||
"t2.unique2 = <replaceable>constant</replaceable>". So we get the
|
|
||||||
same inner-scan plan and costs that we'd get from, say, "explain select
|
|
||||||
* from tenk2 where unique2 = 42". The loop node's costs are then set
|
|
||||||
on the basis of the outer scan's cost, plus one repetition of the
|
|
||||||
inner scan for each outer tuple (100 * 0.53, here), plus a little CPU
|
|
||||||
time for join processing.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
In this example the loop's output row count is the same as the product
|
|
||||||
of the two scans' row counts, but that's not true in general, because
|
|
||||||
in general you can have WHERE clauses that mention both relations and
|
|
||||||
so can only be applied at the join point, not to either input scan.
|
|
||||||
For example, if we added "WHERE ... AND t1.hundred < t2.hundred",
|
|
||||||
that'd decrease the output row count of the join node, but not change
|
|
||||||
either input scan.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
We can look at variant plans by forcing the planner to disregard
|
|
||||||
whatever strategy it thought was the winner (a pretty crude tool,
|
|
||||||
but it's what we've got at the moment):
|
|
||||||
|
|
||||||
<programlisting>
|
|
||||||
regression=# set enable_nestloop = off;
|
|
||||||
SET VARIABLE
|
|
||||||
regression=# explain select * from tenk1 t1, tenk2 t2 where t1.unique1 < 100
|
|
||||||
regression-# and t1.unique2 = t2.unique2;
|
|
||||||
NOTICE: QUERY PLAN:
|
|
||||||
|
|
||||||
Hash Join (cost=89.60..574.10 rows=100 width=296)
|
|
||||||
-> Seq Scan on tenk2 t2
|
|
||||||
(cost=0.00..333.00 rows=10000 width=148)
|
|
||||||
-> Hash (cost=89.35..89.35 rows=100 width=148)
|
|
||||||
-> Index Scan using tenk1_unique1 on tenk1 t1
|
|
||||||
(cost=0.00..89.35 rows=100 width=148)
|
|
||||||
</programlisting>
|
|
||||||
|
|
||||||
This plan proposes to extract the 100 interesting rows of tenk1
|
|
||||||
using ye same olde indexscan, stash them into an in-memory hash table,
|
|
||||||
and then do a sequential scan of tenk2, probing into the hash table
|
|
||||||
for possible matches of "t1.unique2 = t2.unique2" at each tenk2 tuple.
|
|
||||||
The cost to read tenk1 and set up the hash table is entirely start-up
|
|
||||||
cost for the hash join, since we won't get any tuples out until we can
|
|
||||||
start reading tenk2. The total time estimate for the join also
|
|
||||||
includes a pretty hefty charge for CPU time to probe the hash table
|
|
||||||
10000 times. Note, however, that we are NOT charging 10000 times 89.35;
|
|
||||||
the hash table setup is only done once in this plan type.
|
|
||||||
</para>
|
|
||||||
</sect1>
|
|
||||||
</chapter>
|
|
||||||
|
|
||||||
<!-- Keep this comment at the end of the file
|
|
||||||
Local variables:
|
|
||||||
mode:sgml
|
|
||||||
sgml-omittag:nil
|
|
||||||
sgml-shorttag:t
|
|
||||||
sgml-minimize-attributes:nil
|
|
||||||
sgml-always-quote-attributes:t
|
|
||||||
sgml-indent-step:1
|
|
||||||
sgml-indent-data:t
|
|
||||||
sgml-parent-document:nil
|
|
||||||
sgml-default-dtd-file:"./reference.ced"
|
|
||||||
sgml-exposed-tags:nil
|
|
||||||
sgml-local-catalogs:("/usr/lib/sgml/catalog")
|
|
||||||
sgml-local-ecat-files:nil
|
|
||||||
End:
|
|
||||||
-->
|
|
@ -1,79 +0,0 @@
|
|||||||
<!--
|
|
||||||
$Header: /cvsroot/pgsql/doc/src/sgml/Attic/populate.sgml,v 2.3 2000/09/29 20:21:34 petere Exp $
|
|
||||||
-->
|
|
||||||
|
|
||||||
<chapter id="populate">
|
|
||||||
<title>Populating a Database</title>
|
|
||||||
|
|
||||||
<note>
|
|
||||||
<title>Author</title>
|
|
||||||
<para>
|
|
||||||
Written by Tom Lane, from an e-mail message dated 1999-12-05.
|
|
||||||
</para>
|
|
||||||
</note>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
One may need to do a large number of table insertions when first
|
|
||||||
populating a database. Here are some tips and techniques for making that as
|
|
||||||
efficient as possible.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<sect1 id="disable-autocommit">
|
|
||||||
<title>Disable Auto-commit</title>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
Turn off auto-commit and just do one commit at
|
|
||||||
the end. Otherwise <productname>Postgres</productname> is doing a
|
|
||||||
lot of work for each record
|
|
||||||
added. In general when you are doing bulk inserts, you want
|
|
||||||
to turn off some of the database features to gain speed.
|
|
||||||
</para>
|
|
||||||
</sect1>
|
|
||||||
|
|
||||||
<sect1 id="populate-copy-from">
|
|
||||||
<title>Use COPY FROM</title>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
Use <command>COPY FROM STDIN</command> to load all the records in one
|
|
||||||
command, instead
|
|
||||||
of a series of INSERT commands. This reduces parsing, planning, etc
|
|
||||||
overhead a great deal. If you do this then it's not necessary to fool
|
|
||||||
around with autocommit.
|
|
||||||
</para>
|
|
||||||
</sect1>
|
|
||||||
|
|
||||||
<sect1 id="populate-rm-indices">
|
|
||||||
<title>Remove Indices</title>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
If you are loading a freshly created table, the fastest way is to
|
|
||||||
create the table, bulk-load with COPY, then create any indexes needed
|
|
||||||
for the table. Creating an index on pre-existing data is quicker than
|
|
||||||
updating it incrementally as each record is loaded.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
If you are augmenting an existing table, you can <command>DROP
|
|
||||||
INDEX</command>, load the table, then recreate the index. Of
|
|
||||||
course, the database performance for other users may be adversely
|
|
||||||
affected during the time that the index is missing.
|
|
||||||
</para>
|
|
||||||
</sect1>
|
|
||||||
</chapter>
|
|
||||||
|
|
||||||
<!-- Keep this comment at the end of the file
|
|
||||||
Local variables:
|
|
||||||
mode:sgml
|
|
||||||
sgml-omittag:nil
|
|
||||||
sgml-shorttag:t
|
|
||||||
sgml-minimize-attributes:nil
|
|
||||||
sgml-always-quote-attributes:t
|
|
||||||
sgml-indent-step:1
|
|
||||||
sgml-indent-data:t
|
|
||||||
sgml-parent-document:nil
|
|
||||||
sgml-default-dtd-file:"./reference.ced"
|
|
||||||
sgml-exposed-tags:nil
|
|
||||||
sgml-local-catalogs:("/usr/lib/sgml/catalog")
|
|
||||||
sgml-local-ecat-files:nil
|
|
||||||
End:
|
|
||||||
-->
|
|
@ -1,5 +1,5 @@
|
|||||||
<!--
|
<!--
|
||||||
$Header: /cvsroot/pgsql/doc/src/sgml/Attic/user.sgml,v 1.21 2000/12/14 22:30:56 petere Exp $
|
$Header: /cvsroot/pgsql/doc/src/sgml/Attic/user.sgml,v 1.22 2000/12/16 02:29:36 tgl Exp $
|
||||||
-->
|
-->
|
||||||
|
|
||||||
<book id="user">
|
<book id="user">
|
||||||
@ -57,8 +57,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/Attic/user.sgml,v 1.21 2000/12/14 22:30:56
|
|||||||
&environ;
|
&environ;
|
||||||
&manage;
|
&manage;
|
||||||
&storage;
|
&storage;
|
||||||
&plan;
|
&perform;
|
||||||
&populate
|
|
||||||
|
|
||||||
<!-- appendices -->
|
<!-- appendices -->
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user