1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-27 12:41:57 +03:00

Enable BUFFERS with EXPLAIN ANALYZE by default

The topic of turning EXPLAIN's BUFFERS option on with the ANALYZE option
has come up a few times over the past few years.  In many ways, doing this
seems like a good idea as it may be more obvious to users why a given
query is running more slowly than they might expect.  Also, from my own
(David's) personal experience, I've seen users posting to the mailing
lists with two identical plans, one slow and one fast asking why their
query is sometimes slow.  In many cases, this is due to additional reads.
Having BUFFERS on by default may help reduce some of these questions, and
if not, make it more obvious to the user before they post, or save a
round-trip to the mailing list when additional I/O effort is the cause of
the slowness.

The general consensus is that we want BUFFERS on by default with
ANALYZE.  However, there were more than zero concerns raised with doing
so.  The primary reason against is the additional verbosity, making it
harder to read large plans.  Another concern was that buffer information
isn't always useful so may not make sense to have it on by default.

It's currently December, so let's commit this to see if anyone comes
forward with a strong objection against making this change.  We have over
half a year remaining in the v18 cycle where we could still easily consider
reverting this if someone were to come forward with a convincing enough
reason as to why doing this is a bad idea.

There were two patches independently submitted to achieve this goal, one
by me and the other by Guillaume.  This commit is a mix of both of these
patches with some additional work done by me to adjust various
additional places in the documentation which include EXPLAIN ANALYZE
output.

Author: Guillaume Lelarge, David Rowley
Reviewed-by: Robert Haas, Greg Sabino Mullane, Michael Christofides
Discussion: https://postgr.es/m/CANNMO++W7MM8T0KyXN3ZheXXt-uLVM3aEtZd+WNfZ=obxffUiA@mail.gmail.com
This commit is contained in:
David Rowley
2024-12-11 22:35:11 +13:00
parent 0f5738202b
commit c2a4078eba
30 changed files with 257 additions and 197 deletions

View File

@ -121,9 +121,10 @@ SELECT 10000000
Seq Scan on tbloom (cost=0.00..213744.00 rows=250 width=24) (actual time=357.059..357.059 rows=0 loops=1)
Filter: ((i2 = 898732) AND (i5 = 123451))
Rows Removed by Filter: 10000000
Buffers: shared hit=63744
Planning Time: 0.346 ms
Execution Time: 357.076 ms
(5 rows)
(6 rows)
</programlisting>
</para>
@ -144,9 +145,10 @@ CREATE INDEX
Seq Scan on tbloom (cost=0.00..213744.00 rows=2 width=24) (actual time=351.016..351.017 rows=0 loops=1)
Filter: ((i2 = 898732) AND (i5 = 123451))
Rows Removed by Filter: 10000000
Buffers: shared hit=63744
Planning Time: 0.138 ms
Execution Time: 351.035 ms
(5 rows)
(6 rows)
</programlisting>
</para>
@ -168,11 +170,13 @@ CREATE INDEX
Recheck Cond: ((i2 = 898732) AND (i5 = 123451))
Rows Removed by Index Recheck: 2300
Heap Blocks: exact=2256
Buffers: shared hit=21864
-&gt; Bitmap Index Scan on bloomidx (cost=0.00..178436.00 rows=1 width=0) (actual time=20.005..20.005 rows=2300 loops=1)
Index Cond: ((i2 = 898732) AND (i5 = 123451))
Buffers: shared hit=19608
Planning Time: 0.099 ms
Execution Time: 22.632 ms
(8 rows)
(10 rows)
</programlisting>
</para>
@ -199,14 +203,18 @@ CREATE INDEX
-------------------------------------------------------------------&zwsp;--------------------------------------------------------
Bitmap Heap Scan on tbloom (cost=9.29..13.30 rows=1 width=24) (actual time=0.032..0.033 rows=0 loops=1)
Recheck Cond: ((i5 = 123451) AND (i2 = 898732))
Buffers: shared read=6
-&gt; BitmapAnd (cost=9.29..9.29 rows=1 width=0) (actual time=0.047..0.047 rows=0 loops=1)
Buffers: shared hit=6
-&gt; Bitmap Index Scan on btreeidx5 (cost=0.00..4.52 rows=11 width=0) (actual time=0.026..0.026 rows=7 loops=1)
Index Cond: (i5 = 123451)
Buffers: shared hit=3
-&gt; Bitmap Index Scan on btreeidx2 (cost=0.00..4.52 rows=11 width=0) (actual time=0.007..0.007 rows=8 loops=1)
Index Cond: (i2 = 898732)
Buffers: shared hit=3
Planning Time: 0.264 ms
Execution Time: 0.047 ms
(9 rows)
(13 rows)
</programlisting>
Although this query runs much faster than with either of the single
indexes, we pay a penalty in index size. Each of the single-column

View File

@ -149,10 +149,11 @@
QUERY PLAN
-------------------------------------------------------------------&zwsp;------------------------------------------
Aggregate (cost=16.27..16.29 rows=1 width=8) (actual time=0.303..0.303 rows=1 loops=1)
Buffers: shared hit=14
-> Seq Scan on pg_class (cost=0.00..15.42 rows=342 width=4) (actual time=0.017..0.111 rows=356 loops=1)
Buffers: shared hit=14
Planning Time: 0.116 ms
Execution Time: 0.365 ms
(4 rows)
</screen>
Given the cost of the plan, it is entirely reasonable that no
<acronym>JIT</acronym> was used; the cost of <acronym>JIT</acronym> would
@ -165,7 +166,9 @@ SET
QUERY PLAN
-------------------------------------------------------------------&zwsp;------------------------------------------
Aggregate (cost=16.27..16.29 rows=1 width=8) (actual time=6.049..6.049 rows=1 loops=1)
Buffers: shared hit=14
-> Seq Scan on pg_class (cost=0.00..15.42 rows=342 width=4) (actual time=0.019..0.052 rows=356 loops=1)
Buffers: shared hit=14
Planning Time: 0.133 ms
JIT:
Functions: 3

View File

@ -39,7 +39,7 @@
<para>
Examples in this section are drawn from the regression test database
after doing a <command>VACUUM ANALYZE</command>, using v17 development sources.
after doing a <command>VACUUM ANALYZE</command>, using v18 development sources.
You should be able to get similar results if you try the examples
yourself, but your estimated costs and row counts might vary slightly
because <command>ANALYZE</command>'s statistics are random samples rather
@ -722,13 +722,19 @@ WHERE t1.unique1 &lt; 10 AND t1.unique2 = t2.unique2;
QUERY PLAN
-------------------------------------------------------------------&zwsp;--------------------------------------------------------------
Nested Loop (cost=4.65..118.50 rows=10 width=488) (actual time=0.017..0.051 rows=10 loops=1)
Buffers: shared hit=36 read=6
-&gt; Bitmap Heap Scan on tenk1 t1 (cost=4.36..39.38 rows=10 width=244) (actual time=0.009..0.017 rows=10 loops=1)
Recheck Cond: (unique1 &lt; 10)
Heap Blocks: exact=10
Buffers: shared hit=3 read=5 written=4
-&gt; Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.004..0.004 rows=10 loops=1)
Index Cond: (unique1 &lt; 10)
Buffers: shared hit=2
-&gt; Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.90 rows=1 width=244) (actual time=0.003..0.003 rows=1 loops=10)
Index Cond: (unique2 = t1.unique2)
Buffers: shared hit=24 read=6
Planning:
Buffers: shared hit=15 dirtied=9
Planning Time: 0.485 ms
Execution Time: 0.073 ms
</screen>
@ -769,16 +775,24 @@ WHERE t1.unique1 &lt; 100 AND t1.unique2 = t2.unique2 ORDER BY t1.fivethous;
Sort (cost=713.05..713.30 rows=100 width=488) (actual time=2.995..3.002 rows=100 loops=1)
Sort Key: t1.fivethous
Sort Method: quicksort Memory: 74kB
Buffers: shared hit=440
-&gt; Hash Join (cost=226.23..709.73 rows=100 width=488) (actual time=0.515..2.920 rows=100 loops=1)
Hash Cond: (t2.unique2 = t1.unique2)
Buffers: shared hit=437
-&gt; Seq Scan on tenk2 t2 (cost=0.00..445.00 rows=10000 width=244) (actual time=0.026..1.790 rows=10000 loops=1)
Buffers: shared hit=345
-&gt; Hash (cost=224.98..224.98 rows=100 width=244) (actual time=0.476..0.477 rows=100 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 35kB
Buffers: shared hit=92
-&gt; Bitmap Heap Scan on tenk1 t1 (cost=5.06..224.98 rows=100 width=244) (actual time=0.030..0.450 rows=100 loops=1)
Recheck Cond: (unique1 &lt; 100)
Heap Blocks: exact=90
Buffers: shared hit=92
-&gt; Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.013..0.013 rows=100 loops=1)
Index Cond: (unique1 &lt; 100)
Buffers: shared hit=2
Planning:
Buffers: shared hit=12
Planning Time: 0.187 ms
Execution Time: 3.036 ms
</screen>
@ -803,6 +817,7 @@ EXPLAIN ANALYZE SELECT * FROM tenk1 WHERE ten &lt; 7;
Seq Scan on tenk1 (cost=0.00..470.00 rows=7000 width=244) (actual time=0.030..1.995 rows=7000 loops=1)
Filter: (ten &lt; 7)
Rows Removed by Filter: 3000
Buffers: shared hit=345
Planning Time: 0.102 ms
Execution Time: 2.145 ms
</screen>
@ -826,6 +841,7 @@ EXPLAIN ANALYZE SELECT * FROM polygon_tbl WHERE f1 @&gt; polygon '(0.5,2.0)';
Seq Scan on polygon_tbl (cost=0.00..1.09 rows=1 width=85) (actual time=0.023..0.023 rows=0 loops=1)
Filter: (f1 @&gt; '((0.5,2))'::polygon)
Rows Removed by Filter: 7
Buffers: shared hit=1
Planning Time: 0.039 ms
Execution Time: 0.033 ms
</screen>
@ -845,6 +861,7 @@ EXPLAIN ANALYZE SELECT * FROM polygon_tbl WHERE f1 @&gt; polygon '(0.5,2.0)';
Index Scan using gpolygonind on polygon_tbl (cost=0.13..8.15 rows=1 width=85) (actual time=0.074..0.074 rows=0 loops=1)
Index Cond: (f1 @&gt; '((0.5,2))'::polygon)
Rows Removed by Index Recheck: 1
Buffers: shared hit=1
Planning Time: 0.039 ms
Execution Time: 0.098 ms
</screen>
@ -857,34 +874,31 @@ EXPLAIN ANALYZE SELECT * FROM polygon_tbl WHERE f1 @&gt; polygon '(0.5,2.0)';
</para>
<para>
<command>EXPLAIN</command> has a <literal>BUFFERS</literal> option that can be used with
<literal>ANALYZE</literal> to get even more run time statistics:
<command>EXPLAIN</command> has a <literal>BUFFERS</literal> option which
provides additional detail about I/O operations performed during the
planning and execution of the given query. The buffer numbers displayed
show the count of the non-distinct buffers hit, read, dirtied, and written
for the given node and all of its child nodes. The
<literal>ANALYZE</literal> option implicitly enables the
<literal>BUFFERS</literal> option. If this
is undesired, <literal>BUFFERS</literal> may be explicitly disabled:
<screen>
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM tenk1 WHERE unique1 &lt; 100 AND unique2 &gt; 9000;
EXPLAIN (ANALYZE, BUFFERS OFF) SELECT * FROM tenk1 WHERE unique1 &lt; 100 AND unique2 &gt; 9000;
QUERY PLAN
-------------------------------------------------------------------&zwsp;--------------------------------------------------------------
Bitmap Heap Scan on tenk1 (cost=25.07..60.11 rows=10 width=244) (actual time=0.105..0.114 rows=10 loops=1)
Recheck Cond: ((unique1 &lt; 100) AND (unique2 &gt; 9000))
Heap Blocks: exact=10
Buffers: shared hit=14 read=3
-&gt; BitmapAnd (cost=25.07..25.07 rows=10 width=0) (actual time=0.100..0.101 rows=0 loops=1)
Buffers: shared hit=4 read=3
-&gt; Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.027..0.027 rows=100 loops=1)
Index Cond: (unique1 &lt; 100)
Buffers: shared hit=2
-&gt; Bitmap Index Scan on tenk1_unique2 (cost=0.00..19.78 rows=999 width=0) (actual time=0.070..0.070 rows=999 loops=1)
Index Cond: (unique2 &gt; 9000)
Buffers: shared hit=2 read=3
Planning:
Buffers: shared hit=3
Planning Time: 0.162 ms
Execution Time: 0.143 ms
</screen>
The numbers provided by <literal>BUFFERS</literal> help to identify which parts
of the query are the most I/O-intensive.
</para>
<para>
@ -906,8 +920,10 @@ EXPLAIN ANALYZE UPDATE tenk1 SET hundred = hundred + 1 WHERE unique1 &lt; 100;
-&gt; Bitmap Heap Scan on tenk1 (cost=5.06..225.23 rows=100 width=10) (actual time=0.065..0.141 rows=100 loops=1)
Recheck Cond: (unique1 &lt; 100)
Heap Blocks: exact=90
Buffers: shared hit=4 read=2
-&gt; Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=100 width=0) (actual time=0.031..0.031 rows=100 loops=1)
Index Cond: (unique1 &lt; 100)
Buffers: shared read=2
Planning Time: 0.151 ms
Execution Time: 1.856 ms
@ -1040,10 +1056,12 @@ EXPLAIN ANALYZE SELECT * FROM tenk1 WHERE unique1 &lt; 100 AND unique2 &gt; 9000
QUERY PLAN
-------------------------------------------------------------------&zwsp;------------------------------------------------------------
Limit (cost=0.29..14.33 rows=2 width=244) (actual time=0.051..0.071 rows=2 loops=1)
Buffers: shared hit=16
-&gt; Index Scan using tenk1_unique2 on tenk1 (cost=0.29..70.50 rows=10 width=244) (actual time=0.051..0.070 rows=2 loops=1)
Index Cond: (unique2 &gt; 9000)
Filter: (unique1 &lt; 100)
Rows Removed by Filter: 287
Buffers: shared hit=16
Planning Time: 0.077 ms
Execution Time: 0.086 ms
</screen>

View File

@ -489,7 +489,7 @@ SELECT relpages, reltuples FROM pg_class WHERE relname = 't';
condition on the <structfield>a</structfield> column:
<programlisting>
EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1;
EXPLAIN (ANALYZE, TIMING OFF, BUFFERS OFF) SELECT * FROM t WHERE a = 1;
QUERY PLAN
-------------------------------------------------------------------&zwsp;------------
Seq Scan on t (cost=0.00..170.00 rows=100 width=8) (actual rows=100 loops=1)
@ -506,7 +506,7 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1;
condition on both columns, combining them with <literal>AND</literal>:
<programlisting>
EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
EXPLAIN (ANALYZE, TIMING OFF, BUFFERS OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
QUERY PLAN
-------------------------------------------------------------------&zwsp;----------
Seq Scan on t (cost=0.00..195.00 rows=1 width=8) (actual rows=100 loops=1)
@ -530,7 +530,7 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
<programlisting>
CREATE STATISTICS stts (dependencies) ON a, b FROM t;
ANALYZE t;
EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
EXPLAIN (ANALYZE, TIMING OFF, BUFFERS OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
QUERY PLAN
-------------------------------------------------------------------&zwsp;------------
Seq Scan on t (cost=0.00..195.00 rows=100 width=8) (actual rows=100 loops=1)
@ -551,7 +551,7 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
estimated number of rows returned by the HashAggregate node) is very
accurate:
<programlisting>
EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a;
EXPLAIN (ANALYZE, TIMING OFF, BUFFERS OFF) SELECT COUNT(*) FROM t GROUP BY a;
QUERY PLAN
-------------------------------------------------------------------&zwsp;----------------------
HashAggregate (cost=195.00..196.00 rows=100 width=12) (actual rows=100 loops=1)
@ -562,7 +562,7 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a;
groups in a query with two columns in <command>GROUP BY</command>, as
in the following example, is off by an order of magnitude:
<programlisting>
EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
EXPLAIN (ANALYZE, TIMING OFF, BUFFERS OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
QUERY PLAN
-------------------------------------------------------------------&zwsp;-------------------------
HashAggregate (cost=220.00..230.00 rows=1000 width=16) (actual rows=100 loops=1)
@ -575,7 +575,7 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
DROP STATISTICS stts;
CREATE STATISTICS stts (dependencies, ndistinct) ON a, b FROM t;
ANALYZE t;
EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
EXPLAIN (ANALYZE, TIMING OFF, BUFFERS OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
QUERY PLAN
-------------------------------------------------------------------&zwsp;-------------------------
HashAggregate (cost=220.00..221.00 rows=100 width=16) (actual rows=100 loops=1)
@ -615,7 +615,7 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
DROP STATISTICS stts;
CREATE STATISTICS stts2 (mcv) ON a, b FROM t;
ANALYZE t;
EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
EXPLAIN (ANALYZE, TIMING OFF, BUFFERS OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
QUERY PLAN
-------------------------------------------------------------------&zwsp;------------
Seq Scan on t (cost=0.00..195.00 rows=100 width=8) (actual rows=100 loops=1)
@ -672,7 +672,7 @@ SELECT m.* FROM pg_statistic_ext join pg_statistic_ext_data on (oid = stxoid),
to decide which combinations are compatible.
<programlisting>
EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 10;
EXPLAIN (ANALYZE, TIMING OFF, BUFFERS OFF) SELECT * FROM t WHERE a = 1 AND b = 10;
QUERY PLAN
-------------------------------------------------------------------&zwsp;--------
Seq Scan on t (cost=0.00..195.00 rows=1 width=8) (actual rows=0 loops=1)
@ -685,7 +685,7 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 10;
consider the following range query for the same table:
<programlisting>
EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a &lt;= 49 AND b &gt; 49;
EXPLAIN (ANALYZE, TIMING OFF, BUFFERS OFF) SELECT * FROM t WHERE a &lt;= 49 AND b &gt; 49;
QUERY PLAN
-------------------------------------------------------------------&zwsp;--------
Seq Scan on t (cost=0.00..195.00 rows=1 width=8) (actual rows=0 loops=1)

View File

@ -201,8 +201,10 @@ ROLLBACK;
query processing.
The number of blocks shown for an
upper-level node includes those used by all its child nodes. In text
format, only non-zero values are printed. This parameter defaults to
<literal>FALSE</literal>.
format, only non-zero values are printed. Buffers information is
included by default when <literal>ANALYZE</literal> is used but
otherwise is not included by default, but can be enabled using this
option.
</para>
</listitem>
</varlistentry>
@ -500,11 +502,13 @@ EXPLAIN ANALYZE EXECUTE query(100, 200);
HashAggregate (cost=10.77..10.87 rows=10 width=12) (actual time=0.043..0.044 rows=10 loops=1)
Group Key: foo
Batches: 1 Memory Usage: 24kB
Buffers: shared hit=4
-&gt; Index Scan using test_pkey on test (cost=0.29..10.27 rows=99 width=8) (actual time=0.009..0.025 rows=99 loops=1)
Index Cond: ((id &gt; 100) AND (id &lt; 200))
Buffers: shared hit=4
Planning Time: 0.244 ms
Execution Time: 0.073 ms
(7 rows)
(9 rows)
</programlisting>
</para>