1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-27 12:41:57 +03:00

Invent SERIALIZE option for EXPLAIN.

EXPLAIN (ANALYZE, SERIALIZE) allows collection of statistics about
the volume of data emitted by a query, as well as the time taken
to convert the data to the on-the-wire format.  Previously there
was no way to investigate this without actually sending the data
to the client, in which case network transmission costs might
swamp what you wanted to see.  In particular this feature allows
investigating the costs of de-TOASTing compressed or out-of-line
data during formatting.

Stepan Rutz and Matthias van de Meent,
reviewed by Tomas Vondra and myself

Discussion: https://postgr.es/m/ca0adb0e-fa4e-c37e-1cd7-91170b18cae1@gmx.de
This commit is contained in:
Tom Lane
2024-04-03 17:41:54 -04:00
parent 97ce821e3e
commit 06286709ee
10 changed files with 542 additions and 11 deletions

View File

@ -144,11 +144,11 @@ EXPLAIN SELECT * FROM tenk1;
It's important to understand that the cost of an upper-level node includes
the cost of all its child nodes. It's also important to realize that
the cost only reflects things that the planner cares about.
In particular, the cost does not consider the time spent transmitting
result rows to the client, which could be an important
factor in the real elapsed time; but the planner ignores it because
it cannot change it by altering the plan. (Every correct plan will
output the same row set, we trust.)
In particular, the cost does not consider the time spent to convert
output values to text form or to transmit them to the client, which
could be important factors in the real elapsed time; but the planner
ignores those costs because it cannot change them by altering the
plan. (Every correct plan will output the same row set, we trust.)
</para>
<para>
@ -956,6 +956,17 @@ EXPLAIN UPDATE parent SET f2 = f2 + 1 WHERE f1 = 101;
<command>EXPLAIN ANALYZE</command>.
</para>
<para>
The time shown for the top-level node does not include any time needed
to convert the query's output data into displayable form or to send it
to the client. While <command>EXPLAIN ANALYZE</command> will never
send the data to the client, it can be told to convert the query's
output data to displayable form and measure the time needed for that,
by specifying the <literal>SERIALIZE</literal> option. That time will
be shown separately, and it's also included in the
total <literal>Execution time</literal>.
</para>
</sect2>
<sect2 id="using-explain-caveats">
@ -965,7 +976,8 @@ EXPLAIN UPDATE parent SET f2 = f2 + 1 WHERE f1 = 101;
There are two significant ways in which run times measured by
<command>EXPLAIN ANALYZE</command> can deviate from normal execution of
the same query. First, since no output rows are delivered to the client,
network transmission costs and I/O conversion costs are not included.
network transmission costs are not included. I/O conversion costs are
not included either unless <literal>SERIALIZE</literal> is specified.
Second, the measurement overhead added by <command>EXPLAIN
ANALYZE</command> can be significant, especially on machines with slow
<function>gettimeofday()</function> operating-system calls. You can use the

View File

@ -41,6 +41,7 @@ EXPLAIN [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] <rep
SETTINGS [ <replaceable class="parameter">boolean</replaceable> ]
GENERIC_PLAN [ <replaceable class="parameter">boolean</replaceable> ]
BUFFERS [ <replaceable class="parameter">boolean</replaceable> ]
SERIALIZE [ { NONE | TEXT | BINARY } ]
WAL [ <replaceable class="parameter">boolean</replaceable> ]
TIMING [ <replaceable class="parameter">boolean</replaceable> ]
SUMMARY [ <replaceable class="parameter">boolean</replaceable> ]
@ -206,6 +207,34 @@ ROLLBACK;
</listitem>
</varlistentry>
<varlistentry>
<term><literal>SERIALIZE</literal></term>
<listitem>
<para>
Include information on the cost
of <firstterm>serializing</firstterm> the query's output data, that
is converting it to text or binary format to send to the client.
This can be a significant part of the time required for regular
execution of the query, if the datatype output functions are
expensive or if <acronym>TOAST</acronym>ed values must be fetched
from out-of-line storage. <command>EXPLAIN</command>'s default
behavior, <literal>SERIALIZE NONE</literal>, does not perform these
conversions. If <literal>SERIALIZE TEXT</literal>
or <literal>SERIALIZE BINARY</literal> is specified, the appropriate
conversions are performed, and the time spent doing so is measured
(unless <literal>TIMING OFF</literal> is specified). If
the <literal>BUFFERS</literal> option is also specified, then any
buffer accesses involved in the conversions are counted too.
In no case, however, will <command>EXPLAIN</command> actually send
the resulting data to the client; hence network transmission costs
cannot be investigated this way.
Serialization may only be enabled when <literal>ANALYZE</literal> is
also enabled. If <literal>SERIALIZE</literal> is written without an
argument, <literal>TEXT</literal> is assumed.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>WAL</literal></term>
<listitem>