mirror of
https://github.com/postgres/postgres.git
synced 2025-07-28 23:42:10 +03:00
Explicitly track whether aggregate final functions modify transition state.
Up to now, there's been hard-wired assumptions that normal aggregates'
final functions never modify their transition states, while ordered-set
aggregates' final functions always do. This has always been a bit
limiting, and in particular it's getting in the way of improving the
built-in ordered-set aggregates to allow merging of transition states.
Therefore, let's introduce catalog and CREATE AGGREGATE infrastructure
that lets the finalfn's behavior be declared explicitly.
There are now three possibilities for the finalfn behavior: it's purely
read-only, it trashes the transition state irrecoverably, or it changes
the state in such a way that no more transfn calls are possible but the
state can still be passed to other, compatible finalfns. There are no
examples of this third case today, but we'll shortly make the built-in
OSAs act like that.
This change allows user-defined aggregates to explicitly disclaim support
for use as window functions, and/or to prevent transition state merging,
if their implementations cannot handle that. While it was previously
possible to handle the window case with a run-time error check, there was
not any way to prevent transition state merging, which in retrospect is
something commit 804163bc2
should have provided for. But better late
than never.
In passing, split out pg_aggregate.c's extern function declarations into
a new header file pg_aggregate_fn.h, similarly to what we've done for
some other catalog headers, so that pg_aggregate.h itself can be safe
for frontend files to include. This lets pg_dump use the symbolic
names for relevant constants.
Discussion: https://postgr.es/m/4834.1507849699@sss.pgh.pa.us
This commit is contained in:
@ -486,6 +486,26 @@
|
||||
<entry></entry>
|
||||
<entry>True to pass extra dummy arguments to <structfield>aggmfinalfn</structfield></entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><structfield>aggfinalmodify</structfield></entry>
|
||||
<entry><type>char</type></entry>
|
||||
<entry></entry>
|
||||
<entry>Whether <structfield>aggfinalfn</structfield> modifies the
|
||||
transition state value:
|
||||
<literal>r</literal> if it is read-only,
|
||||
<literal>s</literal> if the <structfield>aggtransfn</structfield>
|
||||
cannot be applied after the <structfield>aggfinalfn</structfield>, or
|
||||
<literal>w</literal> if it writes on the value
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><structfield>aggmfinalmodify</structfield></entry>
|
||||
<entry><type>char</type></entry>
|
||||
<entry></entry>
|
||||
<entry>Like <structfield>aggfinalmodify</structfield>, but for
|
||||
the <structfield>aggmfinalfn</structfield>
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><structfield>aggsortop</structfield></entry>
|
||||
<entry><type>oid</type></entry>
|
||||
|
@ -27,6 +27,7 @@ CREATE AGGREGATE <replaceable class="parameter">name</replaceable> ( [ <replacea
|
||||
[ , SSPACE = <replaceable class="parameter">state_data_size</replaceable> ]
|
||||
[ , FINALFUNC = <replaceable class="parameter">ffunc</replaceable> ]
|
||||
[ , FINALFUNC_EXTRA ]
|
||||
[ , FINALFUNC_MODIFY = { READ_ONLY | SHARABLE | READ_WRITE } ]
|
||||
[ , COMBINEFUNC = <replaceable class="parameter">combinefunc</replaceable> ]
|
||||
[ , SERIALFUNC = <replaceable class="parameter">serialfunc</replaceable> ]
|
||||
[ , DESERIALFUNC = <replaceable class="parameter">deserialfunc</replaceable> ]
|
||||
@ -37,6 +38,7 @@ CREATE AGGREGATE <replaceable class="parameter">name</replaceable> ( [ <replacea
|
||||
[ , MSSPACE = <replaceable class="parameter">mstate_data_size</replaceable> ]
|
||||
[ , MFINALFUNC = <replaceable class="parameter">mffunc</replaceable> ]
|
||||
[ , MFINALFUNC_EXTRA ]
|
||||
[ , MFINALFUNC_MODIFY = { READ_ONLY | SHARABLE | READ_WRITE } ]
|
||||
[ , MINITCOND = <replaceable class="parameter">minitial_condition</replaceable> ]
|
||||
[ , SORTOP = <replaceable class="parameter">sort_operator</replaceable> ]
|
||||
[ , PARALLEL = { SAFE | RESTRICTED | UNSAFE } ]
|
||||
@ -49,6 +51,7 @@ CREATE AGGREGATE <replaceable class="parameter">name</replaceable> ( [ [ <replac
|
||||
[ , SSPACE = <replaceable class="parameter">state_data_size</replaceable> ]
|
||||
[ , FINALFUNC = <replaceable class="parameter">ffunc</replaceable> ]
|
||||
[ , FINALFUNC_EXTRA ]
|
||||
[ , FINALFUNC_MODIFY = { READ_ONLY | SHARABLE | READ_WRITE } ]
|
||||
[ , INITCOND = <replaceable class="parameter">initial_condition</replaceable> ]
|
||||
[ , PARALLEL = { SAFE | RESTRICTED | UNSAFE } ]
|
||||
[ , HYPOTHETICAL ]
|
||||
@ -63,6 +66,7 @@ CREATE AGGREGATE <replaceable class="parameter">name</replaceable> (
|
||||
[ , SSPACE = <replaceable class="parameter">state_data_size</replaceable> ]
|
||||
[ , FINALFUNC = <replaceable class="parameter">ffunc</replaceable> ]
|
||||
[ , FINALFUNC_EXTRA ]
|
||||
[ , FINALFUNC_MODIFY = { READ_ONLY | SHARABLE | READ_WRITE } ]
|
||||
[ , COMBINEFUNC = <replaceable class="parameter">combinefunc</replaceable> ]
|
||||
[ , SERIALFUNC = <replaceable class="parameter">serialfunc</replaceable> ]
|
||||
[ , DESERIALFUNC = <replaceable class="parameter">deserialfunc</replaceable> ]
|
||||
@ -73,6 +77,7 @@ CREATE AGGREGATE <replaceable class="parameter">name</replaceable> (
|
||||
[ , MSSPACE = <replaceable class="parameter">mstate_data_size</replaceable> ]
|
||||
[ , MFINALFUNC = <replaceable class="parameter">mffunc</replaceable> ]
|
||||
[ , MFINALFUNC_EXTRA ]
|
||||
[ , MFINALFUNC_MODIFY = { READ_ONLY | SHARABLE | READ_WRITE } ]
|
||||
[ , MINITCOND = <replaceable class="parameter">minitial_condition</replaceable> ]
|
||||
[ , SORTOP = <replaceable class="parameter">sort_operator</replaceable> ]
|
||||
)
|
||||
@ -197,7 +202,8 @@ CREATE AGGREGATE <replaceable class="parameter">name</replaceable> (
|
||||
as described in <xref linkend="xaggr-moving-aggregates">. This requires
|
||||
specifying the <literal>MSFUNC</>, <literal>MINVFUNC</>,
|
||||
and <literal>MSTYPE</> parameters, and optionally
|
||||
the <literal>MSPACE</>, <literal>MFINALFUNC</>, <literal>MFINALFUNC_EXTRA</>,
|
||||
the <literal>MSPACE</>, <literal>MFINALFUNC</>,
|
||||
<literal>MFINALFUNC_EXTRA</>, <literal>MFINALFUNC_MODIFY</>,
|
||||
and <literal>MINITCOND</> parameters. Except for <literal>MINVFUNC</>,
|
||||
these parameters work like the corresponding simple-aggregate parameters
|
||||
without <literal>M</>; they define a separate implementation of the
|
||||
@ -412,6 +418,21 @@ SELECT col FROM tab ORDER BY col USING sortop LIMIT 1;
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term><literal>FINALFUNC_MODIFY</> = { <literal>READ_ONLY</> | <literal>SHARABLE</> | <literal>READ_WRITE</> }</term>
|
||||
<listitem>
|
||||
<para>
|
||||
This option specifies whether the final function is a pure function
|
||||
that does not modify its arguments. <literal>READ_ONLY</> indicates
|
||||
it does not; the other two values indicate that it may change the
|
||||
transition state value. See <xref linkend="sql-createaggregate-notes"
|
||||
endterm="sql-createaggregate-notes-title"> below for more detail. The
|
||||
default is <literal>READ_ONLY</>, except for ordered-set aggregates,
|
||||
for which the default is <literal>READ_WRITE</>.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term><replaceable class="parameter">combinefunc</replaceable></term>
|
||||
<listitem>
|
||||
@ -563,6 +584,16 @@ SELECT col FROM tab ORDER BY col USING sortop LIMIT 1;
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term><literal>MFINALFUNC_MODIFY</> = { <literal>READ_ONLY</> | <literal>SHARABLE</> | <literal>READ_WRITE</> }</term>
|
||||
<listitem>
|
||||
<para>
|
||||
This option is like <literal>FINALFUNC_MODIFY</>, but it describes
|
||||
the behavior of the moving-aggregate final function.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term><replaceable class="parameter">minitial_condition</replaceable></term>
|
||||
<listitem>
|
||||
@ -587,12 +618,12 @@ SELECT col FROM tab ORDER BY col USING sortop LIMIT 1;
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term><literal>PARALLEL</literal></term>
|
||||
<term><literal>PARALLEL =</> { <literal>SAFE</> | <literal>RESTRICTED</> | <literal>UNSAFE</> }</term>
|
||||
<listitem>
|
||||
<para>
|
||||
The meanings of <literal>PARALLEL SAFE</>, <literal>PARALLEL
|
||||
RESTRICTED</>, and <literal>PARALLEL UNSAFE</> are the same as
|
||||
for <xref linkend="sql-createfunction">. An aggregate will not be
|
||||
in <xref linkend="sql-createfunction">. An aggregate will not be
|
||||
considered for parallelization if it is marked <literal>PARALLEL
|
||||
UNSAFE</> (which is the default!) or <literal>PARALLEL RESTRICTED</>.
|
||||
Note that the parallel-safety markings of the aggregate's support
|
||||
@ -624,8 +655,8 @@ SELECT col FROM tab ORDER BY col USING sortop LIMIT 1;
|
||||
</para>
|
||||
</refsect1>
|
||||
|
||||
<refsect1>
|
||||
<title>Notes</title>
|
||||
<refsect1 id="sql-createaggregate-notes">
|
||||
<title id="sql-createaggregate-notes-title">Notes</title>
|
||||
|
||||
<para>
|
||||
In parameters that specify support function names, you can write
|
||||
@ -634,6 +665,34 @@ SELECT col FROM tab ORDER BY col USING sortop LIMIT 1;
|
||||
of the support functions are determined from other parameters.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Ordinarily, Postgres functions are expected to be true functions that
|
||||
do not modify their input values. However, an aggregate transition
|
||||
function, <emphasis>when used in the context of an aggregate</>,
|
||||
is allowed to cheat and modify its transition-state argument in place.
|
||||
This can provide substantial performance benefits compared to making
|
||||
a fresh copy of the transition state each time.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Likewise, while an aggregate final function is normally expected not to
|
||||
modify its input values, sometimes it is impractical to avoid modifying
|
||||
the transition-state argument. Such behavior must be declared using
|
||||
the <literal>FINALFUNC_MODIFY</> parameter. The <literal>READ_WRITE</>
|
||||
value indicates that the final function modifies the transition state in
|
||||
unspecified ways. This value prevents use of the aggregate as a window
|
||||
function, and it also prevents merging of transition states for aggregate
|
||||
calls that share the same input values and transition functions.
|
||||
The <literal>SHARABLE</> value indicates that the transition function
|
||||
cannot be applied after the final function, but multiple final-function
|
||||
calls can be performed on the ending transition state value. This value
|
||||
prevents use of the aggregate as a window function, but it allows merging
|
||||
of transition states. (That is, the optimization of interest here is not
|
||||
applying the same final function repeatedly, but applying different final
|
||||
functions to the same ending transition state value. This is allowed as
|
||||
long as none of the final functions are marked <literal>READ_WRITE</>.)
|
||||
</para>
|
||||
|
||||
<para>
|
||||
If an aggregate supports moving-aggregate mode, it will improve
|
||||
calculation efficiency when the aggregate is used as a window function
|
||||
@ -671,7 +730,8 @@ SELECT col FROM tab ORDER BY col USING sortop LIMIT 1;
|
||||
Note that whether or not the aggregate supports moving-aggregate
|
||||
mode, <productname>PostgreSQL</productname> can handle a moving frame
|
||||
end without recalculation; this is done by continuing to add new values
|
||||
to the aggregate's state. It is assumed that the final function does
|
||||
to the aggregate's state. This is why use of an aggregate as a window
|
||||
function requires that the final function be read-only: it must
|
||||
not damage the aggregate's state value, so that the aggregation can be
|
||||
continued even after an aggregate result value has been obtained for
|
||||
one set of frame boundaries.
|
||||
|
@ -487,6 +487,13 @@ SELECT percentile_disc(0.5) WITHIN GROUP (ORDER BY income) FROM households;
|
||||
C, since their state values aren't definable as any SQL data type.
|
||||
(In the above example, notice that the state value is declared as
|
||||
type <type>internal</> — this is typical.)
|
||||
Also, because the final function performs the sort, it is not possible
|
||||
to continue adding input rows by executing the transition function again
|
||||
later. This means the final function is not <literal>READ_ONLY</>;
|
||||
it must be declared in <xref linkend="sql-createaggregate">
|
||||
as <literal>READ_WRITE</>, or as <literal>SHARABLE</> if it's
|
||||
possible for additional final-function calls to make use of the
|
||||
already-sorted state.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
@ -622,16 +629,15 @@ SELECT percentile_disc(0.5) WITHIN GROUP (ORDER BY income) FROM households;
|
||||
<programlisting>
|
||||
if (AggCheckCallContext(fcinfo, NULL))
|
||||
</programlisting>
|
||||
One reason for checking this is that when it is true for a transition
|
||||
function, the first input
|
||||
One reason for checking this is that when it is true, the first input
|
||||
must be a temporary state value and can therefore safely be modified
|
||||
in-place rather than allocating a new copy.
|
||||
See <function>int8inc()</> for an example.
|
||||
(This is the <emphasis>only</>
|
||||
case where it is safe for a function to modify a pass-by-reference input.
|
||||
In particular, final functions for normal aggregates must not
|
||||
modify their inputs in any case, because in some cases they will be
|
||||
re-executed on the same final state value.)
|
||||
(While aggregate transition functions are always allowed to modify
|
||||
the transition value in-place, aggregate final functions are generally
|
||||
discouraged from doing so; if they do so, the behavior must be declared
|
||||
when creating the aggregate. See <xref linkend="sql-createaggregate">
|
||||
for more detail.)
|
||||
</para>
|
||||
|
||||
<para>
|
||||
|
Reference in New Issue
Block a user