Allow aggregate functions to be VARIADIC.

There's no inherent reason why an aggregate function can't be variadic (even VARIADIC ANY) if its transition function can handle the case. Indeed, this patch to add the feature touches none of the planner or executor, and little of the parser; the main missing stuff was DDL and pg_dump support. It is true that variadic aggregates can create the same sort of ambiguity about parameters versus ORDER BY keys that was complained of when we (briefly) had both one- and two-argument forms of string_agg(). However, the policy formed in response to that discussion only said that we'd not create any built-in aggregates with varying numbers of arguments, not that we shouldn't allow users to do it. So the logical extension of that is we can allow users to make variadic aggregates as long as we're wary about shipping any such in core. In passing, this patch allows aggregate function arguments to be named, to the extent of remembering the names in pg_proc and dumping them in pg_dump. You can't yet call an aggregate using named-parameter notation. That seems like a likely future extension, but it'll take some work, and it's not what this patch is really about. Likewise, there's still some work needed to make window functions handle VARIADIC fully, but I left that for another day. initdb forced because of new aggvariadic field in Aggref parse nodes.
2025-07-30 11:03:19 +03:00 · 2013-09-03 17:08:38 -04:00
parent 8b290f3115
commit 0d3f4406df
35 changed files with 448 additions and 116 deletions
--- a/doc/src/sgml/ref/alter_aggregate.sgml
+++ b/doc/src/sgml/ref/alter_aggregate.sgml
@ -21,9 +21,12 @@ PostgreSQL documentation

 <refsynopsisdiv>
 <synopsis>
-ALTER AGGREGATE <replaceable>name</replaceable> ( <replaceable>argtype</replaceable> [ , ... ] ) RENAME TO <replaceable>new_name</replaceable>
-ALTER AGGREGATE <replaceable>name</replaceable> ( <replaceable>argtype</replaceable> [ , ... ] ) OWNER TO <replaceable>new_owner</replaceable>
-ALTER AGGREGATE <replaceable>name</replaceable> ( <replaceable>argtype</replaceable> [ , ... ] ) SET SCHEMA <replaceable>new_schema</replaceable>
+ALTER AGGREGATE <replaceable>name</replaceable> ( [ <replaceable>argmode</replaceable> ] [ <replaceable>arg_name</replaceable> ] <replaceable>arg_data_type</replaceable> [ , ... ] )
+  RENAME TO <replaceable>new_name</replaceable>
+ALTER AGGREGATE <replaceable>name</replaceable> ( [ <replaceable>argmode</replaceable> ] [ <replaceable>arg_name</replaceable> ] <replaceable>arg_data_type</replaceable> [ , ... ] )
+  OWNER TO <replaceable>new_owner</replaceable>
+ALTER AGGREGATE <replaceable>name</replaceable> ( [ <replaceable>argmode</replaceable> ] [ <replaceable>arg_name</replaceable> ] <replaceable>arg_data_type</replaceable> [ , ... ] )
+  SET SCHEMA <replaceable>new_schema</replaceable>
 </synopsis>
 </refsynopsisdiv>

@ -62,12 +65,36 @@ ALTER AGGREGATE <replaceable>name</replaceable> ( <replaceable>argtype</replacea
   </varlistentry>

   <varlistentry>
-    <term><replaceable class="parameter">argtype</replaceable></term>
+    <term><replaceable class="parameter">argmode</replaceable></term>
+
+    <listitem>
+     <para>
+      The mode of an argument: <literal>IN</> or <literal>VARIADIC</>.
+      If omitted, the default is <literal>IN</>.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term><replaceable class="parameter">arg_name</replaceable></term>
+
+    <listitem>
+     <para>
+      The name of an argument.
+      Note that <command>ALTER AGGREGATE</command> does not actually pay
+      any attention to argument names, since only the argument data
+      types are needed to determine the aggregate function's identity.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term><replaceable class="parameter">arg_data_type</replaceable></term>
    <listitem>
     <para>
      An input data type on which the aggregate function operates.
      To reference a zero-argument aggregate function, write <literal>*</>
-      in place of the list of input data types.
+      in place of the list of argument specifications.
     </para>
    </listitem>
   </varlistentry>
--- a/doc/src/sgml/ref/alter_extension.sgml
+++ b/doc/src/sgml/ref/alter_extension.sgml
@ -30,7 +30,7 @@ ALTER EXTENSION <replaceable class="PARAMETER">name</replaceable> DROP <replacea

 <phrase>where <replaceable class="PARAMETER">member_object</replaceable> is:</phrase>

-  AGGREGATE <replaceable class="PARAMETER">agg_name</replaceable> (<replaceable class="PARAMETER">agg_type</replaceable> [, ...] ) |
+  AGGREGATE <replaceable class="PARAMETER">agg_name</replaceable> ( [ <replaceable class="parameter">argmode</replaceable> ] [ <replaceable class="parameter">argname</replaceable> ] <replaceable class="parameter">agg_type</replaceable> [, ...] ) |
  CAST (<replaceable>source_type</replaceable> AS <replaceable>target_type</replaceable>) |
  COLLATION <replaceable class="PARAMETER">object_name</replaceable> |
  CONVERSION <replaceable class="PARAMETER">object_name</replaceable> |
@ -179,7 +179,7 @@ ALTER EXTENSION <replaceable class="PARAMETER">name</replaceable> DROP <replacea
      <para>
       An input data type on which the aggregate function operates.
       To reference a zero-argument aggregate function, write <literal>*</>
-       in place of the list of input data types.
+       in place of the list of argument specifications.
      </para>
     </listitem>
    </varlistentry>
--- a/doc/src/sgml/ref/comment.sgml
+++ b/doc/src/sgml/ref/comment.sgml
@ -23,7 +23,7 @@ PostgreSQL documentation
 <synopsis>
 COMMENT ON
 {
-  AGGREGATE <replaceable class="PARAMETER">agg_name</replaceable> (<replaceable class="PARAMETER">agg_type</replaceable> [, ...] ) |
+  AGGREGATE <replaceable class="PARAMETER">agg_name</replaceable> ( [ <replaceable class="parameter">argmode</replaceable> ] [ <replaceable class="parameter">argname</replaceable> ] <replaceable class="parameter">agg_type</replaceable> [, ...] ) |
  CAST (<replaceable>source_type</replaceable> AS <replaceable>target_type</replaceable>) |
  COLLATION <replaceable class="PARAMETER">object_name</replaceable> |
  COLUMN <replaceable class="PARAMETER">relation_name</replaceable>.<replaceable class="PARAMETER">column_name</replaceable> |
@ -126,7 +126,7 @@ COMMENT ON
     <para>
      An input data type on which the aggregate function operates.
      To reference a zero-argument aggregate function, write <literal>*</>
-      in place of the list of input data types.
+      in place of the list of argument specifications.
     </para>
    </listitem>
   </varlistentry>
@ -156,7 +156,7 @@ COMMENT ON
      The mode of a function argument: <literal>IN</>, <literal>OUT</>,
      <literal>INOUT</>, or <literal>VARIADIC</>.
      If omitted, the default is <literal>IN</>.
-      Note that <command>COMMENT ON FUNCTION</command> does not actually pay
+      Note that <command>COMMENT</command> does not actually pay
      any attention to <literal>OUT</> arguments, since only the input
      arguments are needed to determine the function's identity.
      So it is sufficient to list the <literal>IN</>, <literal>INOUT</>,
@ -170,7 +170,7 @@ COMMENT ON
    <listitem>
     <para>
      The name of a function argument.
-      Note that <command>COMMENT ON FUNCTION</command> does not actually pay
+      Note that <command>COMMENT</command> does not actually pay
      any attention to argument names, since only the argument data
      types are needed to determine the function's identity.
     </para>
--- a/doc/src/sgml/ref/create_aggregate.sgml
+++ b/doc/src/sgml/ref/create_aggregate.sgml
@ -21,7 +21,7 @@ PostgreSQL documentation

 <refsynopsisdiv>
 <synopsis>
-CREATE AGGREGATE <replaceable class="PARAMETER">name</replaceable> ( <replaceable class="PARAMETER">input_data_type</replaceable> [ , ... ] ) (
+CREATE AGGREGATE <replaceable class="parameter">name</replaceable> ( [ <replaceable class="parameter">argmode</replaceable> ] [ <replaceable class="parameter">arg_name</replaceable> ] <replaceable class="parameter">arg_data_type</replaceable> [ , ... ] ) (
    SFUNC = <replaceable class="PARAMETER">sfunc</replaceable>,
    STYPE = <replaceable class="PARAMETER">state_data_type</replaceable>
    [ , FINALFUNC = <replaceable class="PARAMETER">ffunc</replaceable> ]
@ -118,7 +118,7 @@ CREATE AGGREGATE <replaceable class="PARAMETER">name</replaceable> (
   Note that this behavior is only available when
   <replaceable class="PARAMETER">state_data_type</replaceable>
   is the same as the first
-   <replaceable class="PARAMETER">input_data_type</replaceable>.
+   <replaceable class="PARAMETER">arg_data_type</replaceable>.
   When these types are different, you must supply a nonnull initial
   condition or use a nonstrict transition function.
  </para>
@ -187,12 +187,36 @@ SELECT col FROM tab ORDER BY col USING sortop LIMIT 1;
   </varlistentry>

   <varlistentry>
-    <term><replaceable class="PARAMETER">input_data_type</replaceable></term>
+    <term><replaceable class="parameter">argmode</replaceable></term>
+
+    <listitem>
+     <para>
+      The mode of an argument: <literal>IN</> or <literal>VARIADIC</>.
+      (Aggregate functions do not support <literal>OUT</> arguments.)
+      If omitted, the default is <literal>IN</>.  Only the last argument
+      can be marked <literal>VARIADIC</>.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term><replaceable class="parameter">arg_name</replaceable></term>
+
+    <listitem>
+     <para>
+      The name of an argument.  This is currently only useful for
+      documentation purposes.  If omitted, the argument has no name.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term><replaceable class="PARAMETER">arg_data_type</replaceable></term>
    <listitem>
     <para>
      An input data type on which this aggregate function operates.
      To create a zero-argument aggregate function, write <literal>*</>
-      in place of the list of input data types.  (An example of such an
+      in place of the list of argument specifications.  (An example of such an
      aggregate is <function>count(*)</function>.)
     </para>
    </listitem>
@ -205,8 +229,8 @@ SELECT col FROM tab ORDER BY col USING sortop LIMIT 1;
      In the old syntax for <command>CREATE AGGREGATE</>, the input data type
      is specified by a <literal>basetype</> parameter rather than being
      written next to the aggregate name.  Note that this syntax allows
-      only one input parameter.  To define a zero-argument aggregate function,
-      specify the <literal>basetype</> as
+      only one input parameter.  To define a zero-argument aggregate function
+      with this syntax, specify the <literal>basetype</> as
      <literal>"ANY"</> (not <literal>*</>).
     </para>
    </listitem>
--- a/doc/src/sgml/ref/drop_aggregate.sgml
+++ b/doc/src/sgml/ref/drop_aggregate.sgml
@ -21,7 +21,9 @@ PostgreSQL documentation

 <refsynopsisdiv>
 <synopsis>
-DROP AGGREGATE [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable> ( <replaceable class="PARAMETER">argtype</replaceable> [ , ... ] ) [ CASCADE | RESTRICT ]
+DROP AGGREGATE [ IF EXISTS ]
+  <replaceable class="parameter">name</replaceable> ( [ <replaceable class="parameter">argmode</replaceable> ] [ <replaceable class="parameter">arg_name</replaceable> ] <replaceable class="parameter">arg_data_type</replaceable> [ , ... ] )
+  [ CASCADE | RESTRICT ]
 </synopsis>
 </refsynopsisdiv>

@ -60,12 +62,36 @@ DROP AGGREGATE [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable> (
   </varlistentry>

   <varlistentry>
-    <term><replaceable class="parameter">argtype</replaceable></term>
+    <term><replaceable class="parameter">argmode</replaceable></term>
+
+    <listitem>
+     <para>
+      The mode of an argument: <literal>IN</> or <literal>VARIADIC</>.
+      If omitted, the default is <literal>IN</>.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term><replaceable class="parameter">arg_name</replaceable></term>
+
+    <listitem>
+     <para>
+      The name of an argument.
+      Note that <command>DROP AGGREGATE</command> does not actually pay
+      any attention to argument names, since only the argument data
+      types are needed to determine the aggregate function's identity.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term><replaceable class="parameter">arg_data_type</replaceable></term>
    <listitem>
     <para>
      An input data type on which the aggregate function operates.
      To reference a zero-argument aggregate function, write <literal>*</>
-      in place of the list of input data types.
+      in place of the list of argument specifications.
     </para>
    </listitem>
   </varlistentry>
--- a/doc/src/sgml/ref/security_label.sgml
+++ b/doc/src/sgml/ref/security_label.sgml
@ -25,7 +25,7 @@ SECURITY LABEL [ FOR <replaceable class="PARAMETER">provider</replaceable> ] ON
 {
  TABLE <replaceable class="PARAMETER">object_name</replaceable> |
  COLUMN <replaceable class="PARAMETER">table_name</replaceable>.<replaceable class="PARAMETER">column_name</replaceable> |
-  AGGREGATE <replaceable class="PARAMETER">agg_name</replaceable> (<replaceable class="PARAMETER">agg_type</replaceable> [, ...] ) |
+  AGGREGATE <replaceable class="PARAMETER">agg_name</replaceable> ( [ <replaceable class="parameter">argmode</replaceable> ] [ <replaceable class="parameter">argname</replaceable> ] <replaceable class="parameter">agg_type</replaceable> [, ...] ) |
  DATABASE <replaceable class="PARAMETER">object_name</replaceable> |
  DOMAIN <replaceable class="PARAMETER">object_name</replaceable> |
  EVENT TRIGGER <replaceable class="PARAMETER">object_name</replaceable> |
@ -107,12 +107,12 @@ SECURITY LABEL [ FOR <replaceable class="PARAMETER">provider</replaceable> ] ON
   </varlistentry>

   <varlistentry>
-    <term><replaceable class="parameter">arg_type</replaceable></term>
+    <term><replaceable class="parameter">agg_type</replaceable></term>
    <listitem>
     <para>
      An input data type on which the aggregate function operates.
      To reference a zero-argument aggregate function, write <literal>*</>
-      in place of the list of input data types.
+      in place of the list of argument specifications.
     </para>
    </listitem>
   </varlistentry>
@ -125,7 +125,7 @@ SECURITY LABEL [ FOR <replaceable class="PARAMETER">provider</replaceable> ] ON
      The mode of a function argument: <literal>IN</>, <literal>OUT</>,
      <literal>INOUT</>, or <literal>VARIADIC</>.
      If omitted, the default is <literal>IN</>.
-      Note that <command>SECURITY LABEL ON FUNCTION</command> does not actually
+      Note that <command>SECURITY LABEL</command> does not actually
      pay any attention to <literal>OUT</> arguments, since only the input
      arguments are needed to determine the function's identity.
      So it is sufficient to list the <literal>IN</>, <literal>INOUT</>,
@ -140,7 +140,7 @@ SECURITY LABEL [ FOR <replaceable class="PARAMETER">provider</replaceable> ] ON
    <listitem>
     <para>
      The name of a function argument.
-      Note that <command>SECURITY LABEL ON FUNCTION</command> does not actually
+      Note that <command>SECURITY LABEL</command> does not actually
      pay any attention to argument names, since only the argument data
      types are needed to determine the function's identity.
     </para>
--- a/doc/src/sgml/syntax.sgml
+++ b/doc/src/sgml/syntax.sgml
@ -2524,6 +2524,13 @@ SELECT concat_lower_or_upper('Hello', 'World', uppercase := true);
    having numerous parameters that have default values, named or mixed
    notation can save a great deal of writing and reduce chances for error.
   </para>
+
+   <note>
+    <para>
+     Named and mixed call notations can currently be used only with regular
+     functions, not with aggregate functions or window functions.
+    </para>
+   </note>
  </sect2>
 </sect1>

--- a/doc/src/sgml/xaggr.sgml
+++ b/doc/src/sgml/xaggr.sgml
@ -169,6 +169,42 @@ SELECT attrelid::regclass, array_accum(atttypid::regtype)
 </programlisting>
  </para>

+  <para>
+   An aggregate function can be made to accept a varying number of arguments
+   by declaring its last argument as a <literal>VARIADIC</> array, in much
+   the same fashion as for regular functions; see
+   <xref linkend="xfunc-sql-variadic-functions">.  The aggregate's transition
+   function must have the same array type as its last argument.  The
+   transition function typically would also be marked <literal>VARIADIC</>,
+   but this is not strictly required.
+  </para>
+
+  <note>
+   <para>
+    Variadic aggregates are easily misused in connection with
+    the <literal>ORDER BY</> option (see <xref linkend="syntax-aggregates">),
+    since the parser cannot tell whether the wrong number of actual arguments
+    have been given in such a combination.  Keep in mind that everything to
+    the right of <literal>ORDER BY</> is a sort key, not an argument to the
+    aggregate.  For example, in
+<programlisting>
+SELECT myaggregate(a ORDER BY a, b, c) FROM ...
+</programlisting>
+    the parser will see this as a single aggregate function argument and
+    three sort keys.  However, the user might have intended
+<programlisting>
+SELECT myaggregate(a, b, c ORDER BY a) FROM ...
+</programlisting>
+    If <literal>myaggregate</> is variadic, both these calls could be
+    perfectly valid.
+   </para>
+
+   <para>
+    For the same reason, it's wise to think twice before creating aggregate
+    functions with the same names and different numbers of regular arguments.
+   </para>
+  </note>
+
  <para>
   A function written in C can detect that it is being called as an
   aggregate transition or final function by calling