1
0
mirror of https://github.com/postgres/postgres.git synced 2025-11-06 07:49:08 +03:00

Implement GROUP BY DISTINCT

With grouping sets, it's possible that some of the grouping sets are
duplicate.  This is especially common with CUBE and ROLLUP clauses. For
example GROUP BY CUBE (a,b), CUBE (b,c) is equivalent to

  GROUP BY GROUPING SETS (
    (a, b, c),
    (a, b, c),
    (a, b, c),
    (a, b),
    (a, b),
    (a, b),
    (a),
    (a),
    (a),
    (c, a),
    (c, a),
    (c, a),
    (c),
    (b, c),
    (b),
    ()
  )

Some of the grouping sets are calculated multiple times, which is mostly
unnecessary.  This commit implements a new GROUP BY DISTINCT feature, as
defined in the SQL standard, which eliminates the duplicate sets.

Author: Vik Fearing
Reviewed-by: Erik Rijkers, Georgios Kokolatos, Tomas Vondra
Discussion: https://postgr.es/m/bf3805a8-d7d1-ae61-fece-761b7ff41ecc@postgresfriends.org
This commit is contained in:
Tomas Vondra
2021-03-18 17:45:38 +01:00
parent cd91de0d17
commit be45be9c33
18 changed files with 333 additions and 27 deletions

View File

@@ -37,7 +37,7 @@ SELECT [ ALL | DISTINCT [ ON ( <replaceable class="parameter">expression</replac
[ * | <replaceable class="parameter">expression</replaceable> [ [ AS ] <replaceable class="parameter">output_name</replaceable> ] [, ...] ]
[ FROM <replaceable class="parameter">from_item</replaceable> [, ...] ]
[ WHERE <replaceable class="parameter">condition</replaceable> ]
[ GROUP BY <replaceable class="parameter">grouping_element</replaceable> [, ...] ]
[ GROUP BY [ ALL | DISTINCT ] <replaceable class="parameter">grouping_element</replaceable> [, ...] ]
[ HAVING <replaceable class="parameter">condition</replaceable> ]
[ WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceable class="parameter">window_definition</replaceable> ) [, ...] ]
[ { UNION | INTERSECT | EXCEPT } [ ALL | DISTINCT ] <replaceable class="parameter">select</replaceable> ]
@@ -778,7 +778,7 @@ WHERE <replaceable class="parameter">condition</replaceable>
<para>
The optional <literal>GROUP BY</literal> clause has the general form
<synopsis>
GROUP BY <replaceable class="parameter">grouping_element</replaceable> [, ...]
GROUP BY [ ALL | DISTINCT ] <replaceable class="parameter">grouping_element</replaceable> [, ...]
</synopsis>
</para>
@@ -802,7 +802,10 @@ GROUP BY <replaceable class="parameter">grouping_element</replaceable> [, ...]
independent <replaceable>grouping sets</replaceable>. The effect of this is
equivalent to constructing a <literal>UNION ALL</literal> between
subqueries with the individual grouping sets as their
<literal>GROUP BY</literal> clauses. For further details on the handling
<literal>GROUP BY</literal> clauses. The optional <literal>DISTINCT</literal>
clause removes duplicate sets before processing; it does <emphasis>not</emphasis>
transform the <literal>UNION ALL</literal> into a <literal>UNION DISTINCT</literal>.
For further details on the handling
of grouping sets see <xref linkend="queries-grouping-sets"/>.
</para>