1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-30 11:03:19 +03:00

Add some basic support for window frame clauses to the window-functions

patch.  This includes the ability to force the frame to cover the whole
partition, and the ability to make the frame end exactly on the current row
rather than its last ORDER BY peer.  Supporting any more of the full SQL
frame-clause syntax will require nontrivial hacking on the window aggregate
code, so it'll have to wait for 8.5 or beyond.
This commit is contained in:
Tom Lane
2008-12-31 00:08:39 +00:00
parent 0fb9be7acf
commit 8e8854daa2
24 changed files with 757 additions and 307 deletions

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/advanced.sgml,v 1.55 2008/12/28 18:53:53 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/advanced.sgml,v 1.56 2008/12/31 00:08:32 tgl Exp $ -->
<chapter id="tutorial-advanced">
<title>Advanced Features</title>
@ -429,27 +429,27 @@ SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary
<para>
We already saw that <literal>ORDER BY</> can be omitted if the ordering
of rows is not important. It is also possible to omit <literal>PARTITION
BY</>, in which case the window function is computed over all rows of the
virtual table; that is, there is one partition containing all the rows.
BY</>, in which case there is just one partition containing all the rows.
</para>
<para>
There is another important concept associated with window functions:
for each row, there is a set of rows within its partition called its
<firstterm>window frame</>. When <literal>ORDER BY</> is omitted the
frame is always the same as the partition. If <literal>ORDER BY</> is
supplied, the frame consists of all rows from the start of the partition
up to the current row, plus any following rows that are equal to the
current row according to the <literal>ORDER BY</> clause.
<firstterm>window frame</>. Many (but not all) window functions act only
on the rows of the window frame, rather than of the whole partition.
By default, if <literal>ORDER BY</> is supplied then the frame consists of
all rows from the start of the partition up through the current row, plus
any following rows that are equal to the current row according to the
<literal>ORDER BY</> clause. When <literal>ORDER BY</> is omitted the
default frame consists of all rows in the partition.
<footnote>
<para>
The SQL standard includes options to define the window frame in
other ways, but this definition is the only one currently supported
by <productname>PostgreSQL</productname>.
There are options to define the window frame in other ways, but
this tutorial does not cover them. See
<xref linkend="syntax-window-functions"> for details.
</para>
</footnote>
Many window functions act only on the rows of the window frame, rather
than of the whole partition. Here is an example using <function>sum</>:
Here is an example using <function>sum</>:
</para>
<programlisting>
@ -550,6 +550,13 @@ SELECT sum(salary) OVER w, avg(salary) OVER w
WINDOW w AS (PARTITION BY depname ORDER BY salary DESC);
</programlisting>
</para>
<para>
More details about window functions can be found in
<xref linkend="syntax-window-functions">,
<xref linkend="queries-window">, and the
<xref linkend="sql-select" endterm="sql-select-title"> reference page.
</para>
</sect1>

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.464 2008/12/28 18:53:53 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.465 2008/12/31 00:08:33 tgl Exp $ -->
<chapter id="functions">
<title>Functions and Operators</title>
@ -10160,9 +10160,8 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<para>
<firstterm>Window functions</firstterm> provide the ability to perform
calculations across sets of rows that are related to the current query
row. For information about this feature see
<xref linkend="tutorial-window"> and
<xref linkend="syntax-window-functions">.
row. See <xref linkend="tutorial-window"> for an introduction to this
feature.
</para>
<para>
@ -10392,18 +10391,23 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<para>
Note that <function>first_value</>, <function>last_value</>, and
<function>nth_value</> consider only the rows within the <quote>window
frame</>, that is the rows from the start of the partition through the
last peer of the current row. This is particularly likely to give
unintuitive results for <function>last_value</>.
frame</>, which by default contains the rows from the start of the
partition through the last peer of the current row. This is
likely to give unhelpful results for <function>nth_value</> and
particularly <function>last_value</>. You can redefine the frame as
being the whole partition by adding <literal>ROWS BETWEEN UNBOUNDED
PRECEDING AND UNBOUNDED FOLLOWING</> to the <literal>OVER</> clause.
See <xref linkend="syntax-window-functions"> for more information.
</para>
<para>
When an aggregate function is used as a window function, it aggregates
over the rows within the current row's window frame. To obtain
aggregation over the whole partition, be sure to omit <literal>ORDER BY</>
from the window definition. An aggregate used with <literal>ORDER BY</>
produces a <quote>running sum</> type of behavior, which may or may not
be what's wanted.
aggregation over the whole partition, omit <literal>ORDER BY</> or use
<literal>ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING</>.
An aggregate used with <literal>ORDER BY</> and the default window frame
definition produces a <quote>running sum</> type of behavior, which may or
may not be what's wanted.
</para>
<note>
@ -10416,7 +10420,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
same as the standard's default, namely <literal>RESPECT NULLS</>.
Likewise, the standard's <literal>FROM FIRST</> or <literal>FROM LAST</>
option for <function>nth_value</> is not implemented: only the
default <literal>FROM FIRST</> behavior is supported.
default <literal>FROM FIRST</> behavior is supported. (You can achieve
the result of <literal>FROM LAST</> by reversing the <literal>ORDER BY</>
ordering.)
</para>
</note>

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/queries.sgml,v 1.51 2008/12/28 18:53:54 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/queries.sgml,v 1.52 2008/12/31 00:08:35 tgl Exp $ -->
<chapter id="queries">
<title>Queries</title>
@ -993,7 +993,7 @@ SELECT product_id, p.name, (sum(s.units) * (p.price - p.cost)) AS profit
</para>
<para>
Currently, use of window functions always forces sorting, and so the
Currently, window functions always require presorted data, and so the
query output will be ordered according to one or another of the window
functions' <literal>PARTITION BY</>/<literal>ORDER BY</> clauses.
It is not recommendable to rely on this, however. Use an explicit

View File

@ -1,5 +1,5 @@
<!--
$PostgreSQL: pgsql/doc/src/sgml/ref/select.sgml,v 1.114 2008/12/29 18:23:53 tgl Exp $
$PostgreSQL: pgsql/doc/src/sgml/ref/select.sgml,v 1.115 2008/12/31 00:08:35 tgl Exp $
PostgreSQL documentation
-->
@ -583,7 +583,21 @@ WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceabl
[ <replaceable class="parameter">existing_window_name</replaceable> ]
[ PARTITION BY <replaceable class="parameter">expression</replaceable> [, ...] ]
[ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ]
[ <replaceable class="parameter">framing_clause</replaceable> ]
</synopsis>
</para>
<para>
If an <replaceable class="parameter">existing_window_name</replaceable>
is specified it must refer to an earlier entry in the <literal>WINDOW</>
list; the new window copies its partitioning clause from that entry,
as well as its ordering clause if any. In this case the new window cannot
specify its own <literal>PARTITION BY</> clause, and it can specify
<literal>ORDER BY</> only if the copied window does not have one.
The framing clause is never copied from the existing window.
</para>
<para>
The elements of the <literal>PARTITION BY</> list are interpreted in
the same fashion as elements of a
<xref linkend="sql-groupby" endterm="sql-groupby-title">, and
@ -597,12 +611,29 @@ WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceabl
</para>
<para>
If an <replaceable class="parameter">existing_window_name</replaceable>
is specified it must refer to an earlier entry in the <literal>WINDOW</>
list; the new window copies its partitioning clause from that entry,
as well as its ordering clause if any. In this case the new window cannot
specify its own <literal>PARTITION BY</> clause, and it can specify
<literal>ORDER BY</> only if the copied window does not have one.
The optional <replaceable class="parameter">framing_clause</> defines
the <firstterm>window frame</> for window functions that depend on the
frame (not all do). It can be one of
<synopsis>
RANGE UNBOUNDED PRECEDING
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
ROWS UNBOUNDED PRECEDING
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
</synopsis>
The first two are equivalent and are also the default: they set the
frame to be all rows from the partition start up through the current row's
last peer in the <literal>ORDER BY</> ordering (which means all rows if
there is no <literal>ORDER BY</>). The options
<literal>RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING</> and
<literal>ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING</>
are also equivalent: they always select all rows in the partition.
Lastly, <literal>ROWS UNBOUNDED PRECEDING</> or its verbose equivalent
<literal>ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW</> select
all rows up through the current row (regardless of duplicates).
Beware that this option can produce implementation-dependent results
if the <literal>ORDER BY</> ordering does not order the rows uniquely.
</para>
<para>
@ -623,8 +654,9 @@ WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceabl
<para>
Window functions are described in detail in
<xref linkend="tutorial-window"> and
<xref linkend="syntax-window-functions">.
<xref linkend="tutorial-window">,
<xref linkend="syntax-window-functions">, and
<xref linkend="queries-window">.
</para>
</refsect2>
@ -1453,12 +1485,10 @@ SELECT distributors.* WHERE distributors.name = 'Westward';
<title><literal>WINDOW</literal> Clause Restrictions</title>
<para>
The SQL standard provides for an optional <quote>framing clause</>,
introduced by the key word <literal>RANGE</> or <literal>ROWS</>,
in window definitions. <productname>PostgreSQL</productname> does
not yet implement framing clauses, and always follows the
default framing behavior, which is equivalent to the framing clause
<literal>RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW</>.
The SQL standard provides additional options for the window
<replaceable class="parameter">framing_clause</>.
<productname>PostgreSQL</productname> currently supports only the
options listed above.
</para>
</refsect2>

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/syntax.sgml,v 1.127 2008/12/28 18:53:54 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/syntax.sgml,v 1.128 2008/12/31 00:08:35 tgl Exp $ -->
<chapter id="sql-syntax">
<title>SQL Syntax</title>
@ -1591,6 +1591,17 @@ sqrt(2)
[ <replaceable class="parameter">window_name</replaceable> ]
[ PARTITION BY <replaceable class="parameter">expression</replaceable> [, ...] ]
[ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ]
[ <replaceable class="parameter">framing_clause</replaceable> ]
</synopsis>
and the optional <replaceable class="parameter">framing_clause</replaceable>
can be one of
<synopsis>
RANGE UNBOUNDED PRECEDING
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
ROWS UNBOUNDED PRECEDING
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
</synopsis>
Here, <replaceable>expression</replaceable> represents any value
@ -1603,19 +1614,33 @@ sqrt(2)
Named window specifications are usually referenced with just
<literal>OVER</> <replaceable>window_name</replaceable>, but it is
also possible to write a window name inside the parentheses and then
optionally override its ordering clause with <literal>ORDER BY</>.
optionally override its ordering clause and/or framing clause.
This latter syntax follows the same rules as modifying an existing
window name within the <literal>WINDOW</literal> clause; see the
<xref linkend="sql-select" endterm="sql-select-title"> reference
page for details.
</para>
<para>
The default framing option is <literal>RANGE UNBOUNDED PRECEDING</>,
which is the same as <literal>RANGE BETWEEN UNBOUNDED PRECEDING AND
CURRENT ROW</>; it selects rows up through the current row's last
peer in the <literal>ORDER BY</> ordering (which means all rows if
there is no <literal>ORDER BY</>). The options
<literal>RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING</> and
<literal>ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING</>
are also equivalent: they always select all rows in the partition.
Lastly, <literal>ROWS UNBOUNDED PRECEDING</> or its verbose equivalent
<literal>ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW</> select
all rows up through the current row (regardless of duplicates).
Beware that this option can produce implementation-dependent results
if the <literal>ORDER BY</> ordering does not order the rows uniquely.
</para>
<para>
The built-in window functions are described in <xref
linkend="functions-window-table">. Also, any built-in or
user-defined aggregate function can be used as a window function.
Currently, there is no provision for user-defined window functions
other than aggregates.
</para>
<para>