1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-27 12:41:57 +03:00

Support window functions a la SQL:2008.

Hitoshi Harada, with some kibitzing from Heikki and Tom.
This commit is contained in:
Tom Lane
2008-12-28 18:54:01 +00:00
parent 38e9348282
commit 95b07bc7f5
92 changed files with 6720 additions and 321 deletions

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/advanced.sgml,v 1.54 2007/02/01 00:28:16 momjian Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/advanced.sgml,v 1.55 2008/12/28 18:53:53 tgl Exp $ -->
<chapter id="tutorial-advanced">
<title>Advanced Features</title>
@ -240,7 +240,7 @@ COMMIT;
<para>
<productname>PostgreSQL</> actually treats every SQL statement as being
executed within a transaction. If you do not issue a <command>BEGIN</>
command,
command,
then each individual statement has an implicit <command>BEGIN</> and
(if successful) <command>COMMIT</> wrapped around it. A group of
statements surrounded by <command>BEGIN</> and <command>COMMIT</>
@ -265,7 +265,7 @@ COMMIT;
with <command>ROLLBACK TO</>. All the transaction's database changes
between defining the savepoint and rolling back to it are discarded, but
changes earlier than the savepoint are kept.
</para>
</para>
<para>
After rolling back to a savepoint, it continues to be defined, so you can
@ -274,7 +274,7 @@ COMMIT;
system can free some resources. Keep in mind that either releasing or
rolling back to a savepoint
will automatically release all savepoints that were defined after it.
</para>
</para>
<para>
All this is happening within the transaction block, so none of it
@ -282,7 +282,7 @@ COMMIT;
transaction block, the committed actions become visible as a unit
to other sessions, while the rolled-back actions never become visible
at all.
</para>
</para>
<para>
Remembering the bank database, suppose we debit $100.00 from Alice's
@ -317,6 +317,242 @@ COMMIT;
</sect1>
<sect1 id="tutorial-window">
<title id="tutorial-window-title">Window Functions</title>
<indexterm zone="tutorial-window">
<primary>window function</primary>
</indexterm>
<para>
A <firstterm>window function</> performs a calculation across a set of
table rows that are somehow related to the current row. This is comparable
to the type of calculation that can be done with an aggregate function.
But unlike regular aggregate functions, use of a window function does not
cause rows to become grouped into a single output row &mdash; the
rows retain their separate identities. Behind the scenes, the window
function is able to access more than just the current row of the query
result.
</para>
<para>
Here is an example that shows how to compare each employee's salary
with the average salary in his or her department:
<programlisting>
SELECT depname, empno, salary, avg(salary) OVER (PARTITION BY depname) FROM empsalary;
</programlisting>
<screen>
depname | empno | salary | avg
-----------+-------+--------+-----------------------
develop | 11 | 5200 | 5020.0000000000000000
develop | 7 | 4200 | 5020.0000000000000000
develop | 9 | 4500 | 5020.0000000000000000
develop | 8 | 6000 | 5020.0000000000000000
develop | 10 | 5200 | 5020.0000000000000000
personnel | 5 | 3500 | 3700.0000000000000000
personnel | 2 | 3900 | 3700.0000000000000000
sales | 3 | 4800 | 4866.6666666666666667
sales | 1 | 5000 | 4866.6666666666666667
sales | 4 | 4800 | 4866.6666666666666667
(10 rows)
</screen>
The first three output columns come directly from the table
<structname>empsalary</>, and there is one output row for each row in the
table. The fourth column represents an average taken across all the table
rows that have the same <structfield>depname</> value as the current row.
(This actually is the same function as the regular <function>avg</>
aggregate function, but the <literal>OVER</> clause causes it to be
treated as a window function and computed across an appropriate set of
rows.)
</para>
<para>
A window function call always contains an <literal>OVER</> clause
following the window function's name and argument(s). This is what
syntactically distinguishes it from a regular function or aggregate
function. The <literal>OVER</> clause determines exactly how the
rows of the query are split up for processing by the window function.
The <literal>PARTITION BY</> list within <literal>OVER</> specifies
dividing the rows into groups, or partitions, that share the same
values of the <literal>PARTITION BY</> expression(s). For each row,
the window function is computed across the rows that fall into the
same partition as the current row.
</para>
<para>
Although <function>avg</> will produce the same result no matter
what order it processes the partition's rows in, this is not true of all
window functions. When needed, you can control that order using
<literal>ORDER BY</> within <literal>OVER</>. Here is an example:
<programlisting>
SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary DESC) FROM empsalary;
</programlisting>
<screen>
depname | empno | salary | rank
-----------+-------+--------+------
develop | 8 | 6000 | 1
develop | 10 | 5200 | 2
develop | 11 | 5200 | 2
develop | 9 | 4500 | 4
develop | 7 | 4200 | 5
personnel | 2 | 3900 | 1
personnel | 5 | 3500 | 2
sales | 1 | 5000 | 1
sales | 4 | 4800 | 2
sales | 3 | 4800 | 2
(10 rows)
</screen>
As shown here, the <function>rank</> function produces a numerical rank
within the current row's partition for each distinct <literal>ORDER BY</>
value, in the order defined by the <literal>ORDER BY</> clause.
<function>rank</> needs no explicit parameter, because its behavior
is entirely determined by the <literal>OVER</> clause.
</para>
<para>
The rows considered by a window function are those of the <quote>virtual
table</> produced by the query's <literal>FROM</> clause as filtered by its
<literal>WHERE</>, <literal>GROUP BY</>, and <literal>HAVING</> clauses
if any. For example, a row removed because it does not meet the
<literal>WHERE</> condition is not seen by any window function.
A query can contain multiple window functions that slice up the data
in different ways by means of different <literal>OVER</> clauses, but
they all act on the same collection of rows defined by this virtual table.
</para>
<para>
We already saw that <literal>ORDER BY</> can be omitted if the ordering
of rows is not important. It is also possible to omit <literal>PARTITION
BY</>, in which case the window function is computed over all rows of the
virtual table; that is, there is one partition containing all the rows.
</para>
<para>
There is another important concept associated with window functions:
for each row, there is a set of rows within its partition called its
<firstterm>window frame</>. When <literal>ORDER BY</> is omitted the
frame is always the same as the partition. If <literal>ORDER BY</> is
supplied, the frame consists of all rows from the start of the partition
up to the current row, plus any following rows that are equal to the
current row according to the <literal>ORDER BY</> clause.
<footnote>
<para>
The SQL standard includes options to define the window frame in
other ways, but this definition is the only one currently supported
by <productname>PostgreSQL</productname>.
</para>
</footnote>
Many window functions act only on the rows of the window frame, rather
than of the whole partition. Here is an example using <function>sum</>:
</para>
<programlisting>
SELECT salary, sum(salary) OVER () FROM empsalary;
</programlisting>
<screen>
salary | sum
--------+-------
5200 | 47100
5000 | 47100
3500 | 47100
4800 | 47100
3900 | 47100
4200 | 47100
4500 | 47100
4800 | 47100
6000 | 47100
5200 | 47100
(10 rows)
</screen>
<para>
Above, since there is no <literal>ORDER BY</> in the <literal>OVER</>
clause, the window frame is the same as the partition, which for lack of
<literal>PARTITION BY</> is the whole table; in other words each sum is
taken over the whole table and so we get the same result for each output
row. But if we add an <literal>ORDER BY</> clause, we get very different
results:
</para>
<programlisting>
SELECT salary, sum(salary) OVER (ORDER BY salary) FROM empsalary;
</programlisting>
<screen>
salary | sum
--------+-------
3500 | 3500
3900 | 7400
4200 | 11600
4500 | 16100
4800 | 25700
4800 | 25700
5000 | 30700
5200 | 41100
5200 | 41100
6000 | 47100
(10 rows)
</screen>
<para>
Here the sum is taken from the first (lowest) salary up through the
current one, including any duplicates of the current one (notice the
results for the duplicated salaries).
</para>
<para>
Window functions are permitted only in the <literal>SELECT</literal> list
and the <literal>ORDER BY</> clause of the query. They are forbidden
elsewhere, such as in <literal>GROUP BY</>, <literal>HAVING</>
and <literal>WHERE</literal> clauses. This is because they logically
execute after the processing of those clauses. Also, window functions
execute after regular aggregate functions. This means it is valid to
include an aggregate function call in the arguments of a window function,
but not vice versa.
</para>
<para>
If there is a need to filter or group rows after the window calculations
are performed, you can use a sub-select. For example:
<programlisting>
SELECT depname, empno, salary, enroll_date
FROM
(SELECT depname, empno, salary, enroll_date,
rank() OVER (PARTITION BY depname ORDER BY salary DESC, empno) AS pos
FROM empsalary
) AS ss
WHERE pos < 3;
</programlisting>
The above query only shows the rows from the inner query having
<literal>rank</> less than <literal>3</>.
</para>
<para>
When a query involves multiple window functions, it is possible to write
out each one with a separate <literal>OVER</> clause, but this is
duplicative and error-prone if the same windowing behavior is wanted
for several functions. Instead, each windowing behavior can be named
in a <literal>WINDOW</> clause and then referenced in <literal>OVER</>.
For example:
<programlisting>
SELECT sum(salary) OVER w, avg(salary) OVER w
FROM empsalary
WINDOW w AS (PARTITION BY depname ORDER BY salary DESC);
</programlisting>
</para>
</sect1>
<sect1 id="tutorial-inheritance">
<title>Inheritance</title>
@ -391,7 +627,7 @@ CREATE TABLE capitals (
<para>
For example, the following query finds the names of all cities,
including state capitals, that are located at an altitude
including state capitals, that are located at an altitude
over 500 feet:
<programlisting>
@ -455,7 +691,7 @@ SELECT name, altitude
<sect1 id="tutorial-conclusion">
<title>Conclusion</title>
<para>
<productname>PostgreSQL</productname> has many features not
touched upon in this tutorial introduction, which has been

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/errcodes.sgml,v 1.25 2008/10/04 21:56:52 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/errcodes.sgml,v 1.26 2008/12/28 18:53:53 tgl Exp $ -->
<appendix id="errcodes-appendix">
<title><productname>PostgreSQL</productname> Error Codes</title>
@ -378,6 +378,18 @@
<entry>invalid_argument_for_logarithm</entry>
</row>
<row>
<entry><literal>22014</literal></entry>
<entry>INVALID ARGUMENT FOR NTILE FUNCTION</entry>
<entry>invalid_argument_for_ntile_function</entry>
</row>
<row>
<entry><literal>22016</literal></entry>
<entry>INVALID ARGUMENT FOR NTH_VALUE FUNCTION</entry>
<entry>invalid_argument_for_nth_value_function</entry>
</row>
<row>
<entry><literal>2201F</literal></entry>
<entry>INVALID ARGUMENT FOR POWER FUNCTION</entry>
@ -990,6 +1002,12 @@
<entry>grouping_error</entry>
</row>
<row>
<entry><literal>42P20</literal></entry>
<entry>WINDOWING ERROR</entry>
<entry>windowing_error</entry>
</row>
<row>
<entry><literal>42P19</literal></entry>
<entry>INVALID RECURSION</entry>

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.463 2008/12/19 16:25:16 petere Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.464 2008/12/28 18:53:53 tgl Exp $ -->
<chapter id="functions">
<title>Functions and Operators</title>
@ -10149,6 +10149,278 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
</sect1>
<sect1 id="functions-window">
<title>Window Functions</title>
<indexterm zone="functions-window">
<primary>window function</primary>
<secondary>built-in</secondary>
</indexterm>
<para>
<firstterm>Window functions</firstterm> provide the ability to perform
calculations across sets of rows that are related to the current query
row. For information about this feature see
<xref linkend="tutorial-window"> and
<xref linkend="syntax-window-functions">.
</para>
<para>
The built-in window functions are listed in
<xref linkend="functions-window-table">. Note that these functions
<emphasis>must</> be invoked using window function syntax; that is an
<literal>OVER</> clause is required.
</para>
<para>
In addition to these functions, any built-in or user-defined aggregate
function can be used as a window function (see
<xref linkend="functions-aggregate"> for a list of the built-in aggregates).
Aggregate functions act as window functions only when an <literal>OVER</>
clause follows the call; otherwise they act as regular aggregates.
</para>
<table id="functions-window-table">
<title>General-Purpose Window Functions</title>
<tgroup cols="3">
<thead>
<row>
<entry>Function</entry>
<entry>Return Type</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry>
<indexterm>
<primary>row_number</primary>
</indexterm>
<function>row_number()</function>
</entry>
<entry>
<type>bigint</type>
</entry>
<entry>number of the current row within its partition, counting from 1</entry>
</row>
<row>
<entry>
<indexterm>
<primary>rank</primary>
</indexterm>
<function>rank()</function>
</entry>
<entry>
<type>bigint</type>
</entry>
<entry>rank of the current row with gaps; same as <function>row_number</> of its first peer</entry>
</row>
<row>
<entry>
<indexterm>
<primary>dense_rank</primary>
</indexterm>
<function>dense_rank()</function>
</entry>
<entry>
<type>bigint</type>
</entry>
<entry>rank of the current row without gaps; this function counts peer groups</entry>
</row>
<row>
<entry>
<indexterm>
<primary>percent_rank</primary>
</indexterm>
<function>percent_rank()</function>
</entry>
<entry>
<type>double precision</type>
</entry>
<entry>relative rank of the current row: (<function>rank</> - 1) / (total rows - 1)</entry>
</row>
<row>
<entry>
<indexterm>
<primary>cume_dist</primary>
</indexterm>
<function>cume_dist()</function>
</entry>
<entry>
<type>double precision</type>
</entry>
<entry>relative rank of the current row: (number of rows preceding or peer with current row) / (total rows)</entry>
</row>
<row>
<entry>
<indexterm>
<primary>ntile</primary>
</indexterm>
<function>ntile(<replaceable class="parameter">num_buckets</replaceable> <type>integer</>)</function>
</entry>
<entry>
<type>integer</type>
</entry>
<entry>integer ranging from 1 to the argument value, dividing the
partition as equally as possible</entry>
</row>
<row>
<entry>
<indexterm>
<primary>lag</primary>
</indexterm>
<function>
lag(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
</function>
</entry>
<entry>
<type>same type as <replaceable class="parameter">value</replaceable></type>
</entry>
<entry>
returns <replaceable class="parameter">value</replaceable> evaluated at
the row that is <replaceable class="parameter">offset</replaceable>
rows before the current row within the partition; if there is no such
row, instead return <replaceable class="parameter">default</replaceable>.
Both <replaceable class="parameter">offset</replaceable> and
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
<replaceable class="parameter">default</replaceable> to null
</entry>
</row>
<row>
<entry>
<indexterm>
<primary>lead</primary>
</indexterm>
<function>
lead(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
</function>
</entry>
<entry>
<type>same type as <replaceable class="parameter">value</replaceable></type>
</entry>
<entry>
returns <replaceable class="parameter">value</replaceable> evaluated at
the row that is <replaceable class="parameter">offset</replaceable>
rows after the current row within the partition; if there is no such
row, instead return <replaceable class="parameter">default</replaceable>.
Both <replaceable class="parameter">offset</replaceable> and
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
<replaceable class="parameter">default</replaceable> to null
</entry>
</row>
<row>
<entry>
<indexterm>
<primary>first_value</primary>
</indexterm>
<function>first_value(<replaceable class="parameter">value</replaceable> <type>any</>)</function>
</entry>
<entry>
<type>same type as <replaceable class="parameter">value</replaceable></type>
</entry>
<entry>
returns <replaceable class="parameter">value</replaceable> evaluated
at the row that is the first row of the window frame
</entry>
</row>
<row>
<entry>
<indexterm>
<primary>last_value</primary>
</indexterm>
<function>last_value(<replaceable class="parameter">value</replaceable> <type>any</>)</function>
</entry>
<entry>
<type>same type as <replaceable class="parameter">value</replaceable></type>
</entry>
<entry>
returns <replaceable class="parameter">value</replaceable> evaluated
at the row that is the last row of the window frame
</entry>
</row>
<row>
<entry>
<indexterm>
<primary>nth_value</primary>
</indexterm>
<function>
nth_value(<replaceable class="parameter">value</replaceable> <type>any</>, <replaceable class="parameter">nth</replaceable> <type>integer</>)
</function>
</entry>
<entry>
<type>same type as <replaceable class="parameter">value</replaceable></type>
</entry>
<entry>
returns <replaceable class="parameter">value</replaceable> evaluated
at the row that is the <replaceable class="parameter">nth</replaceable>
row of the window frame (counting from 1); null if no such row
</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
All of the functions listed in
<xref linkend="functions-window-table"> depend on the sort ordering
specified by the <literal>ORDER BY</> clause of the associated window
definition. Rows that are not distinct in the <literal>ORDER BY</>
ordering are said to be <firstterm>peers</>; the four ranking functions
are defined so that they give the same answer for any two peer rows.
</para>
<para>
Note that <function>first_value</>, <function>last_value</>, and
<function>nth_value</> consider only the rows within the <quote>window
frame</>, that is the rows from the start of the partition through the
last peer of the current row. This is particularly likely to give
unintuitive results for <function>last_value</>.
</para>
<para>
When an aggregate function is used as a window function, it aggregates
over the rows within the current row's window frame. To obtain
aggregation over the whole partition, be sure to omit <literal>ORDER BY</>
from the window definition. An aggregate used with <literal>ORDER BY</>
produces a <quote>running sum</> type of behavior, which may or may not
be what's wanted.
</para>
<note>
<para>
The SQL standard defines a <literal>RESPECT NULLS</> or
<literal>IGNORE NULLS</> option for <function>lead</>, <function>lag</>,
<function>first_value</>, <function>last_value</>, and
<function>nth_value</>. This is not implemented in
<productname>PostgreSQL</productname>: the behavior is always the
same as the standard's default, namely <literal>RESPECT NULLS</>.
Likewise, the standard's <literal>FROM FIRST</> or <literal>FROM LAST</>
option for <function>nth_value</> is not implemented: only the
default <literal>FROM FIRST</> behavior is supported.
</para>
</note>
</sect1>
<sect1 id="functions-subquery">
<title>Subquery Expressions</title>

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/queries.sgml,v 1.50 2008/10/14 00:41:34 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/queries.sgml,v 1.51 2008/12/28 18:53:54 tgl Exp $ -->
<chapter id="queries">
<title>Queries</title>
@ -949,6 +949,57 @@ SELECT product_id, p.name, (sum(s.units) * (p.price - p.cost)) AS profit
5000. Note that the aggregate expressions do not necessarily need
to be the same in all parts of the query.
</para>
<para>
If a query contains aggregate function calls, but no <literal>GROUP BY</>
clause, grouping still occurs: the result is a single group row (or
perhaps no rows at all, if the single row is then eliminated by
<literal>HAVING</>).
The same is true if it contains a <literal>HAVING</> clause, even
without any aggregate function calls or <literal>GROUP BY</> clause.
</para>
</sect2>
<sect2 id="queries-window">
<title>Window Function Processing</>
<indexterm zone="queries-window">
<primary>window function</primary>
<secondary>order of execution</>
</indexterm>
<para>
If the query contains any window functions (see
<xref linkend="tutorial-window"> and
<xref linkend="syntax-window-functions">), these functions are evaluated
after any grouping, aggregation, and <literal>HAVING</> filtering is
performed. That is, if the query uses any aggregates, <literal>GROUP
BY</>, or <literal>HAVING</>, then the rows seen by the window functions
are the group rows instead of the original table rows from
<literal>FROM</>/<literal>WHERE</>.
</para>
<para>
When multiple window functions are used, all the window functions having
syntactically equivalent <literal>PARTITION BY</> and <literal>ORDER BY</>
clauses in their window definitions are guaranteed to be evaluated in a
single pass over the data. Therefore they will see the same sort ordering,
even if the <literal>ORDER BY</> does not uniquely determine an ordering.
However, no guarantees are made about the evaluation of functions having
different <literal>PARTITION BY</> or <literal>ORDER BY</> specifications.
(In such cases a sort step is typically required between the passes of
window function evaluations, and the sort is not guaranteed to preserve
ordering of rows that its <literal>ORDER BY</> sees as equivalent.)
</para>
<para>
Currently, use of window functions always forces sorting, and so the
query output will be ordered according to one or another of the window
functions' <literal>PARTITION BY</>/<literal>ORDER BY</> clauses.
It is not recommendable to rely on this, however. Use an explicit
top-level <literal>ORDER BY</> clause if you want to be sure the
results are sorted in a particular way.
</para>
</sect2>
</sect1>

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/query.sgml,v 1.50 2007/02/01 00:28:17 momjian Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/query.sgml,v 1.51 2008/12/28 18:53:54 tgl Exp $ -->
<chapter id="tutorial-sql">
<title>The <acronym>SQL</acronym> Language</title>
@ -621,7 +621,7 @@ SELECT W1.city, W1.temp_lo AS low, W1.temp_hi AS high,
San Francisco | 43 | 57 | San Francisco | 46 | 50
Hayward | 37 | 54 | San Francisco | 46 | 50
(2 rows)
</programlisting>
</programlisting>
Here we have relabeled the weather table as <literal>W1</> and
<literal>W2</> to be able to distinguish the left and right side
@ -651,9 +651,9 @@ SELECT *
<indexterm><primary>min</primary></indexterm>
<indexterm><primary>sum</primary></indexterm>
Like most other relational database products,
Like most other relational database products,
<productname>PostgreSQL</productname> supports
aggregate functions.
<firstterm>aggregate functions</>.
An aggregate function computes a single result from multiple input rows.
For example, there are aggregates to compute the
<function>count</function>, <function>sum</function>,
@ -815,7 +815,7 @@ SELECT city, max(temp_lo)
<para>
You can update existing rows using the
<command>UPDATE</command> command.
<command>UPDATE</command> command.
Suppose you discover the temperature readings are
all off by 2 degrees after November 28. You can correct the
data as follows:

View File

@ -1,5 +1,5 @@
<!--
$PostgreSQL: pgsql/doc/src/sgml/ref/select.sgml,v 1.112 2008/12/01 09:38:08 petere Exp $
$PostgreSQL: pgsql/doc/src/sgml/ref/select.sgml,v 1.113 2008/12/28 18:53:54 tgl Exp $
PostgreSQL documentation
-->
@ -39,6 +39,7 @@ SELECT [ ALL | DISTINCT [ ON ( <replaceable class="parameter">expression</replac
[ WHERE <replaceable class="parameter">condition</replaceable> ]
[ GROUP BY <replaceable class="parameter">expression</replaceable> [, ...] ]
[ HAVING <replaceable class="parameter">condition</replaceable> [, ...] ]
[ WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceable class="parameter">window_definition</replaceable> ) [, ...] ]
[ { UNION | INTERSECT | EXCEPT } [ ALL ] <replaceable class="parameter">select</replaceable> ]
[ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ]
[ LIMIT { <replaceable class="parameter">count</replaceable> | ALL } ]
@ -566,6 +567,67 @@ HAVING <replaceable class="parameter">condition</replaceable>
</para>
</refsect2>
<refsect2 id="SQL-WINDOW">
<title id="sql-window-title"><literal>WINDOW</literal> Clause</title>
<para>
The optional <literal>WINDOW</literal> clause has the general form
<synopsis>
WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceable class="parameter">window_definition</replaceable> ) [, ...]
</synopsis>
where <replaceable class="parameter">window_name</replaceable> is
a name that can be referenced from subsequent window definitions or
<literal>OVER</> clauses, and
<replaceable class="parameter">window_definition</replaceable> is
<synopsis>
[ <replaceable class="parameter">existing_window_name</replaceable> ]
[ PARTITION BY <replaceable class="parameter">expression</replaceable> [, ...] ]
[ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ]
</synopsis>
The elements of the <literal>PARTITION BY</> list are interpreted in
the same fashion as elements of a
<xref linkend="sql-groupby" endterm="sql-groupby-title">, and
the elements of the <literal>ORDER BY</> list are interpreted in the
same fashion as elements of an
<xref linkend="sql-orderby" endterm="sql-orderby-title">.
The only difference is that these expressions can contain aggregate
function calls, which are not allowed in a regular <literal>GROUP BY</>
clause. They are allowed here because windowing occurs after grouping
and aggregation.
</para>
<para>
If an <replaceable class="parameter">existing_window_name</replaceable>
is specified it must refer to an earlier entry in the <literal>WINDOW</>
list; the new window copies its partitioning clause from that entry,
as well as its ordering clause if any. In this case the new window cannot
specify its own <literal>PARTITION BY</> clause, and it can specify
<literal>ORDER BY</> only if the copied window does not have one.
</para>
<para>
The purpose of a <literal>WINDOW</literal> clause is to specify the
behavior of <firstterm>window functions</> appearing in the query's
<xref linkend="sql-select-list" endterm="sql-select-list-title"> or
<xref linkend="sql-orderby" endterm="sql-orderby-title">. These functions
can reference the <literal>WINDOW</literal> clause entries by name
in their <literal>OVER</> clauses. A <literal>WINDOW</literal> clause
entry does not have to be referenced anywhere, however; if it is not
used in the query it is simply ignored. It is possible to use window
functions without any <literal>WINDOW</literal> clause at all, since
a window function call can specify its window definition directly in
its <literal>OVER</> clause. However, the <literal>WINDOW</literal>
clause saves typing when the same window definition is needed for more
than one window function.
</para>
<para>
Window functions are described in detail in
<xref linkend="tutorial-window"> and
<xref linkend="syntax-window-functions">.
</para>
</refsect2>
<refsect2 id="sql-select-list">
<title id="sql-select-list-title"><command>SELECT</command> List</title>
@ -922,7 +984,7 @@ FETCH { FIRST | NEXT } [ <replaceable class="parameter">count</replaceable> ] {
constants for the offset or fetch count, parentheses will be
necessary in most cases. If the fetch count is omitted, it
defaults to 1.
</para>
</para>
<para>
When using <literal>LIMIT</>, it is a good idea to use an
@ -1387,6 +1449,19 @@ SELECT distributors.* WHERE distributors.name = 'Westward';
</para>
</refsect2>
<refsect2>
<title><literal>WINDOW</literal> Clause Restrictions</title>
<para>
The SQL standard provides for an optional <quote>framing clause</>,
introduced by the key word <literal>RANGE</> or <literal>ROWS</>,
in window definitions. <productname>PostgreSQL</productname> does
not yet implement framing clauses, and always follows the
default framing behavior, which is equivalent to the framing clause
<literal>ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW</>.
</para>
</refsect2>
<refsect2>
<title><literal>LIMIT</literal> and <literal>OFFSET</literal></title>

View File

@ -1,5 +1,5 @@
<!--
$PostgreSQL: pgsql/doc/src/sgml/ref/select_into.sgml,v 1.43 2008/11/14 10:22:47 petere Exp $
$PostgreSQL: pgsql/doc/src/sgml/ref/select_into.sgml,v 1.44 2008/12/28 18:53:54 tgl Exp $
PostgreSQL documentation
-->
@ -29,6 +29,7 @@ SELECT [ ALL | DISTINCT [ ON ( <replaceable class="parameter">expression</replac
[ WHERE <replaceable class="parameter">condition</replaceable> ]
[ GROUP BY <replaceable class="parameter">expression</replaceable> [, ...] ]
[ HAVING <replaceable class="parameter">condition</replaceable> [, ...] ]
[ WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceable class="parameter">window_definition</replaceable> ) [, ...] ]
[ { UNION | INTERSECT | EXCEPT } [ ALL ] <replaceable class="parameter">select</replaceable> ]
[ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ]
[ LIMIT { <replaceable class="parameter">count</replaceable> | ALL } ]

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/syntax.sgml,v 1.126 2008/12/09 20:52:03 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/syntax.sgml,v 1.127 2008/12/28 18:53:54 tgl Exp $ -->
<chapter id="sql-syntax">
<title>SQL Syntax</title>
@ -1201,6 +1201,12 @@ SELECT 3 OPERATOR(pg_catalog.+) 4;
</para>
</listitem>
<listitem>
<para>
A window function call.
</para>
</listitem>
<listitem>
<para>
A type cast.
@ -1445,7 +1451,7 @@ $1.somecolumn
enclosed in parentheses:
<synopsis>
<replaceable>function</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional> )
<replaceable>function_name</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional> )
</synopsis>
</para>
@ -1480,7 +1486,7 @@ sqrt(2)
<synopsis>
<replaceable>aggregate_name</replaceable> (<replaceable>expression</replaceable> [ , ... ] )
<replaceable>aggregate_name</replaceable> (ALL <replaceable>expression</replaceable> [ , ... ] )
<replaceable>aggregate_name</replaceable> (DISTINCT <replaceable>expression</replaceable> [ , ... ] )
<replaceable>aggregate_name</replaceable> (DISTINCT <replaceable>expression</replaceable>)
<replaceable>aggregate_name</replaceable> ( * )
</synopsis>
@ -1488,7 +1494,7 @@ sqrt(2)
defined aggregate (possibly qualified with a schema name), and
<replaceable>expression</replaceable> is
any value expression that does not itself contain an aggregate
expression.
expression or a window function call.
</para>
<para>
@ -1550,6 +1556,89 @@ sqrt(2)
</note>
</sect2>
<sect2 id="syntax-window-functions">
<title>Window Function Calls</title>
<indexterm zone="syntax-window-functions">
<primary>window function</primary>
<secondary>invocation</secondary>
</indexterm>
<indexterm zone="syntax-window-functions">
<primary>OVER clause</primary>
</indexterm>
<para>
A <firstterm>window function call</firstterm> represents the application
of an aggregate-like function over some portion of the rows selected
by a query. Unlike regular aggregate function calls, this is not tied
to grouping of the selected rows into a single output row &mdash; each
row remains separate in the query output. However the window function
is able to scan all the rows that would be part of the current row's
group according to the grouping specification (<literal>PARTITION BY</>
list) of the window function call.
The syntax of a window function call is one of the following:
<synopsis>
<replaceable>function_name</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional>) OVER ( <replaceable class="parameter">window_definition</replaceable> )
<replaceable>function_name</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional>) OVER <replaceable>window_name</replaceable>
<replaceable>function_name</replaceable> ( * ) OVER ( <replaceable class="parameter">window_definition</replaceable> )
<replaceable>function_name</replaceable> ( * ) OVER <replaceable>window_name</replaceable>
</synopsis>
where <replaceable class="parameter">window_definition</replaceable>
has the syntax
<synopsis>
[ <replaceable class="parameter">window_name</replaceable> ]
[ PARTITION BY <replaceable class="parameter">expression</replaceable> [, ...] ]
[ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ]
</synopsis>
Here, <replaceable>expression</replaceable> represents any value
expression that does not itself contain window function calls.
The <literal>PARTITION BY</> and <literal>ORDER BY</> lists have
essentially the same syntax and semantics as <literal>GROUP BY</>
and <literal>ORDER BY</> clauses of the whole query.
<replaceable>window_name</replaceable> is a reference to a named window
specification defined in the query's <literal>WINDOW</literal> clause.
Named window specifications are usually referenced with just
<literal>OVER</> <replaceable>window_name</replaceable>, but it is
also possible to write a window name inside the parentheses and then
optionally override its ordering clause with <literal>ORDER BY</>.
This latter syntax follows the same rules as modifying an existing
window name within the <literal>WINDOW</literal> clause; see the
<xref linkend="sql-select" endterm="sql-select-title"> reference
page for details.
</para>
<para>
The built-in window functions are described in <xref
linkend="functions-window-table">. Also, any built-in or
user-defined aggregate function can be used as a window function.
Currently, there is no provision for user-defined window functions
other than aggregates.
</para>
<para>
The syntaxes using <literal>*</> are used for calling parameter-less
aggregate functions as window functions, for example
<literal>count(*) OVER (PARTITION BY x ORDER BY y)</>.
<literal>*</> is customarily not used for non-aggregate window functions.
Aggregate window functions, unlike normal aggregate functions, do not
allow <literal>DISTINCT</> to be used within the function argument list.
</para>
<para>
Window function calls are permitted only in the <literal>SELECT</literal>
list and the <literal>ORDER BY</> clause of the query.
</para>
<para>
More information about window functions can be found in
<xref linkend="tutorial-window"> and
<xref linkend="queries-window">.
</para>
</sect2>
<sect2 id="sql-syntax-type-casts">
<title>Type Casts</title>

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/xaggr.sgml,v 1.36 2008/11/20 21:10:44 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/xaggr.sgml,v 1.37 2008/12/28 18:53:54 tgl Exp $ -->
<sect1 id="xaggr">
<title>User-Defined Aggregates</title>
@ -167,10 +167,13 @@ SELECT attrelid::regclass, array_accum(atttypid::regtype)
<para>
A function written in C can detect that it is being called as an
aggregate transition or final function by seeing if it was passed
an <structname>AggState</> node as the function call <quote>context</>,
an <structname>AggState</> or <structname>WindowAggState</> node
as the function call <quote>context</>,
for example by:
<programlisting>
if (fcinfo->context &amp;&amp; IsA(fcinfo->context, AggState))
if (fcinfo-&gt;context &amp;&amp;
(IsA(fcinfo-&gt;context, AggState) ||
IsA(fcinfo-&gt;context, WindowAggState)))
</programlisting>
One reason for checking this is that when it is true, the first input
must be a temporary transition value and can therefore safely be modified