mirror of
https://github.com/postgres/postgres.git
synced 2025-07-27 12:41:57 +03:00
Support window functions a la SQL:2008.
Hitoshi Harada, with some kibitzing from Heikki and Tom.
This commit is contained in:
@ -1,4 +1,4 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/advanced.sgml,v 1.54 2007/02/01 00:28:16 momjian Exp $ -->
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/advanced.sgml,v 1.55 2008/12/28 18:53:53 tgl Exp $ -->
|
||||
|
||||
<chapter id="tutorial-advanced">
|
||||
<title>Advanced Features</title>
|
||||
@ -240,7 +240,7 @@ COMMIT;
|
||||
<para>
|
||||
<productname>PostgreSQL</> actually treats every SQL statement as being
|
||||
executed within a transaction. If you do not issue a <command>BEGIN</>
|
||||
command,
|
||||
command,
|
||||
then each individual statement has an implicit <command>BEGIN</> and
|
||||
(if successful) <command>COMMIT</> wrapped around it. A group of
|
||||
statements surrounded by <command>BEGIN</> and <command>COMMIT</>
|
||||
@ -265,7 +265,7 @@ COMMIT;
|
||||
with <command>ROLLBACK TO</>. All the transaction's database changes
|
||||
between defining the savepoint and rolling back to it are discarded, but
|
||||
changes earlier than the savepoint are kept.
|
||||
</para>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
After rolling back to a savepoint, it continues to be defined, so you can
|
||||
@ -274,7 +274,7 @@ COMMIT;
|
||||
system can free some resources. Keep in mind that either releasing or
|
||||
rolling back to a savepoint
|
||||
will automatically release all savepoints that were defined after it.
|
||||
</para>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
All this is happening within the transaction block, so none of it
|
||||
@ -282,7 +282,7 @@ COMMIT;
|
||||
transaction block, the committed actions become visible as a unit
|
||||
to other sessions, while the rolled-back actions never become visible
|
||||
at all.
|
||||
</para>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Remembering the bank database, suppose we debit $100.00 from Alice's
|
||||
@ -317,6 +317,242 @@ COMMIT;
|
||||
</sect1>
|
||||
|
||||
|
||||
<sect1 id="tutorial-window">
|
||||
<title id="tutorial-window-title">Window Functions</title>
|
||||
|
||||
<indexterm zone="tutorial-window">
|
||||
<primary>window function</primary>
|
||||
</indexterm>
|
||||
|
||||
<para>
|
||||
A <firstterm>window function</> performs a calculation across a set of
|
||||
table rows that are somehow related to the current row. This is comparable
|
||||
to the type of calculation that can be done with an aggregate function.
|
||||
But unlike regular aggregate functions, use of a window function does not
|
||||
cause rows to become grouped into a single output row — the
|
||||
rows retain their separate identities. Behind the scenes, the window
|
||||
function is able to access more than just the current row of the query
|
||||
result.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Here is an example that shows how to compare each employee's salary
|
||||
with the average salary in his or her department:
|
||||
|
||||
<programlisting>
|
||||
SELECT depname, empno, salary, avg(salary) OVER (PARTITION BY depname) FROM empsalary;
|
||||
</programlisting>
|
||||
|
||||
<screen>
|
||||
depname | empno | salary | avg
|
||||
-----------+-------+--------+-----------------------
|
||||
develop | 11 | 5200 | 5020.0000000000000000
|
||||
develop | 7 | 4200 | 5020.0000000000000000
|
||||
develop | 9 | 4500 | 5020.0000000000000000
|
||||
develop | 8 | 6000 | 5020.0000000000000000
|
||||
develop | 10 | 5200 | 5020.0000000000000000
|
||||
personnel | 5 | 3500 | 3700.0000000000000000
|
||||
personnel | 2 | 3900 | 3700.0000000000000000
|
||||
sales | 3 | 4800 | 4866.6666666666666667
|
||||
sales | 1 | 5000 | 4866.6666666666666667
|
||||
sales | 4 | 4800 | 4866.6666666666666667
|
||||
(10 rows)
|
||||
</screen>
|
||||
|
||||
The first three output columns come directly from the table
|
||||
<structname>empsalary</>, and there is one output row for each row in the
|
||||
table. The fourth column represents an average taken across all the table
|
||||
rows that have the same <structfield>depname</> value as the current row.
|
||||
(This actually is the same function as the regular <function>avg</>
|
||||
aggregate function, but the <literal>OVER</> clause causes it to be
|
||||
treated as a window function and computed across an appropriate set of
|
||||
rows.)
|
||||
</para>
|
||||
|
||||
<para>
|
||||
A window function call always contains an <literal>OVER</> clause
|
||||
following the window function's name and argument(s). This is what
|
||||
syntactically distinguishes it from a regular function or aggregate
|
||||
function. The <literal>OVER</> clause determines exactly how the
|
||||
rows of the query are split up for processing by the window function.
|
||||
The <literal>PARTITION BY</> list within <literal>OVER</> specifies
|
||||
dividing the rows into groups, or partitions, that share the same
|
||||
values of the <literal>PARTITION BY</> expression(s). For each row,
|
||||
the window function is computed across the rows that fall into the
|
||||
same partition as the current row.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Although <function>avg</> will produce the same result no matter
|
||||
what order it processes the partition's rows in, this is not true of all
|
||||
window functions. When needed, you can control that order using
|
||||
<literal>ORDER BY</> within <literal>OVER</>. Here is an example:
|
||||
|
||||
<programlisting>
|
||||
SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary DESC) FROM empsalary;
|
||||
</programlisting>
|
||||
|
||||
<screen>
|
||||
depname | empno | salary | rank
|
||||
-----------+-------+--------+------
|
||||
develop | 8 | 6000 | 1
|
||||
develop | 10 | 5200 | 2
|
||||
develop | 11 | 5200 | 2
|
||||
develop | 9 | 4500 | 4
|
||||
develop | 7 | 4200 | 5
|
||||
personnel | 2 | 3900 | 1
|
||||
personnel | 5 | 3500 | 2
|
||||
sales | 1 | 5000 | 1
|
||||
sales | 4 | 4800 | 2
|
||||
sales | 3 | 4800 | 2
|
||||
(10 rows)
|
||||
</screen>
|
||||
|
||||
As shown here, the <function>rank</> function produces a numerical rank
|
||||
within the current row's partition for each distinct <literal>ORDER BY</>
|
||||
value, in the order defined by the <literal>ORDER BY</> clause.
|
||||
<function>rank</> needs no explicit parameter, because its behavior
|
||||
is entirely determined by the <literal>OVER</> clause.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The rows considered by a window function are those of the <quote>virtual
|
||||
table</> produced by the query's <literal>FROM</> clause as filtered by its
|
||||
<literal>WHERE</>, <literal>GROUP BY</>, and <literal>HAVING</> clauses
|
||||
if any. For example, a row removed because it does not meet the
|
||||
<literal>WHERE</> condition is not seen by any window function.
|
||||
A query can contain multiple window functions that slice up the data
|
||||
in different ways by means of different <literal>OVER</> clauses, but
|
||||
they all act on the same collection of rows defined by this virtual table.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
We already saw that <literal>ORDER BY</> can be omitted if the ordering
|
||||
of rows is not important. It is also possible to omit <literal>PARTITION
|
||||
BY</>, in which case the window function is computed over all rows of the
|
||||
virtual table; that is, there is one partition containing all the rows.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
There is another important concept associated with window functions:
|
||||
for each row, there is a set of rows within its partition called its
|
||||
<firstterm>window frame</>. When <literal>ORDER BY</> is omitted the
|
||||
frame is always the same as the partition. If <literal>ORDER BY</> is
|
||||
supplied, the frame consists of all rows from the start of the partition
|
||||
up to the current row, plus any following rows that are equal to the
|
||||
current row according to the <literal>ORDER BY</> clause.
|
||||
<footnote>
|
||||
<para>
|
||||
The SQL standard includes options to define the window frame in
|
||||
other ways, but this definition is the only one currently supported
|
||||
by <productname>PostgreSQL</productname>.
|
||||
</para>
|
||||
</footnote>
|
||||
Many window functions act only on the rows of the window frame, rather
|
||||
than of the whole partition. Here is an example using <function>sum</>:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
SELECT salary, sum(salary) OVER () FROM empsalary;
|
||||
</programlisting>
|
||||
|
||||
<screen>
|
||||
salary | sum
|
||||
--------+-------
|
||||
5200 | 47100
|
||||
5000 | 47100
|
||||
3500 | 47100
|
||||
4800 | 47100
|
||||
3900 | 47100
|
||||
4200 | 47100
|
||||
4500 | 47100
|
||||
4800 | 47100
|
||||
6000 | 47100
|
||||
5200 | 47100
|
||||
(10 rows)
|
||||
</screen>
|
||||
|
||||
<para>
|
||||
Above, since there is no <literal>ORDER BY</> in the <literal>OVER</>
|
||||
clause, the window frame is the same as the partition, which for lack of
|
||||
<literal>PARTITION BY</> is the whole table; in other words each sum is
|
||||
taken over the whole table and so we get the same result for each output
|
||||
row. But if we add an <literal>ORDER BY</> clause, we get very different
|
||||
results:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
SELECT salary, sum(salary) OVER (ORDER BY salary) FROM empsalary;
|
||||
</programlisting>
|
||||
|
||||
<screen>
|
||||
salary | sum
|
||||
--------+-------
|
||||
3500 | 3500
|
||||
3900 | 7400
|
||||
4200 | 11600
|
||||
4500 | 16100
|
||||
4800 | 25700
|
||||
4800 | 25700
|
||||
5000 | 30700
|
||||
5200 | 41100
|
||||
5200 | 41100
|
||||
6000 | 47100
|
||||
(10 rows)
|
||||
</screen>
|
||||
|
||||
<para>
|
||||
Here the sum is taken from the first (lowest) salary up through the
|
||||
current one, including any duplicates of the current one (notice the
|
||||
results for the duplicated salaries).
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Window functions are permitted only in the <literal>SELECT</literal> list
|
||||
and the <literal>ORDER BY</> clause of the query. They are forbidden
|
||||
elsewhere, such as in <literal>GROUP BY</>, <literal>HAVING</>
|
||||
and <literal>WHERE</literal> clauses. This is because they logically
|
||||
execute after the processing of those clauses. Also, window functions
|
||||
execute after regular aggregate functions. This means it is valid to
|
||||
include an aggregate function call in the arguments of a window function,
|
||||
but not vice versa.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
If there is a need to filter or group rows after the window calculations
|
||||
are performed, you can use a sub-select. For example:
|
||||
|
||||
<programlisting>
|
||||
SELECT depname, empno, salary, enroll_date
|
||||
FROM
|
||||
(SELECT depname, empno, salary, enroll_date,
|
||||
rank() OVER (PARTITION BY depname ORDER BY salary DESC, empno) AS pos
|
||||
FROM empsalary
|
||||
) AS ss
|
||||
WHERE pos < 3;
|
||||
</programlisting>
|
||||
|
||||
The above query only shows the rows from the inner query having
|
||||
<literal>rank</> less than <literal>3</>.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
When a query involves multiple window functions, it is possible to write
|
||||
out each one with a separate <literal>OVER</> clause, but this is
|
||||
duplicative and error-prone if the same windowing behavior is wanted
|
||||
for several functions. Instead, each windowing behavior can be named
|
||||
in a <literal>WINDOW</> clause and then referenced in <literal>OVER</>.
|
||||
For example:
|
||||
|
||||
<programlisting>
|
||||
SELECT sum(salary) OVER w, avg(salary) OVER w
|
||||
FROM empsalary
|
||||
WINDOW w AS (PARTITION BY depname ORDER BY salary DESC);
|
||||
</programlisting>
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
|
||||
<sect1 id="tutorial-inheritance">
|
||||
<title>Inheritance</title>
|
||||
|
||||
@ -391,7 +627,7 @@ CREATE TABLE capitals (
|
||||
|
||||
<para>
|
||||
For example, the following query finds the names of all cities,
|
||||
including state capitals, that are located at an altitude
|
||||
including state capitals, that are located at an altitude
|
||||
over 500 feet:
|
||||
|
||||
<programlisting>
|
||||
@ -455,7 +691,7 @@ SELECT name, altitude
|
||||
|
||||
<sect1 id="tutorial-conclusion">
|
||||
<title>Conclusion</title>
|
||||
|
||||
|
||||
<para>
|
||||
<productname>PostgreSQL</productname> has many features not
|
||||
touched upon in this tutorial introduction, which has been
|
||||
|
@ -1,4 +1,4 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/errcodes.sgml,v 1.25 2008/10/04 21:56:52 tgl Exp $ -->
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/errcodes.sgml,v 1.26 2008/12/28 18:53:53 tgl Exp $ -->
|
||||
|
||||
<appendix id="errcodes-appendix">
|
||||
<title><productname>PostgreSQL</productname> Error Codes</title>
|
||||
@ -378,6 +378,18 @@
|
||||
<entry>invalid_argument_for_logarithm</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>22014</literal></entry>
|
||||
<entry>INVALID ARGUMENT FOR NTILE FUNCTION</entry>
|
||||
<entry>invalid_argument_for_ntile_function</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>22016</literal></entry>
|
||||
<entry>INVALID ARGUMENT FOR NTH_VALUE FUNCTION</entry>
|
||||
<entry>invalid_argument_for_nth_value_function</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>2201F</literal></entry>
|
||||
<entry>INVALID ARGUMENT FOR POWER FUNCTION</entry>
|
||||
@ -990,6 +1002,12 @@
|
||||
<entry>grouping_error</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>42P20</literal></entry>
|
||||
<entry>WINDOWING ERROR</entry>
|
||||
<entry>windowing_error</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>42P19</literal></entry>
|
||||
<entry>INVALID RECURSION</entry>
|
||||
|
@ -1,4 +1,4 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.463 2008/12/19 16:25:16 petere Exp $ -->
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.464 2008/12/28 18:53:53 tgl Exp $ -->
|
||||
|
||||
<chapter id="functions">
|
||||
<title>Functions and Operators</title>
|
||||
@ -10149,6 +10149,278 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
|
||||
|
||||
</sect1>
|
||||
|
||||
<sect1 id="functions-window">
|
||||
<title>Window Functions</title>
|
||||
|
||||
<indexterm zone="functions-window">
|
||||
<primary>window function</primary>
|
||||
<secondary>built-in</secondary>
|
||||
</indexterm>
|
||||
|
||||
<para>
|
||||
<firstterm>Window functions</firstterm> provide the ability to perform
|
||||
calculations across sets of rows that are related to the current query
|
||||
row. For information about this feature see
|
||||
<xref linkend="tutorial-window"> and
|
||||
<xref linkend="syntax-window-functions">.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The built-in window functions are listed in
|
||||
<xref linkend="functions-window-table">. Note that these functions
|
||||
<emphasis>must</> be invoked using window function syntax; that is an
|
||||
<literal>OVER</> clause is required.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
In addition to these functions, any built-in or user-defined aggregate
|
||||
function can be used as a window function (see
|
||||
<xref linkend="functions-aggregate"> for a list of the built-in aggregates).
|
||||
Aggregate functions act as window functions only when an <literal>OVER</>
|
||||
clause follows the call; otherwise they act as regular aggregates.
|
||||
</para>
|
||||
|
||||
<table id="functions-window-table">
|
||||
<title>General-Purpose Window Functions</title>
|
||||
|
||||
<tgroup cols="3">
|
||||
<thead>
|
||||
<row>
|
||||
<entry>Function</entry>
|
||||
<entry>Return Type</entry>
|
||||
<entry>Description</entry>
|
||||
</row>
|
||||
</thead>
|
||||
|
||||
<tbody>
|
||||
<row>
|
||||
<entry>
|
||||
<indexterm>
|
||||
<primary>row_number</primary>
|
||||
</indexterm>
|
||||
<function>row_number()</function>
|
||||
</entry>
|
||||
<entry>
|
||||
<type>bigint</type>
|
||||
</entry>
|
||||
<entry>number of the current row within its partition, counting from 1</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry>
|
||||
<indexterm>
|
||||
<primary>rank</primary>
|
||||
</indexterm>
|
||||
<function>rank()</function>
|
||||
</entry>
|
||||
<entry>
|
||||
<type>bigint</type>
|
||||
</entry>
|
||||
<entry>rank of the current row with gaps; same as <function>row_number</> of its first peer</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry>
|
||||
<indexterm>
|
||||
<primary>dense_rank</primary>
|
||||
</indexterm>
|
||||
<function>dense_rank()</function>
|
||||
</entry>
|
||||
<entry>
|
||||
<type>bigint</type>
|
||||
</entry>
|
||||
<entry>rank of the current row without gaps; this function counts peer groups</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry>
|
||||
<indexterm>
|
||||
<primary>percent_rank</primary>
|
||||
</indexterm>
|
||||
<function>percent_rank()</function>
|
||||
</entry>
|
||||
<entry>
|
||||
<type>double precision</type>
|
||||
</entry>
|
||||
<entry>relative rank of the current row: (<function>rank</> - 1) / (total rows - 1)</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry>
|
||||
<indexterm>
|
||||
<primary>cume_dist</primary>
|
||||
</indexterm>
|
||||
<function>cume_dist()</function>
|
||||
</entry>
|
||||
<entry>
|
||||
<type>double precision</type>
|
||||
</entry>
|
||||
<entry>relative rank of the current row: (number of rows preceding or peer with current row) / (total rows)</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry>
|
||||
<indexterm>
|
||||
<primary>ntile</primary>
|
||||
</indexterm>
|
||||
<function>ntile(<replaceable class="parameter">num_buckets</replaceable> <type>integer</>)</function>
|
||||
</entry>
|
||||
<entry>
|
||||
<type>integer</type>
|
||||
</entry>
|
||||
<entry>integer ranging from 1 to the argument value, dividing the
|
||||
partition as equally as possible</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry>
|
||||
<indexterm>
|
||||
<primary>lag</primary>
|
||||
</indexterm>
|
||||
<function>
|
||||
lag(<replaceable class="parameter">value</replaceable> <type>any</>
|
||||
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
|
||||
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
|
||||
</function>
|
||||
</entry>
|
||||
<entry>
|
||||
<type>same type as <replaceable class="parameter">value</replaceable></type>
|
||||
</entry>
|
||||
<entry>
|
||||
returns <replaceable class="parameter">value</replaceable> evaluated at
|
||||
the row that is <replaceable class="parameter">offset</replaceable>
|
||||
rows before the current row within the partition; if there is no such
|
||||
row, instead return <replaceable class="parameter">default</replaceable>.
|
||||
Both <replaceable class="parameter">offset</replaceable> and
|
||||
<replaceable class="parameter">default</replaceable> are evaluated
|
||||
with respect to the current row. If omitted,
|
||||
<replaceable class="parameter">offset</replaceable> defaults to 1 and
|
||||
<replaceable class="parameter">default</replaceable> to null
|
||||
</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry>
|
||||
<indexterm>
|
||||
<primary>lead</primary>
|
||||
</indexterm>
|
||||
<function>
|
||||
lead(<replaceable class="parameter">value</replaceable> <type>any</>
|
||||
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
|
||||
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
|
||||
</function>
|
||||
</entry>
|
||||
<entry>
|
||||
<type>same type as <replaceable class="parameter">value</replaceable></type>
|
||||
</entry>
|
||||
<entry>
|
||||
returns <replaceable class="parameter">value</replaceable> evaluated at
|
||||
the row that is <replaceable class="parameter">offset</replaceable>
|
||||
rows after the current row within the partition; if there is no such
|
||||
row, instead return <replaceable class="parameter">default</replaceable>.
|
||||
Both <replaceable class="parameter">offset</replaceable> and
|
||||
<replaceable class="parameter">default</replaceable> are evaluated
|
||||
with respect to the current row. If omitted,
|
||||
<replaceable class="parameter">offset</replaceable> defaults to 1 and
|
||||
<replaceable class="parameter">default</replaceable> to null
|
||||
</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry>
|
||||
<indexterm>
|
||||
<primary>first_value</primary>
|
||||
</indexterm>
|
||||
<function>first_value(<replaceable class="parameter">value</replaceable> <type>any</>)</function>
|
||||
</entry>
|
||||
<entry>
|
||||
<type>same type as <replaceable class="parameter">value</replaceable></type>
|
||||
</entry>
|
||||
<entry>
|
||||
returns <replaceable class="parameter">value</replaceable> evaluated
|
||||
at the row that is the first row of the window frame
|
||||
</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry>
|
||||
<indexterm>
|
||||
<primary>last_value</primary>
|
||||
</indexterm>
|
||||
<function>last_value(<replaceable class="parameter">value</replaceable> <type>any</>)</function>
|
||||
</entry>
|
||||
<entry>
|
||||
<type>same type as <replaceable class="parameter">value</replaceable></type>
|
||||
</entry>
|
||||
<entry>
|
||||
returns <replaceable class="parameter">value</replaceable> evaluated
|
||||
at the row that is the last row of the window frame
|
||||
</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry>
|
||||
<indexterm>
|
||||
<primary>nth_value</primary>
|
||||
</indexterm>
|
||||
<function>
|
||||
nth_value(<replaceable class="parameter">value</replaceable> <type>any</>, <replaceable class="parameter">nth</replaceable> <type>integer</>)
|
||||
</function>
|
||||
</entry>
|
||||
<entry>
|
||||
<type>same type as <replaceable class="parameter">value</replaceable></type>
|
||||
</entry>
|
||||
<entry>
|
||||
returns <replaceable class="parameter">value</replaceable> evaluated
|
||||
at the row that is the <replaceable class="parameter">nth</replaceable>
|
||||
row of the window frame (counting from 1); null if no such row
|
||||
</entry>
|
||||
</row>
|
||||
</tbody>
|
||||
</tgroup>
|
||||
</table>
|
||||
|
||||
<para>
|
||||
All of the functions listed in
|
||||
<xref linkend="functions-window-table"> depend on the sort ordering
|
||||
specified by the <literal>ORDER BY</> clause of the associated window
|
||||
definition. Rows that are not distinct in the <literal>ORDER BY</>
|
||||
ordering are said to be <firstterm>peers</>; the four ranking functions
|
||||
are defined so that they give the same answer for any two peer rows.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Note that <function>first_value</>, <function>last_value</>, and
|
||||
<function>nth_value</> consider only the rows within the <quote>window
|
||||
frame</>, that is the rows from the start of the partition through the
|
||||
last peer of the current row. This is particularly likely to give
|
||||
unintuitive results for <function>last_value</>.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
When an aggregate function is used as a window function, it aggregates
|
||||
over the rows within the current row's window frame. To obtain
|
||||
aggregation over the whole partition, be sure to omit <literal>ORDER BY</>
|
||||
from the window definition. An aggregate used with <literal>ORDER BY</>
|
||||
produces a <quote>running sum</> type of behavior, which may or may not
|
||||
be what's wanted.
|
||||
</para>
|
||||
|
||||
<note>
|
||||
<para>
|
||||
The SQL standard defines a <literal>RESPECT NULLS</> or
|
||||
<literal>IGNORE NULLS</> option for <function>lead</>, <function>lag</>,
|
||||
<function>first_value</>, <function>last_value</>, and
|
||||
<function>nth_value</>. This is not implemented in
|
||||
<productname>PostgreSQL</productname>: the behavior is always the
|
||||
same as the standard's default, namely <literal>RESPECT NULLS</>.
|
||||
Likewise, the standard's <literal>FROM FIRST</> or <literal>FROM LAST</>
|
||||
option for <function>nth_value</> is not implemented: only the
|
||||
default <literal>FROM FIRST</> behavior is supported.
|
||||
</para>
|
||||
</note>
|
||||
|
||||
</sect1>
|
||||
|
||||
<sect1 id="functions-subquery">
|
||||
<title>Subquery Expressions</title>
|
||||
|
@ -1,4 +1,4 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/queries.sgml,v 1.50 2008/10/14 00:41:34 tgl Exp $ -->
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/queries.sgml,v 1.51 2008/12/28 18:53:54 tgl Exp $ -->
|
||||
|
||||
<chapter id="queries">
|
||||
<title>Queries</title>
|
||||
@ -949,6 +949,57 @@ SELECT product_id, p.name, (sum(s.units) * (p.price - p.cost)) AS profit
|
||||
5000. Note that the aggregate expressions do not necessarily need
|
||||
to be the same in all parts of the query.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
If a query contains aggregate function calls, but no <literal>GROUP BY</>
|
||||
clause, grouping still occurs: the result is a single group row (or
|
||||
perhaps no rows at all, if the single row is then eliminated by
|
||||
<literal>HAVING</>).
|
||||
The same is true if it contains a <literal>HAVING</> clause, even
|
||||
without any aggregate function calls or <literal>GROUP BY</> clause.
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
<sect2 id="queries-window">
|
||||
<title>Window Function Processing</>
|
||||
|
||||
<indexterm zone="queries-window">
|
||||
<primary>window function</primary>
|
||||
<secondary>order of execution</>
|
||||
</indexterm>
|
||||
|
||||
<para>
|
||||
If the query contains any window functions (see
|
||||
<xref linkend="tutorial-window"> and
|
||||
<xref linkend="syntax-window-functions">), these functions are evaluated
|
||||
after any grouping, aggregation, and <literal>HAVING</> filtering is
|
||||
performed. That is, if the query uses any aggregates, <literal>GROUP
|
||||
BY</>, or <literal>HAVING</>, then the rows seen by the window functions
|
||||
are the group rows instead of the original table rows from
|
||||
<literal>FROM</>/<literal>WHERE</>.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
When multiple window functions are used, all the window functions having
|
||||
syntactically equivalent <literal>PARTITION BY</> and <literal>ORDER BY</>
|
||||
clauses in their window definitions are guaranteed to be evaluated in a
|
||||
single pass over the data. Therefore they will see the same sort ordering,
|
||||
even if the <literal>ORDER BY</> does not uniquely determine an ordering.
|
||||
However, no guarantees are made about the evaluation of functions having
|
||||
different <literal>PARTITION BY</> or <literal>ORDER BY</> specifications.
|
||||
(In such cases a sort step is typically required between the passes of
|
||||
window function evaluations, and the sort is not guaranteed to preserve
|
||||
ordering of rows that its <literal>ORDER BY</> sees as equivalent.)
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Currently, use of window functions always forces sorting, and so the
|
||||
query output will be ordered according to one or another of the window
|
||||
functions' <literal>PARTITION BY</>/<literal>ORDER BY</> clauses.
|
||||
It is not recommendable to rely on this, however. Use an explicit
|
||||
top-level <literal>ORDER BY</> clause if you want to be sure the
|
||||
results are sorted in a particular way.
|
||||
</para>
|
||||
</sect2>
|
||||
</sect1>
|
||||
|
||||
|
@ -1,4 +1,4 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/query.sgml,v 1.50 2007/02/01 00:28:17 momjian Exp $ -->
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/query.sgml,v 1.51 2008/12/28 18:53:54 tgl Exp $ -->
|
||||
|
||||
<chapter id="tutorial-sql">
|
||||
<title>The <acronym>SQL</acronym> Language</title>
|
||||
@ -621,7 +621,7 @@ SELECT W1.city, W1.temp_lo AS low, W1.temp_hi AS high,
|
||||
San Francisco | 43 | 57 | San Francisco | 46 | 50
|
||||
Hayward | 37 | 54 | San Francisco | 46 | 50
|
||||
(2 rows)
|
||||
</programlisting>
|
||||
</programlisting>
|
||||
|
||||
Here we have relabeled the weather table as <literal>W1</> and
|
||||
<literal>W2</> to be able to distinguish the left and right side
|
||||
@ -651,9 +651,9 @@ SELECT *
|
||||
<indexterm><primary>min</primary></indexterm>
|
||||
<indexterm><primary>sum</primary></indexterm>
|
||||
|
||||
Like most other relational database products,
|
||||
Like most other relational database products,
|
||||
<productname>PostgreSQL</productname> supports
|
||||
aggregate functions.
|
||||
<firstterm>aggregate functions</>.
|
||||
An aggregate function computes a single result from multiple input rows.
|
||||
For example, there are aggregates to compute the
|
||||
<function>count</function>, <function>sum</function>,
|
||||
@ -815,7 +815,7 @@ SELECT city, max(temp_lo)
|
||||
|
||||
<para>
|
||||
You can update existing rows using the
|
||||
<command>UPDATE</command> command.
|
||||
<command>UPDATE</command> command.
|
||||
Suppose you discover the temperature readings are
|
||||
all off by 2 degrees after November 28. You can correct the
|
||||
data as follows:
|
||||
|
@ -1,5 +1,5 @@
|
||||
<!--
|
||||
$PostgreSQL: pgsql/doc/src/sgml/ref/select.sgml,v 1.112 2008/12/01 09:38:08 petere Exp $
|
||||
$PostgreSQL: pgsql/doc/src/sgml/ref/select.sgml,v 1.113 2008/12/28 18:53:54 tgl Exp $
|
||||
PostgreSQL documentation
|
||||
-->
|
||||
|
||||
@ -39,6 +39,7 @@ SELECT [ ALL | DISTINCT [ ON ( <replaceable class="parameter">expression</replac
|
||||
[ WHERE <replaceable class="parameter">condition</replaceable> ]
|
||||
[ GROUP BY <replaceable class="parameter">expression</replaceable> [, ...] ]
|
||||
[ HAVING <replaceable class="parameter">condition</replaceable> [, ...] ]
|
||||
[ WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceable class="parameter">window_definition</replaceable> ) [, ...] ]
|
||||
[ { UNION | INTERSECT | EXCEPT } [ ALL ] <replaceable class="parameter">select</replaceable> ]
|
||||
[ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ]
|
||||
[ LIMIT { <replaceable class="parameter">count</replaceable> | ALL } ]
|
||||
@ -566,6 +567,67 @@ HAVING <replaceable class="parameter">condition</replaceable>
|
||||
</para>
|
||||
</refsect2>
|
||||
|
||||
<refsect2 id="SQL-WINDOW">
|
||||
<title id="sql-window-title"><literal>WINDOW</literal> Clause</title>
|
||||
|
||||
<para>
|
||||
The optional <literal>WINDOW</literal> clause has the general form
|
||||
<synopsis>
|
||||
WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceable class="parameter">window_definition</replaceable> ) [, ...]
|
||||
</synopsis>
|
||||
where <replaceable class="parameter">window_name</replaceable> is
|
||||
a name that can be referenced from subsequent window definitions or
|
||||
<literal>OVER</> clauses, and
|
||||
<replaceable class="parameter">window_definition</replaceable> is
|
||||
<synopsis>
|
||||
[ <replaceable class="parameter">existing_window_name</replaceable> ]
|
||||
[ PARTITION BY <replaceable class="parameter">expression</replaceable> [, ...] ]
|
||||
[ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ]
|
||||
</synopsis>
|
||||
The elements of the <literal>PARTITION BY</> list are interpreted in
|
||||
the same fashion as elements of a
|
||||
<xref linkend="sql-groupby" endterm="sql-groupby-title">, and
|
||||
the elements of the <literal>ORDER BY</> list are interpreted in the
|
||||
same fashion as elements of an
|
||||
<xref linkend="sql-orderby" endterm="sql-orderby-title">.
|
||||
The only difference is that these expressions can contain aggregate
|
||||
function calls, which are not allowed in a regular <literal>GROUP BY</>
|
||||
clause. They are allowed here because windowing occurs after grouping
|
||||
and aggregation.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
If an <replaceable class="parameter">existing_window_name</replaceable>
|
||||
is specified it must refer to an earlier entry in the <literal>WINDOW</>
|
||||
list; the new window copies its partitioning clause from that entry,
|
||||
as well as its ordering clause if any. In this case the new window cannot
|
||||
specify its own <literal>PARTITION BY</> clause, and it can specify
|
||||
<literal>ORDER BY</> only if the copied window does not have one.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The purpose of a <literal>WINDOW</literal> clause is to specify the
|
||||
behavior of <firstterm>window functions</> appearing in the query's
|
||||
<xref linkend="sql-select-list" endterm="sql-select-list-title"> or
|
||||
<xref linkend="sql-orderby" endterm="sql-orderby-title">. These functions
|
||||
can reference the <literal>WINDOW</literal> clause entries by name
|
||||
in their <literal>OVER</> clauses. A <literal>WINDOW</literal> clause
|
||||
entry does not have to be referenced anywhere, however; if it is not
|
||||
used in the query it is simply ignored. It is possible to use window
|
||||
functions without any <literal>WINDOW</literal> clause at all, since
|
||||
a window function call can specify its window definition directly in
|
||||
its <literal>OVER</> clause. However, the <literal>WINDOW</literal>
|
||||
clause saves typing when the same window definition is needed for more
|
||||
than one window function.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Window functions are described in detail in
|
||||
<xref linkend="tutorial-window"> and
|
||||
<xref linkend="syntax-window-functions">.
|
||||
</para>
|
||||
</refsect2>
|
||||
|
||||
<refsect2 id="sql-select-list">
|
||||
<title id="sql-select-list-title"><command>SELECT</command> List</title>
|
||||
|
||||
@ -922,7 +984,7 @@ FETCH { FIRST | NEXT } [ <replaceable class="parameter">count</replaceable> ] {
|
||||
constants for the offset or fetch count, parentheses will be
|
||||
necessary in most cases. If the fetch count is omitted, it
|
||||
defaults to 1.
|
||||
</para>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
When using <literal>LIMIT</>, it is a good idea to use an
|
||||
@ -1387,6 +1449,19 @@ SELECT distributors.* WHERE distributors.name = 'Westward';
|
||||
</para>
|
||||
</refsect2>
|
||||
|
||||
<refsect2>
|
||||
<title><literal>WINDOW</literal> Clause Restrictions</title>
|
||||
|
||||
<para>
|
||||
The SQL standard provides for an optional <quote>framing clause</>,
|
||||
introduced by the key word <literal>RANGE</> or <literal>ROWS</>,
|
||||
in window definitions. <productname>PostgreSQL</productname> does
|
||||
not yet implement framing clauses, and always follows the
|
||||
default framing behavior, which is equivalent to the framing clause
|
||||
<literal>ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW</>.
|
||||
</para>
|
||||
</refsect2>
|
||||
|
||||
<refsect2>
|
||||
<title><literal>LIMIT</literal> and <literal>OFFSET</literal></title>
|
||||
|
||||
|
@ -1,5 +1,5 @@
|
||||
<!--
|
||||
$PostgreSQL: pgsql/doc/src/sgml/ref/select_into.sgml,v 1.43 2008/11/14 10:22:47 petere Exp $
|
||||
$PostgreSQL: pgsql/doc/src/sgml/ref/select_into.sgml,v 1.44 2008/12/28 18:53:54 tgl Exp $
|
||||
PostgreSQL documentation
|
||||
-->
|
||||
|
||||
@ -29,6 +29,7 @@ SELECT [ ALL | DISTINCT [ ON ( <replaceable class="parameter">expression</replac
|
||||
[ WHERE <replaceable class="parameter">condition</replaceable> ]
|
||||
[ GROUP BY <replaceable class="parameter">expression</replaceable> [, ...] ]
|
||||
[ HAVING <replaceable class="parameter">condition</replaceable> [, ...] ]
|
||||
[ WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceable class="parameter">window_definition</replaceable> ) [, ...] ]
|
||||
[ { UNION | INTERSECT | EXCEPT } [ ALL ] <replaceable class="parameter">select</replaceable> ]
|
||||
[ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ]
|
||||
[ LIMIT { <replaceable class="parameter">count</replaceable> | ALL } ]
|
||||
|
@ -1,4 +1,4 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/syntax.sgml,v 1.126 2008/12/09 20:52:03 tgl Exp $ -->
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/syntax.sgml,v 1.127 2008/12/28 18:53:54 tgl Exp $ -->
|
||||
|
||||
<chapter id="sql-syntax">
|
||||
<title>SQL Syntax</title>
|
||||
@ -1201,6 +1201,12 @@ SELECT 3 OPERATOR(pg_catalog.+) 4;
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
A window function call.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
A type cast.
|
||||
@ -1445,7 +1451,7 @@ $1.somecolumn
|
||||
enclosed in parentheses:
|
||||
|
||||
<synopsis>
|
||||
<replaceable>function</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional> )
|
||||
<replaceable>function_name</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional> )
|
||||
</synopsis>
|
||||
</para>
|
||||
|
||||
@ -1480,7 +1486,7 @@ sqrt(2)
|
||||
<synopsis>
|
||||
<replaceable>aggregate_name</replaceable> (<replaceable>expression</replaceable> [ , ... ] )
|
||||
<replaceable>aggregate_name</replaceable> (ALL <replaceable>expression</replaceable> [ , ... ] )
|
||||
<replaceable>aggregate_name</replaceable> (DISTINCT <replaceable>expression</replaceable> [ , ... ] )
|
||||
<replaceable>aggregate_name</replaceable> (DISTINCT <replaceable>expression</replaceable>)
|
||||
<replaceable>aggregate_name</replaceable> ( * )
|
||||
</synopsis>
|
||||
|
||||
@ -1488,7 +1494,7 @@ sqrt(2)
|
||||
defined aggregate (possibly qualified with a schema name), and
|
||||
<replaceable>expression</replaceable> is
|
||||
any value expression that does not itself contain an aggregate
|
||||
expression.
|
||||
expression or a window function call.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
@ -1550,6 +1556,89 @@ sqrt(2)
|
||||
</note>
|
||||
</sect2>
|
||||
|
||||
<sect2 id="syntax-window-functions">
|
||||
<title>Window Function Calls</title>
|
||||
|
||||
<indexterm zone="syntax-window-functions">
|
||||
<primary>window function</primary>
|
||||
<secondary>invocation</secondary>
|
||||
</indexterm>
|
||||
|
||||
<indexterm zone="syntax-window-functions">
|
||||
<primary>OVER clause</primary>
|
||||
</indexterm>
|
||||
|
||||
<para>
|
||||
A <firstterm>window function call</firstterm> represents the application
|
||||
of an aggregate-like function over some portion of the rows selected
|
||||
by a query. Unlike regular aggregate function calls, this is not tied
|
||||
to grouping of the selected rows into a single output row — each
|
||||
row remains separate in the query output. However the window function
|
||||
is able to scan all the rows that would be part of the current row's
|
||||
group according to the grouping specification (<literal>PARTITION BY</>
|
||||
list) of the window function call.
|
||||
The syntax of a window function call is one of the following:
|
||||
|
||||
<synopsis>
|
||||
<replaceable>function_name</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional>) OVER ( <replaceable class="parameter">window_definition</replaceable> )
|
||||
<replaceable>function_name</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional>) OVER <replaceable>window_name</replaceable>
|
||||
<replaceable>function_name</replaceable> ( * ) OVER ( <replaceable class="parameter">window_definition</replaceable> )
|
||||
<replaceable>function_name</replaceable> ( * ) OVER <replaceable>window_name</replaceable>
|
||||
</synopsis>
|
||||
where <replaceable class="parameter">window_definition</replaceable>
|
||||
has the syntax
|
||||
<synopsis>
|
||||
[ <replaceable class="parameter">window_name</replaceable> ]
|
||||
[ PARTITION BY <replaceable class="parameter">expression</replaceable> [, ...] ]
|
||||
[ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ]
|
||||
</synopsis>
|
||||
|
||||
Here, <replaceable>expression</replaceable> represents any value
|
||||
expression that does not itself contain window function calls.
|
||||
The <literal>PARTITION BY</> and <literal>ORDER BY</> lists have
|
||||
essentially the same syntax and semantics as <literal>GROUP BY</>
|
||||
and <literal>ORDER BY</> clauses of the whole query.
|
||||
<replaceable>window_name</replaceable> is a reference to a named window
|
||||
specification defined in the query's <literal>WINDOW</literal> clause.
|
||||
Named window specifications are usually referenced with just
|
||||
<literal>OVER</> <replaceable>window_name</replaceable>, but it is
|
||||
also possible to write a window name inside the parentheses and then
|
||||
optionally override its ordering clause with <literal>ORDER BY</>.
|
||||
This latter syntax follows the same rules as modifying an existing
|
||||
window name within the <literal>WINDOW</literal> clause; see the
|
||||
<xref linkend="sql-select" endterm="sql-select-title"> reference
|
||||
page for details.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The built-in window functions are described in <xref
|
||||
linkend="functions-window-table">. Also, any built-in or
|
||||
user-defined aggregate function can be used as a window function.
|
||||
Currently, there is no provision for user-defined window functions
|
||||
other than aggregates.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The syntaxes using <literal>*</> are used for calling parameter-less
|
||||
aggregate functions as window functions, for example
|
||||
<literal>count(*) OVER (PARTITION BY x ORDER BY y)</>.
|
||||
<literal>*</> is customarily not used for non-aggregate window functions.
|
||||
Aggregate window functions, unlike normal aggregate functions, do not
|
||||
allow <literal>DISTINCT</> to be used within the function argument list.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Window function calls are permitted only in the <literal>SELECT</literal>
|
||||
list and the <literal>ORDER BY</> clause of the query.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
More information about window functions can be found in
|
||||
<xref linkend="tutorial-window"> and
|
||||
<xref linkend="queries-window">.
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
<sect2 id="sql-syntax-type-casts">
|
||||
<title>Type Casts</title>
|
||||
|
||||
|
@ -1,4 +1,4 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/xaggr.sgml,v 1.36 2008/11/20 21:10:44 tgl Exp $ -->
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/xaggr.sgml,v 1.37 2008/12/28 18:53:54 tgl Exp $ -->
|
||||
|
||||
<sect1 id="xaggr">
|
||||
<title>User-Defined Aggregates</title>
|
||||
@ -167,10 +167,13 @@ SELECT attrelid::regclass, array_accum(atttypid::regtype)
|
||||
<para>
|
||||
A function written in C can detect that it is being called as an
|
||||
aggregate transition or final function by seeing if it was passed
|
||||
an <structname>AggState</> node as the function call <quote>context</>,
|
||||
an <structname>AggState</> or <structname>WindowAggState</> node
|
||||
as the function call <quote>context</>,
|
||||
for example by:
|
||||
<programlisting>
|
||||
if (fcinfo->context && IsA(fcinfo->context, AggState))
|
||||
if (fcinfo->context &&
|
||||
(IsA(fcinfo->context, AggState) ||
|
||||
IsA(fcinfo->context, WindowAggState)))
|
||||
</programlisting>
|
||||
One reason for checking this is that when it is true, the first input
|
||||
must be a temporary transition value and can therefore safely be modified
|
||||
|
Reference in New Issue
Block a user