Support window functions a la SQL:2008.

Hitoshi Harada, with some kibitzing from Heikki and Tom.
2025-07-27 12:41:57 +03:00 · 2008-12-28 18:54:01 +00:00
parent 38e9348282
commit 95b07bc7f5
92 changed files with 6720 additions and 321 deletions
--- a/doc/src/sgml/advanced.sgml
+++ b/doc/src/sgml/advanced.sgml
@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/advanced.sgml,v 1.54 2007/02/01 00:28:16 momjian Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/advanced.sgml,v 1.55 2008/12/28 18:53:53 tgl Exp $ -->

 <chapter id="tutorial-advanced">
  <title>Advanced Features</title>
@ -240,7 +240,7 @@ COMMIT;
   <para>
    <productname>PostgreSQL</> actually treats every SQL statement as being
    executed within a transaction.  If you do not issue a <command>BEGIN</>
-    command, 
+    command,
    then each individual statement has an implicit <command>BEGIN</> and
    (if successful) <command>COMMIT</> wrapped around it.  A group of
    statements surrounded by <command>BEGIN</> and <command>COMMIT</>
@ -265,7 +265,7 @@ COMMIT;
    with <command>ROLLBACK TO</>.  All the transaction's database changes
    between defining the savepoint and rolling back to it are discarded, but
    changes earlier than the savepoint are kept.
-   </para> 
+   </para>

   <para>
    After rolling back to a savepoint, it continues to be defined, so you can
@ -274,7 +274,7 @@ COMMIT;
    system can free some resources.  Keep in mind that either releasing or
    rolling back to a savepoint
    will automatically release all savepoints that were defined after it.
-   </para> 
+   </para>

   <para>
    All this is happening within the transaction block, so none of it
@ -282,7 +282,7 @@ COMMIT;
    transaction block, the committed actions become visible as a unit
    to other sessions, while the rolled-back actions never become visible
    at all.
-   </para> 
+   </para>

   <para>
    Remembering the bank database, suppose we debit $100.00 from Alice's
@ -317,6 +317,242 @@ COMMIT;
  </sect1>


+  <sect1 id="tutorial-window">
+   <title id="tutorial-window-title">Window Functions</title>
+
+   <indexterm zone="tutorial-window">
+    <primary>window function</primary>
+   </indexterm>
+
+   <para>
+    A <firstterm>window function</> performs a calculation across a set of
+    table rows that are somehow related to the current row.  This is comparable
+    to the type of calculation that can be done with an aggregate function.
+    But unlike regular aggregate functions, use of a window function does not
+    cause rows to become grouped into a single output row &mdash; the
+    rows retain their separate identities.  Behind the scenes, the window
+    function is able to access more than just the current row of the query
+    result.
+   </para>
+
+   <para>
+    Here is an example that shows how to compare each employee's salary
+    with the average salary in his or her department:
+
+<programlisting>
+SELECT depname, empno, salary, avg(salary) OVER (PARTITION BY depname) FROM empsalary;
+</programlisting>
+
+<screen>
+  depname  | empno | salary |          avg          
+-----------+-------+--------+-----------------------
+ develop   |    11 |   5200 | 5020.0000000000000000
+ develop   |     7 |   4200 | 5020.0000000000000000
+ develop   |     9 |   4500 | 5020.0000000000000000
+ develop   |     8 |   6000 | 5020.0000000000000000
+ develop   |    10 |   5200 | 5020.0000000000000000
+ personnel |     5 |   3500 | 3700.0000000000000000
+ personnel |     2 |   3900 | 3700.0000000000000000
+ sales     |     3 |   4800 | 4866.6666666666666667
+ sales     |     1 |   5000 | 4866.6666666666666667
+ sales     |     4 |   4800 | 4866.6666666666666667
+(10 rows)
+</screen>
+
+    The first three output columns come directly from the table
+    <structname>empsalary</>, and there is one output row for each row in the
+    table.  The fourth column represents an average taken across all the table
+    rows that have the same <structfield>depname</> value as the current row.
+    (This actually is the same function as the regular <function>avg</>
+    aggregate function, but the <literal>OVER</> clause causes it to be
+    treated as a window function and computed across an appropriate set of
+    rows.)
+   </para>
+
+   <para>
+    A window function call always contains an <literal>OVER</> clause
+    following the window function's name and argument(s).  This is what
+    syntactically distinguishes it from a regular function or aggregate
+    function.  The <literal>OVER</> clause determines exactly how the
+    rows of the query are split up for processing by the window function.
+    The <literal>PARTITION BY</> list within <literal>OVER</> specifies
+    dividing the rows into groups, or partitions, that share the same
+    values of the <literal>PARTITION BY</> expression(s).  For each row,
+    the window function is computed across the rows that fall into the
+    same partition as the current row.
+   </para>
+
+   <para>
+    Although <function>avg</> will produce the same result no matter
+    what order it processes the partition's rows in, this is not true of all
+    window functions.  When needed, you can control that order using
+    <literal>ORDER BY</> within <literal>OVER</>.  Here is an example:
+
+<programlisting>
+SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary DESC) FROM empsalary;
+</programlisting>
+
+<screen>
+  depname  | empno | salary | rank 
+-----------+-------+--------+------
+ develop   |     8 |   6000 |    1
+ develop   |    10 |   5200 |    2
+ develop   |    11 |   5200 |    2
+ develop   |     9 |   4500 |    4
+ develop   |     7 |   4200 |    5
+ personnel |     2 |   3900 |    1
+ personnel |     5 |   3500 |    2
+ sales     |     1 |   5000 |    1
+ sales     |     4 |   4800 |    2
+ sales     |     3 |   4800 |    2
+(10 rows)
+</screen>
+
+    As shown here, the <function>rank</> function produces a numerical rank
+    within the current row's partition for each distinct <literal>ORDER BY</>
+    value, in the order defined by the <literal>ORDER BY</> clause.
+    <function>rank</> needs no explicit parameter, because its behavior
+    is entirely determined by the <literal>OVER</> clause.
+   </para>
+
+   <para>
+    The rows considered by a window function are those of the <quote>virtual
+    table</> produced by the query's <literal>FROM</> clause as filtered by its
+    <literal>WHERE</>, <literal>GROUP BY</>, and <literal>HAVING</> clauses
+    if any.  For example, a row removed because it does not meet the
+    <literal>WHERE</> condition is not seen by any window function.
+    A query can contain multiple window functions that slice up the data
+    in different ways by means of different <literal>OVER</> clauses, but
+    they all act on the same collection of rows defined by this virtual table.
+   </para>
+
+   <para>
+    We already saw that <literal>ORDER BY</> can be omitted if the ordering
+    of rows is not important.  It is also possible to omit <literal>PARTITION
+    BY</>, in which case the window function is computed over all rows of the
+    virtual table; that is, there is one partition containing all the rows.
+   </para>
+
+   <para>
+    There is another important concept associated with window functions:
+    for each row, there is a set of rows within its partition called its
+    <firstterm>window frame</>.  When <literal>ORDER BY</> is omitted the
+    frame is always the same as the partition.  If <literal>ORDER BY</> is
+    supplied, the frame consists of all rows from the start of the partition
+    up to the current row, plus any following rows that are equal to the
+    current row according to the <literal>ORDER BY</> clause.
+     <footnote>
+      <para>
+       The SQL standard includes options to define the window frame in
+       other ways, but this definition is the only one currently supported
+       by <productname>PostgreSQL</productname>.
+      </para>
+     </footnote>
+    Many window functions act only on the rows of the window frame, rather
+    than of the whole partition.  Here is an example using <function>sum</>:
+   </para>
+
+<programlisting>
+SELECT salary, sum(salary) OVER () FROM empsalary;
+</programlisting>
+
+<screen>
+ salary |  sum  
+--------+-------
+   5200 | 47100
+   5000 | 47100
+   3500 | 47100
+   4800 | 47100
+   3900 | 47100
+   4200 | 47100
+   4500 | 47100
+   4800 | 47100
+   6000 | 47100
+   5200 | 47100
+(10 rows)
+</screen>
+
+   <para>
+    Above, since there is no <literal>ORDER BY</> in the <literal>OVER</>
+    clause, the window frame is the same as the partition, which for lack of
+    <literal>PARTITION BY</> is the whole table; in other words each sum is
+    taken over the whole table and so we get the same result for each output
+    row.  But if we add an <literal>ORDER BY</> clause, we get very different
+    results:
+   </para>
+
+<programlisting>
+SELECT salary, sum(salary) OVER (ORDER BY salary) FROM empsalary;
+</programlisting>
+
+<screen>
+ salary |  sum  
+--------+-------
+   3500 |  3500
+   3900 |  7400
+   4200 | 11600
+   4500 | 16100
+   4800 | 25700
+   4800 | 25700
+   5000 | 30700
+   5200 | 41100
+   5200 | 41100
+   6000 | 47100
+(10 rows)
+</screen>
+
+   <para>
+    Here the sum is taken from the first (lowest) salary up through the
+    current one, including any duplicates of the current one (notice the
+    results for the duplicated salaries).
+   </para>
+
+   <para>
+    Window functions are permitted only in the <literal>SELECT</literal> list
+    and the <literal>ORDER BY</> clause of the query. They are forbidden
+    elsewhere, such as in <literal>GROUP BY</>, <literal>HAVING</>
+    and <literal>WHERE</literal> clauses.  This is because they logically
+    execute after the processing of those clauses.  Also, window functions
+    execute after regular aggregate functions.  This means it is valid to
+    include an aggregate function call in the arguments of a window function,
+    but not vice versa.
+   </para>
+
+   <para>
+    If there is a need to filter or group rows after the window calculations
+    are performed, you can use a sub-select.  For example:
+
+<programlisting>
+SELECT depname, empno, salary, enroll_date
+FROM
+  (SELECT depname, empno, salary, enroll_date,
+          rank() OVER (PARTITION BY depname ORDER BY salary DESC, empno) AS pos
+     FROM empsalary
+  ) AS ss
+WHERE pos < 3;
+</programlisting>
+
+    The above query only shows the rows from the inner query having
+    <literal>rank</> less than <literal>3</>.
+   </para>
+
+   <para>
+    When a query involves multiple window functions, it is possible to write
+    out each one with a separate <literal>OVER</> clause, but this is
+    duplicative and error-prone if the same windowing behavior is wanted
+    for several functions.  Instead, each windowing behavior can be named
+    in a <literal>WINDOW</> clause and then referenced in <literal>OVER</>.
+    For example:
+
+<programlisting>
+SELECT sum(salary) OVER w, avg(salary) OVER w
+  FROM empsalary
+  WINDOW w AS (PARTITION BY depname ORDER BY salary DESC);
+</programlisting>
+   </para>
+  </sect1>
+
+
  <sect1 id="tutorial-inheritance">
   <title>Inheritance</title>

@ -391,7 +627,7 @@ CREATE TABLE capitals (

   <para>
    For example, the  following  query finds the  names  of  all  cities,
-    including  state capitals, that are located at an altitude 
+    including  state capitals, that are located at an altitude
    over 500 feet:

 <programlisting>
@ -455,7 +691,7 @@ SELECT name, altitude

  <sect1 id="tutorial-conclusion">
   <title>Conclusion</title>
- 
+
   <para>
    <productname>PostgreSQL</productname> has many features not
    touched upon in this tutorial introduction, which has been
--- a/doc/src/sgml/errcodes.sgml
+++ b/doc/src/sgml/errcodes.sgml
@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/errcodes.sgml,v 1.25 2008/10/04 21:56:52 tgl Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/errcodes.sgml,v 1.26 2008/12/28 18:53:53 tgl Exp $ -->

 <appendix id="errcodes-appendix">
 <title><productname>PostgreSQL</productname> Error Codes</title>
@ -378,6 +378,18 @@
 <entry>invalid_argument_for_logarithm</entry>
 </row>

+<row>
+<entry><literal>22014</literal></entry>
+<entry>INVALID ARGUMENT FOR NTILE FUNCTION</entry>
+<entry>invalid_argument_for_ntile_function</entry>
+</row>
+
+<row>
+<entry><literal>22016</literal></entry>
+<entry>INVALID ARGUMENT FOR NTH_VALUE FUNCTION</entry>
+<entry>invalid_argument_for_nth_value_function</entry>
+</row>
+
 <row>
 <entry><literal>2201F</literal></entry>
 <entry>INVALID ARGUMENT FOR POWER FUNCTION</entry>
@ -990,6 +1002,12 @@
 <entry>grouping_error</entry>
 </row>

+<row>
+<entry><literal>42P20</literal></entry>
+<entry>WINDOWING ERROR</entry>
+<entry>windowing_error</entry>
+</row>
+
 <row>
 <entry><literal>42P19</literal></entry>
 <entry>INVALID RECURSION</entry>
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.463 2008/12/19 16:25:16 petere Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.464 2008/12/28 18:53:53 tgl Exp $ -->

 <chapter id="functions">
  <title>Functions and Operators</title>
@ -10149,6 +10149,278 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;

 </sect1>

+ <sect1 id="functions-window">
+  <title>Window Functions</title>
+
+  <indexterm zone="functions-window">
+   <primary>window function</primary>
+   <secondary>built-in</secondary>
+  </indexterm>
+
+  <para>
+   <firstterm>Window functions</firstterm> provide the ability to perform
+   calculations across sets of rows that are related to the current query
+   row.  For information about this feature see
+   <xref linkend="tutorial-window"> and
+   <xref linkend="syntax-window-functions">.
+  </para>
+
+  <para>
+   The built-in window functions are listed in
+   <xref linkend="functions-window-table">.  Note that these functions
+   <emphasis>must</> be invoked using window function syntax; that is an
+   <literal>OVER</> clause is required.
+  </para>
+
+  <para>
+   In addition to these functions, any built-in or user-defined aggregate
+   function can be used as a window function (see
+   <xref linkend="functions-aggregate"> for a list of the built-in aggregates).
+   Aggregate functions act as window functions only when an <literal>OVER</>
+   clause follows the call; otherwise they act as regular aggregates.
+  </para>
+
+  <table id="functions-window-table">
+   <title>General-Purpose Window Functions</title>
+
+   <tgroup cols="3">
+    <thead>
+     <row>
+      <entry>Function</entry>
+      <entry>Return Type</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry>
+       <indexterm>
+        <primary>row_number</primary>
+       </indexterm>
+       <function>row_number()</function>
+      </entry>
+      <entry>
+       <type>bigint</type>
+      </entry>
+      <entry>number of the current row within its partition, counting from 1</entry>
+     </row>
+
+     <row>
+      <entry>
+       <indexterm>
+        <primary>rank</primary>
+       </indexterm>
+       <function>rank()</function>
+      </entry>
+      <entry>
+       <type>bigint</type>
+      </entry>
+      <entry>rank of the current row with gaps; same as <function>row_number</> of its first peer</entry>
+     </row>
+
+     <row>
+      <entry>
+       <indexterm>
+        <primary>dense_rank</primary>
+       </indexterm>
+       <function>dense_rank()</function>
+      </entry>
+      <entry>
+       <type>bigint</type>
+      </entry>
+      <entry>rank of the current row without gaps; this function counts peer groups</entry>
+     </row>
+
+     <row>
+      <entry>
+       <indexterm>
+        <primary>percent_rank</primary>
+       </indexterm>
+       <function>percent_rank()</function>
+      </entry>
+      <entry>
+       <type>double precision</type>
+      </entry>
+      <entry>relative rank of the current row: (<function>rank</> - 1) / (total rows - 1)</entry>
+     </row>
+
+     <row>
+      <entry>
+       <indexterm>
+        <primary>cume_dist</primary>
+       </indexterm>
+       <function>cume_dist()</function>
+      </entry>
+      <entry>
+       <type>double precision</type>
+      </entry>
+      <entry>relative rank of the current row: (number of rows preceding or peer with current row) / (total rows)</entry>
+     </row>
+
+     <row>
+      <entry>
+       <indexterm>
+        <primary>ntile</primary>
+       </indexterm>
+       <function>ntile(<replaceable class="parameter">num_buckets</replaceable> <type>integer</>)</function>
+      </entry>
+      <entry>
+       <type>integer</type>
+      </entry>
+      <entry>integer ranging from 1 to the argument value, dividing the
+       partition as equally as possible</entry>
+     </row>
+
+     <row>
+      <entry>
+       <indexterm>
+        <primary>lag</primary>
+       </indexterm>
+       <function>
+         lag(<replaceable class="parameter">value</replaceable> <type>any</>
+             [, <replaceable class="parameter">offset</replaceable> <type>integer</>
+             [, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+       </function>
+      </entry>
+      <entry>
+       <type>same type as <replaceable class="parameter">value</replaceable></type>
+      </entry>
+      <entry>
+       returns <replaceable class="parameter">value</replaceable> evaluated at
+       the row that is <replaceable class="parameter">offset</replaceable>
+       rows before the current row within the partition; if there is no such
+       row, instead return <replaceable class="parameter">default</replaceable>.
+       Both <replaceable class="parameter">offset</replaceable> and
+       <replaceable class="parameter">default</replaceable> are evaluated
+       with respect to the current row.  If omitted,
+       <replaceable class="parameter">offset</replaceable> defaults to 1 and
+       <replaceable class="parameter">default</replaceable> to null
+      </entry>
+     </row>
+
+     <row>
+      <entry>
+       <indexterm>
+        <primary>lead</primary>
+       </indexterm>
+       <function>
+         lead(<replaceable class="parameter">value</replaceable> <type>any</>
+              [, <replaceable class="parameter">offset</replaceable> <type>integer</>
+              [, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+       </function>
+      </entry>
+      <entry>
+       <type>same type as <replaceable class="parameter">value</replaceable></type>
+      </entry>
+      <entry>
+       returns <replaceable class="parameter">value</replaceable> evaluated at
+       the row that is <replaceable class="parameter">offset</replaceable>
+       rows after the current row within the partition; if there is no such
+       row, instead return <replaceable class="parameter">default</replaceable>.
+       Both <replaceable class="parameter">offset</replaceable> and
+       <replaceable class="parameter">default</replaceable> are evaluated
+       with respect to the current row.  If omitted,
+       <replaceable class="parameter">offset</replaceable> defaults to 1 and
+       <replaceable class="parameter">default</replaceable> to null
+      </entry>
+     </row>
+
+     <row>
+      <entry>
+       <indexterm>
+        <primary>first_value</primary>
+       </indexterm>
+       <function>first_value(<replaceable class="parameter">value</replaceable> <type>any</>)</function>
+      </entry>
+      <entry>
+       <type>same type as <replaceable class="parameter">value</replaceable></type>
+      </entry>
+      <entry>
+       returns <replaceable class="parameter">value</replaceable> evaluated
+       at the row that is the first row of the window frame
+      </entry>
+     </row>
+
+     <row>
+      <entry>
+       <indexterm>
+        <primary>last_value</primary>
+       </indexterm>
+       <function>last_value(<replaceable class="parameter">value</replaceable> <type>any</>)</function>
+      </entry>
+      <entry>
+       <type>same type as <replaceable class="parameter">value</replaceable></type>
+      </entry>
+      <entry>
+       returns <replaceable class="parameter">value</replaceable> evaluated
+       at the row that is the last row of the window frame
+      </entry>
+     </row>
+
+     <row>
+      <entry>
+       <indexterm>
+        <primary>nth_value</primary>
+       </indexterm>
+       <function>
+         nth_value(<replaceable class="parameter">value</replaceable> <type>any</>, <replaceable class="parameter">nth</replaceable> <type>integer</>)
+       </function>
+      </entry>
+      <entry>
+       <type>same type as <replaceable class="parameter">value</replaceable></type>
+      </entry>
+      <entry>
+       returns <replaceable class="parameter">value</replaceable> evaluated
+       at the row that is the <replaceable class="parameter">nth</replaceable>
+       row of the window frame (counting from 1); null if no such row
+      </entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+
+  <para>
+   All of the functions listed in
+   <xref linkend="functions-window-table"> depend on the sort ordering
+   specified by the <literal>ORDER BY</> clause of the associated window
+   definition.  Rows that are not distinct in the <literal>ORDER BY</>
+   ordering are said to be <firstterm>peers</>; the four ranking functions
+   are defined so that they give the same answer for any two peer rows.
+  </para>
+
+  <para>
+   Note that <function>first_value</>, <function>last_value</>, and
+   <function>nth_value</> consider only the rows within the <quote>window
+   frame</>, that is the rows from the start of the partition through the
+   last peer of the current row.  This is particularly likely to give
+   unintuitive results for <function>last_value</>.
+  </para>
+
+  <para>
+   When an aggregate function is used as a window function, it aggregates
+   over the rows within the current row's window frame.  To obtain
+   aggregation over the whole partition, be sure to omit <literal>ORDER BY</>
+   from the window definition.  An aggregate used with <literal>ORDER BY</>
+   produces a <quote>running sum</> type of behavior, which may or may not
+   be what's wanted.
+  </para>
+
+  <note>
+   <para>
+    The SQL standard defines a <literal>RESPECT NULLS</> or
+    <literal>IGNORE NULLS</> option for <function>lead</>, <function>lag</>,
+    <function>first_value</>, <function>last_value</>, and
+    <function>nth_value</>.  This is not implemented in
+    <productname>PostgreSQL</productname>: the behavior is always the
+    same as the standard's default, namely <literal>RESPECT NULLS</>.
+    Likewise, the standard's <literal>FROM FIRST</> or <literal>FROM LAST</>
+    option for <function>nth_value</> is not implemented: only the
+    default <literal>FROM FIRST</> behavior is supported.
+   </para>
+  </note>
+
+ </sect1>

 <sect1 id="functions-subquery">
  <title>Subquery Expressions</title>
--- a/doc/src/sgml/queries.sgml
+++ b/doc/src/sgml/queries.sgml
@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/queries.sgml,v 1.50 2008/10/14 00:41:34 tgl Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/queries.sgml,v 1.51 2008/12/28 18:53:54 tgl Exp $ -->

 <chapter id="queries">
 <title>Queries</title>
@ -949,6 +949,57 @@ SELECT product_id, p.name, (sum(s.units) * (p.price - p.cost)) AS profit
    5000.  Note that the aggregate expressions do not necessarily need
    to be the same in all parts of the query.
   </para>
+
+   <para>
+    If a query contains aggregate function calls, but no <literal>GROUP BY</>
+    clause, grouping still occurs: the result is a single group row (or
+    perhaps no rows at all, if the single row is then eliminated by
+    <literal>HAVING</>).
+    The same is true if it contains a <literal>HAVING</> clause, even
+    without any aggregate function calls or <literal>GROUP BY</> clause.
+   </para>
+  </sect2>
+
+  <sect2 id="queries-window">
+   <title>Window Function Processing</>
+
+   <indexterm zone="queries-window">
+    <primary>window function</primary>
+    <secondary>order of execution</>
+   </indexterm>
+
+   <para>
+    If the query contains any window functions (see
+    <xref linkend="tutorial-window"> and
+    <xref linkend="syntax-window-functions">), these functions are evaluated
+    after any grouping, aggregation, and <literal>HAVING</> filtering is
+    performed.  That is, if the query uses any aggregates, <literal>GROUP
+    BY</>, or <literal>HAVING</>, then the rows seen by the window functions
+    are the group rows instead of the original table rows from
+    <literal>FROM</>/<literal>WHERE</>.
+   </para>
+
+   <para>
+    When multiple window functions are used, all the window functions having
+    syntactically equivalent <literal>PARTITION BY</> and <literal>ORDER BY</>
+    clauses in their window definitions are guaranteed to be evaluated in a
+    single pass over the data. Therefore they will see the same sort ordering,
+    even if the <literal>ORDER BY</> does not uniquely determine an ordering.
+    However, no guarantees are made about the evaluation of functions having
+    different <literal>PARTITION BY</> or <literal>ORDER BY</> specifications.
+    (In such cases a sort step is typically required between the passes of
+    window function evaluations, and the sort is not guaranteed to preserve
+    ordering of rows that its <literal>ORDER BY</> sees as equivalent.)
+   </para>
+
+   <para>
+    Currently, use of window functions always forces sorting, and so the
+    query output will be ordered according to one or another of the window
+    functions' <literal>PARTITION BY</>/<literal>ORDER BY</> clauses.
+    It is not recommendable to rely on this, however.  Use an explicit
+    top-level <literal>ORDER BY</> clause if you want to be sure the
+    results are sorted in a particular way.
+   </para>
  </sect2>
 </sect1>

--- a/doc/src/sgml/query.sgml
+++ b/doc/src/sgml/query.sgml
@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/query.sgml,v 1.50 2007/02/01 00:28:17 momjian Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/query.sgml,v 1.51 2008/12/28 18:53:54 tgl Exp $ -->

 <chapter id="tutorial-sql">
  <title>The <acronym>SQL</acronym> Language</title>
@ -621,7 +621,7 @@ SELECT W1.city, W1.temp_lo AS low, W1.temp_hi AS high,
 San Francisco |  43 |   57 | San Francisco |  46 |   50
 Hayward       |  37 |   54 | San Francisco |  46 |   50
 (2 rows)
-</programlisting>     
+</programlisting>

    Here we have relabeled the weather table as <literal>W1</> and
    <literal>W2</> to be able to distinguish the left and right side
@ -651,9 +651,9 @@ SELECT *
    <indexterm><primary>min</primary></indexterm>
    <indexterm><primary>sum</primary></indexterm>

-    Like  most  other relational database products, 
+    Like  most  other relational database products,
    <productname>PostgreSQL</productname> supports
-    aggregate functions.
+    <firstterm>aggregate functions</>.
    An aggregate function computes a single result from multiple input rows.
    For example, there are aggregates to compute the
    <function>count</function>, <function>sum</function>,
@ -815,7 +815,7 @@ SELECT city, max(temp_lo)

   <para>
    You can update existing rows using the
-    <command>UPDATE</command> command. 
+    <command>UPDATE</command> command.
    Suppose you discover the temperature readings are
    all off by 2 degrees after November 28.  You can correct the
    data as follows:
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@ -1,5 +1,5 @@
 <!--
-$PostgreSQL: pgsql/doc/src/sgml/ref/select.sgml,v 1.112 2008/12/01 09:38:08 petere Exp $
+$PostgreSQL: pgsql/doc/src/sgml/ref/select.sgml,v 1.113 2008/12/28 18:53:54 tgl Exp $
 PostgreSQL documentation
 -->

@ -39,6 +39,7 @@ SELECT [ ALL | DISTINCT [ ON ( <replaceable class="parameter">expression</replac
    [ WHERE <replaceable class="parameter">condition</replaceable> ]
    [ GROUP BY <replaceable class="parameter">expression</replaceable> [, ...] ]
    [ HAVING <replaceable class="parameter">condition</replaceable> [, ...] ]
+    [ WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceable class="parameter">window_definition</replaceable> ) [, ...] ]
    [ { UNION | INTERSECT | EXCEPT } [ ALL ] <replaceable class="parameter">select</replaceable> ]
    [ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ]
    [ LIMIT { <replaceable class="parameter">count</replaceable> | ALL } ]
@ -566,6 +567,67 @@ HAVING <replaceable class="parameter">condition</replaceable>
   </para>
  </refsect2>

+  <refsect2 id="SQL-WINDOW">
+   <title id="sql-window-title"><literal>WINDOW</literal> Clause</title>
+
+   <para>
+    The optional <literal>WINDOW</literal> clause has the general form
+<synopsis>
+WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceable class="parameter">window_definition</replaceable> ) [, ...]
+</synopsis>
+    where <replaceable class="parameter">window_name</replaceable> is
+    a name that can be referenced from subsequent window definitions or
+    <literal>OVER</> clauses, and
+    <replaceable class="parameter">window_definition</replaceable> is
+<synopsis>
+[ <replaceable class="parameter">existing_window_name</replaceable> ]
+[ PARTITION BY <replaceable class="parameter">expression</replaceable> [, ...] ]
+[ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ]
+</synopsis>
+    The elements of the <literal>PARTITION BY</> list are interpreted in
+    the same fashion as elements of a
+    <xref linkend="sql-groupby" endterm="sql-groupby-title">, and
+    the elements of the <literal>ORDER BY</> list are interpreted in the
+    same fashion as elements of an
+    <xref linkend="sql-orderby" endterm="sql-orderby-title">.
+    The only difference is that these expressions can contain aggregate
+    function calls, which are not allowed in a regular <literal>GROUP BY</>
+    clause.  They are allowed here because windowing occurs after grouping
+    and aggregation.
+   </para>
+
+   <para>
+    If an <replaceable class="parameter">existing_window_name</replaceable>
+    is specified it must refer to an earlier entry in the <literal>WINDOW</>
+    list; the new window copies its partitioning clause from that entry,
+    as well as its ordering clause if any.  In this case the new window cannot
+    specify its own <literal>PARTITION BY</> clause, and it can specify
+    <literal>ORDER BY</> only if the copied window does not have one.
+   </para>
+
+   <para>
+    The purpose of a <literal>WINDOW</literal> clause is to specify the
+    behavior of <firstterm>window functions</> appearing in the query's
+    <xref linkend="sql-select-list" endterm="sql-select-list-title"> or
+    <xref linkend="sql-orderby" endterm="sql-orderby-title">.  These functions
+    can reference the <literal>WINDOW</literal> clause entries by name
+    in their <literal>OVER</> clauses.  A <literal>WINDOW</literal> clause
+    entry does not have to be referenced anywhere, however; if it is not
+    used in the query it is simply ignored.  It is possible to use window
+    functions without any <literal>WINDOW</literal> clause at all, since
+    a window function call can specify its window definition directly in
+    its <literal>OVER</> clause.  However, the <literal>WINDOW</literal>
+    clause saves typing when the same window definition is needed for more
+    than one window function.
+   </para>
+
+   <para>
+    Window functions are described in detail in
+    <xref linkend="tutorial-window"> and
+    <xref linkend="syntax-window-functions">.
+   </para>
+  </refsect2>
+
  <refsect2 id="sql-select-list">
   <title id="sql-select-list-title"><command>SELECT</command> List</title>

@ -922,7 +984,7 @@ FETCH { FIRST | NEXT } [ <replaceable class="parameter">count</replaceable> ] {
    constants for the offset or fetch count, parentheses will be
    necessary in most cases.  If the fetch count is omitted, it
    defaults to 1.
-   </para>    
+   </para>

   <para>
    When using <literal>LIMIT</>, it is a good idea to use an
@ -1387,6 +1449,19 @@ SELECT distributors.* WHERE distributors.name = 'Westward';
   </para>
  </refsect2>

+  <refsect2>
+   <title><literal>WINDOW</literal> Clause Restrictions</title>
+
+   <para>
+    The SQL standard provides for an optional <quote>framing clause</>,
+    introduced by the key word <literal>RANGE</> or <literal>ROWS</>,
+    in window definitions.  <productname>PostgreSQL</productname> does
+    not yet implement framing clauses, and always follows the
+    default framing behavior, which is equivalent to the framing clause
+    <literal>ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW</>.
+   </para>
+  </refsect2>
+
  <refsect2>
   <title><literal>LIMIT</literal> and <literal>OFFSET</literal></title>

--- a/doc/src/sgml/ref/select_into.sgml
+++ b/doc/src/sgml/ref/select_into.sgml
@ -1,5 +1,5 @@
 <!--
-$PostgreSQL: pgsql/doc/src/sgml/ref/select_into.sgml,v 1.43 2008/11/14 10:22:47 petere Exp $
+$PostgreSQL: pgsql/doc/src/sgml/ref/select_into.sgml,v 1.44 2008/12/28 18:53:54 tgl Exp $
 PostgreSQL documentation
 -->

@ -29,6 +29,7 @@ SELECT [ ALL | DISTINCT [ ON ( <replaceable class="parameter">expression</replac
    [ WHERE <replaceable class="parameter">condition</replaceable> ]
    [ GROUP BY <replaceable class="parameter">expression</replaceable> [, ...] ]
    [ HAVING <replaceable class="parameter">condition</replaceable> [, ...] ]
+    [ WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceable class="parameter">window_definition</replaceable> ) [, ...] ]
    [ { UNION | INTERSECT | EXCEPT } [ ALL ] <replaceable class="parameter">select</replaceable> ]
    [ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ]
    [ LIMIT { <replaceable class="parameter">count</replaceable> | ALL } ]
--- a/doc/src/sgml/syntax.sgml
+++ b/doc/src/sgml/syntax.sgml
@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/syntax.sgml,v 1.126 2008/12/09 20:52:03 tgl Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/syntax.sgml,v 1.127 2008/12/28 18:53:54 tgl Exp $ -->

 <chapter id="sql-syntax">
 <title>SQL Syntax</title>
@ -1201,6 +1201,12 @@ SELECT 3 OPERATOR(pg_catalog.+) 4;
     </para>
    </listitem>

+    <listitem>
+     <para>
+      A window function call.
+     </para>
+    </listitem>
+
    <listitem>
     <para>
      A type cast.
@ -1445,7 +1451,7 @@ $1.somecolumn
    enclosed in parentheses:

 <synopsis>
-<replaceable>function</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional> )
+<replaceable>function_name</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional> )
 </synopsis>
   </para>

@ -1480,7 +1486,7 @@ sqrt(2)
 <synopsis>
 <replaceable>aggregate_name</replaceable> (<replaceable>expression</replaceable> [ , ... ] )
 <replaceable>aggregate_name</replaceable> (ALL <replaceable>expression</replaceable> [ , ... ] )
-<replaceable>aggregate_name</replaceable> (DISTINCT <replaceable>expression</replaceable> [ , ... ] )
+<replaceable>aggregate_name</replaceable> (DISTINCT <replaceable>expression</replaceable>)
 <replaceable>aggregate_name</replaceable> ( * )
 </synopsis>

@ -1488,7 +1494,7 @@ sqrt(2)
    defined aggregate (possibly qualified with a schema name), and
    <replaceable>expression</replaceable> is
    any value expression that does not itself contain an aggregate
-    expression.
+    expression or a window function call.
   </para>

   <para>
@ -1550,6 +1556,89 @@ sqrt(2)
   </note>
  </sect2>

+  <sect2 id="syntax-window-functions">
+   <title>Window Function Calls</title>
+
+   <indexterm zone="syntax-window-functions">
+    <primary>window function</primary>
+    <secondary>invocation</secondary>
+   </indexterm>
+
+   <indexterm zone="syntax-window-functions">
+    <primary>OVER clause</primary>
+   </indexterm>
+
+   <para>
+    A <firstterm>window function call</firstterm> represents the application
+    of an aggregate-like function over some portion of the rows selected
+    by a query.  Unlike regular aggregate function calls, this is not tied
+    to grouping of the selected rows into a single output row &mdash; each
+    row remains separate in the query output.  However the window function
+    is able to scan all the rows that would be part of the current row's
+    group according to the grouping specification (<literal>PARTITION BY</>
+    list) of the window function call.
+    The syntax of a window function call is one of the following:
+
+<synopsis>
+<replaceable>function_name</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional>) OVER ( <replaceable class="parameter">window_definition</replaceable> )
+<replaceable>function_name</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional>) OVER <replaceable>window_name</replaceable>
+<replaceable>function_name</replaceable> ( * ) OVER ( <replaceable class="parameter">window_definition</replaceable> )
+<replaceable>function_name</replaceable> ( * ) OVER <replaceable>window_name</replaceable>
+</synopsis>
+    where <replaceable class="parameter">window_definition</replaceable>
+    has the syntax
+<synopsis>
+[ <replaceable class="parameter">window_name</replaceable> ]
+[ PARTITION BY <replaceable class="parameter">expression</replaceable> [, ...] ]
+[ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ]
+</synopsis>
+
+    Here, <replaceable>expression</replaceable> represents any value
+    expression that does not itself contain window function calls.
+    The <literal>PARTITION BY</> and <literal>ORDER BY</> lists have
+    essentially the same syntax and semantics as <literal>GROUP BY</>
+    and <literal>ORDER BY</> clauses of the whole query.
+    <replaceable>window_name</replaceable> is a reference to a named window
+    specification defined in the query's <literal>WINDOW</literal> clause.
+    Named window specifications are usually referenced with just
+    <literal>OVER</> <replaceable>window_name</replaceable>, but it is
+    also possible to write a window name inside the parentheses and then
+    optionally override its ordering clause with <literal>ORDER BY</>.
+    This latter syntax follows the same rules as modifying an existing
+    window name within the <literal>WINDOW</literal> clause; see the
+    <xref linkend="sql-select" endterm="sql-select-title"> reference
+    page for details.
+   </para>
+
+   <para>
+    The built-in window functions are described in <xref
+    linkend="functions-window-table">.  Also, any built-in or
+    user-defined aggregate function can be used as a window function.
+    Currently, there is no provision for user-defined window functions
+    other than aggregates.
+   </para>
+
+   <para>
+    The syntaxes using <literal>*</> are used for calling parameter-less
+    aggregate functions as window functions, for example
+    <literal>count(*) OVER (PARTITION BY x ORDER BY y)</>.
+    <literal>*</> is customarily not used for non-aggregate window functions.
+    Aggregate window functions, unlike normal aggregate functions, do not
+    allow <literal>DISTINCT</> to be used within the function argument list.
+   </para>
+
+   <para>
+    Window function calls are permitted only in the <literal>SELECT</literal>
+    list and the <literal>ORDER BY</> clause of the query.
+   </para>
+
+   <para>
+    More information about window functions can be found in
+    <xref linkend="tutorial-window"> and
+    <xref linkend="queries-window">.
+   </para>
+  </sect2>
+
  <sect2 id="sql-syntax-type-casts">
   <title>Type Casts</title>

--- a/doc/src/sgml/xaggr.sgml
+++ b/doc/src/sgml/xaggr.sgml
@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/xaggr.sgml,v 1.36 2008/11/20 21:10:44 tgl Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/xaggr.sgml,v 1.37 2008/12/28 18:53:54 tgl Exp $ -->

 <sect1 id="xaggr">
  <title>User-Defined Aggregates</title>
@ -167,10 +167,13 @@ SELECT attrelid::regclass, array_accum(atttypid::regtype)
  <para>
   A function written in C can detect that it is being called as an
   aggregate transition or final function by seeing if it was passed
-   an <structname>AggState</> node as the function call <quote>context</>,
+   an <structname>AggState</> or <structname>WindowAggState</> node
+   as the function call <quote>context</>,
   for example by:
 <programlisting>
-        if (fcinfo->context &amp;&amp; IsA(fcinfo->context, AggState))
+        if (fcinfo-&gt;context &amp;&amp;
+            (IsA(fcinfo-&gt;context, AggState) ||
+             IsA(fcinfo-&gt;context, WindowAggState)))
 </programlisting>
   One reason for checking this is that when it is true, the first input
   must be a temporary transition value and can therefore safely be modified