1
0
mirror of https://github.com/postgres/postgres.git synced 2025-06-20 15:22:23 +03:00

Implement SQL-standard LATERAL subqueries.

This patch implements the standard syntax of LATERAL attached to a
sub-SELECT in FROM, and also allows LATERAL attached to a function in FROM,
since set-returning function calls are expected to be one of the principal
use-cases.

The main change here is a rewrite of the mechanism for keeping track of
which relations are visible for column references while the FROM clause is
being scanned.  The parser "namespace" lists are no longer lists of bare
RTEs, but are lists of ParseNamespaceItem structs, which carry an RTE
pointer as well as some visibility-controlling flags.  Aside from
supporting LATERAL correctly, this lets us get rid of the ancient hacks
that required rechecking subqueries and JOIN/ON and function-in-FROM
expressions for invalid references after they were initially parsed.
Invalid column references are now always correctly detected on sight.

In passing, remove assorted parser error checks that are now dead code by
virtue of our having gotten rid of add_missing_from, as well as some
comments that are obsolete for the same reason.  (It was mainly
add_missing_from that caused so much fudging here in the first place.)

The planner support for this feature is very minimal, and will be improved
in future patches.  It works well enough for testing purposes, though.

catversion bump forced due to new field in RangeTblEntry.
This commit is contained in:
Tom Lane
2012-08-07 19:02:54 -04:00
parent 5078be4804
commit 5ebaaa4944
41 changed files with 1301 additions and 482 deletions

View File

@ -2444,7 +2444,7 @@
</row>
<row>
<entry><token>LATERAL</token></entry>
<entry></entry>
<entry>reserved</entry>
<entry>reserved</entry>
<entry>reserved</entry>
<entry></entry>

View File

@ -590,7 +590,7 @@ SELECT a.* FROM (my_table AS a JOIN your_table AS b ON ...) AS c
<para>
Subqueries specifying a derived table must be enclosed in
parentheses and <emphasis>must</emphasis> be assigned a table
alias name. (See <xref linkend="queries-table-aliases">.) For
alias name (as in <xref linkend="queries-table-aliases">). For
example:
<programlisting>
FROM (SELECT * FROM table1) AS alias_name
@ -697,6 +697,87 @@ SELECT *
expand to.
</para>
</sect3>
<sect3 id="queries-lateral">
<title><literal>LATERAL</> Subqueries</title>
<indexterm zone="queries-lateral">
<primary>LATERAL</>
<secondary>in the FROM clause</>
</indexterm>
<para>
Subqueries and table functions appearing in <literal>FROM</> can be
preceded by the key word <literal>LATERAL</>. This allows them to
reference columns provided by preceding <literal>FROM</> items.
(Without <literal>LATERAL</literal>, each <literal>FROM</> item is
evaluated independently and so cannot cross-reference any other
<literal>FROM</> item.)
A <literal>LATERAL</literal> item can appear at top level in the
<literal>FROM</> list, or within a <literal>JOIN</> tree; in the latter
case it can also refer to any items that are on the left-hand side of a
<literal>JOIN</> that it is on the right-hand side of.
</para>
<para>
When a <literal>FROM</> item contains <literal>LATERAL</literal>
cross-references, evaluation proceeds as follows: for each row of the
<literal>FROM</> item providing the cross-referenced column(s), or
set of rows of multiple <literal>FROM</> items providing the
columns, the <literal>LATERAL</literal> item is evaluated using that
row or row set's values of the columns. The resulting row(s) are
joined as usual with the rows they were computed from. This is
repeated for each row or set of rows from the column source table(s).
</para>
<para>
A trivial example of <literal>LATERAL</literal> is
<programlisting>
SELECT * FROM foo, LATERAL (SELECT * FROM bar WHERE bar.id = foo.bar_id) ss;
</programlisting>
This is not especially useful since it has exactly the same result as
the more conventional
<programlisting>
SELECT * FROM foo, bar WHERE bar.id = foo.bar_id;
</programlisting>
<literal>LATERAL</literal> is primarily useful when the cross-referenced
column is necessary for computing the row(s) to be joined. A common
application is providing an argument value for a set-returning function.
For example, supposing that <function>vertices(polygon)</> returns the
set of vertices of a polygon, we could identify close-together vertices
of polygons stored in a table with:
<programlisting>
SELECT p1.id, p2.id, v1, v2
FROM polygons p1, polygons p2,
LATERAL vertices(p1.poly) v1,
LATERAL vertices(p2.poly) v2
WHERE (v1 &lt;-&gt; v2) &lt; 10 AND p1.id != p2.id;
</programlisting>
This query could also be written
<programlisting>
SELECT p1.id, p2.id, v1, v2
FROM polygons p1 CROSS JOIN LATERAL vertices(p1.poly) v1,
polygons p2 CROSS JOIN LATERAL vertices(p2.poly) v2
WHERE (v1 &lt;-&gt; v2) &lt; 10 AND p1.id != p2.id;
</programlisting>
or in several other equivalent formulations.
</para>
<para>
It is often particularly handy to <literal>LEFT JOIN</> to a
<literal>LATERAL</literal> subquery, so that source rows will appear in
the result even if the <literal>LATERAL</literal> subquery produces no
rows for them. For example, if <function>get_product_names()</> returns
the names of products made by a manufacturer, but some manufacturers in
our table currently produce no products, we could find out which ones
those are like this:
<programlisting>
SELECT m.name
FROM manufacturers m LEFT JOIN LATERAL get_product_names(m.id) pname ON true
WHERE pname IS NULL;
</programlisting>
</para>
</sect3>
</sect2>
<sect2 id="queries-where">

View File

@ -50,10 +50,10 @@ SELECT [ ALL | DISTINCT [ ON ( <replaceable class="parameter">expression</replac
<phrase>where <replaceable class="parameter">from_item</replaceable> can be one of:</phrase>
[ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ [ AS ] <replaceable class="parameter">alias</replaceable> [ ( <replaceable class="parameter">column_alias</replaceable> [, ...] ) ] ]
( <replaceable class="parameter">select</replaceable> ) [ AS ] <replaceable class="parameter">alias</replaceable> [ ( <replaceable class="parameter">column_alias</replaceable> [, ...] ) ]
[ LATERAL ] ( <replaceable class="parameter">select</replaceable> ) [ AS ] <replaceable class="parameter">alias</replaceable> [ ( <replaceable class="parameter">column_alias</replaceable> [, ...] ) ]
<replaceable class="parameter">with_query_name</replaceable> [ [ AS ] <replaceable class="parameter">alias</replaceable> [ ( <replaceable class="parameter">column_alias</replaceable> [, ...] ) ] ]
<replaceable class="parameter">function_name</replaceable> ( [ <replaceable class="parameter">argument</replaceable> [, ...] ] ) [ AS ] <replaceable class="parameter">alias</replaceable> [ ( <replaceable class="parameter">column_alias</replaceable> [, ...] | <replaceable class="parameter">column_definition</replaceable> [, ...] ) ]
<replaceable class="parameter">function_name</replaceable> ( [ <replaceable class="parameter">argument</replaceable> [, ...] ] ) AS ( <replaceable class="parameter">column_definition</replaceable> [, ...] )
[ LATERAL ] <replaceable class="parameter">function_name</replaceable> ( [ <replaceable class="parameter">argument</replaceable> [, ...] ] ) [ AS ] <replaceable class="parameter">alias</replaceable> [ ( <replaceable class="parameter">column_alias</replaceable> [, ...] | <replaceable class="parameter">column_definition</replaceable> [, ...] ) ]
[ LATERAL ] <replaceable class="parameter">function_name</replaceable> ( [ <replaceable class="parameter">argument</replaceable> [, ...] ] ) AS ( <replaceable class="parameter">column_definition</replaceable> [, ...] )
<replaceable class="parameter">from_item</replaceable> [ NATURAL ] <replaceable class="parameter">join_type</replaceable> <replaceable class="parameter">from_item</replaceable> [ ON <replaceable class="parameter">join_condition</replaceable> | USING ( <replaceable class="parameter">join_column</replaceable> [, ...] ) ]
<phrase>and <replaceable class="parameter">with_query</replaceable> is:</phrase>
@ -284,8 +284,8 @@ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ]
The <literal>FROM</literal> clause specifies one or more source
tables for the <command>SELECT</command>. If multiple sources are
specified, the result is the Cartesian product (cross join) of all
the sources. But usually qualification conditions
are added to restrict the returned rows to a small subset of the
the sources. But usually qualification conditions are added (via
<literal>WHERE</>) to restrict the returned rows to a small subset of the
Cartesian product.
</para>
@ -414,17 +414,18 @@ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ]
</para>
<para>
A <literal>JOIN</literal> clause combines two
<literal>FROM</> items. Use parentheses if necessary to
determine the order of nesting. In the absence of parentheses,
<literal>JOIN</literal>s nest left-to-right. In any case
<literal>JOIN</literal> binds more tightly than the commas
separating <literal>FROM</> items.
A <literal>JOIN</literal> clause combines two <literal>FROM</>
items, which for convenience we will refer to as <quote>tables</>,
though in reality they can be any type of <literal>FROM</> item.
Use parentheses if necessary to determine the order of nesting.
In the absence of parentheses, <literal>JOIN</literal>s nest
left-to-right. In any case <literal>JOIN</literal> binds more
tightly than the commas separating <literal>FROM</>-list items.
</para>
<para><literal>CROSS JOIN</> and <literal>INNER JOIN</literal>
produce a simple Cartesian product, the same result as you get from
listing the two items at the top level of <literal>FROM</>,
listing the two tables at the top level of <literal>FROM</>,
but restricted by the join condition (if any).
<literal>CROSS JOIN</> is equivalent to <literal>INNER JOIN ON
(TRUE)</>, that is, no rows are removed by qualification.
@ -449,7 +450,7 @@ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ]
joined rows, plus one row for each unmatched right-hand row
(extended with nulls on the left). This is just a notational
convenience, since you could convert it to a <literal>LEFT
OUTER JOIN</> by switching the left and right inputs.
OUTER JOIN</> by switching the left and right tables.
</para>
<para><literal>FULL OUTER JOIN</> returns all the joined rows, plus
@ -495,6 +496,47 @@ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ]
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>LATERAL</literal></term>
<listitem>
<para>The <literal>LATERAL</literal> key word can precede a
sub-<command>SELECT</command> or function-call <literal>FROM</>
item. This allows the sub-<command>SELECT</command> or function
expression to refer to columns of <literal>FROM</> items that appear
before it in the <literal>FROM</> list. (Without
<literal>LATERAL</literal>, each <literal>FROM</> item is evaluated
independently and so cannot cross-reference any other
<literal>FROM</> item.) A <literal>LATERAL</literal> item can
appear at top level in the <literal>FROM</> list, or within a
<literal>JOIN</> tree; in the latter case it can also refer to any
items that are on the left-hand side of a <literal>JOIN</> that it is
on the right-hand side of.
</para>
<para>
When a <literal>FROM</> item contains <literal>LATERAL</literal>
cross-references, evaluation proceeds as follows: for each row of the
<literal>FROM</> item providing the cross-referenced column(s), or
set of rows of multiple <literal>FROM</> items providing the
columns, the <literal>LATERAL</literal> item is evaluated using that
row or row set's values of the columns. The resulting row(s) are
joined as usual with the rows they were computed from. This is
repeated for each row or set of rows from the column source table(s).
</para>
<para>
The column source table(s) must be <literal>INNER</> or
<literal>LEFT</> joined to the <literal>LATERAL</literal> item, else
there would not be a well-defined set of rows from which to compute
each set of rows for the <literal>LATERAL</literal> item. Thus,
although a construct such as <literal><replaceable>X</> RIGHT JOIN
LATERAL <replaceable>Y</></literal> is syntactically valid, it is
not actually allowed for <replaceable>Y</> to reference
<replaceable>X</>.
</para>
</listitem>
</varlistentry>
</variablelist>
</para>
</refsect2>
@ -1532,6 +1574,26 @@ SELECT distance, employee_name FROM employee_recursive;
else the query will loop indefinitely. (See <xref linkend="queries-with">
for more examples.)
</para>
<para>
This example uses <literal>LATERAL</> to apply a set-returning function
<function>get_product_names()</> for each row of the
<structname>manufacturers</> table:
<programlisting>
SELECT m.name AS mname, pname
FROM manufacturers m, LATERAL get_product_names(m.id) pname;
</programlisting>
Manufacturers not currently having any products would not appear in the
result, since it is an inner join. If we wished to include the names of
such manufacturers in the result, we could do:
<programlisting>
SELECT m.name AS mname, pname
FROM manufacturers m LEFT JOIN LATERAL get_product_names(m.id) pname ON true;
</programlisting>
</para>
</refsect1>
<refsect1>
@ -1611,6 +1673,20 @@ SELECT distributors.* WHERE distributors.name = 'Westward';
</para>
</refsect2>
<refsect2>
<title>Function Calls in <literal>FROM</literal></title>
<para>
<productname>PostgreSQL</productname> allows a function call to be
written directly as a member of the <literal>FROM</> list. In the SQL
standard it would be necessary to wrap such a function call in a
sub-<command>SELECT</command>; that is, the syntax
<literal>FROM <replaceable>func</>(...) <replaceable>alias</></literal>
is approximately equivalent to
<literal>FROM (SELECT <replaceable>func</>(...)) <replaceable>alias</></literal>.
</para>
</refsect2>
<refsect2>
<title>Namespace Available to <literal>GROUP BY</literal> and <literal>ORDER BY</literal></title>