mirror of
https://github.com/postgres/postgres.git
synced 2025-07-30 11:03:19 +03:00
SEARCH and CYCLE clauses
This adds the SQL standard feature that adds the SEARCH and CYCLE clauses to recursive queries to be able to do produce breadth- or depth-first search orders and detect cycles. These clauses can be rewritten into queries using existing syntax, and that is what this patch does in the rewriter. Reviewed-by: Vik Fearing <vik@postgresfriends.org> Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/db80ceee-6f97-9b4a-8ee8-3ba0c58e5be2@2ndquadrant.com
This commit is contained in:
@ -2218,6 +2218,39 @@ SELECT * FROM search_tree <emphasis>ORDER BY depth</emphasis>;
|
||||
in any case.
|
||||
</para>
|
||||
</tip>
|
||||
|
||||
<para>
|
||||
There is built-in syntax to compute a depth- or breadth-first sort column.
|
||||
For example:
|
||||
|
||||
<programlisting>
|
||||
WITH RECURSIVE search_tree(id, link, data) AS (
|
||||
SELECT t.id, t.link, t.data
|
||||
FROM tree t
|
||||
UNION ALL
|
||||
SELECT t.id, t.link, t.data
|
||||
FROM tree t, search_tree st
|
||||
WHERE t.id = st.link
|
||||
) <emphasis>SEARCH DEPTH FIRST BY id SET ordercol</emphasis>
|
||||
SELECT * FROM search_tree ORDER BY ordercol;
|
||||
|
||||
WITH RECURSIVE search_tree(id, link, data) AS (
|
||||
SELECT t.id, t.link, t.data
|
||||
FROM tree t
|
||||
UNION ALL
|
||||
SELECT t.id, t.link, t.data
|
||||
FROM tree t, search_tree st
|
||||
WHERE t.id = st.link
|
||||
) <emphasis>SEARCH BREADTH FIRST BY id SET ordercol</emphasis>
|
||||
SELECT * FROM search_tree ORDER BY ordercol;
|
||||
</programlisting>
|
||||
This syntax is internally expanded to something similar to the above
|
||||
hand-written forms. The <literal>SEARCH</literal> clause specifies whether
|
||||
depth- or breadth first search is wanted, the list of columns to track for
|
||||
sorting, and a column name that will contain the result data that can be
|
||||
used for sorting. That column will implicitly be added to the output rows
|
||||
of the CTE.
|
||||
</para>
|
||||
</sect3>
|
||||
|
||||
<sect3 id="queries-with-cycle">
|
||||
@ -2305,10 +2338,39 @@ SELECT * FROM search_graph;
|
||||
</para>
|
||||
</tip>
|
||||
|
||||
<para>
|
||||
There is built-in syntax to simplify cycle detection. The above query can
|
||||
also be written like this:
|
||||
<programlisting>
|
||||
WITH RECURSIVE search_graph(id, link, data, depth) AS (
|
||||
SELECT g.id, g.link, g.data, 1
|
||||
FROM graph g
|
||||
UNION ALL
|
||||
SELECT g.id, g.link, g.data, sg.depth + 1
|
||||
FROM graph g, search_graph sg
|
||||
WHERE g.id = sg.link
|
||||
) <emphasis>CYCLE id SET is_cycle TO true DEFAULT false USING path</emphasis>
|
||||
SELECT * FROM search_graph;
|
||||
</programlisting>
|
||||
and it will be internally rewritten to the above form. The
|
||||
<literal>CYCLE</literal> clause specifies first the list of columns to
|
||||
track for cycle detection, then a column name that will show whether a
|
||||
cycle has been detected, then two values to use in that column for the yes
|
||||
and no cases, and finally the name of another column that will track the
|
||||
path. The cycle and path columns will implicitly be added to the output
|
||||
rows of the CTE.
|
||||
</para>
|
||||
|
||||
<tip>
|
||||
<para>
|
||||
The cycle path column is computed in the same way as the depth-first
|
||||
ordering column show in the previous section.
|
||||
ordering column show in the previous section. A query can have both a
|
||||
<literal>SEARCH</literal> and a <literal>CYCLE</literal> clause, but a
|
||||
depth-first search specification and a cycle detection specification would
|
||||
create redundant computations, so it's more efficient to just use the
|
||||
<literal>CYCLE</literal> clause and order by the path column. If
|
||||
breadth-first ordering is wanted, then specifying both
|
||||
<literal>SEARCH</literal> and <literal>CYCLE</literal> can be useful.
|
||||
</para>
|
||||
</tip>
|
||||
|
||||
|
@ -73,6 +73,8 @@ SELECT [ ALL | DISTINCT [ ON ( <replaceable class="parameter">expression</replac
|
||||
<phrase>and <replaceable class="parameter">with_query</replaceable> is:</phrase>
|
||||
|
||||
<replaceable class="parameter">with_query_name</replaceable> [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ] AS [ [ NOT ] MATERIALIZED ] ( <replaceable class="parameter">select</replaceable> | <replaceable class="parameter">values</replaceable> | <replaceable class="parameter">insert</replaceable> | <replaceable class="parameter">update</replaceable> | <replaceable class="parameter">delete</replaceable> )
|
||||
[ SEARCH { BREADTH | DEPTH } FIRST BY <replaceable>column_name</replaceable> [, ...] SET <replaceable>search_seq_col_name</replaceable> ]
|
||||
[ CYCLE <replaceable>column_name</replaceable> [, ...] SET <replaceable>cycle_mark_col_name</replaceable> TO <replaceable>cycle_mark_value</replaceable> DEFAULT <replaceable>cycle_mark_default</replaceable> USING <replaceable>cycle_path_col_name</replaceable> ]
|
||||
|
||||
TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ]
|
||||
</synopsis>
|
||||
@ -276,6 +278,48 @@ TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ]
|
||||
queries that do not use recursion or forward references.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The optional <literal>SEARCH</literal> clause computes a <firstterm>search
|
||||
sequence column</firstterm> that can be used for ordering the results of a
|
||||
recursive query in either breadth-first or depth-first order. The
|
||||
supplied column name list specifies the row key that is to be used for
|
||||
keeping track of visited rows. A column named
|
||||
<replaceable>search_seq_col_name</replaceable> will be added to the result
|
||||
column list of the <literal>WITH</literal> query. This column can be
|
||||
ordered by in the outer query to achieve the respective ordering. See
|
||||
<xref linkend="queries-with-search"/> for examples.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The optional <literal>CYCLE</literal> clause is used to detect cycles in
|
||||
recursive queries. The supplied column name list specifies the row key
|
||||
that is to be used for keeping track of visited rows. A column named
|
||||
<replaceable>cycle_mark_col_name</replaceable> will be added to the result
|
||||
column list of the <literal>WITH</literal> query. This column will be set
|
||||
to <replaceable>cycle_mark_value</replaceable> when a cycle has been
|
||||
detected, else to <replaceable>cycle_mark_default</replaceable>.
|
||||
Furthermore, processing of the recursive union will stop when a cycle has
|
||||
been detected. <replaceable>cycle_mark_value</replaceable> and
|
||||
<replaceable>cycle_mark_default</replaceable> must be constants and they
|
||||
must be coercible to a common data type, and the data type must have an
|
||||
inequality operator. (The SQL standard requires that they be character
|
||||
strings, but PostgreSQL does not require that.) Furthermore, a column
|
||||
named <replaceable>cycle_path_col_name</replaceable> will be added to the
|
||||
result column list of the <literal>WITH</literal> query. This column is
|
||||
used internally for tracking visited rows. See <xref
|
||||
linkend="queries-with-cycle"/> for examples.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Both the <literal>SEARCH</literal> and the <literal>CYCLE</literal> clause
|
||||
are only valid for recursive <literal>WITH</literal> queries. The
|
||||
<replaceable>with_query</replaceable> must be a <literal>UNION</literal>
|
||||
(or <literal>UNION ALL</literal>) of two <literal>SELECT</literal> (or
|
||||
equivalent) commands (no nested <literal>UNION</literal>s). If both
|
||||
clauses are used, the column added by the <literal>SEARCH</literal> clause
|
||||
appears before the columns added by the <literal>CYCLE</literal> clause.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The primary query and the <literal>WITH</literal> queries are all
|
||||
(notionally) executed at the same time. This implies that the effects of
|
||||
|
Reference in New Issue
Block a user