mirror of
				https://github.com/postgres/postgres.git
				synced 2025-10-25 13:17:41 +03:00 
			
		
		
		
	Rip out table expression section from SQL syntax chapter and develop it
into new chapter on query (SELECT) syntax. In the end this should become a narrative and example-filled counterpart to the SELECT reference page.
This commit is contained in:
		| @@ -1,4 +1,4 @@ | ||||
| <!-- $Header: /cvsroot/pgsql/doc/src/sgml/filelist.sgml,v 1.4 2001/01/06 11:58:56 petere Exp $ --> | ||||
| <!-- $Header: /cvsroot/pgsql/doc/src/sgml/filelist.sgml,v 1.5 2001/01/22 23:34:32 petere Exp $ --> | ||||
|  | ||||
| <!entity about      SYSTEM "about.sgml"> | ||||
| <!entity history    SYSTEM "history.sgml"> | ||||
| @@ -31,7 +31,7 @@ | ||||
| <!entity plsql      SYSTEM "plsql.sgml"> | ||||
| <!entity pltcl      SYSTEM "pltcl.sgml"> | ||||
| <!entity psql       SYSTEM "psql.sgml"> | ||||
| <!entity query-ug   SYSTEM "query-ug.sgml"> | ||||
| <!entity queries    SYSTEM "queries.sgml"> | ||||
| <!entity storage    SYSTEM "storage.sgml"> | ||||
| <!entity syntax     SYSTEM "syntax.sgml"> | ||||
| <!entity typeconv   SYSTEM "typeconv.sgml"> | ||||
|   | ||||
							
								
								
									
										819
									
								
								doc/src/sgml/queries.sgml
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										819
									
								
								doc/src/sgml/queries.sgml
									
									
									
									
									
										Normal file
									
								
							| @@ -0,0 +1,819 @@ | ||||
| <!-- $Header: /cvsroot/pgsql/doc/src/sgml/queries.sgml,v 1.1 2001/01/22 23:34:33 petere Exp $ --> | ||||
|  | ||||
| <chapter id="queries"> | ||||
|  <title>Queries</title> | ||||
|  | ||||
|  <para> | ||||
|   A <firstterm>query</firstterm> is the process of or the command to | ||||
|   retrieve data from a database.  In SQL the <command>SELECT</command> | ||||
|   command is used to specify queries.  The general syntax of the | ||||
|   <command>SELECT</command> command is | ||||
| <synopsis> | ||||
| SELECT <replaceable>select_list</replaceable> FROM <replaceable>table_expression</replaceable> <optional><replaceable>sort_specification</replaceable></optional> | ||||
| </synopsis> | ||||
|   The following sections describe the details of the select list, the | ||||
|   table expression, and the sort specification.  The simplest kind of | ||||
|   query has the form | ||||
| <programlisting> | ||||
| SELECT * FROM table1; | ||||
| </programlisting> | ||||
|   Assuming that there is a table called table1, this command would | ||||
|   retrieve all rows and all columns from table1.  (The method of | ||||
|   retrieval depends on the client application.  For example, the | ||||
|   <application>psql</application> program will display an ASCII-art | ||||
|   table on the screen, client libraries will offer functions to | ||||
|   retrieve individual rows and columns.)  The select list | ||||
|   specification <literal>*</literal> means all columns that the table | ||||
|   expression happens to provide.  A select list can also select a | ||||
|   subset of the available columns or even make calculations on the | ||||
|   columns before retrieving them; see <xref | ||||
|   linkend="queries-select-lists">.  For example, if table1 has columns | ||||
|   named a, b, and c (and perhaps others) you can make the following | ||||
|   query: | ||||
| <programlisting> | ||||
| SELECT a, b + c FROM table1; | ||||
| </programlisting> | ||||
|   (assuming that b and c are of a numeric data type). | ||||
|  </para> | ||||
|  | ||||
|  <para> | ||||
|   <literal>FROM table1</literal> is a particularly simple kind of | ||||
|   table expression.  In general, table expressions can be complex | ||||
|   constructs of base tables, joins, and subqueries.  But you can also | ||||
|   omit the table expression entirely and use the SELECT command as a | ||||
|   calculator: | ||||
| <programlisting> | ||||
| SELECT 3 * 4; | ||||
| </programlisting> | ||||
|   This is more useful if the expressions in the select list return | ||||
|   varying results.  For example, you could call a function this way. | ||||
| <programlisting> | ||||
| SELECT random(); | ||||
| </programlisting> | ||||
|  </para> | ||||
|  | ||||
|  <sect1 id="queries-table-expressions"> | ||||
|   <title>Table Expressions</title> | ||||
|  | ||||
|   <para> | ||||
|    A <firstterm>table expression</firstterm> specifies a table.  The | ||||
|    table expression contains a FROM clause that is optionally followed | ||||
|    by WHERE, GROUP BY, and HAVING clauses.  Trivial table expressions | ||||
|    simply refer to a table on disk, a so-called base table, but more | ||||
|    complex expressions can be used to modify or combine base tables in | ||||
|    various ways. | ||||
|   </para> | ||||
|  | ||||
|   <para> | ||||
|    The WHERE, GROUP BY, and HAVING clauses in the table expression | ||||
|    specify a pipeline of successive transformations performed on the | ||||
|    table derived in the FROM clause.  The final transformed table that | ||||
|    is derived provides the input rows used to derive output rows as | ||||
|    specified by the select list of derived column value expressions. | ||||
|   </para> | ||||
| 	 | ||||
|   <sect2 id="queries-from"> | ||||
|    <title>FROM clause</title> | ||||
|   | ||||
|    <para> | ||||
|     The FROM clause derives a table from one or more other tables | ||||
|     given in a comma-separated table reference list. | ||||
| <synopsis> | ||||
| FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_reference</replaceable> <optional>, ...</optional></optional> | ||||
| </synopsis> | ||||
|  | ||||
|     A table reference may be a table name or a derived table such as a | ||||
|     subquery, a table join, or complex combinations of these.  If more | ||||
|     than one table reference is listed in the FROM clause they are | ||||
|     CROSS JOINed (see below) to form the derived table that may then | ||||
|     be subject to transformations by the WHERE, GROUP BY, and HAVING | ||||
|     clauses and is finally the result of the overall table expression. | ||||
|    </para> | ||||
|  | ||||
|    <para> | ||||
|     If a table reference is a simple table name and it is the | ||||
|     supertable in a table inheritance hierarchy, rows of the table | ||||
|     include rows from all of its subtable successors unless the | ||||
|     keyword ONLY precedes the table name. | ||||
|    </para> | ||||
|  | ||||
|    <sect3 id="queries-join"> | ||||
|     <title>Joined Tables</title> | ||||
|  | ||||
|     <para> | ||||
|      A joined table is a table derived from two other (real or | ||||
|      derived) tables according to the rules of the particular join | ||||
|      type.  INNER, OUTER, NATURAL, and CROSS JOIN are supported. | ||||
|     </para> | ||||
|  | ||||
|     <variablelist> | ||||
|      <title>Join Types</title> | ||||
|  | ||||
|      <varlistentry> | ||||
|       <term>CROSS JOIN</term> | ||||
|  | ||||
|       <listitem> | ||||
| <synopsis> | ||||
| <replaceable>T1</replaceable> CROSS JOIN <replaceable>T2</replaceable> | ||||
| </synopsis> | ||||
|  | ||||
|        <para> | ||||
|         For each combination of rows from | ||||
|         <replaceable>T1</replaceable> and | ||||
|         <replaceable>T2</replaceable> the derived table will contain a | ||||
|         row consisting of all columns in <replaceable>T1</replaceable> | ||||
|         followed by all columns in <replaceable>T2</replaceable>.  If | ||||
|         the tables have have N and M rows respectively, the joined | ||||
|         table will have N * M rows.  A cross join is essentially an | ||||
|         <literal>INNER JOIN ON TRUE</literal>. | ||||
|        </para> | ||||
|  | ||||
|        <tip> | ||||
|         <para> | ||||
|          <literal>FROM <replaceable>T1</replaceable> CROSS JOIN | ||||
|          <replaceable>T2</replaceable></literal> is equivalent to | ||||
|          <literal>FROM <replaceable>T1</replaceable>, | ||||
|          <replaceable>T2</replaceable></literal>. | ||||
|         </para> | ||||
|        </tip> | ||||
|       </listitem> | ||||
|      </varlistentry> | ||||
|  | ||||
|      <varlistentry> | ||||
|       <term>Qualified JOINs</term> | ||||
|       <listitem> | ||||
|  | ||||
| <synopsis> | ||||
| <replaceable>T1</replaceable> { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> } JOIN <replaceable>T2</replaceable> ON <replaceable>boolean expression</replaceable> | ||||
| <replaceable>T1</replaceable> { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> } JOIN <replaceable>T2</replaceable> USING ( <replaceable>join column list</replaceable> ) | ||||
| </synopsis> | ||||
|          | ||||
|        <para> | ||||
|         The words <token>INNER</token> and <token>OUTER</token> are | ||||
|         optional for all JOINs.  <token>INNER</token> is the default; | ||||
|         <token>LEFT</token>, <token>RIGHT</token>, and | ||||
|         <token>FULL</token> are for OUTER JOINs only. | ||||
|        </para> | ||||
|  | ||||
|        <para> | ||||
|         The <firstterm>join condition</firstterm> is specified in the | ||||
|         ON or USING clause.  (The meaning of the join condition | ||||
|         depends on the particular join type; see below.)  The ON | ||||
|         clause takes a Boolean value expression of the same kind as is | ||||
|         used in a WHERE clause.  The USING clause takes a | ||||
|         comma-separated list of column names, which the joined tables | ||||
|         must have in common, and joins the tables on the equality of | ||||
|         those columns as a set, resulting in a joined table having one | ||||
|         column for each common column listed and all of the other | ||||
|         columns from both tables.  Thus, <literal>USING (a, b, | ||||
|         c)</literal> is equivalent to <literal>ON (t1.a = t2.a AND | ||||
|         t1.b = t2.b AND t1.c = t2.c)</literal> with the exception that | ||||
|         if ON is used there will be two columns a, b, and c in the | ||||
|         result, whereas with USING there will be only one of each. | ||||
|        </para> | ||||
|  | ||||
|        <variablelist> | ||||
|         <varlistentry> | ||||
|          <term>INNER JOIN</term> | ||||
|  | ||||
|          <listitem> | ||||
|           <para> | ||||
|            For each row R1 of T1, the joined table has a row for each | ||||
|            row in T2 that satisfies the join condition with R1. | ||||
|           </para> | ||||
|          </listitem> | ||||
|         </varlistentry> | ||||
|  | ||||
|         <varlistentry> | ||||
|          <term>LEFT OUTER JOIN</term> | ||||
|  | ||||
|          <listitem> | ||||
|           <para> | ||||
|            First, an INNER JOIN is performed.  Then, for a row in T1 | ||||
|            that does not satisfy the join condition with any row in | ||||
|            T2, a joined row is returned with NULL values in columns of | ||||
|            T2.  Thus, the joined table unconditionally has a row for each | ||||
|            row in T1. | ||||
|           </para> | ||||
|          </listitem> | ||||
|         </varlistentry> | ||||
|           | ||||
|         <varlistentry> | ||||
|          <term>RIGHT OUTER JOIN</term> | ||||
|  | ||||
|          <listitem> | ||||
|           <para> | ||||
|            This is like a left join, only that the result table will | ||||
|            unconditionally have a row for each row in T2. | ||||
|           </para> | ||||
|          </listitem> | ||||
|         </varlistentry> | ||||
|           | ||||
|         <varlistentry> | ||||
|          <term>FULL OUTER JOIN</term> | ||||
|  | ||||
|          <listitem> | ||||
|           <para> | ||||
|            First, an INNER JOIN is performed.  Then, for each row in | ||||
|            T1 that does not satisfy the join condition with any row in | ||||
|            T2, a joined row is returned with null values in columns of | ||||
|            T2.  Also, for each row of T2 that does not satisfy the | ||||
|            join condition with any row in T1, a joined row with null | ||||
|            values in the columns of T1 is returned. | ||||
|           </para> | ||||
|          </listitem> | ||||
|         </varlistentry> | ||||
|        </variablelist> | ||||
|       </listitem> | ||||
|      </varlistentry> | ||||
|        | ||||
|      <varlistentry> | ||||
|       <term>NATURAL JOIN</term> | ||||
|  | ||||
|       <listitem> | ||||
| <synopsis> | ||||
| <replaceable>T1</replaceable> NATURAL { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> JOIN <replaceable>T2</replaceable> | ||||
| </synopsis> | ||||
|        <para> | ||||
|         A natural join creates a joined table where every pair of matching | ||||
|         column names between the two tables are merged into one column. The | ||||
|         join specification is effectively a USING clause containing all the | ||||
|         common column names and is otherwise like a Qualified JOIN. | ||||
|        </para> | ||||
|       </listitem> | ||||
|      </varlistentry> | ||||
|     </variablelist> | ||||
|  | ||||
|     <para> | ||||
|      Joins of all types can be chained together or nested where either | ||||
|      or both of <replaceable>T1</replaceable> and | ||||
|      <replaceable>T2</replaceable> may be JOINed tables.  Parenthesis | ||||
|      can be used around JOIN clauses to control the join order which | ||||
|      are otherwise left to right. | ||||
|     </para> | ||||
|    </sect3> | ||||
|  | ||||
|    <sect3 id="queries-subqueries"> | ||||
|     <title>Subqueries</title> | ||||
|  | ||||
|     <para> | ||||
|      Subqueries specifying a derived table must be enclosed in | ||||
|      parenthesis and <emphasis>must</emphasis> be named using an AS | ||||
|      clause.  (See <xref linkend="queries-table-aliases">.) | ||||
|     </para> | ||||
|  | ||||
| <programlisting> | ||||
| FROM (SELECT * FROM table1) AS alias_name | ||||
| </programlisting> | ||||
|  | ||||
|     <para> | ||||
|      This example is equivalent to <literal>FROM table1 AS | ||||
|      alias_name</literal>.  Many subqueries can be written as table | ||||
|      joins instead. | ||||
|     </para> | ||||
|    </sect3> | ||||
|  | ||||
|    <sect3 id="queries-table-aliases"> | ||||
|     <title>Table and Column Aliases</title> | ||||
|  | ||||
|     <para> | ||||
|      A temporary name can be given to tables and complex table | ||||
|      references to be used for references to the derived table in | ||||
|      further processing.  This is called a <firstterm>table | ||||
|      alias</firstterm>. | ||||
| <synopsis> | ||||
| FROM <replaceable>table_reference</replaceable> AS <replaceable>alias</replaceable> | ||||
| </synopsis> | ||||
|      Here, <replaceable>alias</replaceable> can be any regular | ||||
|      identifier.  The alias becomes the new name of the table | ||||
|      reference for the current query -- it is no longer possible to | ||||
|      refer to the table by the original name (if the table reference | ||||
|      was an ordinary base table).  Thus | ||||
| <programlisting> | ||||
| SELECT * FROM my_table AS m WHERE my_table.a > 5; | ||||
| </programlisting> | ||||
|      is not valid SQL syntax.  What will happen instead, as a | ||||
|      <productname>Postgres</productname> extension, is that an implicit | ||||
|      table reference is added to the FROM clause, so the query is | ||||
|      processed as if it was written as | ||||
| <programlisting> | ||||
| SELECT * FROM my_table AS m, my_table WHERE my_table.a > 5; | ||||
| </programlisting> | ||||
|      Table aliases are mainly for notational convenience, but it is | ||||
|      necessary to use them when joining a table to itself, e.g., | ||||
| <programlisting> | ||||
| SELECT * FROM my_table AS a CROSS JOIN my_table AS b ... | ||||
| </programlisting> | ||||
|      Additionally, an alias is required if the table reference is a | ||||
|      subquery. | ||||
|     </para> | ||||
|  | ||||
|     <para> | ||||
|      Parenthesis are used to resolve ambiguities.  The following | ||||
|      statement will assign the alias <literal>b</literal> to the | ||||
|      result of the join, unlike the previous example: | ||||
| <programlisting> | ||||
| SELECT * FROM (my_table AS a CROSS JOIN my_table) AS b ... | ||||
| </programlisting> | ||||
|     </para> | ||||
|  | ||||
|     <para> | ||||
| <synopsis> | ||||
| FROM <replaceable>table_reference</replaceable> <replaceable>alias</replaceable> | ||||
| </synopsis> | ||||
|      This form is equivalent the previously treated one; the | ||||
|      <token>AS</token> key word is noise. | ||||
|     </para> | ||||
|  | ||||
|     <para> | ||||
| <synopsis> | ||||
| FROM <replaceable>table_reference</replaceable> <optional>AS</optional> <replaceable>alias</replaceable> ( <replaceable>column1</replaceable> <optional>, <replaceable>column2</replaceable> <optional>, ...</optional></optional> ) | ||||
| </synopsis> | ||||
|      In addition to renaming the table as described above, the columns | ||||
|      of the table are also given temporary names.  If less column | ||||
|      aliases are specified than the actual table has columns, the last | ||||
|      columns are not renamed.  This syntax is especially useful for | ||||
|      self-joins or subqueries. | ||||
|     </para> | ||||
|    </sect3> | ||||
|  | ||||
|    <sect3 id="queries-table-expression-examples"> | ||||
|     <title>Examples</title> | ||||
|  | ||||
|     <para> | ||||
| <programlisting> | ||||
| FROM T1 INNER JOIN T2 USING (C) | ||||
| FROM T1 LEFT OUTER JOIN T2 USING (C) | ||||
| FROM (T1 RIGHT OUTER JOIN T2 ON (T1C1=T2C1)) AS DT1 | ||||
| FROM (T1 FULL OUTER JOIN T2 USING (C)) AS DT1 (DT1C1, DT1C2) | ||||
|  | ||||
| FROM T1 NATURAL INNER JOIN T2 | ||||
| FROM T1 NATURAL LEFT OUTER JOIN T2 | ||||
| FROM T1 NATURAL RIGHT OUTER JOIN T2 | ||||
| FROM T1 NATURAL FULL OUTER JOIN T2 | ||||
|  | ||||
| FROM (SELECT * FROM T1) DT1 CROSS JOIN T2, T3 | ||||
| FROM (SELECT * FROM T1) DT1, T2, T3 | ||||
| </programlisting> | ||||
|  | ||||
|      Above are some examples of joined tables and complex derived | ||||
|      tables.  Notice how the AS clause renames or names a derived | ||||
|      table and how the optional comma-separated list of column names | ||||
|      that follows gives names or renames the columns.  The last two | ||||
|      FROM clauses produce the same derived table from T1, T2, and T3. | ||||
|      The AS keyword was omitted in naming the subquery as DT1.  The | ||||
|      keywords OUTER and INNER are noise that can be omitted also. | ||||
|     </para> | ||||
|    </sect3> | ||||
|  | ||||
|   </sect2> | ||||
|  | ||||
|   <sect2 id="queries-where"> | ||||
|    <title>WHERE clause</title> | ||||
|  | ||||
|    <para> | ||||
|     The syntax of the WHERE clause is | ||||
| <synopsis> | ||||
| WHERE <replaceable>search condition</replaceable> | ||||
| </synopsis> | ||||
|     where <replaceable>search condition</replaceable> is any value | ||||
|     expression as defined in <xref linkend="sql-expressions"> that | ||||
|     returns a value of type <type>boolean</type>. | ||||
|    </para> | ||||
|  | ||||
|    <para> | ||||
|     After the processing of the FROM clause is done, each row of the | ||||
|     derived table is checked against the search condition.  If the | ||||
|     result of the condition is true, the row is kept in the output | ||||
|     table, otherwise (that is, if the result is false or NULL) it is | ||||
|     discarded.  The search condition typically references at least some | ||||
|     column in the table generated in the FROM clause; this is not | ||||
|     required, but otherwise the WHERE clause will be fairly useless. | ||||
|    </para> | ||||
|  | ||||
|    <note> | ||||
|     <para> | ||||
|      Before the implementation of the JOIN syntax, it was necessary to | ||||
|      put the join condition of an inner join in the WHERE clause.  For | ||||
|      example, these table expressions are equivalent: | ||||
| <programlisting> | ||||
| FROM a, b WHERE a.id = b.id AND b.val > 5 | ||||
| </programlisting> | ||||
|      and | ||||
| <programlisting> | ||||
| FROM a INNER JOIN b ON (a.id = b.id) WHERE b.val > 5 | ||||
| </programlisting> | ||||
|      or perhaps even | ||||
| <programlisting> | ||||
| FROM a NATURAL JOIN b WHERE b.val > 5 | ||||
| </programlisting> | ||||
|      Which one of these you use is mainly a matter of style.  The JOIN | ||||
|      syntax in the FROM clause is probably not as portable to other | ||||
|      products.  For outer joins there is no choice in any case:  they | ||||
|      must be done in the FROM clause. | ||||
|     </para> | ||||
|    </note> | ||||
|  | ||||
| <programlisting> | ||||
| FROM FDT WHERE | ||||
|     C1 > 5 | ||||
|  | ||||
| FROM FDT WHERE | ||||
|     C1 IN (1, 2, 3) | ||||
| FROM FDT WHERE | ||||
|     C1 IN (SELECT C1 FROM T2) | ||||
| FROM FDT WHERE | ||||
|     C1 IN (SELECT C3 FROM T2 WHERE C2 = FDT.C1 + 10) | ||||
|  | ||||
| FROM FDT WHERE | ||||
|     C1 BETWEEN (SELECT C3 FROM T2 WHERE C2 = FDT.C1 + 10) AND 100 | ||||
|  | ||||
| FROM FDT WHERE | ||||
|     EXISTS (SELECT C1 FROM T2 WHERE C2 > FDT.C1) | ||||
| </programlisting> | ||||
|  | ||||
|    <para> | ||||
|     In the examples above, FDT is the table derived in the FROM | ||||
|     clause. Rows that do not meet the search condition of the where | ||||
|     clause are eliminated from FDT. Notice the use of scalar | ||||
|     subqueries as value expressions (C2 assumed UNIQUE).  Just like | ||||
|     any other query, the subqueries can employ complex table | ||||
|     expressions.  Notice how FDT is referenced in the subqueries. | ||||
|     Qualifying C1 as FDT.C1 is only necessary if C1 is the name of a | ||||
|     column in the derived input table of the subquery.  Qualifying the | ||||
|     column name adds clarity even when it is not needed.  The column | ||||
|     naming scope of an outer query extends into its inner queries. | ||||
|    </para> | ||||
|   </sect2> | ||||
|  | ||||
|  | ||||
|   <sect2 id="queries-group"> | ||||
|    <title>GROUP BY and HAVING clauses</title>	    | ||||
|  | ||||
|    <para> | ||||
|     After passing the WHERE filter, the derived input table may be | ||||
|     subject to grouping, using the GROUP BY clause, and elimination of | ||||
|     group rows using the HAVING clause. | ||||
|    </para> | ||||
|  | ||||
| <synopsis> | ||||
| SELECT <replaceable>select_list</replaceable> FROM ... <optional>WHERE ...</optional> GROUP BY <replaceable>grouping_column_reference</replaceable> <optional>, <replaceable>grouping_column_reference</replaceable></optional>... | ||||
| </synopsis> | ||||
|  | ||||
|    <para> | ||||
|     The GROUP BY clause is used to group together rows in a table that | ||||
|     share the same values in all the columns listed. The order in | ||||
|     which the columns are listed does not matter (as opposed to an | ||||
|     ORDER BY clause).  The purpose is to reduce each group of rows | ||||
|     sharing common values into one group row that is representative of | ||||
|     all rows in the group.  This is done to eliminate redundancy in | ||||
|     the output and/or obtain aggregates that apply to these groups. | ||||
|    </para> | ||||
| 	   | ||||
|    <para> | ||||
|     Once a table is grouped, columns that are not included in the | ||||
|     grouping cannot be referenced, except in aggregate expressions, | ||||
|     since a specific value in those columns is ambiguous - which row | ||||
|     in the group should it come from?  The grouped-by columns can be | ||||
|     referenced in select list column expressions since they have a | ||||
|     known constant value per group.  Aggregate functions on the | ||||
|     ungrouped columns provide values that span the rows of a group, | ||||
|     not of the whole table.  For instance, a | ||||
|     <function>sum(sales)</function> on a grouped table by product code | ||||
|     gives the total sales for each product, not the total sales on all | ||||
|     products.  The aggregates of the ungrouped columns are | ||||
|     representative of the group, whereas their individual values may | ||||
|     not be. | ||||
|    </para> | ||||
|  | ||||
|    <para> | ||||
|     Example: | ||||
| <programlisting> | ||||
| SELECT pid, p.name, (sum(s.units) * p.price) AS sales | ||||
|   FROM products p LEFT JOIN sales s USING ( pid ) | ||||
|   GROUP BY pid, p.name, p.price; | ||||
| </programlisting> | ||||
|     In this example, the columns pid, p.name, and p.price must be in | ||||
|     the GROUP BY clause since they are referenced in the query select | ||||
|     list.  The column s.units does not have to be in the GROUP BY list | ||||
|     since it is only used in an aggregate expression | ||||
|     (<function>sum()</function>), which represents the group of sales | ||||
|     of a product.  For each product, a summary row is returned about | ||||
|     all sales of the product. | ||||
|    </para> | ||||
|  | ||||
|    <para> | ||||
|     In strict SQL, GROUP BY can only group by columns of the source | ||||
|     table but Postgres extends this to also allow GROUP BY to group by | ||||
|     select columns in the query select list.  Grouping by value | ||||
|     expressions instead of simple column names is also allowed. | ||||
|    </para> | ||||
|  | ||||
|    <para> | ||||
| <synopsis> | ||||
| SELECT <replaceable>select_list</replaceable> FROM ... <optional>WHERE ...</optional> GROUP BY ... HAVING <replaceable>boolean_expression</replaceable> | ||||
| </synopsis> | ||||
|     If a table has been grouped using a GROUP BY clause, but then only | ||||
|     certain groups are of interest, the HAVING clause can be used, | ||||
|     much like a WHERE clause, to eliminate groups from a grouped | ||||
|     table.  For some queries, Postgres allows a HAVING clause to be | ||||
|     used without a GROUP BY and then it acts just like another WHERE | ||||
|     clause, but the point in using HAVING that way is not clear. Since | ||||
|     HAVING operates on groups, only grouped columns can be listed in | ||||
|     the HAVING clause.  If selection based on some ungrouped column is | ||||
|     desired, it should be expressed in the WHERE clause. | ||||
|    </para> | ||||
|  | ||||
|    <para> | ||||
|     Example: | ||||
| <programlisting> | ||||
| SELECT pid    AS "Products", | ||||
|        p.name AS "Over 5000", | ||||
|        (sum(s.units) * (p.price - p.cost)) AS "Past Month Profit" | ||||
|   FROM products p LEFT JOIN sales s USING ( pid ) | ||||
|   WHERE p.date > CURRENT_DATE - INTERVAL '4 weeks' | ||||
|   GROUP BY pid, p.name, p.price, p.cost | ||||
|     HAVING p.price > 5000; | ||||
| </programlisting> | ||||
|     In the example above, the WHERE clause is selecting rows by a | ||||
|     column that is not grouped, while the HAVING clause | ||||
|     is selecting groups with a price greater than 5000. | ||||
|    </para> | ||||
|   </sect2> | ||||
|  </sect1> | ||||
|  | ||||
|  | ||||
|  <sect1 id="queries-select-lists"> | ||||
|   <title>Select Lists</title> | ||||
|  | ||||
|   <para> | ||||
|    The table expression in the <command>SELECT</command> command | ||||
|    constructs an intermediate virtual table by possibly combining | ||||
|    tables, views, eliminating rows, grouping, etc.  This table is | ||||
|    finally passed on to processing by the select list.  The select | ||||
|    list determines which <emphasis>columns</emphasis> of the | ||||
|    intermediate table are retained.  The simplest kind of select list | ||||
|    is <literal>*</literal> which retains all columns that the table | ||||
|    expression produces.  Otherwise, a select list is a comma-separated | ||||
|    list of value expressions (as defined in <xref | ||||
|    linkend="sql-expressions">).  For instance, it could be a list of | ||||
|    column names: | ||||
| <programlisting> | ||||
| SELECT a, b, c FROM ... | ||||
| </programlisting> | ||||
|    The columns names a, b, and c are either the actual names of the | ||||
|    columns of table referenced in the FROM clause, or the aliases | ||||
|    given to them as explained in <xref linkend="queries-table-aliases">. | ||||
|    The name space available in the select list is the same as in the | ||||
|    WHERE clause (unless grouping is used, in which case it is the same | ||||
|    as in the HAVING clause).  If more than one table has a column of | ||||
|    the same name, the table name must also be given, as in | ||||
| <programlisting> | ||||
| SELECT tbl1.a, tbl2.b, tbl1.c FROM ... | ||||
| </programlisting> | ||||
|    (see also <xref linkend="queries-where">). | ||||
|   </para> | ||||
|  | ||||
|   <para> | ||||
|    If an arbitrary value expression is used in the select list, it | ||||
|    conceptually adds a new virtual column to the returned table.  The | ||||
|    value expression is effectively evaluated once for each retrieved | ||||
|    row with real values substituted for any column references.  But | ||||
|    the expressions in the select list do not have to reference any | ||||
|    columns in the table expression of the FROM clause; they can be | ||||
|    constant arithmetic expressions as well, for instance. | ||||
|   </para> | ||||
|  | ||||
|   <sect2 id="queries-column-labels"> | ||||
|    <title>Column Labels</title> | ||||
|  | ||||
|    <para> | ||||
|     The entries in the select list can be assigned names for further | ||||
|     processing.  The <quote>further processing</quote> in this case is | ||||
|     an optional sort specification and the client application (e.g., | ||||
|     column headers for display).  For example: | ||||
| <programlisting> | ||||
| SELECT a AS value, b + c AS sum FROM ... | ||||
| </programlisting> | ||||
|     The AS key word can in fact be omitted. | ||||
|    </para> | ||||
|  | ||||
|    <para> | ||||
|     If no name is chosen, the system assigns a default.  For simple | ||||
|     column references, this is the name of the column.  For function | ||||
|     calls, this is the name of the function.  For complex expressions, | ||||
|     the system will generate a generic name. | ||||
|    </para> | ||||
|  | ||||
|    <note> | ||||
|     <para> | ||||
|      The naming of output columns here is different from that done in | ||||
|      the FROM clause (see <xref linkend="queries-table-aliases">).  This | ||||
|      pipeline will in fact allow you to rename the same column twice, | ||||
|      but the name chosen in the select list is the one that will be | ||||
|      passed on. | ||||
|     </para> | ||||
|    </note> | ||||
|   </sect2> | ||||
|  | ||||
|   <sect2 id="queries-distinct"> | ||||
|    <title>DISTINCT</title> | ||||
|  | ||||
|    <para> | ||||
|     After the select list has been processed, the result table may | ||||
|     optionally be subject to the elimination of duplicates.  The | ||||
|     <token>DISTINCT</token> key word is written directly after the | ||||
|     <token>SELECT</token> to enable this: | ||||
| <synopsis> | ||||
| SELECT DISTINCT <replaceable>select_list</replaceable> ... | ||||
| </synopsis> | ||||
|     (Instead of <token>DISTINCT</token> the word <token>ALL</token> | ||||
|     can be used to select the default behavior of retaining all rows.) | ||||
|    </para> | ||||
|  | ||||
|    <para> | ||||
|     Obviously, two rows are considered distinct if they differ in at | ||||
|     least one column value.  NULLs are considered equal in this | ||||
|     consideration. | ||||
|    </para> | ||||
|  | ||||
|    <para> | ||||
|     Alternatively, an arbitrary expression can determine what rows are | ||||
|     to be considered distinct: | ||||
| <synopsis> | ||||
| SELECT DISTINCT ON (<replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ...</optional>) <replaceable>select_list</replaceable> ... | ||||
| </synopsis> | ||||
|     Here <replaceable>expression</replaceable> is an arbitrary value | ||||
|     expression that is evaluated for all rows.  A set of rows for | ||||
|     which all the expressions is equal are considered duplicates and | ||||
|     only the first row is kept in the output.  Note that the | ||||
|     <quote>first row</quote> of a set is unpredictable unless the | ||||
|     query is sorted. | ||||
|    </para> | ||||
|  | ||||
|    <para> | ||||
|     The DISTINCT ON clause is not part of the SQL standard and is | ||||
|     sometimes considered bad style because of the indeterminate nature | ||||
|     of its results.  With judicious use of GROUP BY and subselects in | ||||
|     FROM the construct can be avoided, but it is very often the much | ||||
|     more convenient alternative. | ||||
|    </para> | ||||
|   </sect2> | ||||
|  </sect1> | ||||
|  | ||||
|  <sect1 id="queries-union"> | ||||
|   <title>Combining Queries</title> | ||||
|  | ||||
|   <para> | ||||
|    The results of two queries can be combined using the set operations | ||||
|    union, intersection, and difference.  The syntax is | ||||
| <synopsis> | ||||
| <replaceable>query1</replaceable> UNION <optional>ALL</optional> <replaceable>query2</replaceable> | ||||
| <replaceable>query1</replaceable> INTERSECT <optional>ALL</optional> <replaceable>query2</replaceable> | ||||
| <replaceable>query1</replaceable> EXCEPT <optional>ALL</optional> <replaceable>query2</replaceable> | ||||
| </synopsis> | ||||
|    <replaceable>query1</replaceable> and | ||||
|    <replaceable>query2</replaceable> are queries that can use any of | ||||
|    the features discussed up to this point.  Set operations can also | ||||
|    be nested and chained, for example | ||||
| <synopsis> | ||||
| <replaceable>query1</replaceable> UNION <replaceable>query2</replaceable> UNION <replaceable>query3</replaceable> | ||||
| </synopsis> | ||||
|    which really says | ||||
| <synopsis> | ||||
| (<replaceable>query1</replaceable> UNION <replaceable>query2</replaceable>) UNION <replaceable>query3</replaceable> | ||||
| </synopsis> | ||||
|   </para> | ||||
|  | ||||
|   <para> | ||||
|    <command>UNION</command> effectively appends the result of | ||||
|    <replaceable>query2</replaceable> to the result of | ||||
|    <replaceable>query1</replaceable> (although there is no guarantee | ||||
|    that this is the order in which the rows are actually returned) and | ||||
|    eliminates all duplicate rows, in the sense of DISTINCT, unless ALL | ||||
|    is specified. | ||||
|   </para> | ||||
|  | ||||
|   <para> | ||||
|    <command>INTERSECT</command> returns all rows that are both in the | ||||
|    result of <replaceable>query1</replaceable> and in the result of | ||||
|    <replaceable>query2</replaceable>.  Duplicate rows are eliminated | ||||
|    unless ALL is specified. | ||||
|   </para> | ||||
|  | ||||
|   <para> | ||||
|    <command>EXCEPT</command> returns all rows that are in the result | ||||
|    of <replaceable>query1</replaceable> but not in the result of | ||||
|    <replaceable>query2</replaceable>.  Again, duplicates are | ||||
|    eliminated unless ALL is specified. | ||||
|   </para> | ||||
|  | ||||
|   <para> | ||||
|    In order to calculate the union, intersection, or difference of two | ||||
|    queries, the two queries must be <quote>union compatible</quote>, | ||||
|    which means that they both return the same number of columns, and | ||||
|    that the corresponding columns have compatible data types, as | ||||
|    described in <xref linkend="typeconv-union-case">. | ||||
|   </para> | ||||
|  </sect1> | ||||
|  | ||||
|  | ||||
|  <sect1 id="queries-order"> | ||||
|   <title>Sorting Rows</title> | ||||
| 	    | ||||
|   <para> | ||||
|    After a query has produced an output table (after the select list | ||||
|    has been processed) it can optionally be sorted.  If sorting is not | ||||
|    chosen, the rows will be returned in random order.  The actual | ||||
|    order in that case will depend on the scan and join plan types and | ||||
|    the order on disk, but it must not be relied on.  A particular | ||||
|    ordering can only be guaranteed if the sort step is explicitly | ||||
|    chosen. | ||||
|   </para> | ||||
|  | ||||
|   <para> | ||||
|    The ORDER BY clause specifies the sort order: | ||||
| <synopsis> | ||||
| SELECT <replaceable>select_list</replaceable> FROM <replaceable>table_expression</replaceable> ORDER BY <replaceable>column1</replaceable> <optional>ASC | DESC</optional> <optional>, <replaceable>column2</replaceable> <optional>ASC | DESC</optional> ...</optional> | ||||
| </synopsis> | ||||
|    <replaceable>column1</replaceable>, etc., refer to select list | ||||
|    columns:  It can either be the name of a column (either the | ||||
|    explicit column label or default name, as explained in <xref | ||||
|    linkend="queries-column-labels">) or the number of a column.  Some | ||||
|    examples: | ||||
| <programlisting> | ||||
| SELECT a, b FROM table1 ORDER BY a; | ||||
| SELECT a + b AS sum, c FROM table1 ORDER BY sum; | ||||
| SELECT a, sum(b) FROM table1 GROUP BY a ORDER BY 1; | ||||
| </programlisting> | ||||
|   </para> | ||||
|  | ||||
|   <para> | ||||
|    As an extension to the SQL standard, Postgres also allows ordering | ||||
|    by arbitrary expressions: | ||||
| <programlisting> | ||||
| SELECT a, b FROM table1 ORDER BY a + b; | ||||
| </programlisting> | ||||
|    References to column names in the FROM clause that are renamed in | ||||
|    the select list are also allowed: | ||||
| <programlisting> | ||||
| SELECT a AS b FROM table1 ORDER BY a; | ||||
| </programlisting> | ||||
|    But this does not work in queries involving UNION, INTERSECT, or | ||||
|    EXCEPT, and is not portable. | ||||
|   </para> | ||||
|  | ||||
|   <para> | ||||
|    Each column specification may be followed by an optional ASC or | ||||
|    DESC to set the sort direction.  ASC is default.  Ascending order | ||||
|    puts smaller values first, where <quote>smaller</quote> is defined | ||||
|    in terms of the <literal><</literal> operator.  Similarly, | ||||
|    descending order is determined with the <literal>></literal> | ||||
|    operator. | ||||
|   </para> | ||||
|  | ||||
|   <para> | ||||
|    If more than one sort column is specified the later entries are | ||||
|    used to sort the rows that are equal under the order imposed by the | ||||
|    earlier sort specifications. | ||||
|   </para> | ||||
|  </sect1> | ||||
|  | ||||
|  <sect1 id="queries-limit"> | ||||
|   <title>LIMIT and OFFSET</title> | ||||
|  | ||||
| <synopsis> | ||||
| SELECT <replaceable>select_list</replaceable> FROM <replaceable>table_expression</replaceable> <optional>ORDER BY <replaceable>sort_spec</replaceable></optional> <optional>LIMIT { <replaceable>number</replaceable> | ALL }</optional> <optional>OFFSET <replaceable>number</replaceable></optional> | ||||
| </synopsis> | ||||
|  | ||||
|   <para> | ||||
|    LIMIT allows you to retrieve just a portion of the rows that are | ||||
|    generated by the rest of the query.  If a limit count is given, no | ||||
|    more than that many rows will be returned.  If an offset is given, | ||||
|    that many rows will be skipped before starting to return rows. | ||||
|   </para> | ||||
|  | ||||
|   <para> | ||||
|    When using LIMIT, it is a good idea to use an ORDER BY clause that | ||||
|    constrains the result rows into a unique order.  Otherwise you will | ||||
|    get an unpredictable subset of the query's rows---you may be asking | ||||
|    for the tenth through twentieth rows, but tenth through twentieth | ||||
|    in what ordering?  The ordering is unknown, unless you specified | ||||
|    ORDER BY. | ||||
|   </para> | ||||
|  | ||||
|   <para> | ||||
|    The query optimizer takes LIMIT into account when generating a | ||||
|    query plan, so you are very likely to get different plans (yielding | ||||
|    different row orders) depending on what you give for LIMIT and | ||||
|    OFFSET.  Thus, using different LIMIT/OFFSET values to select | ||||
|    different subsets of a query result <emphasis>will give | ||||
|    inconsistent results</emphasis> unless you enforce a predictable | ||||
|    result ordering with ORDER BY.  This is not a bug; it is an | ||||
|    inherent consequence of the fact that SQL does not promise to | ||||
|    deliver the results of a query in any particular order unless ORDER | ||||
|    BY is used to constrain the order. | ||||
|   </para> | ||||
|  </sect1> | ||||
|  | ||||
| </chapter> | ||||
| @@ -1,5 +1,5 @@ | ||||
| <!-- | ||||
| $Header: /cvsroot/pgsql/doc/src/sgml/syntax.sgml,v 1.36 2001/01/21 22:02:01 petere Exp $ | ||||
| $Header: /cvsroot/pgsql/doc/src/sgml/syntax.sgml,v 1.37 2001/01/22 23:34:33 petere Exp $ | ||||
| --> | ||||
|  | ||||
| <chapter id="sql-syntax"> | ||||
| @@ -743,7 +743,7 @@ CAST ( '<replaceable>string</replaceable>' AS <replaceable>type</replaceable> ) | ||||
|   </para> | ||||
|  | ||||
|   <para> | ||||
|    In addition to this list, there are a number of contructs that can | ||||
|    In addition to this list, there are a number of constructs that can | ||||
|    be classified as an expression but do not follow any general syntax | ||||
|    rules.  These generally have the semantics of a function or | ||||
|    operator and are explained in the appropriate location in <xref | ||||
| @@ -763,15 +763,15 @@ CAST ( '<replaceable>string</replaceable>' AS <replaceable>type</replaceable> ) | ||||
|    <para> | ||||
|     A column can be referenced in the form: | ||||
| <synopsis> | ||||
| <replaceable>corelation</replaceable>.<replaceable>columnname</replaceable> `['<replaceable>subscript</replaceable>`]' | ||||
| <replaceable>correlation</replaceable>.<replaceable>columnname</replaceable> `['<replaceable>subscript</replaceable>`]' | ||||
| </synopsis> | ||||
|  | ||||
|     <replaceable>corelation</replaceable> is either the name of a | ||||
|     <replaceable>correlation</replaceable> is either the name of a | ||||
|     table, an alias for a table defined by means of a FROM clause, or | ||||
|     the keyword <literal>NEW</literal> or <literal>OLD</literal>. | ||||
|     (NEW and OLD can only appear in the action portion of a rule, | ||||
|     while other corelation names can be used in any SQL statement.) | ||||
|     The corelation name can be omitted if the column name is unique | ||||
|     while other correlation names can be used in any SQL statement.) | ||||
|     The correlation name can be omitted if the column name is unique | ||||
|     across all the tables being used in the current query.  If | ||||
|     <replaceable>column</replaceable> is of an array type, then the | ||||
|     optional <replaceable>subscript</replaceable> selects a specific | ||||
| @@ -895,8 +895,8 @@ sqrt(2) | ||||
|     The precedence and associativity of the operators is hard-wired | ||||
|     into the parser.  Most operators have the same precedence and are | ||||
|     left-associative.  This may lead to non-intuitive behavior; for | ||||
|     example the boolean operators "<" and ">" have a different | ||||
|     precedence than the boolean operators "<=" and ">=".  Also, | ||||
|     example the Boolean operators "<" and ">" have a different | ||||
|     precedence than the Boolean operators "<=" and ">=".  Also, | ||||
|     you will sometimes need to add parentheses when using combinations | ||||
|     of binary and unary operators.  For instance | ||||
| <programlisting> | ||||
| @@ -917,7 +917,7 @@ SELECT (5 &) ~ 6; | ||||
|     <tgroup cols="2"> | ||||
|      <thead> | ||||
|       <row> | ||||
|        <entry>OperatorElement</entry> | ||||
|        <entry>Operator/Element</entry> | ||||
|        <entry>Associativity</entry> | ||||
|        <entry>Description</entry> | ||||
|       </row> | ||||
| @@ -1057,526 +1057,6 @@ SELECT (5 &) ~ 6; | ||||
|    </para> | ||||
|   </sect1> | ||||
|  | ||||
|  | ||||
|  <sect1 id="sql-table-expressions"> | ||||
|   <title>Table Expressions</title> | ||||
|  | ||||
|   <para> | ||||
|    A <firstterm>table expression</firstterm> specifies a table.  The | ||||
|    table expression contains a FROM clause that is optionally followed | ||||
|    by WHERE, GROUP BY, and HAVING clauses.  Trivial table expressions | ||||
|    simply refer to a table on disk, a so-called base table, but more | ||||
|    complex expressions can be used to modify or combine base tables in | ||||
|    various ways. | ||||
|   </para> | ||||
|  | ||||
|   <para> | ||||
|    The general syntax of the <command>SELECT</command> command is | ||||
| <synopsis> | ||||
| SELECT <replaceable>select_list</replaceable> <replaceable>table_expression</replaceable> | ||||
| </synopsis> | ||||
|  | ||||
|    The <replaceable>select_list</replaceable> is a comma separated | ||||
|    list of <replaceable>value expressions</replaceable> as defined in | ||||
|    <xref linkend="sql-expressions"> that specify the derived columns | ||||
|    of the query output table.  Column names in the derived table that | ||||
|    is the result of the <replaceable>table_expression</replaceable> | ||||
|    can be used in the <replaceable>value expression</replaceable>s of | ||||
|    the <replaceable>select_list</replaceable>. | ||||
|   </para> | ||||
|  | ||||
|   <para> | ||||
|    The WHERE, GROUP BY, and HAVING clauses in the table expression | ||||
|    specify a pipeline of successive transformations performed on the | ||||
|    table derived in the FROM clause.  The final transformed table that | ||||
|    is derived provides the input rows used to derive output rows as | ||||
|    specified by the select list of derived column value expressions. | ||||
|   </para> | ||||
| 	 | ||||
|   <sect2> | ||||
|    <title>FROM clause</title> | ||||
|   | ||||
|    <para> | ||||
|     The FROM clause derives a table from one or more other tables | ||||
|     given in a comma-separated table reference list. | ||||
| <synopsis> | ||||
| FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_reference</replaceable> <optional>, ...</optional></optional> | ||||
| </synopsis> | ||||
|  | ||||
|     A table reference may be a table name or a derived table such as a | ||||
|     subquery, a table join, or complex combinations of these.  If more | ||||
|     than one table reference is listed in the FROM clause they are | ||||
|     CROSS JOINed (see below) to form the derived table that may then | ||||
|     be subject to transformations by the WHERE, GROUP BY, and HAVING | ||||
|     clauses and is finally the result of the overall table expression. | ||||
|    </para> | ||||
|  | ||||
|    <para> | ||||
|     If a table reference is a simple table name and it is the | ||||
|     supertable in a table inheritance hierarchy, rows of the table | ||||
|     include rows from all of its subtable successors unless the | ||||
|     keyword ONLY precedes the table name. | ||||
|    </para> | ||||
|  | ||||
|    <sect3> | ||||
|     <title>Joined Tables</title> | ||||
|  | ||||
|     <para> | ||||
|      A joined table is a table derived from two other (real or | ||||
|      derived) tables according to the rules of the particular join | ||||
|      type.  INNER, OUTER, NATURAL, and CROSS JOIN are supported. | ||||
|     </para> | ||||
|  | ||||
|     <variablelist> | ||||
|      <title>Join Types</title> | ||||
|  | ||||
|      <varlistentry> | ||||
|       <term>CROSS JOIN</term> | ||||
|  | ||||
|       <listitem> | ||||
| <synopsis> | ||||
| <replaceable>T1</replaceable> CROSS JOIN <replaceable>T2</replaceable> | ||||
| </synopsis> | ||||
|  | ||||
|        <para> | ||||
|         For each combination of rows from | ||||
|         <replaceable>T1</replaceable> and | ||||
|         <replaceable>T2</replaceable> the derived table will contain a | ||||
|         row consisting of all columns in <replaceable>T1</replaceable> | ||||
|         followed by all columns in <replaceable>T2</replaceable>.  If | ||||
|         the tables have have N and M rows respectively, the joined | ||||
|         table will have N * M rows.  A cross join is essentially an | ||||
|         <literal>INNER JOIN ON TRUE</literal>. | ||||
|        </para> | ||||
|  | ||||
|        <tip> | ||||
|         <para> | ||||
|          <literal>FROM <replaceable>T1</replaceable> CROSS JOIN | ||||
|          <replaceable>T2</replaceable></literal> is equivalent to | ||||
|          <literal>FROM <replaceable>T1</replaceable>, | ||||
|          <replaceable>T2</replaceable></literal>. | ||||
|         </para> | ||||
|        </tip> | ||||
|       </listitem> | ||||
|      </varlistentry> | ||||
|  | ||||
|      <varlistentry> | ||||
|       <term>Qualified JOINs</term> | ||||
|       <listitem> | ||||
|  | ||||
| <synopsis> | ||||
| <replaceable>T1</replaceable> { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> } JOIN <replaceable>T2</replaceable> ON <replaceable>boolean expression</replaceable> | ||||
| <replaceable>T1</replaceable> { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> } JOIN <replaceable>T2</replaceable> USING ( <replaceable>join column list</replaceable> ) | ||||
| </synopsis> | ||||
|          | ||||
|        <para> | ||||
|         The words <token>INNER</token> and <token>OUTER</token> are | ||||
|         optional for all JOINs.  <token>INNER</token> is the default; | ||||
|         <token>LEFT</token>, <token>RIGHT</token>, and | ||||
|         <token>FULL</token> are for OUTER JOINs only. | ||||
|        </para> | ||||
|  | ||||
|        <para> | ||||
|         The <firstterm>join condition</firstterm> is specified in the | ||||
|         ON or USING clause.  (The meaning of the join condition | ||||
|         depends on the particular join type; see below.)  The ON | ||||
|         clause takes a boolean value expression of the same kind as is | ||||
|         used in a WHERE clause.  The USING clause takes a | ||||
|         comma-separated list of column names, which the joined tables | ||||
|         must have in common, and joins the tables on the equality of | ||||
|         those columns as a set, resulting in a joined table having one | ||||
|         column for each common column listed and all of the other | ||||
|         columns from both tables.  Thus, <literal>USING (a, b, | ||||
|         c)</literal> is equivalent to <literal>ON (t1.a = t2.a AND | ||||
|         t1.b = t2.b AND t1.c = t2.c)</literal> with the exception that | ||||
|         if ON is used there will be two columns a, b, and c in the | ||||
|         result, whereas with USING there will be only one of each. | ||||
|        </para> | ||||
|  | ||||
|        <variablelist> | ||||
|         <varlistentry> | ||||
|          <term>INNER JOIN</term> | ||||
|  | ||||
|          <listitem> | ||||
|           <para> | ||||
|            For each row R1 of T1, the joined table has a row for each | ||||
|            row in T2 that satisfies the join condition with R1. | ||||
|           </para> | ||||
|          </listitem> | ||||
|         </varlistentry> | ||||
|  | ||||
|         <varlistentry> | ||||
|          <term>LEFT OUTER JOIN</term> | ||||
|  | ||||
|          <listitem> | ||||
|           <para> | ||||
|            First, an INNER JOIN is performed.  Then, for a row in T1 | ||||
|            that does not satisfy the join condition with any row in | ||||
|            T2, a joined row is returned with NULL values in columns of | ||||
|            T2.  Thus, the joined table unconditionally has a row for each | ||||
|            row in T1. | ||||
|           </para> | ||||
|          </listitem> | ||||
|         </varlistentry> | ||||
|           | ||||
|         <varlistentry> | ||||
|          <term>RIGHT OUTER JOIN</term> | ||||
|  | ||||
|          <listitem> | ||||
|           <para> | ||||
|            This is like a left join, only that the result table will | ||||
|            unconditionally have a row for each row in T2. | ||||
|           </para> | ||||
|          </listitem> | ||||
|         </varlistentry> | ||||
|           | ||||
|         <varlistentry> | ||||
|          <term>FULL OUTER JOIN</term> | ||||
|  | ||||
|          <listitem> | ||||
|           <para> | ||||
|            First, an INNER JOIN is performed.  Then, for each row in | ||||
|            T1 that does not satisfy the join condition with any row in | ||||
|            T2, a joined row is returned with null values in columns of | ||||
|            T2.  Also, for each row of T2 that does not satisfy the | ||||
|            join condition with any row in T1, a joined row with null | ||||
|            values in the columns of T1 is returned. | ||||
|           </para> | ||||
|          </listitem> | ||||
|         </varlistentry> | ||||
|        </variablelist> | ||||
|       </listitem> | ||||
|      </varlistentry> | ||||
|        | ||||
|      <varlistentry> | ||||
|       <term>NATURAL JOIN</term> | ||||
|  | ||||
|       <listitem> | ||||
| <synopsis> | ||||
| <replaceable>T1</replaceable> NATURAL { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> JOIN <replaceable>T2</replaceable> | ||||
| </synopsis> | ||||
|        <para> | ||||
|         A natural join creates a joined table where every pair of matching | ||||
|         column names between the two tables are merged into one column. The | ||||
|         join specification is effectively a USING clause containing all the | ||||
|         common column names and is otherwise like a Qualified JOIN. | ||||
|        </para> | ||||
|       </listitem> | ||||
|      </varlistentry> | ||||
|     </variablelist> | ||||
|  | ||||
|     <para> | ||||
|      Joins of all types can be chained together or nested where either | ||||
|      or both of <replaceable>T1</replaceable> and | ||||
|      <replaceable>T2</replaceable> may be JOINed tables.  Parenthesis | ||||
|      can be used around JOIN clauses to control the join order which | ||||
|      are otherwise left to right. | ||||
|     </para> | ||||
|    </sect3> | ||||
|  | ||||
|    <sect3 id="sql-subqueries"> | ||||
|     <title>Subqueries</title> | ||||
|  | ||||
|     <para> | ||||
|      Subqueries specifying a derived table must be enclosed in | ||||
|      parenthesis and <emphasis>must</emphasis> be named using an AS | ||||
|      clause.  (See <xref linkend="sql-table-aliases">.) | ||||
|     </para> | ||||
|  | ||||
| <programlisting> | ||||
| FROM (SELECT * FROM table1) AS alias_name | ||||
| </programlisting> | ||||
|  | ||||
|     <para> | ||||
|      This example is equivalent to <literal>FROM table1 AS | ||||
|      alias_name</literal>.  Many subquieries can be written as table | ||||
|      joins instead. | ||||
|     </para> | ||||
|    </sect3> | ||||
|  | ||||
|    <sect3 id="sql-table-aliases"> | ||||
|     <title>Table and Column Aliases</title> | ||||
|  | ||||
|     <para> | ||||
|      A temporary name can be given to tables and complex table | ||||
|      references to be used for references to the derived table in | ||||
|      further processing.  This is called a <firstterm>table | ||||
|      alias</firstterm>. | ||||
| <synopsis> | ||||
| FROM <replaceable>table_reference</replaceable> AS <replaceable>alias</replaceable> | ||||
| </synopsis> | ||||
|      Here, <replaceable>alias</replaceable> can be any regular | ||||
|      identifier.  The alias becomes the new name of the table | ||||
|      reference for the current query -- it is no longer possible to | ||||
|      refer to the table by the original name (if the table reference | ||||
|      was an ordinary base table).  Thus | ||||
| <programlisting> | ||||
| SELECT * FROM my_table AS m WHERE my_table.a > 5; | ||||
| </programlisting> | ||||
|      is not valid SQL syntax.  What will happen instead, as a | ||||
|      <productname>Postgres</productname> extension, is that an implict | ||||
|      table reference is added to the FROM clause, so the query is | ||||
|      processed as if it was written as | ||||
| <programlisting> | ||||
| SELECT * FROM my_table AS m, my_table WHERE my_table.a > 5; | ||||
| </programlisting> | ||||
|      Table aliases are mainly for notational convenience, but it is | ||||
|      necessary to use them when joining a table to itself, e.g., | ||||
| <programlisting> | ||||
| SELECT * FROM my_table AS a CROSS JOIN my_table AS b ... | ||||
| </programlisting> | ||||
|      Additionally, an alias is required if the table reference is a | ||||
|      subquery. | ||||
|     </para> | ||||
|  | ||||
|     <para> | ||||
|      Parenthesis are used to resolve ambiguities.  The following | ||||
|      statement will assign the alias <literal>b</literal> to the | ||||
|      result of the join, unlike the previous example: | ||||
| <programlisting> | ||||
| SELECT * FROM (my_table AS a CROSS JOIN my_table) AS b ... | ||||
| </programlisting> | ||||
|     </para> | ||||
|  | ||||
|     <para> | ||||
| <synopsis> | ||||
| FROM <replaceable>table_reference</replaceable> <replaceable>alias</replaceable> | ||||
| </synopsis> | ||||
|      This form is equivalent the previously treated one; the | ||||
|      <token>AS</token> key word is noise. | ||||
|     </para> | ||||
|  | ||||
|     <para> | ||||
| <synopsis> | ||||
| FROM <replaceable>table_reference</replaceable> <optional>AS</optional> <replaceable>alias</replaceable> ( <replaceable>column1</replaceable> <optional>, <replaceable>column2</replaceable> <optional>, ...</optional></optional> ) | ||||
| </synopsis> | ||||
|      In addition to renaming the table as described above, the columns | ||||
|      of the table are also given temporary names.  If less column | ||||
|      aliases are specified than the actual table has columns, the last | ||||
|      columns are not renamed.  This syntax is especially useful for | ||||
|      self-joins or subqueries. | ||||
|     </para> | ||||
|    </sect3> | ||||
|  | ||||
|    <sect3> | ||||
|     <title>Examples</title> | ||||
|  | ||||
|     <para> | ||||
| <programlisting> | ||||
| FROM T1 INNER JOIN T2 USING (C) | ||||
| FROM T1 LEFT OUTER JOIN T2 USING (C) | ||||
| FROM (T1 RIGHT OUTER JOIN T2 ON (T1C1=T2C1)) AS DT1 | ||||
| FROM (T1 FULL OUTER JOIN T2 USING (C)) AS DT1 (DT1C1, DT1C2) | ||||
|  | ||||
| FROM T1 NATURAL INNER JOIN T2 | ||||
| FROM T1 NATURAL LEFT OUTER JOIN T2 | ||||
| FROM T1 NATURAL RIGHT OUTER JOIN T2 | ||||
| FROM T1 NATURAL FULL OUTER JOIN T2 | ||||
|  | ||||
| FROM (SELECT * FROM T1) DT1 CROSS JOIN T2, T3 | ||||
| FROM (SELECT * FROM T1) DT1, T2, T3 | ||||
| </programlisting> | ||||
|  | ||||
|      Above are some examples of joined tables and complex derived | ||||
|      tables.  Notice how the AS clause renames or names a derived | ||||
|      table and how the optional comma-separated list of column names | ||||
|      that follows gives names or renames the columns.  The last two | ||||
|      FROM clauses produce the same derived table from T1, T2, and T3. | ||||
|      The AS keyword was omitted in naming the subquery as DT1.  The | ||||
|      keywords OUTER and INNER are noise that can be omitted also. | ||||
|     </para> | ||||
|    </sect3> | ||||
|  | ||||
|   </sect2> | ||||
|  | ||||
|   <sect2> | ||||
|    <title>WHERE clause</title> | ||||
|  | ||||
|    <para> | ||||
|     The syntax of the WHERE clause is | ||||
| <synopsis> | ||||
| WHERE <replaceable>search condition</replaceable> | ||||
| </synopsis> | ||||
|     where <replaceable>search condition</replaceable> is any value | ||||
|     expression as defined in <xref linkend="sql-expressions"> that | ||||
|     returns a value of type <type>boolean</type>. | ||||
|    </para> | ||||
|  | ||||
|    <para> | ||||
|     After the processing of the FROM clause is done, each row of the | ||||
|     derived table is checked against the search condition.  If the | ||||
|     result of the condition is true, the row is kept in the output | ||||
|     table, otherwise (that is, if the result is false or NULL) it is | ||||
|     discared.  The search condition typically references at least some | ||||
|     column in the table generated in the FROM clause; this is not | ||||
|     required, but otherwise the WHERE clause will be fairly useless. | ||||
|    </para> | ||||
|  | ||||
|    <note> | ||||
|     <para> | ||||
|      Before the implementation of the JOIN syntax, it was necessary to | ||||
|      put the join condition of an inner join in the WHERE clause.  For | ||||
|      example, these table expressions are equivalent: | ||||
| <programlisting> | ||||
| FROM a, b WHERE a.id = b.id AND b.val > 5 | ||||
| </programlisting> | ||||
|      and | ||||
| <programlisting> | ||||
| FROM a INNER JOIN b ON (a.id = b.id) WHERE b.val > 5 | ||||
| </programlisting> | ||||
|      or perhaps even | ||||
| <programlisting> | ||||
| FROM a NATURAL JOIN b WHERE b.val > 5 | ||||
| </programlisting> | ||||
|      Which one of these you use is mainly a matter of style.  The JOIN | ||||
|      syntax in the FROM clause is probably not as portable to other | ||||
|      products.  For outer joins there is no choice in any case:  they | ||||
|      must be done in the FROM clause. | ||||
|     </para> | ||||
|    </note> | ||||
|  | ||||
| <programlisting> | ||||
| FROM FDT WHERE | ||||
|     C1 > 5 | ||||
|  | ||||
| FROM FDT WHERE | ||||
|     C1 IN (1, 2, 3) | ||||
| FROM FDT WHERE | ||||
|     C1 IN (SELECT C1 FROM T2) | ||||
| FROM FDT WHERE | ||||
|     C1 IN (SELECT C3 FROM T2 WHERE C2 = FDT.C1 + 10) | ||||
|  | ||||
| FROM FDT WHERE | ||||
|     C1 BETWEEN (SELECT C3 FROM T2 WHERE C2 = FDT.C1 + 10) AND 100 | ||||
|  | ||||
| FROM FDT WHERE | ||||
|     EXISTS (SELECT C1 FROM T2 WHERE C2 > FDT.C1) | ||||
| </programlisting> | ||||
|  | ||||
|    <para> | ||||
|     In the examples above, FDT is the table derived in the FROM | ||||
|     clause. Rows that do not meet the search condition of the where | ||||
|     clause are eliminated from FDT. Notice the use of scalar | ||||
|     subqueries as value expressions (C2 assumed UNIQUE).  Just like | ||||
|     any other query, the subqueries can employ complex table | ||||
|     expressions.  Notice how FDT is referenced in the subqueries. | ||||
|     Qualifying C1 as FDT.C1 is only necessary if C1 is the name of a | ||||
|     column in the derived input table of the subquery.  Qualifying the | ||||
|     column name adds clarity even when it is not needed.  The column | ||||
|     naming scope of an outer query extends into its inner queries. | ||||
|    </para> | ||||
|   </sect2> | ||||
|  | ||||
| <!-- This is confusing as heck.  Make it simpler. --> | ||||
|  | ||||
| <![IGNORE[ | ||||
|  | ||||
|   <sect2> | ||||
|    <title>GROUP BY and HAVING clauses</title>	    | ||||
|  | ||||
|    <para> | ||||
|     After passing the WHERE filter, the derived input table may be | ||||
|     subject to grouping, using the GROUP BY clause, and elimination of | ||||
|     group rows using the HAVING clause.  (The HAVING clause can also | ||||
|     be used without GROUP BY, but then it is equivalent to the WHERE | ||||
|     clause.) | ||||
|    </para> | ||||
|  | ||||
|    <para> | ||||
|     In standard SQL, the GROUP BY clause takes a list of column names, | ||||
|     that specify a subrow, from the derived input table produced by | ||||
|     the previous WHERE or FROM clause and partitions the table into | ||||
|     groups with duplicate subrows such that within a column of the | ||||
|     subrow, no column value is distinct from other column values. The | ||||
|     resulting derived input table is a special type of table, called a | ||||
|     grouped table, which still contains all columns but only | ||||
|     references to columns of the grouped subrow, and group aggregates, | ||||
|     derived from any of the columns, may appear in derived column | ||||
|     value expressions in the query select list.  When deriving an | ||||
|     output table from a query using a grouped input table, each output | ||||
|     row is derived from a corresponding group/partition of the grouped | ||||
|     table. Aggregates computed in a derived output column are | ||||
|     aggregates on the current partition/group of the grouped input | ||||
|     table being processed.  Only one output table row results per | ||||
|     group/partition of the grouped input table. | ||||
|    </para> | ||||
|  | ||||
|    <para> | ||||
|     Postgres has extended the GROUP BY clause to allow some | ||||
|     non-standard, but useful behavior. Derived output columns, given | ||||
|     names using an AS clause in the query select list, may appear in | ||||
|     the GROUP BY clause in combination with, or instead of, the input | ||||
|     table column names. Tables may also be grouped by arbitrary | ||||
|     expressions. If output table column names appear in the GROUP BY | ||||
|     list, then the input table is augmented with additional columns of | ||||
|     the output table columns listed in the GROUP BY clause. The value | ||||
|     for each row in the additional columns is computed from the value | ||||
|     expression that defines the output column in the query select | ||||
|     list. The augmented input table is grouped by the column names | ||||
|     listed in the GROUP BY clause. The resulting grouped augmented | ||||
|     input table is then treated according standard SQL GROUP BY | ||||
|     semantics. Only the columns of the unaugmented input table in the | ||||
|     grouped subrow (if any), and group aggregates, derived from any of | ||||
|     the columns of the unaugmented input table, may be referenced in | ||||
|     the value expressions of the derived output columns of the | ||||
|     query. Output columns derived with an aggregate expression cannot | ||||
|     be named in the GROUP BY clause. | ||||
|    </para> | ||||
|  | ||||
|    <para> | ||||
|     A HAVING clause may optionally follow a GROUP BY clause.  The | ||||
|     HAVING clause selects or eliminates, depending on which | ||||
|     perspective is taken, groups from the grouped table derived in the | ||||
|     GROUP BY clause that precedes it.  The search condition is the | ||||
|     same type of expression allowed in a WHERE clause and may | ||||
|     reference any of the input table column names in the grouped | ||||
|     subrow, but may not reference any others or any named output | ||||
|     columns.  When the search condition results in TRUE the group is | ||||
|     retained, otherwise the group is eliminated. | ||||
|    </para> | ||||
|   </sect2> | ||||
|  | ||||
|    <sect2> | ||||
|     <title>ORDER BY and LIMIT clauses</title> | ||||
| 	    | ||||
|     <para> | ||||
|      ORDER BY and LIMIT clauses are not clauses of a table expression. | ||||
|      They are optional clauses that may follow a query expression and | ||||
|      are discussed here because they are commonly used with the | ||||
|      clauses above. | ||||
|     </para> | ||||
|  | ||||
|     <para> | ||||
|      ORDER BY takes a comma-separated list of columns and performs a | ||||
|      cascaded ordering of the table by the columns listed, in the | ||||
|      order listed.  The keyword DESC or ASC may follow any column name | ||||
|      or expression in the list to specify descending or ascending | ||||
|      ordering, respectively. Ascending order is the default.  The | ||||
|      ORDER BY clause conforms to the SQL standard but is extended in | ||||
|      Postgres.  Postgres allows ORDER BY to reference both output | ||||
|      table columns, as named in the select list using the AS clause, | ||||
|      and input table columns, as given by the table derived in the | ||||
|      FROM clause and other previous clauses. Postgres also extends | ||||
|      ORDER BY to allow ordering by arbitrary expressions. If used in a | ||||
|      query with a GROUP BY clause, the ORDER BY clause can only | ||||
|      reference output table column names and grouped input table | ||||
|      columns. | ||||
|     </para> | ||||
|  | ||||
|     <para> | ||||
|      LIMIT is not a standard SQL clause.  LIMIT is a Postgres | ||||
|      extension that limits the number of rows that will be returned | ||||
|      from a query.  The rows returned by a query using the LIMIT | ||||
|      clause are random if no ORDER BY clause is specified.  A LIMIT | ||||
|      clause may optionally be followed by an OFFSET clause which | ||||
|      specifies a number of rows to be skipped in the output table | ||||
|      before returning the number of rows specified in the LIMIT | ||||
|      clause. | ||||
|     </para> | ||||
|    </sect2> | ||||
| ]]> | ||||
|   </sect1> | ||||
|  | ||||
| </chapter> | ||||
|  | ||||
| <!-- Keep this comment at the end of the file | ||||
|   | ||||
| @@ -1,5 +1,5 @@ | ||||
| <!-- | ||||
| $Header: /cvsroot/pgsql/doc/src/sgml/Attic/user.sgml,v 1.24 2001/01/13 23:58:55 petere Exp $ | ||||
| $Header: /cvsroot/pgsql/doc/src/sgml/Attic/user.sgml,v 1.25 2001/01/22 23:34:33 petere Exp $ | ||||
| --> | ||||
|  | ||||
| <book id="user"> | ||||
| @@ -44,6 +44,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/Attic/user.sgml,v 1.24 2001/01/13 23:58:55 | ||||
|  | ||||
|  &intro; | ||||
|  &syntax; | ||||
|  &queries; | ||||
|  &datatype; | ||||
|  &func; | ||||
|  &typeconv; | ||||
|   | ||||
		Reference in New Issue
	
	Block a user