1
0
mirror of https://github.com/postgres/postgres.git synced 2025-11-06 07:49:08 +03:00
Commit Graph

203 Commits

Author SHA1 Message Date
Daniel Gustafsson
a3e92622f1 doc: Add missing semicolon in example
One of the examples on the SELECT page was missing a semicolon from
a listing which has the look and feel of being a psql session. This
adds the missing semicolon and also removes the newline between the
query and results to match the other examples nearby.

Backpatch to all supported branches to avoid backpatching issues on
this page.

Reported-by: tim.needham2@gmail.com
Discussion: https://postgr.es/m/169965004097.225187.12941375915673151540@wrigleys.postgresql.org
Backpatch-through: v12
2023-11-13 14:13:03 +01:00
Bruce Momjian
17cd48de3c doc: mention GROUP BY columns can reference target col numbers
Reported-by: hape <postgres-hape@gmx.de>

Discussion: https://postgr.es/m/168871536004.379168.9352636188330923805@wrigleys.postgresql.org

Backpatch-through: 11
2023-09-26 17:31:06 -04:00
Peter Eisentraut
37a6d81c45 doc: Clean up title case use 2023-06-23 14:14:57 +02:00
Bruce Momjian
3e337b585a doc: split out the NATURAL/CROSS JOIN in SELECT syntax
This allows the syntax to be more accurate about what clauses are
supported.  Also switch an example query to use the ANSI join syntax.

Reported-by: Joel Jacobson

Discussion: https://postgr.es/m/67b71d3e-0c22-44df-a223-351f14418319@www.fastmail.com

Backpatch-through: 11
2022-08-31 21:46:14 -04:00
Dean Rasheed
bcedd8f5fc Make subquery aliases optional in the FROM clause.
This allows aliases for sub-SELECTs and VALUES clauses in the FROM
clause to be omitted.

This is an extension of the SQL standard, supported by some other
database systems, and so eases the transition from such systems, as
well as removing the minor inconvenience caused by requiring these
aliases.

Patch by me, reviewed by Tom Lane.

Discussion: https://postgr.es/m/CAEZATCUCGCf82=hxd9N5n6xGHPyYpQnxW8HneeH+uP7yNALkWA@mail.gmail.com
2022-07-20 09:29:42 +01:00
Tom Lane
836af9756b Remove trailing whitespace from *.sgml files.
Historically we've been lax about this, but seeing that we're not
lax in C files, there doesn't seem to be a good reason to be so
in the documentation.  Remove the existing occurrences (mostly
though not entirely in copied-n-pasted psql output), and modify
.gitattributes so that "git diff --check" will warn about future
cases.

While at it, add *.pm to the set of extensions .gitattributes
knows about, and remove some obsolete entries for files that
we don't have in the tree anymore.

Per followup discussion of commit 5a892c9b1.

Discussion: https://postgr.es/m/E1nfcV1-000kOR-E5@gemulon.postgresql.org
2022-04-20 11:04:49 -04:00
David Rowley
7bdd489d3d Doc: use "an SQL" consistently rather than "a SQL"
Similarly to what was done in 04539e73f, we standardized on SQL being
pronounced "es-que-ell" rather than "sequel" in our documentation.

Two inconsistencies have crept in during the v15 cycle.  The others
existed before but were missed in 04539e73f due to none of the searches
accounting for "SQL" being wrapped in tags.

As with 04539e73f, we don't touch code comments here in order to not
create unnecessary back-patching pain.

Discussion: https://postgr.es/m/CAApHDvpML27UqFXnrYO1MJddsKVMQoiZisPvsAGhKE_tsKXquw%40mail.gmail.com
2022-04-20 15:17:56 +12:00
Alvaro Herrera
c6bc655ee2 Error out if SKIP LOCKED and WITH TIES are both specified
Both bugs #16676[1] and #17141[2] illustrate that the combination of
SKIP LOCKED and FETCH FIRST WITH TIES break expectations when it comes
to rows returned to other sessions accessing the same row.  Since this
situation is detectable from the syntax and hard to fix otherwise,
forbid for now, with the potential to fix in the future.

[1] https://postgr.es/m/16676-fd62c3c835880da6@postgresql.org
[2] https://postgr.es/m/17141-913d78b9675aac8e@postgresql.org

Backpatch-through: 13, where WITH TIES was introduced
Author: David Christensen <david.christensen@crunchydata.com>
Discussion: https://postgr.es/m/CAOxo6XLPccCKru3xPMaYDpa+AXyPeWFs+SskrrL+HKwDjJnLhg@mail.gmail.com
2021-10-01 18:29:18 -03:00
Peter Eisentraut
055fee7eb4 Allow an alias to be attached to a JOIN ... USING
This allows something like

    SELECT ... FROM t1 JOIN t2 USING (a, b, c) AS x

where x has the columns a, b, c and unlike a regular alias it does not
hide the range variables of the tables being joined t1 and t2.

Per SQL:2016 feature F404 "Range variable for common column names".

Reviewed-by: Vik Fearing <vik.fearing@2ndquadrant.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/454638cf-d563-ab76-a585-2564428062af@2ndquadrant.com
2021-03-31 17:10:50 +02:00
Tomas Vondra
be45be9c33 Implement GROUP BY DISTINCT
With grouping sets, it's possible that some of the grouping sets are
duplicate.  This is especially common with CUBE and ROLLUP clauses. For
example GROUP BY CUBE (a,b), CUBE (b,c) is equivalent to

  GROUP BY GROUPING SETS (
    (a, b, c),
    (a, b, c),
    (a, b, c),
    (a, b),
    (a, b),
    (a, b),
    (a),
    (a),
    (a),
    (c, a),
    (c, a),
    (c, a),
    (c),
    (b, c),
    (b),
    ()
  )

Some of the grouping sets are calculated multiple times, which is mostly
unnecessary.  This commit implements a new GROUP BY DISTINCT feature, as
defined in the SQL standard, which eliminates the duplicate sets.

Author: Vik Fearing
Reviewed-by: Erik Rijkers, Georgios Kokolatos, Tomas Vondra
Discussion: https://postgr.es/m/bf3805a8-d7d1-ae61-fece-761b7ff41ecc@postgresfriends.org
2021-03-18 18:22:18 +01:00
Peter Eisentraut
f4adc41c4f Enhanced cycle mark values
Per SQL:202x draft, in the CYCLE clause of a recursive query, the
cycle mark values can be of type boolean and can be omitted, in which
case they default to TRUE and FALSE.

Reviewed-by: Vik Fearing <vik@postgresfriends.org>
Discussion: https://www.postgresql.org/message-id/flat/db80ceee-6f97-9b4a-8ee8-3ba0c58e5be2@2ndquadrant.com
2021-02-27 08:13:24 +01:00
Peter Eisentraut
3696a600e2 SEARCH and CYCLE clauses
This adds the SQL standard feature that adds the SEARCH and CYCLE
clauses to recursive queries to be able to do produce breadth- or
depth-first search orders and detect cycles.  These clauses can be
rewritten into queries using existing syntax, and that is what this
patch does in the rewriter.

Reviewed-by: Vik Fearing <vik@postgresfriends.org>
Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/db80ceee-6f97-9b4a-8ee8-3ba0c58e5be2@2ndquadrant.com
2021-02-01 14:32:51 +01:00
Tom Lane
662affcfe9 Doc: improve documentation for UNNEST().
Per a user question, spell out that UNNEST() returns array elements
in storage order; also provide an example to clarify the behavior for
multi-dimensional arrays.

While here, also clarify the SELECT reference page's description of
WITH ORDINALITY.  These details were already given in 7.2.1.4, but
a reference page should not omit details.

Back-patch to v13; there's not room in the table in older versions.

Discussion: https://postgr.es/m/FF1FB31F-0507-4F18-9559-2DE6E07E3B43@gmail.com
2021-01-27 12:50:22 -05:00
Michael Paquier
8a17f44c1e doc: Remove more notes about compatibilities with past versions
This is a follow-up of the work done in fa42c2e, that did not include
all the fixes previously agreed on.  The contents removed here can be
confusing to the reader as they refer to rather old server versions.

Author: Stephen Frost, Tom Lane, Heikki Linnakangas, Yaroslav Schekin
Discussion: https://postgr.es/m/CAB8KJ=jYHgnxLLZSNJz7gBTck4TxomngCmGkw3nEMSNF0yL6wA@mail.gmail.com
Discussion: https://postgr.es/m/1599765595731-0.post@n3.nabble.com
2020-12-01 16:32:26 +09:00
Heikki Linnakangas
c5f42daa60 Misc documentation fixes.
- Misc grammar and punctuation fixes.

- Stylistic cleanup: use spaces between function arguments and JSON fields
  in examples. For example "foo(a,b)" -> "foo(a, b)". Add semicolon after
  last END in a few PL/pgSQL examples that were missing them.

- Make sentence that talked about "..." and ".." operators more clear,
  by avoiding to end the sentence with "..". That makes it look the same
  as "..."

- Fix syntax description for HAVING: HAVING conditions cannot be repeated

Patch by Justin Pryzby, per Yaroslav Schekin's report. Backpatch to all
supported versions, to the extent that the patch applies easily.

Discussion: https://www.postgresql.org/message-id/20201005191922.GE17626%40telsasoft.com
2020-10-19 19:28:54 +03:00
Peter Eisentraut
9081bddbd7 Improve <xref> vs. <command> formatting in the documentation
SQL commands are generally marked up as <command>, except when a link
to a reference page is used using <xref>.  But the latter doesn't
create monospace markup, so this looks strange especially when a
paragraph contains a mix of links and non-links.

We considered putting <command> in the <refentrytitle> on the target
side, but that creates some formatting side effects elsewhere.
Generally, it seems safer to solve this on the link source side.

We can't put the <xref> inside the <command>; the DTD doesn't allow
this.  DocBook 5 would allow the <command> to have the linkend
attribute itself, but we are not there yet.

So to solve this for now, convert the <xref>s to <link> plus
<command>.  This gives the correct look and also gives some more
flexibility what we can put into the link text (e.g., subcommands or
other clauses).  In the future, these could then be converted to
DocBook 5 style.

I haven't converted absolutely all xrefs to SQL command reference
pages, only those where we care about the appearance of the link text
or where it was otherwise appropriate to make the appearance match a
bit better.  Also in some cases, the links where repetitive, so in
those cases the links where just removed and replaced by a plain
<command>.  In cases where we just want the link and don't
specifically care about the generated link text (typically phrased
"for further information see <xref ...>") the xref is kept.

Reported-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://www.postgresql.org/message-id/flat/87o8pco34z.fsf@wibble.ilmari.org
2020-10-03 16:40:02 +02:00
Peter Eisentraut
a02b8bdd98 doc: Fix man page whitespace issues
Whitespace between tags is significant, and in some cases it creates
extra vertical space in man pages.  The fix is either to remove some
newlines or in some cases to reword slightly to avoid the awkward
markup layout.
2020-06-07 14:54:28 +02:00
Tom Lane
60c90c16c1 Doc: fix "Unresolved ID reference" warnings, clean up man page cross-refs.
Use xreflabel attributes instead of endterm attributes to control the
appearance of links to subsections of SQL command reference pages.
This is simpler, it matches what we do elsewhere (e.g. for GUC variables),
and it doesn't draw "Unresolved ID reference" warnings from the PDF
toolchain.

Fix some places where the text was absolutely dependent on an <xref>
rendering exactly so, by using a <link> around the required text
instead.  At least one of those spots had already been turned into
bad grammar by subsequent changes, and the whole idea is just too
fragile for my taste.  <xref> does NOT have fixed output, don't write
as if it does.

Consistently include a page-level link in cross-man-page references,
because otherwise they are useless/nonsensical in man-page output.
Likewise, be consistent about mentioning "below" or "above" in same-page
references; we were doing that in about 90% of the cases, but now it's
100%.

Also get rid of another nonfunctional-in-PDF idea, of making
cross-references to functions by sticking ID tags on <row> constructs.
We can put the IDs on <indexterm>s instead --- which is probably not any
more sensible in abstract terms, but it works where the other doesn't.
(There is talk of attaching cross-reference IDs to most or all of
the docs' function descriptions, but for now I just fixed the two
that exist.)

Discussion: https://postgr.es/m/14480.1589154358@sss.pgh.pa.us
2020-05-11 14:15:55 -04:00
Michael Paquier
8128b0c152 Fix collection of typos and grammar mistakes in the tree, volume 2
This fixes some comments and documentation new as of Postgres 13, and is
a follow-up of the work done in dd0f37e.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20200408165653.GF2228@telsasoft.com
2020-04-14 14:45:43 +09:00
Alvaro Herrera
357889eb17 Support FETCH FIRST WITH TIES
WITH TIES is an option to the FETCH FIRST N ROWS clause (the SQL
standard's spelling of LIMIT), where you additionally get rows that
compare equal to the last of those N rows by the columns in the
mandatory ORDER BY clause.

There was a proposal by Andrew Gierth to implement this functionality in
a more powerful way that would yield more features, but the other patch
had not been finished at this time, so we decided to use this one for
now in the spirit of incremental development.

Author: Surafel Temesgen <surafel3000@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Tomas Vondra <tomas.vondra@2ndquadrant.com>
Discussion: https://postgr.es/m/CALAY4q9ky7rD_A4vf=FVQvCGngm3LOes-ky0J6euMrg=_Se+ag@mail.gmail.com
Discussion: https://postgr.es/m/87o8wvz253.fsf@news-spur.riddles.org.uk
2020-04-07 16:22:13 -04:00
Tom Lane
5b805886ca Doc: clarify use of RECURSIVE in WITH.
Apparently some people misinterpreted the syntax as being that
RECURSIVE is a prefix of individual WITH queries.  It's a modifier
for the WITH clause as a whole, so state that more clearly.

Discussion: https://postgr.es/m/ca53c6ce-a0c6-b14a-a8e3-162f0b2cc119@a-kretschmer.de
2019-11-19 14:43:37 -05:00
Tom Lane
608b167f9f Allow user control of CTE materialization, and change the default behavior.
Historically we've always materialized the full output of a CTE query,
treating WITH as an optimization fence (so that, for example, restrictions
from the outer query cannot be pushed into it).  This is appropriate when
the CTE query is INSERT/UPDATE/DELETE, or is recursive; but when the CTE
query is non-recursive and side-effect-free, there's no hazard of changing
the query results by pushing restrictions down.

Another argument for materialization is that it can avoid duplicate
computation of an expensive WITH query --- but that only applies if
the WITH query is called more than once in the outer query.  Even then
it could still be a net loss, if each call has restrictions that
would allow just a small part of the WITH query to be computed.

Hence, let's change the behavior for WITH queries that are non-recursive
and side-effect-free.  By default, we will inline them into the outer
query (removing the optimization fence) if they are called just once.
If they are called more than once, we will keep the old behavior by
default, but the user can override this and force inlining by specifying
NOT MATERIALIZED.  Lastly, the user can force the old behavior by
specifying MATERIALIZED; this would mainly be useful when the query had
deliberately been employing WITH as an optimization fence to prevent a
poor choice of plan.

Andreas Karlsson, Andrew Gierth, David Fetter

Discussion: https://postgr.es/m/87sh48ffhb.fsf@news-spur.riddles.org.uk
2019-02-16 16:11:12 -05:00
Tom Lane
e0cd0ea4f9 Doc: update documentation for requirement of ORDER BY in GROUPS mode.
Commit ff4f88916 adjusted the code to enforce the SQL spec's requirement
that a window using GROUPS mode must have an ORDER BY clause.  But I missed
that the documentation explicitly said you didn't have to have one.

Also minor wordsmithing in the window-function section of select.sgml.

Per Masahiko Sawada, though I didn't use his patch.
2018-07-12 11:10:24 -04:00
Andrew Gierth
1da162e1f5 Fix SQL:2008 FETCH FIRST syntax to allow parameters.
OFFSET <x> ROWS FETCH FIRST <y> ROWS ONLY syntax is supposed to accept
<simple value specification>, which includes parameters as well as
literals. When this syntax was added all those years ago, it was done
inconsistently, with <x> and <y> being different subsets of the
standard syntax.

Rectify that by making <x> and <y> accept the same thing, and allowing
either a (signed) numeric literal or a c_expr there, which allows for
parameters, variables, and parenthesized arbitrary expressions.

Per bug #15200 from Lukas Eder.

Backpatch all the way, since this has been broken from the start.

Discussion: https://postgr.es/m/877enz476l.fsf@news-spur.riddles.org.uk
Discussion: http://postgr.es/m/152647780335.27204.16895288237122418685@wrigleys.postgresql.org
2018-05-21 17:27:08 +01:00
Tom Lane
0a459cec96 Support all SQL:2011 options for window frame clauses.
This patch adds the ability to use "RANGE offset PRECEDING/FOLLOWING"
frame boundaries in window functions.  We'd punted on that back in the
original patch to add window functions, because it was not clear how to
do it in a reasonably data-type-extensible fashion.  That problem is
resolved here by adding the ability for btree operator classes to provide
an "in_range" support function that defines how to add or subtract the
RANGE offset value.  Factoring it this way also allows the operator class
to avoid overflow problems near the ends of the datatype's range, if it
wishes to expend effort on that.  (In the committed patch, the integer
opclasses handle that issue, but it did not seem worth the trouble to
avoid overflow failures for datetime types.)

The patch includes in_range support for the integer_ops opfamily
(int2/int4/int8) as well as the standard datetime types.  Support for
other numeric types has been requested, but that seems like suitable
material for a follow-on patch.

In addition, the patch adds GROUPS mode which counts the offset in
ORDER-BY peer groups rather than rows, and it adds the frame_exclusion
options specified by SQL:2011.  As far as I can see, we are now fully
up to spec on window framing options.

Existing behaviors remain unchanged, except that I changed the errcode
for a couple of existing error reports to meet the SQL spec's expectation
that negative "offset" values should be reported as SQLSTATE 22013.

Internally and in relevant parts of the documentation, we now consistently
use the terminology "offset PRECEDING/FOLLOWING" rather than "value
PRECEDING/FOLLOWING", since the term "value" is confusingly vague.

Oliver Ford, reviewed and whacked around some by me

Discussion: https://postgr.es/m/CAGMVOdu9sivPAxbNN0X+q19Sfv9edEPv=HibOJhB14TJv_RCQg@mail.gmail.com
2018-02-07 00:06:56 -05:00
Peter Eisentraut
3c49c6facb Convert documentation to DocBook XML
Since some preparation work had already been done, the only source
changes left were changing empty-element tags like <xref linkend="foo">
to <xref linkend="foo"/>, and changing the DOCTYPE.

The source files are still named *.sgml, but they are actually XML files
now.  Renaming could be considered later.

In the build system, the intermediate step to convert from SGML to XML
is removed.  Everything is build straight from the source files again.
The OpenSP (or the old SP) package is no longer needed.

The documentation toolchain instructions are updated and are much
simpler now.

Peter Eisentraut, Alexander Lakhin, Jürgen Purtz
2017-11-23 09:44:28 -05:00
Peter Eisentraut
1ff01b3902 Convert SGML IDs to lower case
IDs in SGML are case insensitive, and we have accumulated a mix of upper
and lower case IDs, including different variants of the same ID.  In
XML, these will be case sensitive, so we need to fix up those
differences.  Going to all lower case seems most straightforward, and
the current build process already makes all anchors and lower case
anyway during the SGML->XML conversion, so this doesn't create any
difference in the output right now.  A future XML-only build process
would, however, maintain any mixed case ID spellings in the output, so
that is another reason to clean this up beforehand.

Author: Alexander Lakhin <exclusion@gmail.com>
2017-10-20 19:26:10 -04:00
Peter Eisentraut
c29c578908 Don't use SGML empty tags
For DocBook XML compatibility, don't use SGML empty tags (</>) anymore,
replace by the full tag name.  Add a warning option to catch future
occurrences.

Alexander Lakhin, Jürgen Purtz
2017-10-17 15:10:33 -04:00
Tom Lane
ed3dc224e5 Doc: clarify description of degenerate NATURAL joins.
Claiming that NATURAL JOIN is equivalent to CROSS JOIN when there are
no common column names is only strictly correct if it's an inner join;
you can't say e.g. CROSS LEFT JOIN.  Better to explain it as meaning
JOIN ON TRUE, instead.  Per a suggestion from David Johnston.

Discussion: https://postgr.es/m/CAKFQuwb+mYszQhDS9f_dqRrk1=Pe-S6D=XMkAXcDf4ykKPmgKQ@mail.gmail.com
2017-07-20 12:41:26 -04:00
Simon Riggs
f66472428a Correct TABLESAMPLE docs
Revert to original use of word “sample”, though with clarification,
per Tom Lane.

Discussion: 29052.1471015383@sss.pgh.pa.us
2016-09-09 11:19:21 +01:00
Simon Riggs
6e75559ea9 Correct TABLESAMPLE docs
Original wording was correct but not the intended meaning.

Reported by Patrik Wenger
2016-08-12 10:34:43 +01:00
Peter Eisentraut
5676da2d01 Documentation spell checking and markup improvements 2016-07-28 22:46:15 -04:00
Tom Lane
9118d03a8c When appropriate, postpone SELECT output expressions till after ORDER BY.
It is frequently useful for volatile, set-returning, or expensive functions
in a SELECT's targetlist to be postponed till after ORDER BY and LIMIT are
done.  Otherwise, the functions might be executed for every row of the
table despite the presence of LIMIT, and/or be executed in an unexpected
order.  For example, in
	SELECT x, nextval('seq') FROM tab ORDER BY x LIMIT 10;
it's probably desirable that the nextval() values are ordered the same
as x, and that nextval() is not run more than 10 times.

In the past, Postgres was inconsistent in this area: you would get the
desirable behavior if the ordering were performed via an indexscan, but
not if it had to be done by an explicit sort step.  Getting the desired
behavior reliably required contortions like
	SELECT x, nextval('seq')
	  FROM (SELECT x FROM tab ORDER BY x) ss LIMIT 10;

This patch conditionally postpones evaluation of pure-output target
expressions (that is, those that are not used as DISTINCT, ORDER BY, or
GROUP BY columns) so that they effectively occur after sorting, even if an
explicit sort step is necessary.  Volatile expressions and set-returning
expressions are always postponed, so as to provide consistent semantics.
Expensive expressions (costing more than 10 times typical operator cost,
which by default would include any user-defined function) are postponed
if there is a LIMIT or if there are expressions that must be postponed.

We could be more aggressive and postpone any nontrivial expression, but
there are costs associated with doing so: it requires an extra Result plan
node which adds some overhead, and postponement changes the volume of data
going through the sort step, perhaps for the worse.  Since we tend not to
have very good estimates of the output width of nontrivial expressions,
it's hard to have much confidence in our ability to predict whether
postponement would increase or decrease the cost of the sort; therefore
this patch doesn't attempt to make decisions conditionally on that.
Between these factors and a general desire not to change query behavior
when there's not a demonstrable benefit, it seems best to be conservative
about applying postponement.  We might tweak the decision rules in the
future, though.

Konstantin Knizhnik, heavily rewritten by me
2016-03-11 12:27:50 -05:00
Tom Lane
dd7a8f66ed Redesign tablesample method API, and do extensive code review.
The original implementation of TABLESAMPLE modeled the tablesample method
API on index access methods, which wasn't a good choice because, without
specialized DDL commands, there's no way to build an extension that can
implement a TSM.  (Raw inserts into system catalogs are not an acceptable
thing to do, because we can't undo them during DROP EXTENSION, nor will
pg_upgrade behave sanely.)  Instead adopt an API more like procedural
language handlers or foreign data wrappers, wherein the only SQL-level
support object needed is a single handler function identified by having
a special return type.  This lets us get rid of the supporting catalog
altogether, so that no custom DDL support is needed for the feature.

Adjust the API so that it can support non-constant tablesample arguments
(the original coding assumed we could evaluate the argument expressions at
ExecInitSampleScan time, which is undesirable even if it weren't outright
unsafe), and discourage sampling methods from looking at invisible tuples.
Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable
within and across queries, as required by the SQL standard, and deal more
honestly with methods that can't support that requirement.

Make a full code-review pass over the tablesample additions, and fix
assorted bugs, omissions, infelicities, and cosmetic issues (such as
failure to put the added code stanzas in a consistent ordering).
Improve EXPLAIN's output of tablesample plans, too.

Back-patch to 9.5 so that we don't have to support the original API
in production.
2015-07-25 14:39:00 -04:00
Andres Freund
f3d3118532 Support GROUPING SETS, CUBE and ROLLUP.
This SQL standard functionality allows to aggregate data by different
GROUP BY clauses at once. Each grouping set returns rows with columns
grouped by in other sets set to NULL.

This could previously be achieved by doing each grouping as a separate
query, conjoined by UNION ALLs. Besides being considerably more concise,
grouping sets will in many cases be faster, requiring only one scan over
the underlying data.

The current implementation of grouping sets only supports using sorting
for input. Individual sets that share a sort order are computed in one
pass. If there are sets that don't share a sort order, additional sort &
aggregation steps are performed. These additional passes are sourced by
the previous sort step; thus avoiding repeated scans of the source data.

The code is structured in a way that adding support for purely using
hash aggregation or a mix of hashing and sorting is possible. Sorting
was chosen to be supported first, as it is the most generic method of
implementation.

Instead of, as in an earlier versions of the patch, representing the
chain of sort and aggregation steps as full blown planner and executor
nodes, all but the first sort are performed inside the aggregation node
itself. This avoids the need to do some unusual gymnastics to handle
having to return aggregated and non-aggregated tuples from underlying
nodes, as well as having to shut down underlying nodes early to limit
memory usage.  The optimizer still builds Sort/Agg node to describe each
phase, but they're not part of the plan tree, but instead additional
data for the aggregation node. They're a convenient and preexisting way
to describe aggregation and sorting.  The first (and possibly only) sort
step is still performed as a separate execution step. That retains
similarity with existing group by plans, makes rescans fairly simple,
avoids very deep plans (leading to slow explains) and easily allows to
avoid the sorting step if the underlying data is sorted by other means.

A somewhat ugly side of this patch is having to deal with a grammar
ambiguity between the new CUBE keyword and the cube extension/functions
named cube (and rollup). To avoid breaking existing deployments of the
cube extension it has not been renamed, neither has cube been made a
reserved keyword. Instead precedence hacking is used to make GROUP BY
cube(..) refer to the CUBE grouping sets feature, and not the function
cube(). To actually group by a function cube(), unlikely as that might
be, the function name has to be quoted.

Needs a catversion bump because stored rules may change.

Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund
Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas
    Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule
Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:46:31 +02:00
Simon Riggs
f6d208d6e5 TABLESAMPLE, SQL Standard and extensible
Add a TABLESAMPLE clause to SELECT statements that allows
user to specify random BERNOULLI sampling or block level
SYSTEM sampling. Implementation allows for extensible
sampling functions to be written, using a standard API.
Basic version follows SQLStandard exactly. Usable
concrete use cases for the sampling API follow in later
commits.

Petr Jelinek

Reviewed by Michael Paquier and Simon Riggs
2015-05-15 14:37:10 -04:00
Tom Lane
f6caf5acf1 Fix incorrect markup in documentation of window frame clauses.
You're required to write either RANGE or ROWS to start a frame clause,
but the documentation incorrectly implied this is optional.  Noted by
David Johnston.
2015-03-31 20:02:40 -04:00
Tom Lane
0ce627d465 Document evaluation-order considerations for aggregate functions.
The SELECT reference page didn't really address the question of when
aggregate function evaluation occurs, nor did the "expression evaluation
rules" documentation mention that CASE can't be used to control whether
an aggregate gets evaluated or not.  Improve that.

Per discussion of bug #11661.  Original text by Marti Raudsepp and Michael
Paquier, rewritten significantly by me.
2014-11-14 17:19:56 -05:00
Alvaro Herrera
35fed51626 Tweak row-level locking documentation
Move the meat of locking levels to mvcc.sgml, leaving only a link to it
in the SELECT reference page.

Michael Paquier, with some tweaks by Álvaro
2014-11-13 14:45:55 -03:00
Alvaro Herrera
df630b0dd5 Implement SKIP LOCKED for row-level locks
This clause changes the behavior of SELECT locking clauses in the
presence of locked rows: instead of causing a process to block waiting
for the locks held by other processes (or raise an error, with NOWAIT),
SKIP LOCKED makes the new reader skip over such rows.  While this is not
appropriate behavior for general purposes, there are some cases in which
it is useful, such as queue-like tables.

Catalog version bumped because this patch changes the representation of
stored rules.

Reviewed by Craig Ringer (based on a previous attempt at an
implementation by Simon Riggs, who also provided input on the syntax
used in the current patch), David Rowley, and Álvaro Herrera.

Author: Thomas Munro
2014-10-07 17:23:34 -03:00
Bruce Momjian
1f4d1074c5 Clarify documentation about "peer" rows in window functions
Peer rows are matching rows when ORDER BY is specified.

Report by arnaud.mouronval@gmail.com, David G Johnston
2014-09-05 18:59:41 -04:00
Kevin Grittner
05258761bf doc: Various typo/grammar fixes
Errors detected using Topy (https://github.com/intgr/topy), all
changes verified by hand and some manual tweaks added.

Marti Raudsepp

Individual changes backpatched, where applicable, as far as 9.0.
2014-08-30 10:52:36 -05:00
Bruce Momjian
3624acd342 docs: small adjustements to recent SELECT and pg_upgrade improvements 2014-03-08 11:26:47 -05:00
Bruce Momjian
b0cb40f93a docs: improve TABLE command by showing supported clauses
Initial patch by Colin 't Hart
2014-03-07 20:56:16 -05:00
Peter Eisentraut
bb4eefe7bf doc: Improve DocBook XML validity
DocBook XML is superficially compatible with DocBook SGML but has a
slightly stricter DTD that we have been violating in a few cases.
Although XSLT doesn't care whether the document is valid, the style
sheets don't necessarily process invalid documents correctly, so we need
to work toward fixing this.

This first commit moves the indexterms in refentry elements to an
allowed position.  It has no impact on the output.
2014-02-23 21:31:08 -05:00
Bruce Momjian
8824b38909 docs: specify FOR UPDATE/SHARE incompatibilities
Document that FOR UPDATE/SHARE are incompatible with GROUP BY, DISTINCT,
HAVING and window functions.

Michael Paquier
2014-01-31 16:37:25 -05:00
Tom Lane
1b4f7f93b4 Allow empty target list in SELECT.
This fixes a problem noted as a followup to bug #8648: if a query has a
semantically-empty target list, e.g. SELECT * FROM zero_column_table,
ruleutils.c will dump it as a syntactically-empty target list, which was
not allowed.  There doesn't seem to be any reliable way to fix this by
hacking ruleutils (note in particular that the originally zero-column table
might since have had columns added to it); and even if we had such a fix,
it would do nothing for existing dump files that might contain bad syntax.
The best bet seems to be to relax the syntactic restriction.

Also, add parse-analysis errors for SELECT DISTINCT with no columns (after
*-expansion) and RETURNING with no columns.  These cases previously
produced unexpected behavior because the parsed Query looked like it had
no DISTINCT or RETURNING clause, respectively.  If anyone ever offers
a plausible use-case for this, we could work a bit harder on making the
situation distinguishable.

Arguably this is a bug fix that should be back-patched, but I'm worried
that there may be client apps or PLs that expect "SELECT ;" to throw a
syntax error.  The issue doesn't seem important enough to risk changing
behavior in minor releases.
2013-12-14 20:23:26 -05:00
Noah Misch
53685d7981 Rename TABLE() to ROWS FROM().
SQL-standard TABLE() is a subset of UNNEST(); they deal with arrays and
other collection types.  This feature, however, deals with set-returning
functions.  Use a different syntax for this feature to keep open the
possibility of implementing the standard TABLE().
2013-12-10 09:34:37 -05:00
Tom Lane
784e762e88 Support multi-argument UNNEST(), and TABLE() syntax for multiple functions.
This patch adds the ability to write TABLE( function1(), function2(), ...)
as a single FROM-clause entry.  The result is the concatenation of the
first row from each function, followed by the second row from each
function, etc; with NULLs inserted if any function produces fewer rows than
others.  This is believed to be a much more useful behavior than what
Postgres currently does with multiple SRFs in a SELECT list.

This syntax also provides a reasonable way to combine use of column
definition lists with WITH ORDINALITY: put the column definition list
inside TABLE(), where it's clear that it doesn't control the ordinality
column as well.

Also implement SQL-compliant multiple-argument UNNEST(), by turning
UNNEST(a,b,c) into TABLE(unnest(a), unnest(b), unnest(c)).

The SQL standard specifies TABLE() with only a single function, not
multiple functions, and it seems to require an implicit UNNEST() which is
not what this patch does.  There may be something wrong with that reading
of the spec, though, because if it's right then the spec's TABLE() is just
a pointless alternative spelling of UNNEST().  After further review of
that, we might choose to adopt a different syntax for what this patch does,
but in any case this functionality seems clearly worthwhile.

Andrew Gierth, reviewed by Zoltán Böszörményi and Heikki Linnakangas, and
significantly revised by me
2013-11-21 19:37:20 -05:00
Peter Eisentraut
001e114b8d Fix whitespace issues found by git diff --check, add gitattributes
Set per file type attributes in .gitattributes to fine-tune whitespace
checks.  With the associated cleanups, the tree is now clean for git
2013-11-10 14:48:29 -05:00