1
0
mirror of https://github.com/postgres/postgres.git synced 2025-05-17 06:41:24 +03:00

1399 Commits

Author SHA1 Message Date
Peter Eisentraut
303f4d1012 Assorted message fixes and improvements 2014-09-05 01:25:27 -04:00
Alvaro Herrera
1c9701cfe5 Fix FOR UPDATE NOWAIT on updated tuple chains
If SELECT FOR UPDATE NOWAIT tries to lock a tuple that is concurrently
being updated, it might fail to honor its NOWAIT specification and block
instead of raising an error.

Fix by adding a no-wait flag to EvalPlanQualFetch which it can pass down
to heap_lock_tuple; also use it in EvalPlanQualFetch itself to avoid
blocking while waiting for a concurrent transaction.

Authors: Craig Ringer and Thomas Munro, tweaked by Álvaro
http://www.postgresql.org/message-id/51FB6703.9090801@2ndquadrant.com

Per Thomas Munro in the course of his SKIP LOCKED feature submission,
who also provided one of the isolation test specs.

Backpatch to 9.4, because that's as far back as it applies without
conflicts (although the bug goes all the way back).  To that branch also
backpatch Thomas Munro's new NOWAIT test cases, committed in master by
Heikki as commit 9ee16b49f0aac819bd4823d9b94485ef608b34e8 .
2014-08-27 19:15:18 -04:00
Robert Haas
1d41739e5a Don't require sort support functions to provide a comparator.
This could be useful for datatypes like text, where we might want
to optimize for some collations but not others.  However, this patch
doesn't introduce any new sortsupport functions that work this way;
it merely revises the code so that future patches may do so.

Patch by me.  Review by Peter Geoghegan.
2014-08-06 16:06:06 -04:00
Kevin Grittner
49d1e03d64 Fix typo in C comment. 2014-08-05 14:17:21 -05:00
Tom Lane
d685814835 Fix bug with whole-row references to append subplans.
ExecEvalWholeRowVar incorrectly supposed that it could "bless" the source
TupleTableSlot just once per query.  But if the input is coming from an
Append (or, perhaps, other cases?) more than one slot might be returned
over the query run.  This led to "record type has not been registered"
errors when a composite datum was extracted from a non-blessed slot.

This bug has been there a long time; I guess it escaped notice because when
dealing with subqueries the planner tends to expand whole-row Vars into
RowExprs, which don't have the same problem.  It is possible to trigger
the problem in all active branches, though, as illustrated by the added
regression test.
2014-07-11 19:12:35 -04:00
Tom Lane
6f5034eda0 Redesign API presented by nodeAgg.c for ordered-set and similar aggregates.
The previous design exposed the input and output ExprContexts of the
Agg plan node, but work on grouping sets has suggested that we'll regret
doing that.  Instead provide more narrowly-defined APIs that can be
implemented in multiple ways, namely a way to get a short-term memory
context and a way to register an aggregate shutdown callback.

Back-patch to 9.4 where the bad APIs were introduced, since we don't
want third-party code using these APIs and then having to change in 9.5.

Andrew Gierth
2014-07-03 18:25:33 -04:00
Tom Lane
45b0f35723 Avoid leaking memory while evaluating arguments for a table function.
ExecMakeTableFunctionResult evaluated the arguments for a function-in-FROM
in the query-lifespan memory context.  This is insignificant in simple
cases where the function relation is scanned only once; but if the function
is in a sub-SELECT or is on the inside of a nested loop, any memory
consumed during argument evaluation can add up quickly.  (The potential for
trouble here had been foreseen long ago, per existing comments; but we'd
not previously seen a complaint from the field about it.)  To fix, create
an additional temporary context just for this purpose.

Per an example from MauMau.  Back-patch to all active branches.
2014-06-19 22:14:26 -04:00
Tom Lane
8f889b1083 Implement UPDATE tab SET (col1,col2,...) = (SELECT ...), ...
This SQL-standard feature allows a sub-SELECT yielding multiple columns
(but only one row) to be used to compute the new values of several columns
to be updated.  While the same results can be had with an independent
sub-SELECT per column, such a workaround can require a great deal of
duplicated computation.

The standard actually says that the source for a multi-column assignment
could be any row-valued expression.  The implementation used here is
tightly tied to our existing sub-SELECT support and can't handle other
cases; the Bison grammar would have some issues with them too.  However,
I don't feel too bad about this since other cases can be converted into
sub-SELECTs.  For instance, "SET (a,b,c) = row_valued_function(x)" could
be written "SET (a,b,c) = (SELECT * FROM row_valued_function(x))".
2014-06-18 13:22:34 -04:00
Jeff Davis
35c0cd3b05 Improve comment for tricky aspect of index-only scans.
Index-only scans avoid taking a lock on the VM buffer, which would
cause a lot of contention. To be correct, that requires some intricate
assumptions that weren't completely documented in the previous
comment.

Reviewed by Robert Haas.
2014-05-06 19:27:43 -07:00
Bruce Momjian
0a78320057 pgindent run for 9.4
This includes removing tabs after periods in C comments, which was
applied to back branches, so this change should not effect backpatching.
2014-05-06 12:12:18 -04:00
Tom Lane
3f8c8e3c61 Fix failure to detoast fields in composite elements of structured types.
If we have an array of records stored on disk, the individual record fields
cannot contain out-of-line TOAST pointers: the tuptoaster.c mechanisms are
only prepared to deal with TOAST pointers appearing in top-level fields of
a stored row.  The same applies for ranges over composite types, nested
composites, etc.  However, the existing code only took care of expanding
sub-field TOAST pointers for the case of nested composites, not for other
structured types containing composites.  For example, given a command such
as

UPDATE tab SET arraycol = ARRAY[(ROW(x,42)::mycompositetype] ...

where x is a direct reference to a field of an on-disk tuple, if that field
is long enough to be toasted out-of-line then the TOAST pointer would be
inserted as-is into the array column.  If the source record for x is later
deleted, the array field value would become a dangling pointer, leading
to errors along the line of "missing chunk number 0 for toast value ..."
when the value is referenced.  A reproducible test case for this was
provided by Jan Pecek, but it seems likely that some of the "missing chunk
number" reports we've heard in the past were caused by similar issues.

Code-wise, the problem is that PG_DETOAST_DATUM() is not adequate to
produce a self-contained Datum value if the Datum is of composite type.
Seen in this light, the problem is not just confined to arrays and ranges,
but could also affect some other places where detoasting is done in that
way, for example form_index_tuple().

I tried teaching the array code to apply toast_flatten_tuple_attribute()
along with PG_DETOAST_DATUM() when the array element type is composite,
but this was messy and imposed extra cache lookup costs whether or not any
TOAST pointers were present, indeed sometimes when the array element type
isn't even composite (since sometimes it takes a typcache lookup to find
that out).  The idea of extending that approach to all the places that
currently use PG_DETOAST_DATUM() wasn't attractive at all.

This patch instead solves the problem by decreeing that composite Datum
values must not contain any out-of-line TOAST pointers in the first place;
that is, we expand out-of-line fields at the point of constructing a
composite Datum, not at the point where we're about to insert it into a
larger tuple.  This rule is applied only to true composite Datums, not
to tuples that are being passed around the system as tuples, so it's not
as invasive as it might sound at first.  With this approach, the amount
of code that has to be touched for a full solution is greatly reduced,
and added cache lookup costs are avoided except when there actually is
a TOAST pointer that needs to be inlined.

The main drawback of this approach is that we might sometimes dereference
a TOAST pointer that will never actually be used by the query, imposing a
rather large cost that wasn't there before.  On the other side of the coin,
if the field value is used multiple times then we'll come out ahead by
avoiding repeat detoastings.  Experimentation suggests that common SQL
coding patterns are unaffected either way, though.  Applications that are
very negatively affected could be advised to modify their code to not fetch
columns they won't be using.

In future, we might consider reverting this solution in favor of detoasting
only at the point where data is about to be stored to disk, using some
method that can drill down into multiple levels of nested structured types.
That will require defining new APIs for structured types, though, so it
doesn't seem feasible as a back-patchable fix.

Note that this patch changes HeapTupleGetDatum() from a macro to a function
call; this means that any third-party code using that macro will not get
protection against creating TOAST-pointer-containing Datums until it's
recompiled.  The same applies to any uses of PG_RETURN_HEAPTUPLEHEADER().
It seems likely that this is not a big problem in practice: most of the
tuple-returning functions in core and contrib produce outputs that could
not possibly be toasted anyway, and the same probably holds for third-party
extensions.

This bug has existed since TOAST was invented, so back-patch to all
supported branches.
2014-05-01 15:19:06 -04:00
Tom Lane
f0fedfe82c Allow polymorphic aggregates to have non-polymorphic state data types.
Before 9.4, such an aggregate couldn't be declared, because its final
function would have to have polymorphic result type but no polymorphic
argument, which CREATE FUNCTION would quite properly reject.  The
ordered-set-aggregate patch found a workaround: allow the final function
to be declared as accepting additional dummy arguments that have types
matching the aggregate's regular input arguments.  However, we failed
to notice that this problem applies just as much to regular aggregates,
despite the fact that we had a built-in regular aggregate array_agg()
that was known to be undeclarable in SQL because its final function
had an illegal signature.  So what we should have done, and what this
patch does, is to decouple the extra-dummy-arguments behavior from
ordered-set aggregates and make it generally available for all aggregate
declarations.  We have to put this into 9.4 rather than waiting till
later because it slightly alters the rules for declaring ordered-set
aggregates.

The patch turned out a bit bigger than I'd hoped because it proved
necessary to record the extra-arguments option in a new pg_aggregate
column.  I'd thought we could just look at the final function's pronargs
at runtime, but that didn't work well for variadic final functions.
It's probably just as well though, because it simplifies life for pg_dump
to record the option explicitly.

While at it, fix array_agg() to have a valid final-function signature,
and add an opr_sanity test to notice future deviations from polymorphic
consistency.  I also marked the percentile_cont() aggregates as not
needing extra arguments, since they don't.
2014-04-23 19:17:41 -04:00
Tom Lane
e0c91a7ff0 Improve some O(N^2) behavior in window function evaluation.
Repositioning the tuplestore seek pointer in window_gettupleslot() turns
out to be a very significant expense when the window frame is sizable and
the frame end can move.  To fix, introduce a tuplestore function for
skipping an arbitrary number of tuples in one call, parallel to the one we
introduced for tuplesort objects in commit 8d65da1f.  This reduces the cost
of window_gettupleslot() to O(1) if the tuplestore has not spilled to disk.
As in the previous commit, I didn't try to do any real optimization of
tuplestore_skiptuples for the case where the tuplestore has spilled to
disk.  There is probably no practical way to get the cost to less than O(N)
anyway, but perhaps someone can think of something later.

Also fix PersistHoldablePortal() to make use of this API now that we have
it.

Based on a suggestion by Dean Rasheed, though this turns out not to look
much like his patch.
2014-04-13 13:59:17 -04:00
Tom Lane
a9d9acbf21 Create infrastructure for moving-aggregate optimization.
Until now, when executing an aggregate function as a window function
within a window with moving frame start (that is, any frame start mode
except UNBOUNDED PRECEDING), we had to recalculate the aggregate from
scratch each time the frame head moved.  This patch allows an aggregate
definition to include an alternate "moving aggregate" implementation
that includes an inverse transition function for removing rows from
the aggregate's running state.  As long as this can be done successfully,
runtime is proportional to the total number of input rows, rather than
to the number of input rows times the average frame length.

This commit includes the core infrastructure, documentation, and regression
tests using user-defined aggregates.  Follow-on commits will update some
of the built-in aggregates to use this feature.

David Rowley and Florian Pflug, reviewed by Dean Rasheed; additional
hacking by me
2014-04-12 12:03:30 -04:00
Noah Misch
7cbe57c34d Offer triggers on foreign tables.
This covers all the SQL-standard trigger types supported for regular
tables; it does not cover constraint triggers.  The approach for
acquiring the old row mirrors that for view INSTEAD OF triggers.  For
AFTER ROW triggers, we spool the foreign tuples to a tuplestore.

This changes the FDW API contract; when deciding which columns to
populate in the slot returned from data modification callbacks, writable
FDWs will need to check for AFTER ROW triggers in addition to checking
for a RETURNING clause.

In support of the feature addition, refactor the TriggerFlags bits and
the assembly of old tuples in ModifyTable.

Ronan Dunklau, reviewed by KaiGai Kohei; some additional hacking by me.
2014-03-23 02:16:34 -04:00
Alvaro Herrera
f88d4cfc9d Setup error context callback for transaction lock waits
With this in place, a session blocking behind another one because of
tuple locks will get a context line mentioning the relation name, tuple
TID, and operation being done on tuple.  For example:

LOG:  process 11367 still waiting for ShareLock on transaction 717 after 1000.108 ms
DETAIL:  Process holding the lock: 11366. Wait queue: 11367.
CONTEXT:  while updating tuple (0,2) in relation "foo"
STATEMENT:  UPDATE foo SET value = 3;

Most usefully, the new line is displayed by log entries due to
log_lock_waits, although of course it will be printed by any other log
message as well.

Author: Christian Kruse, some tweaks by Álvaro Herrera
Reviewed-by: Amit Kapila, Andres Freund, Tom Lane, Robert Haas
2014-03-19 15:10:36 -03:00
Tom Lane
bf4052faa1 Don't reject ROW_MARK_REFERENCE rowmarks for materialized views.
We should allow this so that matviews can be referenced in UPDATE/DELETE
statements in READ COMMITTED isolation level.  The requirement for that
is that a re-fetch by TID will see the same row version the query saw
earlier, which is true of matviews, so there's no reason for the
restriction.  Per bug #9398.

Michael Paquier, after a suggestion by me
2014-03-06 11:37:02 -05:00
Robert Haas
b89e151054 Introduce logical decoding.
This feature, building on previous commits, allows the write-ahead log
stream to be decoded into a series of logical changes; that is,
inserts, updates, and deletes and the transactions which contain them.
It is capable of handling decoding even across changes to the schema
of the effected tables.  The output format is controlled by a
so-called "output plugin"; an example is included.  To make use of
this in a real replication system, the output plugin will need to be
modified to produce output in the format appropriate to that system,
and to perform filtering.

Currently, information can be extracted from the logical decoding
system only via SQL; future commits will add the ability to stream
changes via walsender.

Andres Freund, with review and other contributions from many other
people, including Álvaro Herrera, Abhijit Menon-Sen, Peter Gheogegan,
Kevin Grittner, Robert Haas, Heikki Linnakangas, Fujii Masao, Abhijit
Menon-Sen, Michael Paquier, Simon Riggs, Craig Ringer, and Steve
Singer.
2014-03-03 16:32:18 -05:00
Tom Lane
0def2573c5 Fix *-qualification of named parameters in SQL-language functions.
Given a composite-type parameter named x, "$1.*" worked fine, but "x.*"
not so much.  This has been broken since named parameter references were
added in commit 9bff0780cf5be2193a5bad0d3df2dbe143085264, so patch back
to 9.2.  Per bug #9085 from Hardy Falk.
2014-02-03 14:47:17 -05:00
Robert Haas
2bb1f14b89 Make bitmap heap scans show exact/lossy block info in EXPLAIN ANALYZE.
Etsuro Fujita
2014-01-13 14:42:16 -05:00
Tom Lane
080b7db72e Fix "cannot accept a set" error when only some arms of a CASE return a set.
In commit c1352052ef1d4eeb2eb1d822a207ddc2d106cb13, I implemented an
optimization that assumed that a function's argument expressions would
either always return a set (ie multiple rows), or always not.  This is
wrong however: we allow CASE expressions in which some arms return a set
of some type and others just return a scalar of that type.  There may be
other examples as well.  To fix, replace the run-time test of whether an
argument returned a set with a static precheck (expression_returns_set).
This adds a little bit of query startup overhead, but it seems barely
measurable.

Per bug #8228 from David Johnston.  This has been broken since 8.0,
so patch all supported branches.
2014-01-08 20:18:58 -05:00
Tom Lane
e6336b8b57 Save a few cycles in advance_transition_function().
Keep a pre-initialized FunctionCallInfoData in AggStatePerAggData, and
re-use that at each row instead of doing InitFunctionCallInfoData each
time.  This saves only half a dozen assignments and maybe some stack
manipulation, and yet that seems to be good for a percent or two of the
overall query run time for simple aggregates such as count(*).  The cost
is that the FunctionCallInfoData (which is about a kilobyte, on 64-bit
machines) stays allocated for the duration of the query instead of being
short-lived stack data.  But we're already paying an equivalent space cost
for each regular FuncExpr or OpExpr node, so I don't feel bad about paying
it for aggregate functions.  The code seems a little cleaner this way too,
since the number of things passed to advance_transition_function decreases.
2014-01-08 13:58:37 -05:00
Bruce Momjian
7e04792a1c Update copyright for 2014
Update all files in head, and files COPYRIGHT and legal.sgml in all back
branches.
2014-01-07 16:05:30 -05:00
Tom Lane
8d65da1f01 Support ordered-set (WITHIN GROUP) aggregates.
This patch introduces generic support for ordered-set and hypothetical-set
aggregate functions, as well as implementations of the instances defined in
SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(),
percent_rank(), cume_dist()).  We also added mode() though it is not in the
spec, as well as versions of percentile_cont() and percentile_disc() that
can compute multiple percentile values in one pass over the data.

Unlike the original submission, this patch puts full control of the sorting
process in the hands of the aggregate's support functions.  To allow the
support functions to find out how they're supposed to sort, a new API
function AggGetAggref() is added to nodeAgg.c.  This allows retrieval of
the aggregate call's Aggref node, which may have other uses beyond the
immediate need.  There is also support for ordered-set aggregates to
install cleanup callback functions, so that they can be sure that
infrastructure such as tuplesort objects gets cleaned up.

In passing, make some fixes in the recently-added support for variadic
aggregates, and make some editorial adjustments in the recent FILTER
additions for aggregates.  Also, simplify use of IsBinaryCoercible() by
allowing it to succeed whenever the target type is ANY or ANYELEMENT.
It was inconsistent that it dealt with other polymorphic target types
but not these.

Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing,
and rather heavily editorialized upon by Tom Lane
2013-12-23 16:11:35 -05:00
Tom Lane
784e762e88 Support multi-argument UNNEST(), and TABLE() syntax for multiple functions.
This patch adds the ability to write TABLE( function1(), function2(), ...)
as a single FROM-clause entry.  The result is the concatenation of the
first row from each function, followed by the second row from each
function, etc; with NULLs inserted if any function produces fewer rows than
others.  This is believed to be a much more useful behavior than what
Postgres currently does with multiple SRFs in a SELECT list.

This syntax also provides a reasonable way to combine use of column
definition lists with WITH ORDINALITY: put the column definition list
inside TABLE(), where it's clear that it doesn't control the ordinality
column as well.

Also implement SQL-compliant multiple-argument UNNEST(), by turning
UNNEST(a,b,c) into TABLE(unnest(a), unnest(b), unnest(c)).

The SQL standard specifies TABLE() with only a single function, not
multiple functions, and it seems to require an implicit UNNEST() which is
not what this patch does.  There may be something wrong with that reading
of the spec, though, because if it's right then the spec's TABLE() is just
a pointless alternative spelling of UNNEST().  After further review of
that, we might choose to adopt a different syntax for what this patch does,
but in any case this functionality seems clearly worthwhile.

Andrew Gierth, reviewed by Zoltán Böszörményi and Heikki Linnakangas, and
significantly revised by me
2013-11-21 19:37:20 -05:00
Tom Lane
c28b289bf3 Prevent display of dropped columns in row constraint violation messages.
ExecBuildSlotValueDescription() printed "null" for each dropped column in
a row being complained of by ExecConstraints().  This has some sanity in
terms of the underlying implementation, but is of course pretty surprising
to users.  To fix, we must pass the target relation's descriptor to
ExecBuildSlotValueDescription(), because the slot descriptor it had been
using doesn't get labeled with attisdropped markers.

Per bug #8408 from Maxim Boguk.  Back-patch to 9.2 where the feature of
printing row values in NOT NULL and CHECK constraint violation messages
was introduced.

Michael Paquier and Tom Lane
2013-11-07 14:41:36 -05:00
Tom Lane
b006f4ddb9 Prevent memory leaks from accumulating across printtup() calls.
Historically, printtup() has assumed that it could prevent memory leakage
by pfree'ing the string result of each output function and manually
managing detoasting of toasted values.  This amounts to assuming that
datatype output functions never leak any memory internally; an assumption
we've already decided to be bogus elsewhere, for example in COPY OUT.
range_out in particular is known to leak multiple kilobytes per call, as
noted in bug #8573 from Godfried Vanluffelen.  While we could go in and fix
that leak, it wouldn't be very notationally convenient, and in any case
there have been and undoubtedly will again be other leaks in other output
functions.  So what seems like the best solution is to run the output
functions in a temporary memory context that can be reset after each row,
as we're doing in COPY OUT.  Some quick experimentation suggests this is
actually a tad faster than the retail pfree's anyway.

This patch fixes all the variants of printtup, except for debugtup()
which is used in standalone mode.  It doesn't seem worth worrying
about query-lifespan leaks in standalone mode, and fixing that case
would be a bit tedious since debugtup() doesn't currently have any
startup or shutdown functions.

While at it, remove manual detoast management from several other
output-function call sites that had copied it from printtup().  This
doesn't make a lot of difference right now, but in view of recent
discussions about supporting "non-flattened" Datums, we're going to
want that code gone eventually anyway.

Back-patch to 9.2 where range_out was introduced.  We might eventually
decide to back-patch this further, but in the absence of known major
leaks in older output functions, I'll refrain for now.
2013-11-03 11:33:05 -05:00
Kevin Grittner
be420fa02e Fix subquery reference to non-populated MV in CMV.
A subquery reference to a matview should be allowed by CREATE
MATERIALIZED VIEW WITH NO DATA, just like a direct reference is.

Per bug report from Laurent Sartran.

Backpatch to 9.3.
2013-11-02 18:38:17 -05:00
Robert Haas
ba3d39c969 Don't allow system columns in CHECK constraints, except tableoid.
Previously, arbitray system columns could be mentioned in table
constraints, but they were not correctly checked at runtime, because
the values weren't actually set correctly in the tuple.  Since it
seems easy enough to initialize the table OID properly, do that,
and continue allowing that column, but disallow the rest unless and
until someone figures out a way to make them work properly.

No back-patch, because this doesn't seem important enough to take the
risk of destabilizing the back branches.  In fact, this will pose a
dump-and-reload hazard for those upgrading from previous versions:
constraints that were accepted before but were not correctly enforced
will now either be enforced correctly or not accepted at all.  Either
could result in restore failures, but in practice I think very few
users will notice the difference, since the use case is pretty
marginal anyway and few users will be relying on features that have
not historically worked.

Amit Kapila, reviewed by Rushabh Lathia, with doc changes by me.
2013-09-23 13:31:22 -04:00
Tom Lane
0d3f4406df Allow aggregate functions to be VARIADIC.
There's no inherent reason why an aggregate function can't be variadic
(even VARIADIC ANY) if its transition function can handle the case.
Indeed, this patch to add the feature touches none of the planner or
executor, and little of the parser; the main missing stuff was DDL and
pg_dump support.

It is true that variadic aggregates can create the same sort of ambiguity
about parameters versus ORDER BY keys that was complained of when we
(briefly) had both one- and two-argument forms of string_agg().  However,
the policy formed in response to that discussion only said that we'd not
create any built-in aggregates with varying numbers of arguments, not that
we shouldn't allow users to do it.  So the logical extension of that is
we can allow users to make variadic aggregates as long as we're wary about
shipping any such in core.

In passing, this patch allows aggregate function arguments to be named, to
the extent of remembering the names in pg_proc and dumping them in pg_dump.
You can't yet call an aggregate using named-parameter notation.  That seems
like a likely future extension, but it'll take some work, and it's not what
this patch is really about.  Likewise, there's still some work needed to
make window functions handle VARIADIC fully, but I left that for another
day.

initdb forced because of new aggvariadic field in Aggref parse nodes.
2013-09-03 17:08:46 -04:00
Tom Lane
8e2b71d2d0 Reset the binary heap in MergeAppend rescans.
Failing to do so can cause queries to return wrong data, error out or crash.
This requires adding a new binaryheap_reset() method to binaryheap.c,
but that probably should have been there anyway.

Per bug #8410 from Terje Elde.  Diagnosis and patch by Andres Freund.
2013-08-30 19:15:21 -04:00
Peter Eisentraut
32f7c0ae17 Improve error message when view is not updatable
Avoid using the term "updatable" in confusing ways.  Suggest a trigger
first, before a rule.
2013-08-14 23:02:59 -04:00
Greg Stark
c62736cc37 Add SQL Standard WITH ORDINALITY support for UNNEST (and any other SRF)
Author: Andrew Gierth, David Fetter
Reviewers: Dean Rasheed, Jeevan Chalke, Stephen Frost
2013-07-29 16:38:01 +01:00
Tom Lane
3d13623d75 Prevent leakage of SPI tuple tables during subtransaction abort.
plpgsql often just remembers SPI-result tuple tables in local variables,
and has no mechanism for freeing them if an ereport(ERROR) causes an escape
out of the execution function whose local variable it is.  In the original
coding, that wasn't a problem because the tuple table would be cleaned up
when the function's SPI context went away during transaction abort.
However, once plpgsql grew the ability to trap exceptions, repeated
trapping of errors within a function could result in significant
intra-function-call memory leakage, as illustrated in bug #8279 from
Chad Wagner.

We could fix this locally in plpgsql with a bunch of PG_TRY/PG_CATCH
coding, but that would be tedious, probably slow, and prone to bugs of
omission; moreover it would do nothing for similar risks elsewhere.
What seems like a better plan is to make SPI itself responsible for
freeing tuple tables at subtransaction abort.  This patch attacks the
problem that way, keeping a list of live tuple tables within each SPI
function context.  Currently, such freeing is automatic for tuple tables
made within the failed subtransaction.  We might later add a SPI call to
mark a tuple table as not to be freed this way, allowing callers to opt
out; but until someone exhibits a clear use-case for such behavior, it
doesn't seem worth bothering.

A very useful side-effect of this change is that SPI_freetuptable() can
now defend itself against bad calls, such as duplicate free requests;
this should make things more robust in many places.  (In particular,
this reduces the risks involved if a third-party extension contains
now-redundant SPI_freetuptable() calls in error cleanup code.)

Even though the leakage problem is of long standing, it seems imprudent
to back-patch this into stable branches, since it does represent an API
semantics change for SPI users.  We'll patch this in 9.3, but live with
the leakage in older branches.
2013-07-25 16:46:14 -04:00
Robert Haas
765ad89be3 Use InvalidSnapshot, now SnapshotNow, as the default snapshot.
As far as I can determine, there's no code in the core distribution
that fails to explicitly set the snapshot of a scan or executor
state.  If there is any such code, this will probably cause it to
seg fault; friendlier suggestions were discussed on pgsql-hackers,
but there was no consensus that anything more than this was
needed.

This is another step towards the hoped-for complete removal of
SnapshotNow.
2013-07-23 10:58:32 -04:00
Robert Haas
0518eceec3 Adjust HeapTupleSatisfies* routines to take a HeapTuple.
Previously, these functions took a HeapTupleHeader, but upcoming
patches for logical replication will introduce new a new snapshot
type under which the tuple's TID will be used to lookup (CMIN, CMAX)
for visibility determination purposes.  This makes that information
available.  Code churn is minimal since HeapTupleSatisfiesVisibility
took the HeapTuple anyway, and deferenced it before calling the
satisfies function.

Independently of logical replication, this allows t_tableOid and
t_self to be cross-checked via assertions in tqual.c.  This seems
like a useful way to make sure that all callers are setting these
values properly, which has been previously put forward as
desirable.

Andres Freund, reviewed by Álvaro Herrera
2013-07-22 13:38:44 -04:00
Stephen Frost
4cbe3ac3e8 WITH CHECK OPTION support for auto-updatable VIEWs
For simple views which are automatically updatable, this patch allows
the user to specify what level of checking should be done on records
being inserted or updated.  For 'LOCAL CHECK', new tuples are validated
against the conditionals of the view they are being inserted into, while
for 'CASCADED CHECK' the new tuples are validated against the
conditionals for all views involved (from the top down).

This option is part of the SQL specification.

Dean Rasheed, reviewed by Pavel Stehule
2013-07-18 17:10:16 -04:00
Noah Misch
b560ec1b0d Implement the FILTER clause for aggregate function calls.
This is SQL-standard with a few extensions, namely support for
subqueries and outer references in clause expressions.

catversion bump due to change in Aggref and WindowFunc.

David Fetter, reviewed by Dean Rasheed.
2013-07-16 20:15:36 -04:00
Kevin Grittner
cc1965a99b Add support for REFRESH MATERIALIZED VIEW CONCURRENTLY.
This allows reads to continue without any blocking while a REFRESH
runs.  The new data appears atomically as part of transaction
commit.

Review questioned the Assert that a matview was not a system
relation.  This will be addressed separately.

Reviewed by Hitoshi Harada, Robert Haas, Andres Freund.
Merged after review with security patch f3ab5d4.
2013-07-16 12:55:44 -05:00
Noah Misch
448fee2e23 Make comments reflect that omission of SPI_gettypmod() is intentional. 2013-07-12 18:07:46 -04:00
Robert Haas
568d4138c6 Use an MVCC snapshot, rather than SnapshotNow, for catalog scans.
SnapshotNow scans have the undesirable property that, in the face of
concurrent updates, the scan can fail to see either the old or the new
versions of the row.  In many cases, we work around this by requiring
DDL operations to hold AccessExclusiveLock on the object being
modified; in some cases, the existing locking is inadequate and random
failures occur as a result.  This commit doesn't change anything
related to locking, but will hopefully pave the way to allowing lock
strength reductions in the future.

The major issue has held us back from making this change in the past
is that taking an MVCC snapshot is significantly more expensive than
using a static special snapshot such as SnapshotNow.  However, testing
of various worst-case scenarios reveals that this problem is not
severe except under fairly extreme workloads.  To mitigate those
problems, we avoid retaking the MVCC snapshot for each new scan;
instead, we take a new snapshot only when invalidation messages have
been processed.  The catcache machinery already requires that
invalidation messages be sent before releasing the related heavyweight
lock; else other backends might rely on locally-cached data rather
than scanning the catalog at all.  Thus, making snapshot reuse
dependent on the same guarantees shouldn't break anything that wasn't
already subtly broken.

Patch by me.  Review by Michael Paquier and Andres Freund.
2013-07-02 09:47:01 -04:00
Tom Lane
dc3eb56383 Improve updatability checking for views and foreign tables.
Extend the FDW API (which we already changed for 9.3) so that an FDW can
report whether specific foreign tables are insertable/updatable/deletable.
The default assumption continues to be that they're updatable if the
relevant executor callback function is supplied by the FDW, but finer
granularity is now possible.  As a test case, add an "updatable" option to
contrib/postgres_fdw.

This patch also fixes the information_schema views, which previously did
not think that foreign tables were ever updatable, and fixes
view_is_auto_updatable() so that a view on a foreign table can be
auto-updatable.

initdb forced due to changes in information_schema views and the functions
they rely on.  This is a bit unfortunate to do post-beta1, but if we don't
change this now then we'll have another API break for FDWs when we do
change it.

Dean Rasheed, somewhat editorialized on by Tom Lane
2013-06-12 17:53:33 -04:00
Bruce Momjian
9af4159fce pgindent run for release 9.3
This is the first run of the Perl-based pgindent script.  Also update
pgindent instructions.
2013-05-29 16:58:43 -04:00
Tom Lane
904af8db8a Fix handling of strict non-set functions with NULLs in set-valued inputs.
In a construct like "select plain_function(set_returning_function(...))",
the plain function is applied to each output row of the SRF successively.
If some of the SRF outputs are NULL, and the plain function is strict,
you'd expect to get NULL results for such rows ... but what actually
happened was that such rows were omitted entirely from the result set.
This was due to confusion of this case with what should happen for nested
set-returning functions; a strict SRF is indeed supposed to yield an empty
set for null input.  Per bug #8150 from Erwin Brandstetter.

Although this has been broken forever, we're not back-patching because
of the possibility that some apps out there expect the incorrect behavior.
This change should be listed as a possible incompatibility in the 9.3
release notes.
2013-05-12 13:08:12 -04:00
Tom Lane
f8db76e875 Editorialize a bit on new ProcessUtility() API.
Choose a saner ordering of parameters (adding a new input param after
the output params seemed a bit random), update the function's header
comment to match reality (cmon folks, is this really that hard?),
get rid of useless and sloppily-defined distinction between
PROCESS_UTILITY_SUBCOMMAND and PROCESS_UTILITY_GENERATED.
2013-04-28 00:18:45 -04:00
Tom Lane
5194024d72 Incidental cleanup of matviews code.
Move checking for unscannable matviews into ExecOpenScanRelation, which is
a better place for it first because the open relation is already available
(saving a relcache lookup cycle), and second because this eliminates the
problem of telling the difference between rangetable entries that will or
will not be scanned by the query.  In particular we can get rid of the
not-terribly-well-thought-out-or-implemented isResultRel field that the
initial matviews patch added to RangeTblEntry.

Also get rid of entirely unnecessary scannability check in the rewriter,
and a bogus decision about whether RefreshMatViewStmt requires a parse-time
snapshot.

catversion bump due to removal of a RangeTblEntry field, which changes
stored rules.
2013-04-27 17:48:57 -04:00
Kevin Grittner
63e20041a2 Fix assertion failure for REFRESH MATERIALIZED VIEW in PL.
This was due to incomplete implementation of rowcount reporting
for RMV, which was due to initial waffling on whether it should
be provided.  It seems unlikely to be a useful or universally
available  number as more sophisticated techniques for maintaining
matviews are added, so remove the partial support rather than
completing it.

Per report of Jeevan Chalke, but with a different fix
2013-04-24 08:39:06 -05:00
Peter Eisentraut
cc26ea9fe2 Clean up references to SQL92
In most cases, these were just references to the SQL standard in
general.  In a few cases, a contrast was made between SQL92 and later
standards -- those have been kept unchanged.
2013-04-20 11:04:41 -04:00
Tom Lane
6e481ebff6 Improve error message when an FDW doesn't support WHERE CURRENT OF.
If an FDW fails to take special measures with a CurrentOfExpr, we will
end up trying to execute it as an ordinary qual, which was being treated
as a purely internal failure condition.  Provide a more user-oriented
error message for such cases.
2013-04-19 16:14:56 -04:00
Robert Haas
f8a54e936b sepgsql: Enforce db_procedure:{execute} permission.
To do this, we add an additional object access hook type,
OAT_FUNCTION_EXECUTE.

KaiGai Kohei
2013-04-12 08:58:01 -04:00