This patch introduces two additional lock modes for tuples: "SELECT FOR
KEY SHARE" and "SELECT FOR NO KEY UPDATE". These don't block each
other, in contrast with already existing "SELECT FOR SHARE" and "SELECT
FOR UPDATE". UPDATE commands that do not modify the values stored in
the columns that are part of the key of the tuple now grab a SELECT FOR
NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently
with tuple locks of the FOR KEY SHARE variety.
Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this
means the concurrency improvement applies to them, which is the whole
point of this patch.
The added tuple lock semantics require some rejiggering of the multixact
module, so that the locking level that each transaction is holding can
be stored alongside its Xid. Also, multixacts now need to persist
across server restarts and crashes, because they can now represent not
only tuple locks, but also tuple updates. This means we need more
careful tracking of lifetime of pg_multixact SLRU files; since they now
persist longer, we require more infrastructure to figure out when they
can be removed. pg_upgrade also needs to be careful to copy
pg_multixact files over from the old server to the new, or at least part
of multixact.c state, depending on the versions of the old and new
servers.
Tuple time qualification rules (HeapTupleSatisfies routines) need to be
careful not to consider tuples with the "is multi" infomask bit set as
being only locked; they might need to look up MultiXact values (i.e.
possibly do pg_multixact I/O) to find out the Xid that updated a tuple,
whereas they previously were assured to only use information readily
available from the tuple header. This is considered acceptable, because
the extra I/O would involve cases that would previously cause some
commands to block waiting for concurrent transactions to finish.
Another important change is the fact that locking tuples that have
previously been updated causes the future versions to be marked as
locked, too; this is essential for correctness of foreign key checks.
This causes additional WAL-logging, also (there was previously a single
WAL record for a locked tuple; now there are as many as updated copies
of the tuple there exist.)
With all this in place, contention related to tuples being checked by
foreign key rules should be much reduced.
As a bonus, the old behavior that a subtransaction grabbing a stronger
tuple lock than the parent (sub)transaction held on a given tuple and
later aborting caused the weaker lock to be lost, has been fixed.
Many new spec files were added for isolation tester framework, to ensure
overall behavior is sane. There's probably room for several more tests.
There were several reviewers of this patch; in particular, Noah Misch
and Andres Freund spent considerable time in it. Original idea for the
patch came from Simon Riggs, after a problem report by Joel Jacobson.
Most code is from me, with contributions from Marti Raudsepp, Alexander
Shulgin, Noah Misch and Andres Freund.
This patch was discussed in several pgsql-hackers threads; the most
important start at the following message-ids:
AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com1290721684-sup-3951@alvh.no-ip.org1294953201-sup-2099@alvh.no-ip.org1320343602-sup-2290@alvh.no-ip.org1339690386-sup-8927@alvh.no-ip.org4FE5FF020200002500048A3D@gw.wicourts.gov4FEAB90A0200002500048B7D@gw.wicourts.gov
Extracted from a larger patch by Dimitri Fontaine. It is hoped that
this will provide infrastructure for enriching the new event trigger
functionality, but it seems possibly useful for other purposes as
well.
If we had not been holding buffer pin continuously since the tuple was
initially fetched by the UPDATE or DELETE query, it would be possible for
VACUUM or a page-prune operation to move the tuple while we're trying to
copy it. This would result in a garbage "old" tuple value being passed to
an AFTER ROW UPDATE or AFTER ROW DELETE trigger. The preconditions for
this are somewhat improbable, and the timing constraints are very tight;
so it's not so surprising that this hasn't been reported from the field,
even though the bug has been there a long time.
Problem found by Andres Freund. Back-patch to all active branches.
This prevents surprising behavior when a FOR EACH ROW trigger
BEFORE UPDATE or BEFORE DELETE directly or indirectly updates or
deletes the the old row. Prior to this patch the requested action
on the row could be silently ignored while all triggered actions
based on the occurence of the requested action could be committed.
One example of how this could happen is if the BEFORE DELETE
trigger for a "parent" row deleted "children" which had trigger
functions to update summary or status data on the parent.
This also prevents similar surprising problems if the query has a
volatile function which updates a target row while it is already
being updated.
There are related issues present in FOR UPDATE cursors and READ
COMMITTED queries which are not handled by this patch. These
issues need further evalution to determine what change, if any, is
needed.
Where the new error messages are generated, in most cases the best
fix will be to move code from the BEFORE trigger to an AFTER
trigger. Where this is not feasible, the trigger can avoid the
error by re-issuing the triggering statement and returning NULL.
Documentation changes will be submitted in a separate patch.
Kevin Grittner and Tom Lane with input from Florian Pflug and
Robert Haas, based on problems encountered during conversion of
Wisconsin Circuit Court trigger logic to plpgsql triggers.
This reduces unnecessary exposure of other headers through htup.h, which
is very widely included by many files.
I have chosen to move the function prototypes to the new file as well,
because that means htup.h no longer needs to include tupdesc.h. In
itself this doesn't have much effect in indirect inclusion of tupdesc.h
throughout the tree, because it's also required by execnodes.h; but it's
something to explore in the future, and it seemed best to do the htup.h
change now while I'm busy with it.
Formerly we relied on checking after-the-fact to see if an expression
contained aggregates, window functions, or sub-selects when it shouldn't.
This is grotty, easily forgotten (indeed, we had forgotten to teach
DefineIndex about rejecting window functions), and none too efficient
since it requires extra traversals of the parse tree. To improve matters,
define an enum type that classifies all SQL sub-expressions, store it in
ParseState to show what kind of expression we are currently parsing, and
make transformAggregateCall, transformWindowFuncCall, and transformSubLink
check the expression type and throw error if the type indicates the
construct is disallowed. This allows removal of a large number of ad-hoc
checks scattered around the code base. The enum type is sufficiently
fine-grained that we can still produce error messages of at least the
same specificity as before.
Bringing these error checks together revealed that we'd been none too
consistent about phrasing of the error messages, so standardize the wording
a bit.
Also, rewrite checking of aggregate arguments so that it requires only one
traversal of the arguments, rather than up to three as before.
In passing, clean up some more comments left over from add_missing_from
support, and annotate some tests that I think are dead code now that that's
gone. (I didn't risk actually removing said dead code, though.)
The code was setting it true for other constraints, which is
bogus. Doing so caused bogus catalog entries for such constraints, and
in particular caused an error to be raised when trying to drop a
constraint of types other than CHECK from a table that has children,
such as reported in bug #6712.
In 9.2, additionally ignore connoinherit=true for other constraint
types, to avoid having to force initdb; existing databases might already
contain bogus catalog entries.
Includes a catversion bump (in HEAD only).
Bug report from Miroslav Šulc
Analysis from Amit Kapila and Noah Misch; Amit also contributed the patch.
Commit 3855968f328918b6cd1401dd11d109d471a54d40 added syntax, pg_dump,
psql support, and documentation, but the triggers didn't actually fire.
With this commit, they now do. This is still a pretty basic facility
overall because event triggers do not get a whole lot of information
about what the user is trying to do unless you write them in C; and
there's still no option to fire them anywhere except at the very
beginning of the execution sequence, but it's better than nothing,
and a good building block for future work.
Along the way, add a regression test for ALTER LARGE OBJECT, since
testing of event triggers reveals that we haven't got one.
Dimitri Fontaine and Robert Haas
The latter was already the dominant use, and it's preferable because
in C the convention is that intXX means XX bits. Therefore, allowing
mixed use of int2, int4, int8, int16, int32 is obviously confusing.
Remove the typedefs for int2 and int4 for now. They don't seem to be
widely used outside of the PostgreSQL source tree, and the few uses
can probably be cleaned up by the time this ships.
During an update of a PK row, we can skip firing the RI trigger if any old
key value is NULL, because then the row could not have had any matching
rows in the FK table. Conversely, during an update of an FK row, the
outcome is determined if any new key value is NULL. In either case it
becomes unnecessary to compare individual key values.
This patch was inspired by discussion of Vik Reykja's patch to use IS NOT
DISTINCT semantics for the key comparisons. In the event there is no need
for that and so this patch looks nothing like his, but he should still get
credit for having re-opened consideration of the trigger skip logic.
Previously we followed the SQL92 wording, "MATCH <unspecified>", but since
SQL99 there's been a less awkward way to refer to the default style.
In addition to the code changes, pg_constraint.confmatchtype now stores
this match style as 's' (SIMPLE) rather than 'u' (UNSPECIFIED). This
doesn't affect pg_dump or psql because they use pg_get_constraintdef()
to reconstruct foreign key definitions. But other client-side code might
examine that column directly, so this change will have to be marked as
an incompatibility in the 9.3 release notes.
The original syntax wasn't universally loved, and it didn't allow its
usage in CREATE TABLE, only ALTER TABLE. It now works everywhere, and
it also allows using ALTER TABLE ONLY to add an uninherited CHECK
constraint, per discussion.
The pg_constraint column has accordingly been renamed connoinherit.
This commit partly reverts some of the changes in
61d81bd28dbec65a6b144e0cd3d0bfe25913c3ac, particularly some pg_dump and
psql bits, because now pg_get_constraintdef includes the necessary NO
INHERIT within the constraint definition.
Author: Nikhil Sontakke
Some tweaks by me
This allows loadable modules to get control at drop time, perhaps for the
purpose of performing additional security checks or to log the event.
The initial purpose of this code is to support sepgsql, but other
applications should be possible as well.
KaiGai Kohei, reviewed by me.
This check was overlooked when we added function execute permissions to the
system years ago. For an ordinary trigger function it's not a big deal,
since trigger functions execute with the permissions of the table owner,
so they couldn't do anything the user issuing the CREATE TRIGGER couldn't
have done anyway. However, if a trigger function is SECURITY DEFINER,
that is not the case. The lack of checking would allow another user to
install it on his own table and then invoke it with, essentially, forged
input data; which the trigger function is unlikely to realize, so it might
do something undesirable, for instance insert false entries in an audit log
table.
Reported by Dinesh Kumar, patch by Robert Haas
Security: CVE-2012-0866
This reports the depth level of triggers currently in execution, or zero
if not called from inside a trigger.
No catversion bump in this patch, but you have to initdb if you want
access to the new function.
Author: Kevin Grittner
This makes them enforceable only on the parent table, not on children
tables. This is useful in various situations, per discussion involving
people bitten by the restrictive behavior introduced in 8.4.
Message-Id:
8762mp93iw.fsf@comcast.netCAFaPBrSMMpubkGf4zcRL_YL-AERUbYF_-ZNNYfb3CVwwEqc9TQ@mail.gmail.com
Authors: Nikhil Sontakke, Alex Hunsaker
Reviewed by Robert Haas and myself
Previously, renaming a table, sequence, view, index, foreign table,
column, or trigger checked permissions before locking the object, which
meant that if permissions were revoked during the lock wait, we would
still allow the operation. Similarly, if the original object is dropped
and a new one with the same name is created, the operation will be allowed
if we had permissions on the old object; the permissions on the new
object don't matter. All this is now fixed.
Along the way, attempting to rename a trigger on a foreign table now gives
the same error message as trying to create one there in the first place
(i.e. that it's not a table or view) rather than simply stating that no
trigger by that name exists.
Patch by me; review by Noah Misch.
In the previous coding, callers were faced with an awkward choice:
look up the name, do permissions checks, and then lock the table; or
look up the name, lock the table, and then do permissions checks.
The first choice was wrong because the results of the name lookup
and permissions checks might be out-of-date by the time the table
lock was acquired, while the second allowed a user with no privileges
to interfere with access to a table by users who do have privileges
(e.g. if a malicious backend queues up for an AccessExclusiveLock on
a table on which AccessShareLock is already held, further attempts
to access the table will be blocked until the AccessExclusiveLock
is obtained and the malicious backend's transaction rolls back).
To fix, allow callers of RangeVarGetRelid() to pass a callback which
gets executed after performing the name lookup but before acquiring
the relation lock. If the name lookup is retried (because
invalidation messages are received), the callback will be re-executed
as well, so we get the best of both worlds. RangeVarGetRelid() is
renamed to RangeVarGetRelidExtended(); callers not wishing to supply
a callback can continue to invoke it as RangeVarGetRelid(), which is
now a macro. Since the only one caller that uses nowait = true now
passes a callback anyway, the RangeVarGetRelid() macro defaults nowait
as well. The callback can also be used for supplemental locking - for
example, REINDEX INDEX needs to acquire the table lock before the index
lock to reduce deadlock possibilities.
There's a lot more work to be done here to fix all the cases where this
can be a problem, but this commit provides the general infrastructure
and fixes the following specific cases: REINDEX INDEX, REINDEX TABLE,
LOCK TABLE, and and DROP TABLE/INDEX/SEQUENCE/VIEW/FOREIGN TABLE.
Per discussion with Noah Misch and Alvaro Herrera.
This gets rid of an impressive amount of duplicative code, with only
minimal behavior changes. DROP FOREIGN DATA WRAPPER now requires object
ownership rather than superuser privileges, matching the documentation
we already have. We also eliminate the historical warning about dropping
a built-in function as unuseful. All operations are now performed in the
same order for all object types handled by dropcmds.c.
KaiGai Kohei, with minor revisions by me
Turns out that use of ShareUpdateExclusiveLock or ShareRowExclusiveLock
to protect DDL changes had gotten copied into several places that were
not touched by either of Simon's original patches for the feature, and
thus neither he nor I thought to revert them. (Indeed, it appears that
two of these uses were committed *after* the reversion, which just goes
to show that git merging is no panacea.) Change these places to use
AccessExclusiveLock again. If we ever manage to resurrect that feature,
we're going to have to think a bit harder about how to keep lock level
usage in sync for DDL operations that aren't within the AlterTable
infrastructure.
Two of these bugs are only in HEAD, but one is in the 9.1 branch too.
Alvaro found one of them, I found the other two.
This commit changes index-only scans so that data is read directly from the
index tuple without first generating a faux heap tuple. The only immediate
benefit is that indexes on system columns (such as OID) can be used in
index-only scans, but this is necessary infrastructure if we are ever to
support index-only scans on expression indexes. The executor is now ready
for that, though the planner still needs substantial work to recognize
the possibility.
To do this, Vars in index-only plan nodes have to refer to index columns
not heap columns. I introduced a new special varno, INDEX_VAR, to mark
such Vars to avoid confusion. (In passing, this commit renames the two
existing special varnos to OUTER_VAR and INNER_VAR.) This allows
ruleutils.c to handle them with logic similar to what we use for subplan
reference Vars.
Since index-only scans are now fundamentally different from regular
indexscans so far as their expression subtrees are concerned, I also chose
to change them to have their own plan node type (and hence, their own
executor source file).
This provides information about the numbers of tuples that were visited
but not returned by table scans, as well as the numbers of join tuples
that were considered and discarded within a join plan node.
There is still some discussion going on about the best way to report counts
for outer-join situations, but I think most of what's in the patch would
not change if we revise that, so I'm going to go ahead and commit it as-is.
Documentation changes to follow (they weren't in the submitted patch
either).
Marko Tiikkaja, reviewed by Marc Cousin, somewhat revised by Tom
Due to tuple-slot mismanagement, evaluation of WHEN conditions for AFTER
ROW UPDATE triggers could crash if there had been a BEFORE ROW trigger
fired for the same update. Fix by not trying to overload the use of
estate->es_trig_tuple_slot. Per report from Yoran Heling.
Back-patch to 9.0, when trigger WHEN conditions were introduced.
There may be some other places where we should use errdetail_internal,
but they'll have to be evaluated case-by-case. This commit just hits
a bunch of places where invoking gettext is obviously a waste of cycles.
Regular aggregate functions in combination with, or within the arguments
of, window functions are OK per spec; they have the semantics that the
aggregate output rows are computed and then we run the window functions
over that row set. (Thus, this combination is not really useful unless
there's a GROUP BY so that more than one aggregate output row is possible.)
The case without GROUP BY could fail, as recently reported by Jeff Davis,
because sloppy construction of the Agg node's targetlist resulted in extra
references to possibly-ungrouped Vars appearing outside the aggregate
function calls themselves. See the added regression test case for an
example.
Fixing this requires modifying the API of flatten_tlist and its underlying
function pull_var_clause. I chose to make pull_var_clause's API for
aggregates identical to what it was already doing for placeholders, since
the useful behaviors turn out to be the same (error, report node as-is, or
recurse into it). I also tightened the error checking in this area a bit:
if it was ever valid to see an uplevel Var, Aggref, or PlaceHolderVar here,
that was a long time ago, so complain instead of ignoring them.
Backpatch into 9.1. The failure exists in 8.4 and 9.0 as well, but seeing
that it only occurs in a basically-useless corner case, it doesn't seem
worth the risks of changing a function API in a minor release. There might
be third-party code using pull_var_clause.
In the previous coding, we would look up a relation in RangeVarGetRelid,
lock the resulting OID, and then AcceptInvalidationMessages(). While
this was sufficient to ensure that we noticed any changes to the
relation definition before building the relcache entry, it didn't
handle the possibility that the name we looked up no longer referenced
the same OID. This was particularly problematic in the case where a
table had been dropped and recreated: we'd latch on to the entry for
the old relation and fail later on. Now, we acquire the relation lock
inside RangeVarGetRelid, and retry the name lookup if we notice that
invalidation messages have been processed meanwhile. Many operations
that would previously have failed with an error in the presence of
concurrent DDL will now succeed.
There is a good deal of work remaining to be done here: many callers
of RangeVarGetRelid still pass NoLock for one reason or another. In
addition, nothing in this patch guards against the possibility that
the meaning of an unqualified name might change due to the creation
of a relation in a schema earlier in the user's search path than the
one where it was previously found. Furthermore, there's nothing at
all here to guard against similar race conditions for non-relations.
For all that, it's a start.
Noah Misch and Robert Haas
The initial commit of the ALTER TABLE ADD FOREIGN KEY NOT VALID feature
failed to support labeling such constraints as deferrable. The best fix
for this seems to be to fold NOT VALID into ConstraintAttributeSpec.
That's a bit more general than the documented syntax, but it allows
better-targeted syntax error messages.
In addition, do some mostly-but-not-entirely-cosmetic code review for
the whole NOT VALID patch.
Since collation is effectively an argument, not a property of the function,
FmgrInfo is really the wrong place for it; and this becomes critical in
cases where a cached FmgrInfo is used for varying purposes that might need
different collation settings. Fix by passing it in FunctionCallInfoData
instead. In particular this allows a clean fix for bug #5970 (record_cmp
not working). This requires touching a bit more code than the original
method, but nobody ever thought that collations would not be an invasive
patch...
Previous patches took care of assorted places that call transformExpr from
outside the main parser, but I overlooked the fact that some places use
transformWhereClause as a shortcut for transformExpr + coerce_to_boolean.
In particular this broke collation-sensitive index WHERE clauses, as per
report from Thom Brown. Trigger WHEN and rule WHERE clauses too.
I'm not forcing initdb for this fix, but any affected indexes, triggers,
or rules will need to be dropped and recreated.
The originally committed patch for modifying CTEs didn't interact well
with EXPLAIN, as noted by myself, and also had corner-case problems with
triggers, as noted by Dean Rasheed. Those problems show it is really not
practical for ExecutorEnd to call any user-defined code; so split the
cleanup duties out into a new function ExecutorFinish, which must be called
between the last ExecutorRun call and ExecutorEnd. Some Asserts have been
added to these functions to help verify correct usage.
It is no longer necessary for callers of the executor to call
AfterTriggerBeginQuery/AfterTriggerEndQuery for themselves, as this is now
done by ExecutorStart/ExecutorFinish respectively. If you really need to
suppress that and do it for yourself, pass EXEC_FLAG_SKIP_TRIGGERS to
ExecutorStart.
Also, refactor portal commit processing to allow for the possibility that
PortalDrop will invoke user-defined code. I think this is not actually
necessary just yet, since the portal-execution-strategy logic forces any
non-pure-SELECT query to be run to completion before we will consider
committing. But it seems like good future-proofing.
ExecUpdate checked for whether ExecBRUpdateTriggers had returned a new
tuple value by seeing if the returned tuple was pointer-equal to the old
one. But the "old one" was in estate->es_junkFilter's result slot, which
would be scribbled on if we had done an EvalPlanQual update in response to
a concurrent update of the target tuple; therefore we were comparing a
dangling pointer to a live one. Given the right set of circumstances we
could get a false match, resulting in not forcing the tuple to be stored in
the slot we thought it was stored in. In the case reported by Maxim Boguk
in bug #5798, this led to "cannot extract system attribute from virtual
tuple" failures when trying to do "RETURNING ctid". I believe there is a
very-low-probability chance of more serious errors, such as generating
incorrect index entries based on the original rather than the
trigger-modified version of the row.
In HEAD, change all of ExecBRInsertTriggers, ExecIRInsertTriggers,
ExecBRUpdateTriggers, and ExecIRUpdateTriggers so that they continue to
have similar APIs. In the back branches I just changed
ExecBRUpdateTriggers, since there is no bug in the ExecBRInsertTriggers
case.
FK constraints that are marked NOT VALID may later be VALIDATED, which uses an
ShareUpdateExclusiveLock on constraint table and RowShareLock on referenced
table. Significantly reduces lock strength and duration when adding FKs.
New state visible from psql.
Simon Riggs, with reviews from Marko Tiikkaja and Robert Haas
After a SQL object is created, we provide an opportunity for security
or logging plugins to get control; for example, a security label provider
could use this to assign an initial security label to newly created
objects. The basic infrastructure is (hopefully) reusable for other types
of events that might require similar treatment.
KaiGai Kohei, with minor adjustments.
This patch adds the SQL-standard concept of an INSTEAD OF trigger, which
is fired instead of performing a physical insert/update/delete. The
trigger function is passed the entire old and/or new rows of the view,
and must figure out what to do to the underlying tables to implement
the update. So this feature can be used to implement updatable views
using trigger programming style rather than rule hacking.
In passing, this patch corrects the names of some columns in the
information_schema.triggers view. It seems the SQL committee renamed
them somewhere between SQL:99 and SQL:2003.
Dean Rasheed, reviewed by Bernd Helmle; some additional hacking by me.
transaction snapshots, i.e. a snapshot registered at the beginning of
a transaction. Change variable naming and comments to reflect this reality
in preparation for a future, truly serializable mode, e.g.
Serializable Snapshot Isolation (SSI).
For the moment transaction snapshots are still used to implement
SERIALIZABLE, but hopefully not for too much longer. Patch by Kevin
Grittner and Dan Ports with review and some minor wording changes by me.
afterTriggerInvokeEvents failed to adjust events->tailfree when truncating
the last chunk of an event list. This could result in the data being
"de-truncated" by afterTriggerRestoreEventList during a subsequent
subtransaction abort. Even that wouldn't kill us, because the re-added data
would just be events marked DONE --- unless the data had been partially
overwritten by new events. Then we might crash, or in any case misbehave
(perhaps fire triggers twice, or fire triggers with the wrong event data).
Per bug #5622 from Thue Janus Kristensen.
Back-patch to 8.4 where the current trigger list representation was introduced.
- Rename TSParserGetPrsid to get_ts_parser_oid.
- Rename TSDictionaryGetDictid to get_ts_dict_oid.
- Rename TSTemplateGetTmplid to get_ts_template_oid.
- Rename TSConfigGetCfgid to get_ts_config_oid.
- Rename FindConversionByName to get_conversion_oid.
- Rename GetConstraintName to get_constraint_oid.
- Add new functions get_opclass_oid, get_opfamily_oid, get_rewrite_oid,
get_rewrite_oid_without_relid, get_trigger_oid, and get_cast_oid.
The name of each function matches the corresponding catalog.
Thanks to KaiGai Kohei for the review.
Avoid hard-coding lockmode used for many altering DDL commands, allowing easier
future changes of lock levels. Implementation of initial analysis on DDL
sub-commands, so that many lock levels are now at ShareUpdateExclusiveLock or
ShareRowExclusiveLock, allowing certain DDL not to block reads/writes.
First of number of planned changes in this area; additional docs required
when full project complete.
The purpose of this change is to eliminate the need for every caller
of SearchSysCache, SearchSysCacheCopy, SearchSysCacheExists,
GetSysCacheOid, and SearchSysCacheList to know the maximum number
of allowable keys for a syscache entry (currently 4). This will
make it far easier to increase the maximum number of keys in a
future release should we choose to do so, and it makes the code
shorter, too.
Design and review by Tom Lane.
AbortTransaction or AbortSubTransaction, when trying to clean up after an
error that prevented (sub)transaction start from completing:
* access to TopTransactionResourceOwner that might not exist
* assert failure in AtEOXact_GUC, if AtStart_GUC not called yet
* assert failure or core dump in AfterTriggerEndSubXact, if
AfterTriggerBeginSubXact not called yet
Per testing by injecting elog(ERROR) at successive steps in StartTransaction
and StartSubTransaction. It's not clear whether all of these cases could
really occur in the field, but at least one of them is easily exposed by
simple stress testing, as per my accidental discovery yesterday.
pg_constraint before searching pg_trigger. This allows saner handling of
corner cases; in particular we now say "constraint is not deferrable"
rather than "constraint does not exist" when the command is applied to
a constraint that's inherently non-deferrable. Per a gripe several months
ago from hubert depesz lubaczewski.
To make this work without breaking user-defined constraint triggers,
we have to add entries for them to pg_constraint. However, in return
we can remove the pgconstrname column from pg_constraint, which represents
a fairly sizable space savings. I also replaced the tgisconstraint column
with tgisinternal; the old meaning of tgisconstraint can now be had by
testing for nonzero tgconstraint, while there is no other way to get
the old meaning of nonzero tgconstraint, namely that the trigger was
internally generated rather than being user-created.
In passing, fix an old misstatement in the docs and comments, namely that
pg_trigger.tgdeferrable is exactly redundant with pg_constraint.condeferrable.
Actually, we mark RI action triggers as nondeferrable even when they belong to
a nominally deferrable FK constraint. The SET CONSTRAINTS code now relies on
that instead of hard-coding a list of exception OIDs.