The changed routines are mostly those that can be directly called by
ProcessUtilitySlow; the intention is to make the affected object
information more precise, in support for future event trigger changes.
Originally it was envisioned that the OID of the affected object would
be enough, and in most cases that is correct, but upon actually
implementing the event trigger changes it turned out that ObjectAddress
is more widely useful.
Additionally, some command execution routines grew an output argument
that's an object address which provides further info about the executed
command. To wit:
* for ALTER DOMAIN / ADD CONSTRAINT, it corresponds to the address of
the new constraint
* for ALTER OBJECT / SET SCHEMA, it corresponds to the address of the
schema that originally contained the object.
* for ALTER EXTENSION {ADD, DROP} OBJECT, it corresponds to the address
of the object added to or dropped from the extension.
There's no user-visible change in this commit, and no functional change
either.
Discussion: 20150218213255.GC6717@tamriel.snowman.net
Reviewed-By: Stephen Frost, Andres Freund
Previously, we cached domain constraints for the life of a query, or
really for the life of the FmgrInfo struct that was used to invoke
domain_in() or domain_check(). But plpgsql (and probably other places)
are set up to cache such FmgrInfos for the whole lifespan of a session,
which meant they could be enforcing really stale sets of constraints.
On the other hand, searching pg_constraint once per query gets kind of
expensive too: testing says that as much as half the runtime of a
trivial query such as "SELECT 0::domaintype" went into that.
To fix this, delegate the responsibility for tracking a domain's
constraints to the typcache, which has the infrastructure needed to
detect syscache invalidation events that signal possible changes.
This not only removes unnecessary repeat reads of pg_constraint,
but ensures that we never apply stale constraint data: whatever we
use is the current data according to syscache rules.
Unfortunately, the current configuration of the system catalogs means
we have to flush cached domain-constraint data whenever either pg_type
or pg_constraint changes, which happens rather a lot (eg, creation or
deletion of a temp table will do it). It might be worth rearranging
things to split pg_constraint into two catalogs, of which the domain
constraint one would probably be very low-traffic. That's a job for
another patch though, and in any case this patch should improve matters
materially even with that handicap.
This patch makes use of the recently-added memory context reset callback
feature to manage the lifespan of domain constraint caches, so that we
don't risk deleting a cache that might be in the midst of evaluation.
Although this is a bug fix as well as a performance improvement, no
back-patch. There haven't been many if any field complaints about
stale domain constraint checks, so it doesn't seem worth taking the
risk of modifying data structures as basic as MemoryContexts in back
branches.
The way that columns are added to a view is by calling
AlterTableInternal with special subtype AT_AddColumnToView; but that
subtype is changed to AT_AddColumnRecurse by ATPrepAddColumn. This has
no visible effect in the current code, since views cannot have
inheritance children (thus the recursion step is a no-op) and adding a
column to a view is executed identically to doing it to a table; but it
does make a difference for future event trigger code keeping track of
commands, because the current situation leads to confusing the case with
a normal ALTER TABLE ADD COLUMN.
Fix the problem by passing a flag to ATPrepAddColumn to prevent it from
changing the command subtype. The event trigger code can then properly
ignore the subcommand. (We could remove the call to ATPrepAddColumn,
since views are never typed, and there is never a need for recursion,
which are the two conditions that are checked by ATPrepAddColumn; but it
seems more future-proof to keep the call in place.)
transformExpr() has for many years had provisions to do nothing when
applied to an already-transformed expression tree. However, this was
always ugly and of dubious reliability, so we'd be much better off without
it. The primary historical reason for it was that gram.y sometimes
returned multiple links to the same subexpression, which is no longer true
as of my BETWEEN fixes. We'd also grown some lazy hacks in CREATE TABLE
LIKE (failing to distinguish between raw and already-transformed index
specifications) and one or two other places.
This patch removes the need for and support for re-transforming already
transformed expressions. The index case is dealt with by adding a flag
to struct IndexStmt to indicate that it's already been transformed;
which has some benefit anyway in that tablecmds.c can now Assert that
transformation has happened rather than just assuming. The other main
reason was some rather sloppy code for array type coercion, which can
be fixed (and its performance improved too) by refactoring.
I did leave transformJoinUsingClause() still constructing expressions
containing untransformed operator nodes being applied to Vars, so that
transformExpr() still has to allow Var inputs. But that's a much narrower,
and safer, special case than before, since Vars will never appear in a raw
parse tree, and they don't have any substructure to worry about.
In passing fix some oversights in the patch that added CREATE INDEX
IF NOT EXISTS (missing processing of IndexStmt.if_not_exists). These
appear relatively harmless, but still sloppy coding practice.
This works by keeping a per-subtransaction record of the ins/upd/del
counters before the truncate, and then resetting them; this record is
useful to return to the previous state in case the truncate is rolled
back, either in a subtransaction or whole transaction. The state is
propagated upwards as subtransactions commit.
When the per-table data is sent to the stats collector, a flag indicates
to reset the live/dead counters to zero as well.
Catalog version bumped due to the change in pgstat format.
Author: Alexander Shulgin
Discussion: 1007.1207238291@sss.pgh.pa.us
Discussion: 548F7D38.2000401@BlueTreble.com
Reviewed-by: Álvaro Herrera, Jim Nasby
Apart from enabling comments on domain constraints, this enables a
future project to replicate object dropping to remote servers: with the
current mechanism there's no way to distinguish between the two types of
constraints, so there's no way to know what to drop.
Also added support for the domain constraint comments in psql's \dd and
pg_dump.
Catalog version bumped due to the change in ObjectType enum.
As with NOT NULL constraints, we consider that such constraints are merely
reports of constraints that are being enforced by the remote server (or
other underlying storage mechanism). Their only real use is to allow
planner optimizations, for example in constraint-exclusion checks. Thus,
the code changes here amount to little more than removal of the error that
was formerly thrown for applying CHECK to a foreign table.
(In passing, do a bit of cleanup of the ALTER FOREIGN TABLE reference page,
which had accumulated some weird decisions about ordering etc.)
Shigeru Hanada and Etsuro Fujita, reviewed by Kyotaro Horiguchi and
Ashutosh Bapat.
Generate a table_rewrite event when ALTER TABLE
attempts to rewrite a table. Provide helper
functions to identify table and reason.
Intended use case is to help assess or to react
to schema changes that might hold exclusive locks
for long periods.
Dimitri Fontaine, triggering an edit by Simon Riggs
Reviewed in detail by Michael Paquier
As pointed out by Robert, we should really have named pg_rowsecurity
pg_policy, as the objects stored in that catalog are policies. This
patch fixes that and updates the column names to start with 'pol' to
match the new catalog name.
The security consideration for COPY with row level security, also
pointed out by Robert, has also been addressed by remembering and
re-checking the OID of the relation initially referenced during COPY
processing, to make sure it hasn't changed under us by the time we
finish planning out the query which has been built.
Robert and Alvaro also commented on missing OCLASS and OBJECT entries
for POLICY (formerly ROWSECURITY or POLICY, depending) in various
places. This patch fixes that too, which also happens to add the
ability to COMMENT on policies.
In passing, attempt to improve the consistency of messages, comments,
and documentation as well. This removes various incarnations of
'row-security', 'row-level security', 'Row-security', etc, in favor
of 'policy', 'row level security' or 'row_security' as appropriate.
Happy Thanksgiving!
Buildfarm members with CLOBBER_CACHE_ALWAYS advised us that commit
85b506bbfc2937 was mistaken in setting the relpersistence value of the
index directly in the relcache entry, within reindex_index. The reason
for the failure is that an invalidation message that comes after mucking
with the relcache entry directly, but before writing it to the catalogs,
would cause the entry to become rebuilt in place from catalogs with the
old contents, losing the update.
Fix by passing the correct persistence value to
RelationSetNewRelfilenode instead; this routine also writes the updated
tuple to pg_class, avoiding the problem. Suggested by Tom Lane.
This removes ATChangeIndexesPersistence() introduced by f41872d0c1239d36
which was too ugly to live for long. Instead, the correct persistence
marking is passed all the way down to reindex_index, so that the
transient relation built to contain the index relfilenode can
get marked correctly right from the start.
Author: Fabrízio de Royes Mello
Review and editorialization by Michael Paquier
and Álvaro Herrera
xlog.c is huge, this makes it a little bit smaller, which is nice. Functions
related to putting together the WAL record are in xloginsert.c, and the
lower level stuff for managing WAL buffers and such are in xlog.c.
Also move the definition of XLogRecord to a separate header file. This
causes churn in the #includes of all the files that write WAL records, and
redo routines, but it avoids pulling in xlog.h into most places.
Reviewed by Michael Paquier, Alvaro Herrera, Andres Freund and Amit Kapila.
The new header contains many prototypes for functions in ruleutils.c
that are not exposed to the SQL level.
Reviewed by Andres Freund and Michael Paquier.
Buildfarm member tick identified an issue where the policies in the
relcache for a relation were were being replaced underneath a running
query, leading to segfaults while processing the policies to be added
to a query. Similar to how TupleDesc RuleLocks are handled, add in a
equalRSDesc() function to check if the policies have actually changed
and, if not, swap back the rsdesc field (using the original instead of
the temporairly built one; the whole structure is swapped and then
specific fields swapped back). This now passes a CLOBBER_CACHE_ALWAYS
for me and should resolve the buildfarm error.
In addition to addressing this, add a new chapter in Data Definition
under Privileges which explains row security and provides examples of
its usage, change \d to always list policies (even if row security is
disabled- but note that it is disabled, or enabled with no policies),
rework check_role_for_policy (it really didn't need the entire policy,
but it did need to be using has_privs_of_role()), and change the field
in pg_class to relrowsecurity from relhasrowsecurity, based on
Heikki's suggestion. Also from Heikki, only issue SET ROW_SECURITY in
pg_restore when talking to a 9.5+ server, list Bypass RLS in \du, and
document --enable-row-security options for pg_dump and pg_restore.
Lastly, fix a number of minor whitespace and typo issues from Heikki,
Dimitri, add a missing #include, per Peter E, fix a few minor
variable-assigned-but-not-used and resource leak issues from Coverity
and add tab completion for role attribute bypassrls as well.
Building on the updatable security-barrier views work, add the
ability to define policies on tables to limit the set of rows
which are returned from a query and which are allowed to be added
to a table. Expressions defined by the policy for filtering are
added to the security barrier quals of the query, while expressions
defined to check records being added to a table are added to the
with-check options of the query.
New top-level commands are CREATE/ALTER/DROP POLICY and are
controlled by the table owner. Row Security is able to be enabled
and disabled by the owner on a per-table basis using
ALTER TABLE .. ENABLE/DISABLE ROW SECURITY.
Per discussion, ROW SECURITY is disabled on tables by default and
must be enabled for policies on the table to be used. If no
policies exist on a table with ROW SECURITY enabled, a default-deny
policy is used and no records will be visible.
By default, row security is applied at all times except for the
table owner and the superuser. A new GUC, row_security, is added
which can be set to ON, OFF, or FORCE. When set to FORCE, row
security will be applied even for the table owner and superusers.
When set to OFF, row security will be disabled when allowed and an
error will be thrown if the user does not have rights to bypass row
security.
Per discussion, pg_dump sets row_security = OFF by default to ensure
that exports and backups will have all data in the table or will
error if there are insufficient privileges to bypass row security.
A new option has been added to pg_dump, --enable-row-security, to
ask pg_dump to export with row security enabled.
A new role capability, BYPASSRLS, which can only be set by the
superuser, is added to allow other users to be able to bypass row
security using row_security = OFF.
Many thanks to the various individuals who have helped with the
design, particularly Robert Haas for his feedback.
Authors include Craig Ringer, KaiGai Kohei, Adam Brightwell, Dean
Rasheed, with additional changes and rework by me.
Reviewers have included all of the above, Greg Smith,
Jeff McCormick, and Robert Haas.
Add a succint comment explaining why it's correct to change the
persistence in this way. Also s/loggedness/persistence/ because native
speakers didn't like the latter term.
Fabrízio and Álvaro
This enables changing permanent (logged) tables to unlogged and
vice-versa.
(Docs for ALTER TABLE / SET TABLESPACE got shuffled in an order that
hopefully makes more sense than the original.)
Author: Fabrízio de Royes Mello
Reviewed by: Christoph Berg, Andres Freund, Thom Brown
Some tweaking by Álvaro Herrera
As 'ALTER TABLESPACE .. MOVE ALL' really didn't change the tablespace
but instead changed objects inside tablespaces, it made sense to
rework the syntax and supporting functions to operate under the
'ALTER (TABLE|INDEX|MATERIALIZED VIEW)' syntax and to be in
tablecmds.c.
Pointed out by Alvaro, who also suggested the new syntax.
Back-patch to 9.4.
Such cases are disallowed by the SQL spec, and even if we wanted to allow
them, the semantics seem ambiguous: how should the FK columns be matched up
with the columns of a unique index? (The matching could be significant in
the presence of opclasses with different notions of equality, so this issue
isn't just academic.) However, our code did not previously reject such
cases, but instead would either fail to match to any unique index, or
generate a bizarre opclass-lookup error because of sloppy thinking in the
index-matching code.
David Rowley
log_newpage is used by many indexams, in addition to heap, but for
historical reasons it's always been part of the heapam rmgr. Starting with
9.3, we have another WAL record type for logging an image of a page,
XLOG_FPI. Simplify things by moving log_newpage and log_newpage_buffer to
xlog.c, and switch to using the XLOG_FPI record type.
Bump the WAL version number because the code to replay the old HEAP_NEWPAGE
records is removed.
The syntax doesn't let you specify "WITH OIDS" for foreign tables, but it
was still possible with default_with_oids=true. But the rest of the system,
including pg_dump, isn't prepared to handle foreign tables with OIDs
properly.
Backpatch down to 9.1, where foreign tables were introduced. It's possible
that there are databases out there that already have foreign tables with
OIDs. There isn't much we can do about that, but at least we can prevent
them from being created in the future.
Patch by Etsuro Fujita, reviewed by Hadi Moshayedi.
Commit a73018392636ce832b09b5c31f6ad1f18a4643ea created rather a mess by
putting dependencies on backend-only include files into include/common.
We really shouldn't do that. To clean it up:
* Move TABLESPACE_VERSION_DIRECTORY back to its longtime home in
catalog/catalog.h. We won't consider this symbol part of the FE/BE API.
* Push enum ForkNumber from relfilenode.h into relpath.h. We'll consider
relpath.h as the source of truth for fork numbers, since relpath.c was
already partially serving that function, and anyway relfilenode.h was
kind of a random place for that enum.
* So, relfilenode.h now includes relpath.h rather than vice-versa. This
direction of dependency is fine. (That allows most, but not quite all,
of the existing explicit #includes of relpath.h to go away again.)
* Push forkname_to_number from catalog.c to relpath.c, just to centralize
fork number stuff a bit better.
* Push GetDatabasePath from catalog.c to relpath.c; it was rather odd
that the previous commit didn't keep this together with relpath().
* To avoid needing relfilenode.h in common/, redefine the underlying
function (now called GetRelationPath) as taking separate OID arguments,
and make the APIs using RelFileNode or RelFileNodeBackend into macro
wrappers. (The macros have a potential multiple-eval risk, but none of
the existing call sites have an issue with that; one of them had such a
risk already anyway.)
* Fix failure to follow the directions when "init" fork type was added;
specifically, the errhint in forkname_to_number wasn't updated, and neither
was the SGML documentation for pg_relation_size().
* Fix tablespace-path-too-long check in CreateTableSpace() to account for
fork-name component of maximum-length pathnames. This requires putting
FORKNAMECHARS into a header file, but it was rather useless (and
actually unreferenced) where it was.
The last couple of items are potentially back-patchable bug fixes,
if anyone is sufficiently excited about them; but personally I'm not.
Per a gripe from Christoph Berg about how include/common wasn't
self-contained.
Views which are marked as security_barrier must have their quals
applied before any user-defined quals are called, to prevent
user-defined functions from being able to see rows which the
security barrier view is intended to prevent them from seeing.
Remove the restriction on security barrier views being automatically
updatable by adding a new securityQuals list to the RTE structure
which keeps track of the quals from security barrier views at each
level, independently of the user-supplied quals. When RTEs are
later discovered which have securityQuals populated, they are turned
into subquery RTEs which are marked as security_barrier to prevent
any user-supplied quals being pushed down (modulo LEAKPROOF quals).
Dean Rasheed, reviewed by Craig Ringer, Simon Riggs, KaiGai Kohei
VALIDATE CONSTRAINT
CLUSTER ON
SET WITHOUT CLUSTER
ALTER COLUMN SET STATISTICS
ALTER COLUMN SET ()
ALTER COLUMN RESET ()
All other sub-commands use AccessExclusiveLock
Simon Riggs and Noah Misch
Reviews by Robert Haas and Andres Freund
This covers all the SQL-standard trigger types supported for regular
tables; it does not cover constraint triggers. The approach for
acquiring the old row mirrors that for view INSTEAD OF triggers. For
AFTER ROW triggers, we spool the foreign tuples to a tuplestore.
This changes the FDW API contract; when deciding which columns to
populate in the slot returned from data modification callbacks, writable
FDWs will need to check for AFTER ROW triggers in addition to checking
for a RETURNING clause.
In support of the feature addition, refactor the TriggerFlags bits and
the assembly of old tuples in ModifyTable.
Ronan Dunklau, reviewed by KaiGai Kohei; some additional hacking by me.
If the name lookups come to different conclusions due to concurrent
activity, we might perform some parts of the DDL on a different table
than other parts. At least in the case of CREATE INDEX, this can be
used to cause the permissions checks to be performed against a
different table than the index creation, allowing for a privilege
escalation attack.
This changes the calling convention for DefineIndex, CreateTrigger,
transformIndexStmt, transformAlterTableStmt, CheckIndexCompatible
(in 9.2 and newer), and AlterTable (in 9.1 and older). In addition,
CheckRelationOwnership is removed in 9.2 and newer and the calling
convention is changed in older branches. A field has also been added
to the Constraint node (FkConstraint in 8.4). Third-party code calling
these functions or using the Constraint node will require updating.
Report by Andres Freund. Patch by Robert Haas and Andres Freund,
reviewed by Tom Lane.
Security: CVE-2014-0062
Some cases were still reporting errors and aborting, instead of a NOTICE
that the object was being skipped. This makes it more difficult to
cleanly handle pg_dump --clean, so change that to instead skip missing
objects properly.
Per bug #7873 reported by Dave Rolsky; apparently this affects a large
number of users.
Authors: Pavel Stehule and Dean Rasheed. Some tweaks by Álvaro Herrera
We've always allowed CREATE TABLE to create tables in the database's default
tablespace without checking for CREATE permissions on that tablespace.
Unfortunately, the original implementation of ALTER TABLE ... SET TABLESPACE
didn't pick up on that exception.
This changes ALTER TABLE ... SET TABLESPACE to allow the database's default
tablespace without checking for CREATE rights on that tablespace, just as
CREATE TABLE works today. Users could always do this through a series of
commands (CREATE TABLE ... AS SELECT * FROM ...; DROP TABLE ...; etc), so
let's fix the oversight in SET TABLESPACE's original implementation.
When this reloption is set and wal_level=logical is configured,
we'll record the CIDs stamped by inserts, updates, and deletes to
the table just as we would for an actual catalog table. This will
allow logical decoding to use historical MVCC snapshots to access
such tables just as they access ordinary catalog tables.
Replication solutions built around the logical decoding machinery
will likely need to set this operation for their configuration
tables; it might also be needed by extensions which perform table
access in their output functions.
Andres Freund, reviewed by myself and others.
Although user-defined relations can't be directly created in
pg_catalog, it's possible for them to end up there, because you can
create them in some other schema and then use ALTER TABLE .. SET SCHEMA
to move them there. Previously, such relations couldn't afterwards
be manipulated, because IsSystemRelation()/IsSystemClass() rejected
all attempts to modify objects in the pg_catalog schema, regardless
of their origin. With this patch, they now reject only those
objects in pg_catalog which were created at initdb-time, allowing
most operations on user-created tables in pg_catalog to proceed
normally.
This patch also adds new functions IsCatalogRelation() and
IsCatalogClass(), which is similar to IsSystemRelation() and
IsSystemClass() but with a slightly narrower definition: only TOAST
tables of system catalogs are included, rather than *all* TOAST tables.
This is currently used only for making decisions about when
invalidation messages need to be sent, but upcoming logical decoding
patches will find other uses for this information.
Andres Freund, with some modifications by me.
This patch adds the ability to write TABLE( function1(), function2(), ...)
as a single FROM-clause entry. The result is the concatenation of the
first row from each function, followed by the second row from each
function, etc; with NULLs inserted if any function produces fewer rows than
others. This is believed to be a much more useful behavior than what
Postgres currently does with multiple SRFs in a SELECT list.
This syntax also provides a reasonable way to combine use of column
definition lists with WITH ORDINALITY: put the column definition list
inside TABLE(), where it's clear that it doesn't control the ordinality
column as well.
Also implement SQL-compliant multiple-argument UNNEST(), by turning
UNNEST(a,b,c) into TABLE(unnest(a), unnest(b), unnest(c)).
The SQL standard specifies TABLE() with only a single function, not
multiple functions, and it seems to require an implicit UNNEST() which is
not what this patch does. There may be something wrong with that reading
of the spec, though, because if it's right then the spec's TABLE() is just
a pointless alternative spelling of UNNEST(). After further review of
that, we might choose to adopt a different syntax for what this patch does,
but in any case this functionality seems clearly worthwhile.
Andrew Gierth, reviewed by Zoltán Böszörményi and Heikki Linnakangas, and
significantly revised by me
Pending patches for logical replication will use this to determine
which columns of a tuple ought to be considered as its candidate key.
Andres Freund, with minor, mostly cosmetic adjustments by me
Previously, unless all columns were auto-updateable, we wouldn't
inserts, updates, or deletes, or at least not without a rule or trigger;
now, we'll allow inserts and updates that target only the auto-updateable
columns, and deletes even if there are no auto-updateable columns at
all provided the view definition is otherwise suitable.
Dean Rasheed, reviewed by Marko Tiikkaja
Previously, arbitray system columns could be mentioned in table
constraints, but they were not correctly checked at runtime, because
the values weren't actually set correctly in the tuple. Since it
seems easy enough to initialize the table OID properly, do that,
and continue allowing that column, but disallow the rest unless and
until someone figures out a way to make them work properly.
No back-patch, because this doesn't seem important enough to take the
risk of destabilizing the back branches. In fact, this will pose a
dump-and-reload hazard for those upgrading from previous versions:
constraints that were accepted before but were not correctly enforced
will now either be enforced correctly or not accepted at all. Either
could result in restore failures, but in practice I think very few
users will notice the difference, since the use case is pretty
marginal anyway and few users will be relying on features that have
not historically worked.
Amit Kapila, reviewed by Rushabh Lathia, with doc changes by me.
For simple views which are automatically updatable, this patch allows
the user to specify what level of checking should be done on records
being inserted or updated. For 'LOCAL CHECK', new tuples are validated
against the conditionals of the view they are being inserted into, while
for 'CASCADED CHECK' the new tuples are validated against the
conditionals for all views involved (from the top down).
This option is part of the SQL specification.
Dean Rasheed, reviewed by Pavel Stehule
This allows reads to continue without any blocking while a REFRESH
runs. The new data appears atomically as part of transaction
commit.
Review questioned the Assert that a matview was not a system
relation. This will be addressed separately.
Reviewed by Hitoshi Harada, Robert Haas, Andres Freund.
Merged after review with security patch f3ab5d4.