This behavior wasn't documented, but it should be because it's user-visible
in triggers and other functions executed on the remote server.
Per question from Adam Fuchs.
Back-patch to 9.3 where postgres_fdw was added.
The table-rewriting forms of ALTER TABLE are MVCC-unsafe, in much the same
way as TRUNCATE, because they replace all rows of the table with newly-made
rows with a new xmin. (Ideally, concurrent transactions with old snapshots
would continue to see the old table contents, but the data is not there
anymore --- and if it were there, it would be inconsistent with the table's
updated rowtype, so there would be serious implementation problems to fix.)
This was nowhere documented though, and the problem was only documented for
TRUNCATE in a note in the TRUNCATE reference page. Create a new "Caveats"
section in the MVCC chapter that can be home to this and other limitations
on serializable consistency.
In passing, fix a mistaken statement that VACUUM and CLUSTER would reclaim
space occupied by a dropped column. They don't reconstruct existing tuples
so they couldn't do that.
Back-patch to all supported branches.
Although initdb has long discouraged use of a filesystem mount-point
directory as a PG data directory, this point was covered nowhere in the
user-facing documentation. Also, with the popularity of pg_upgrade,
we really need to recommend that the PG user own not only the data
directory but its parent directory too. (Without a writable parent
directory, operations such as "mv data data.old" fail immediately.
pg_upgrade itself doesn't do that, but wrapper scripts for it often do.)
Hence, adjust the "Creating a Database Cluster" section to address
these points. I also took the liberty of wordsmithing the discussion
of NFS a bit.
These considerations aren't by any means new, so back-patch to all
supported branches.
While postgres' use of SSL renegotiation is a good idea in theory, it
turned out to not work well in practice. The specification and openssl's
implementation of it have lead to several security issues. Postgres' use
of renegotiation also had its share of bugs.
Additionally OpenSSL has a bunch of bugs around renegotiation, reported
and open for years, that regularly lead to connections breaking with
obscure error messages. We tried increasingly complex workarounds to get
around these bugs, but we didn't find anything complete.
Since these connection breakages often lead to hard to debug problems,
e.g. spuriously failing base backups and significant latency spikes when
synchronous replication is used, we have decided to change the default
setting for ssl renegotiation to 0 (disabled) in the released
backbranches and remove it entirely in 9.5 and master..
Author: Michael Paquier, with changes by me
Discussion: 20150624144148.GQ4797@alap3.anarazel.de
Backpatch: 9.0-9.4; 9.5 and master get a different patch
The documentation implied that there was seldom any reason to use the
array_append, array_prepend, and array_cat functions directly. But that's
not really true, because they can help make it clear which case is meant,
which the || operator can't do since it's overloaded to represent all three
cases. Add some discussion and examples illustrating the potentially
confusing behavior that can ensue if the parser misinterprets what was
meant.
Per a complaint from Michael Herold. Back-patch to 9.2, which is where ||
started to behave this way.
Tom fixed another one of these in commit 7f32dbcd, but there was another
almost identical one in libpq docs. Per his comment:
HP's web server has apparently become case-sensitive sometime recently.
Per bug #13479 from Daniel Abraham. Corrected link identified by Alvaro.
The .backup file name can be passed to pg_archivecleanup even if
it includes the extension which is specified in -x option.
However, previously the document incorrectly warned a user
not to do that.
Back-patch to 9.2 where pg_archivecleanup's -x option and
the warning were added.
This adjusts commit 82233ce7ea so that the
postmaster does not exit until all its child processes have exited, even
if the 5-second timeout elapses and we have to send SIGKILL. There is no
great value in having the postmaster process quit sooner, and doing so can
mislead onlookers into thinking that the cluster is fully terminated when
actually some child processes still survive.
This effect might explain recent test failures on buildfarm member hamster,
wherein we failed to restart a cluster just after shutting it down with
"pg_ctl stop -m immediate".
I also did a bit of code review/beautification, including fixing a faulty
use of the Max() macro on a volatile expression.
Back-patch to 9.4. In older branches, the postmaster never waited for
children to exit during immediate shutdowns, and changing that would be
too much of a behavioral change.
- Correct the name of directory which those catalog columns allow to be shrunk.
- Correct the name of symbol which is used as the value of pg_class.relminmxid
when the relation is not a table.
- Fix "ID ID" typo.
Backpatch to 9.3 where those cataog columns were introduced.
Also sneak entries for commits 97ff2a564 et al into the sections for
the previous releases in the relevant branches. Those fixes did go out
in the previous releases, but missed getting documented.
This has been the predominant outcome. When the output of decrypting
with a wrong key coincidentally resembled an OpenPGP packet header,
pgcrypto could instead report "Corrupt data", "Not text data" or
"Unsupported compression algorithm". The distinct "Corrupt data"
message added no value. The latter two error messages misled when the
decrypted payload also exhibited fundamental integrity problems. Worse,
error message variance in other systems has enabled cryptologic attacks;
see RFC 4880 section "14. Security Considerations". Whether these
pgcrypto behaviors are likewise exploitable is unknown.
In passing, document that pgcrypto does not resist side-channel attacks.
Back-patch to 9.0 (all supported versions).
Security: CVE-2015-3167
Analysis by Noah Misch shows that the 25% threshold set by commit
53bb309d2d is lower than any other,
similar autovac threshold. While we don't know exactly what value
will be optimal for all users, it is better to err a little on the
high side than on the low side. A higher value increases the risk
that users might exhaust the available space and start seeing errors
before autovacuum can clean things up sufficiently, but a user who
hits that problem can compensate for it by reducing
autovacuum_multixact_freeze_max_age to a value dependent on their
average multixact size. On the flip side, if the emergency cap
imposed by that patch kicks in too early, the user will experience
excessive wraparound scanning and will be unable to mitigate that
problem by configuration. The new value will hopefully reduce the
risk of such bad experiences while still providing enough headroom
to avoid multixact member exhaustion for most users.
Along the way, adjust the documentation to reflect the effects of
commit 04e6d3b877, which taught
autovacuum to run for multixact wraparound even when autovacuum
is configured off.
As discussed, the default setting of include_realm=0 can be dangerous in
multi-realm environments because it is then impossible to differentiate
users with the same username but who are from two different realms.
Recommend include_realm=1 and note that the default setting may change
in a future version of PostgreSQL and therefore users may wish to
explicitly set include_realm to avoid issues while upgrading.
The logic introduced in commit b69bf30b9b
and repaired in commits 669c7d20e6 and
7be47c56af helps to ensure that we don't
overwrite old multixact member information while it is still needed,
but a user who creates many large multixacts can still exhaust the
member space (and thus start getting errors) while autovacuum stands
idly by.
To fix this, progressively ramp down the effective value (but not the
actual contents) of autovacuum_multixact_freeze_max_age as member space
utilization increases. This makes autovacuum more aggressive and also
reduces the threshold for a manual VACUUM to perform a full-table scan.
This patch leaves unsolved the problem of ensuring that emergency
autovacuums are triggered even when autovacuum=off. We'll need to fix
that via a separate patch.
Thomas Munro and Robert Haas
psql was already accepting conninfo strings as the first parameter in
\connect, but the way it worked wasn't sane; some of the other
parameters would get the previous connection's values, causing it to
connect to a completely unexpected server or, more likely, not finding
any server at all because of completely wrong combinations of
parameters.
Fix by explicitely checking for a conninfo-looking parameter in the
dbname position; if one is found, use its complete specification rather
than mix with the other arguments. Also, change tab-completion to not
try to complete conninfo/URI-looking "dbnames" and document that
conninfos are accepted as first argument.
There was a weak consensus to backpatch this, because while the behavior
of using the dbname as a conninfo is nowhere documented for \connect, it
is reasonable to expect that it works because it does work in many other
contexts. Therefore this is backpatched all the way back to 9.0.
To implement this, routines previously private to libpq have been
duplicated so that psql can decide what looks like a conninfo/URI
string. In back branches, just duplicate the same code all the way back
to 9.2, where URIs where introduced; 9.0 and 9.1 have a simpler version.
In master, the routines are moved to src/common and renamed.
Author: David Fetter, Andrew Dunstan. Some editorialization by me
(probably earning a Gierth's "Sloppy" badge in the process.)
Reviewers: Andrew Gierth, Erik Rijkers, Pavel Stěhule, Stephen Frost,
Robert Haas, Andrew Dunstan.
You're required to write either RANGE or ROWS to start a frame clause,
but the documentation incorrectly implied this is optional. Noted by
David Johnston.
The SGML docs claimed that 1-byte integers could be sent or received with
the "isint" options, but no such behavior has ever been implemented in
pqGetInt() or pqPutInt(). The in-code documentation header for PQfn() was
even less in tune with reality, and the code itself used parameter names
matching neither the SGML docs nor its libpq-fe.h declaration. Do a bit
of additional wordsmithing on the SGML docs while at it.
Since the business about 1-byte integers is a clear documentation bug,
back-patch to all supported branches.
Since 9.1, we've provided extensions with a way to denote
"configuration" tables- tables created by an extension which the user
may modify. By marking these as "configuration" tables, the extension
is asking for the data in these tables to be pg_dump'd (tables which
are not marked in this way are assumed to be entirely handled during
CREATE EXTENSION and are not included at all in a pg_dump).
Unfortunately, pg_dump neglected to consider foreign key relationships
between extension configuration tables and therefore could end up
trying to reload the data in an order which would cause FK violations.
This patch teaches pg_dump about these dependencies, so that the data
dumped out is done so in the best order possible. Note that there's no
way to handle circular dependencies, but those have yet to be seen in
the wild.
The release notes for this should include a caution to users that
existing pg_dump-based backups may be invalid due to this issue. The
data is all there, but restoring from it will require extracting the
data for the configuration tables and then loading them in the correct
order by hand.
Discussed initially back in bug #6738, more recently brought up by
Gilles Darold, who provided an initial patch which was further reworked
by Michael Paquier. Further modifications and documentation updates
by me.
Back-patch to 9.1 where we added the concept of extension configuration
tables.
If libpq output buffer is full, pqSendSome() function tries to drain any
incoming data. This avoids deadlock, if the server e.g. sends a lot of
NOTICE messages, and blocks until we read them. However, pqSendSome() only
did that in blocking mode. In non-blocking mode, the deadlock could still
happen.
To fix, take a two-pronged approach:
1. Change the documentation to instruct that when PQflush() returns 1, you
should wait for both read- and write-ready, and call PQconsumeInput() if it
becomes read-ready. That fixes the deadlock, but applications are not going
to change overnight.
2. In pqSendSome(), drain the input buffer before returning 1. This
alleviates the problem for applications that only wait for write-ready. In
particular, a slow but steady stream of NOTICE messages during COPY FROM
STDIN will no longer cause a deadlock. The risk remains that the server
attempts to send a large burst of data and fills its output buffer, and at
the same time the client also sends enough data to fill its output buffer.
The application will deadlock if it goes to sleep, waiting for the socket
to become write-ready, before the server's data arrives. In practice,
NOTICE messages and such that the server might be sending are usually
short, so it's highly unlikely that the server would fill its output buffer
so quickly.
Backpatch to all supported versions.
In investigating yesterday's crash report from Hugo Osvaldo Barrera, I only
looked back as far as commit f3aec2c7f5 where the breakage occurred
(which is why I thought the IPv4-in-IPv6 business was undocumented). But
actually the logic dates back to commit 3c9bb8886d and was simply
broken by erroneous refactoring in the later commit. A bit of archives
excavation shows that we added the whole business in response to a report
that some 2003-era Linux kernels would report IPv4 connections as having
IPv4-in-IPv6 addresses. The fact that we've had no complaints since 9.0
seems to be sufficient confirmation that no modern kernels do that, so
let's just rip it all out rather than trying to fix it.
Do this in the back branches too, thus essentially deciding that our
effective behavior since 9.0 is correct. If there are any platforms on
which the kernel reports IPv4-in-IPv6 addresses as such, yesterday's fix
would have made for a subtle and potentially security-sensitive change in
the effective meaning of IPv4 pg_hba.conf entries, which does not seem like
a good thing to do in minor releases. So let's let the post-9.0 behavior
stand, and change the documentation to match it.
In passing, I failed to resist the temptation to wordsmith the description
of pg_hba.conf IPv4 and IPv6 address entries a bit. A lot of this text
hasn't been touched since we were IPv4-only.
When ecpg was rewritten to the new protocol version not all variable types
were corrected. This patch rewrites the code for these types to fix that. It
also fixes the documentation to correctly tell the status of array handling.
When beginning streaming replication, the client usually issues the
IDENTIFY_SYSTEM command, which used to return the current WAL insert
position. That's not suitable for the intended purpose of that field,
however. pg_receivexlog uses it to start replication from the reported
point, but if it hasn't been flushed to disk yet, it will fail. Change
IDENTIFY_SYSTEM to report the flush position instead.
Backpatch to 9.1 and above. 9.0 doesn't report any WAL position.
The previous wording claimed that the file was always in /etc, but of
course this varies with the installation layout. Write instead that it
can be found via `pg_config --sysconfdir`. Even though this is still
somewhat incorrect because it doesn't account of moved installations, it
at least conveys that the location depends on the installation.
"ECHO all" is ignored for interactive input, and has been for a very long
time, though possibly not for as long as the documentation has claimed the
opposite. Fix that, and also note that empty lines aren't echoed, which
while dubious is another longstanding behavior (it's embedded in our
regression test files for one thing). Per bug #12721 from Hans Ginzel.
In HEAD, also improve the code comments in this area, and suppress an
unnecessary fflush(stdout) when we're not echoing. That would likely
be safe to back-patch, but I'll not risk it mere hours before a release
wrap.
We've been trying to support \u0000 in JSON values since commit
78ed8e03c6, and have introduced increasingly worse hacks to try to
make it work, such as commit 0ad1a81632. However, it fundamentally
can't work in the way envisioned, because the stored representation looks
the same as for \\u0000 which is not the same thing at all. It's also
entirely bogus to output \u0000 when de-escaped output is called for.
The right way to do this would be to store an actual 0x00 byte, and then
throw error only if asked to produce de-escaped textual output. However,
getting to that point seems likely to take considerable work and may well
never be practical in the 9.4.x series.
To preserve our options for better behavior while getting rid of the nasty
side-effects of 0ad1a81632, revert that commit in toto and instead
throw error if \u0000 is used in a context where it needs to be de-escaped.
(These are the same contexts where non-ASCII Unicode escapes throw error
if the database encoding isn't UTF8, so this behavior is by no means
without precedent.)
In passing, make both the \u0000 case and the non-ASCII Unicode case report
ERRCODE_UNTRANSLATABLE_CHARACTER / "unsupported Unicode escape sequence"
rather than claiming there's something wrong with the input syntax.
Back-patch to 9.4, where we have to do something because 0ad1a81632
broke things for many cases having nothing to do with \u0000. 9.3 also has
bogus behavior, but only for that specific escape value, so given the lack
of field complaints it seems better to leave 9.3 alone.
At one point in the development of this feature, it was claimed that
allowing negative values would be useful to compensate for timezone
differences between master and slave servers. That was based on a mistaken
assumption that commit timestamps are recorded in local time; but of course
they're in UTC. Nor is a negative apply delay likely to be a sane way of
coping with server clock skew. However, the committed patch still treated
negative delays as doing something, and the timezone misapprehension
survived in the user documentation as well.
If recovery_min_apply_delay were a proper GUC we'd just set the minimum
allowed value to be zero; but for the moment it seems better to treat
negative settings as if they were zero.
In passing do some extra wordsmithing on the parameter's documentation,
including correcting a second misstatement that the parameter affects
processing of Restore Point records.
Issue noted by Michael Paquier, who also provided the code patch; doc
changes by me. Back-patch to 9.4 where the feature was introduced.
Use the phraseology "ISO 8601 week-numbering year" in place of just
"ISO year", and make related adjustments to other terminology.
The point of this change is that it seems some people see "ISO year"
and think "standard year", whereupon they're surprised when constructs
like to_char(..., "IYYY-MM-DD") produce nonsensical results. Perhaps
hanging a few more adjectives on it will discourage them from jumping
to false conclusions. I put in an explicit warning against that
specific usage, too, though the main point is to discourage people
who haven't read this far down the page.
In passing fix some nearby markup and terminology inconsistencies.