There's a project policy against using plain "char buf[BLCKSZ]" local
or static variables as page buffers; preferred style is to palloc or
malloc each buffer to ensure it is MAXALIGN'd. However, that policy's
been ignored in an increasing number of places. We've apparently got
away with it so far, probably because (a) relatively few people use
platforms on which misalignment causes core dumps and/or (b) the
variables chance to be sufficiently aligned anyway. But this is not
something to rely on. Moreover, even if we don't get a core dump,
we might be paying a lot of cycles for misaligned accesses.
To fix, invent new union types PGAlignedBlock and PGAlignedXLogBlock
that the compiler must allocate with sufficient alignment, and use
those in place of plain char arrays.
I used these types even for variables where there's no risk of a
misaligned access, since ensuring proper alignment should make
kernel data transfers faster. I also changed some places where
we had been palloc'ing short-lived buffers, for coding style
uniformity and to save palloc/pfree overhead.
Since this seems to be a live portability hazard (despite the lack
of field reports), back-patch to all supported versions.
Patch by me; thanks to Michael Paquier for review.
Discussion: https://postgr.es/m/1535618100.1286.3.camel@credativ.de
Commit 244142d32a only tested for the
pg_controldata output for primary servers, but standby servers have
different "Database cluster state" output, so check for that too.
Diagnosed-by: Michael Paquier
Discussion: https://postgr.es/m/20180810164240.GM13638@paquier.xyz
Backpatch-through: 9.3
Previously pg_upgrade checked for the pid file and started/stopped the
server to force a clean shutdown. However, "pg_ctl -m immediate"
removes the pid file but doesn't do a clean shutdown, so check
pg_controldata for a clean shutdown too.
Diagnosed-by: Vimalraj A
Discussion: https://postgr.es/m/CAFKBAK5e4Q-oTUuPPJ56EU_d2Rzodq6GWKS3ncAk3xo7hAsOZg@mail.gmail.com
Backpatch-through: 9.3
I blew the dust off a Bourne shell (file date 1996, yea verily) and
tried to run test.sh with it. It mostly worked, but I found that the
temp-directory creation code introduced by commit be76a6d39 was not
compatible, for a couple of reasons: this shell thinks "set -e" should
force an exit if a command within backticks fails, and it also thinks code
within braces should be executed by a sub-shell, meaning that variable
settings don't propagate back up to the parent shell. In view of Victor
Wagner's report that Solaris is still using pre-POSIX shells, seems like
we oughta make this case work. It's not like the code is any less
idiomatic this way; the prior coding technique appeared nowhere else.
(There is a remaining bash-ism here, which is that $RANDOM doesn't do
what the code hopes in non-bash shells. But the use of $$ elsewhere in
that path should be enough to ensure uniqueness and some amount of
randomness, so I think it's okay as-is.)
Back-patch to all supported branches, as the previous commit was.
Discussion: https://postgr.es/m/20180720153820.69e9ae6c@fafnir.local.vm
Double-quote $PGDATA in "find" commands introduced by commit da9b580d8,
in case that path contains spaces or other special characters.
Adjust a few other places so that quoting is done more consistently.
None of the others are actual bugs AFAICS, but it's confusing to readers
if the same thing is done differently in different places.
Noted by Tels.
Discussion: https://postgr.es/m/c96303c04c360bbedaa04f90f515745b.squirrel@sm.webmail.pair.com
Most of test.sh uses traditional backtick syntax for command substitution,
but commit da9b580d8 introduced two uses of $(...) syntax, which is not
recognized by very old shells. Bring those into line with the rest.
Victor Wagner
Discussion: https://postgr.es/m/20180720153820.69e9ae6c@fafnir.local.vm
A collection of typos I happened to spot while reading code, as well as
grepping for common mistakes.
Backpatch to all supported versions, as applicable, to avoid conflicts
when backporting other commits in the future.
Msys2's uname -s outputs a string beginning MSYS rather than MINGW as is
output by Msys. Allow either in pg_upgrade's test.sh.
Backpatch to all live branches.
This reverts the backend sides of commit 1fde38beaa.
I have, at least for now, left the pg_verify_checksums tool in place, as
this tool can be very valuable without the rest of the patch as well,
and since it's a read-only tool that only runs when the cluster is down
it should be a lot safer.
Everything of use to frontend code should now appear in the _d.h files,
and making this change frees us from needing to worry about whether the
catalog header files proper are frontend-safe.
Remove src/interfaces/ecpg/ecpglib/pg_type.h entirely, as the previous
commit reduced it to a confusingly-named wrapper around pg_type_d.h.
In passing, make test_rls_hooks.c follow project convention of including
our own files with #include "" not <>.
Discussion: https://postgr.es/m/23690.1523031777@sss.pgh.pa.us
Allow the cluster to be optionally init'd with read access for the
group.
This means a relatively non-privileged user can perform a backup of the
cluster without requiring write privileges, which enhances security.
The mode of PGDATA is used to determine whether group permissions are
enabled for directory and file creates. This method was chosen as it's
simple and works well for the various utilities that write into PGDATA.
Changing the mode of PGDATA manually will not automatically change the
mode of all the files contained therein. If the user would like to
enable group access on an existing cluster then changing the mode of all
the existing files will be required. Note that pg_upgrade will
automatically change the mode of all migrated files if the new cluster
is init'd with the -g option.
Tests are included for the backend and all the utilities which operate
on the PG data directory to ensure that the correct mode is set based on
the data directory permissions.
Author: David Steele <david@pgmasters.net>
Reviewed-By: Michael Paquier, with discussion amongst many others.
Discussion: https://postgr.es/m/ad346fe6-b23e-59f1-ecb7-0e08390ad629%40pgmasters.net
Consolidate directory and file create permissions for tools which work
with the PG data directory by adding a new module (common/file_perm.c)
that contains variables (pg_file_create_mode, pg_dir_create_mode) and
constants to initialize them (0600 for files and 0700 for directories).
Convert mkdir() calls in the backend to MakePGDirectory() if the
original call used default permissions (always the case for regular PG
directories).
Add tests to make sure permissions in PGDATA are set correctly by the
tools which modify the PG data directory.
Authors: David Steele <david@pgmasters.net>,
Adam Brightwell <adam.brightwell@crunchydata.com>
Reviewed-By: Michael Paquier, with discussion amongst many others.
Discussion: https://postgr.es/m/ad346fe6-b23e-59f1-ecb7-0e08390ad629%40pgmasters.net
This makes it possible to turn checksums on in a live cluster, without
the previous need for dump/reload or logical replication (and to turn it
off).
Enabling checkusm starts a background process in the form of a
launcher/worker combination that goes through the entire database and
recalculates checksums on each and every page. Only when all pages have
been checksummed are they fully enabled in the cluster. Any failure of
the process will revert to checksums off and the process has to be
started.
This adds a new WAL record that indicates the state of checksums, so
the process works across replicated clusters.
Authors: Magnus Hagander and Daniel Gustafsson
Review: Tomas Vondra, Michael Banck, Heikki Linnakangas, Andrey Borodin
We were being careless in some places about the order of -L switches in
link command lines, such that -L switches referring to external directories
could come before those referring to directories within the build tree.
This made it possible to accidentally link a system-supplied library, for
example /usr/lib/libpq.so, in place of the one built in the build tree.
Hilarity ensued, the more so the older the system-supplied library is.
To fix, break LDFLAGS into two parts, a sub-variable LDFLAGS_INTERNAL
and the main LDFLAGS variable, both of which are "recursively expanded"
so that they can be incrementally adjusted by different makefiles.
Establish a policy that -L switches for directories in the build tree
must always be added to LDFLAGS_INTERNAL, while -L switches for external
directories must always be added to LDFLAGS. This is sufficient to
ensure a safe search order. For simplicity, we typically also put -l
switches for the respective libraries into those same variables.
(Traditional make usage would have us put -l switches into LIBS, but
cleaning that up is a project for another day, as there's no clear
need for it.)
This turns out to also require separating SHLIB_LINK into two variables,
SHLIB_LINK and SHLIB_LINK_INTERNAL, with a similar rule about which
switches go into which variable. And likewise for PG_LIBS.
Although this change might appear to affect external users of pgxs.mk,
I think it doesn't; they shouldn't have any need to touch the _INTERNAL
variables.
In passing, tweak src/common/Makefile so that the value of CPPFLAGS
recorded in pg_config lacks "-DFRONTEND" and the recorded value of
LDFLAGS lacks "-L../../../src/common". Both of those things are
mistakes, apparently introduced during prior code rearrangements,
as old versions of pg_config don't print them. In general we don't
want anything that's specific to the src/common subdirectory to
appear in those outputs.
This is certainly a bug fix, but in view of the lack of field
complaints, I'm unsure whether it's worth the risk of back-patching.
In any case it seems wise to see what the buildfarm makes of it first.
Discussion: https://postgr.es/m/25214.1522604295@sss.pgh.pa.us
This makes the client programs behave as documented regardless of the
connect-time search_path and regardless of user-created objects. Today,
a malicious user with CREATE permission on a search_path schema can take
control of certain of these clients' queries and invoke arbitrary SQL
functions under the client identity, often a superuser. This is
exploitable in the default configuration, where all users have CREATE
privilege on schema "public".
This changes behavior of user-defined code stored in the database, like
pg_index.indexprs and pg_extension_config_dump(). If they reach code
bearing unqualified names, "does not exist" or "no schema has been
selected to create in" errors might appear. Users may fix such errors
by schema-qualifying affected names. After upgrading, consider watching
server logs for these errors.
The --table arguments of src/bin/scripts clients have been lax; for
example, "vacuumdb -Zt pg_am\;CHECKPOINT" performed a checkpoint. That
now fails, but for now, "vacuumdb -Zt 'pg_am(amname);CHECKPOINT'" still
performs a checkpoint.
Back-patch to 9.3 (all supported versions).
Reviewed by Tom Lane, though this fix strategy was not his first choice.
Reported by Arseniy Sharoglazov.
Security: CVE-2018-1058
Commit b3f840120 changed pg_upgrade so that it'd actually drop and
re-create the template1 and postgres databases in the new cluster.
That works fine, serially. With the -j option it's not so fine, because
other per-database jobs might be launched while the template1 database is
dropped. Since they attempt to connect there to start up, kaboom.
This is the cause of the intermittent failures buildfarm member jacana
has been showing for the last month; evidently it is the only BF member
configured to run the pg_upgrade test with parallelism enabled.
Fix by processing template1 separately before we get into the parallel
sub-job launch loop. (We could alternatively have made the postgres DB
be the special case, but it seems likely that template1 will contain
less stuff and so we lose less parallelism with this choice.)
This oversight led to data corruption in matviews, manifesting as
"could not access status of transaction" before our most recent releases,
and "found xmin from before relfrozenxid" errors since then.
The proximate cause of the problem seems to have been confusion between
the task of preserving dropped-column status and the task of preserving
frozenxid status. Those are required for distinct sets of relkinds,
and the reasoning was entirely undocumented in the source code. In hopes
of forestalling future errors of the same kind, try to improve the
commentary in this area.
In passing, also improve the remarkably unhelpful comments around
pg_upgrade's set_frozenxids(). That's not actually buggy AFAICS,
but good luck figuring out what it does from the old comments.
Per report from Claudio Freire. It appears that bug #14852 from Alexey
Ermakov is an earlier report of the same issue, and there may be other
cases that we failed to identify at the time.
Patch by me based on analysis by Andres Freund. The bug dates back
to the introduction of matviews, so back-patch to all supported branches.
Discussion: https://postgr.es/m/CAGTBQpbrY9CdRGGhyBZ9yqY4jWaGC85rUF4X+R7d-aim=mBNsw@mail.gmail.com
Discussion: https://postgr.es/m/20171013115320.28049.86457@wrigleys.postgresql.org
pg_upgrade has always attempted to ensure that the transient dump files
it creates are inaccessible except to the owner. However, refactoring
in commit 76a7650c4 broke that for the file containing "pg_dumpall -g"
output; since then, that file was protected according to the process's
default umask. Since that file may contain role passwords (hopefully
encrypted, but passwords nonetheless), this is a particularly unfortunate
oversight. Prudent users of pg_upgrade on multiuser systems would
probably run it under a umask tight enough that the issue is moot, but
perhaps some users are depending only on pg_upgrade's umask changes to
protect their data.
To fix this in a future-proof way, let's just tighten the umask at
process start. There are no files pg_upgrade needs to write at a
weaker security level; and if there were, transiently relaxing the
umask around where they're created would be a safer approach.
Report and patch by Tom Lane; the idea for the fix is due to Noah Misch.
Back-patch to all supported branches.
Security: CVE-2018-1053
This patch rearranges the division of labor between pg_dump and pg_dumpall
so that pg_dump itself handles all properties attached to a single
database. Notably, a database's ACL (GRANT/REVOKE status) and local GUC
settings established by ALTER DATABASE SET and ALTER ROLE IN DATABASE SET
can be dumped and restored by pg_dump. This is a long-requested
improvement.
"pg_dumpall -g" will now produce only role- and tablespace-related output,
nothing about individual databases. The total output of a regular
pg_dumpall run remains the same.
pg_dump (or pg_restore) will restore database-level properties only when
creating the target database with --create. This applies not only to
ACLs and GUCs but to the other database properties it already handled,
that is database comments and security labels. This is more consistent
and useful, but does represent an incompatibility in the behavior seen
without --create.
(This change makes the proposed patch to have pg_dump use "COMMENT ON
DATABASE CURRENT_DATABASE" unnecessary, since there is no case where
the command is issued that we won't know the true name of the database.
We might still want that patch as a feature in its own right, but pg_dump
no longer needs it.)
pg_dumpall with --clean will now drop and recreate the "postgres" and
"template1" databases in the target cluster, allowing their locale and
encoding settings to be changed if necessary, and providing a cleaner
way to set nondefault tablespaces for them than we had before. This
means that such a script must now always be started in the "postgres"
database; the order of drops and reconnects will not work otherwise.
Without --clean, the script will not adjust any database-level properties
of those two databases (including their comments, ACLs, and security
labels, which it formerly would try to set).
Another minor incompatibility is that the CREATE DATABASE commands in a
pg_dumpall script will now always specify locale and encoding settings.
Formerly those would be omitted if they matched the cluster's default.
While that behavior had some usefulness in some migration scenarios,
it also posed a significant hazard of unwanted locale/encoding changes.
To migrate to another locale/encoding, it's now necessary to use pg_dump
without --create to restore into a database with the desired settings.
Commit 4bd371f6f's hack to emit "SET default_transaction_read_only = off"
is gone: we now dodge that problem by the expedient of not issuing ALTER
DATABASE SET commands until after reconnecting to the target database.
Therefore, such settings won't apply during the restore session.
In passing, improve some shaky grammar in the docs, and add a note pointing
out that pg_dumpall's output can't be expected to load without any errors.
(Someday we might want to fix that, but this is not that patch.)
Haribabu Kommi, reviewed at various times by Andreas Karlsson,
Vaishnavi Prabakaran, and Robert Haas; further hacking by me.
Discussion: https://postgr.es/m/CAJrrPGcUurV0eWTeXODwsOYFN=Ekq36t1s0YnFYUNzsmRfdAyA@mail.gmail.com
Previously an inaccurate but harmless error was generated when running
--check on a live server before reporting the servers as compatible.
The fix is to split error reporting and exit control in the exec_prog()
API.
Reported-by: Daniel Westermann
Backpatch-through: 10
Commit 9be95ef15 failed to cure all of the redundancy here: we were
actually calling get_major_server_version() three times for each
of the old and new data directories. While that's not enormously
expensive, it's still sloppy.
A. Akenteva
Discussion: https://postgr.es/m/f9266a85d918a3cf3a386b5148aee666@postgrespro.ru
Somebody messed up a refactoring here. As it stood, we'd check pg_ctl's
--version output twice for each cluster. Worse, the first check for the
new cluster's version happened before we'd done any validate_exec checks
there, breaking the check ordering the code intended.
A. Akenteva
Discussion: https://postgr.es/m/f9266a85d918a3cf3a386b5148aee666@postgrespro.ru
The lower case spellings are C and C++ standard and are used in most
parts of the PostgreSQL sources. The upper case spellings are only used
in some files/modules. So standardize on the standard spellings.
The APIs for ICU, Perl, and Windows define their own TRUE and FALSE, so
those are left as is when using those APIs.
In code comments, we use the lower-case spelling for the C concepts and
keep the upper-case spelling for the SQL concepts.
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
configure computed PG_VERSION_NUM incorrectly. (Coulda sworn I tested
that logic back when, but it had an obvious thinko.)
pg_upgrade had not been taught about the new dispensation with just
one part in the major version number.
Both things accidentally failed to fail with 10.0, but with 10.1 we
got the wrong results.
Per buildfarm.
This reverts commit f41e56c76e.
The build farm client would run the pg_upgrade tests twice, once as part
of the existing pg_upgrade check run and once as part of picking up all
TAP tests by looking for "t" directories. Since the pg_upgrade tests
are pretty slow, we will need a better solution or possibly a build farm
client change before we can proceed with this.
For performance reasons a larger segment size than the default 16MB
can be useful. A larger segment size has two main benefits: Firstly,
in setups using archiving, it makes it easier to write scripts that
can keep up with higher amounts of WAL, secondly, the WAL has to be
written and synced to disk less frequently.
But at the same time large segment size are disadvantageous for
smaller databases. So far the segment size had to be configured at
compile time, often making it unrealistic to choose one fitting to a
particularly load. Therefore change it to a initdb time setting.
This includes a breaking changes to the xlogreader.h API, which now
requires the current segment size to be configured. For that and
similar reasons a number of binaries had to be taught how to recognize
the current segment size.
Author: Beena Emerson, editorialized by Andres Freund
Reviewed-By: Andres Freund, David Steele, Kuntal Ghosh, Michael
Paquier, Peter Eisentraut, Robert Hass, Tushar Ahuja
Discussion: https://postgr.es/m/CAOG9ApEAcQ--1ieKbhFzXSQPw_YLmepaa4hNdnY5+ZULpt81Mw@mail.gmail.com
The plan is to convert the current pg_upgrade test to the TAP
framework. This commit just puts a basic TAP test in place so that we
can see how the build farm behaves, since the build farm client has some
special knowledge of the pg_upgrade tests.
Author: Michael Paquier <michael.paquier@gmail.com>
Remove code meant for upgrading to a particular version of PostgreSQL
9.0. Since pg_upgrade only supports upgrading to the current major
version, this code is no longer useful.