Previously, in unlucky cases, it was possible for pg_rewind to remove
certain WAL segments from the rewound demoted primary. In particular
this happens if those files have been marked for archival (i.e., their
.ready files were created) but not yet archived; the newly promoted node
no longer has such files because of them having been recycled, but they
are likely critical for recovery in the demoted node. If pg_rewind
removes them, recovery is not possible anymore.
Fix this by maintaining a hash table of files in this situation in the
scan that looks for a checkpoint, which the decide_file_actions phase
can consult so that it knows to preserve them.
Backpatch to 14. The problem also exists in 13, but that branch was not
blessed with commit eb00f1d4bf, so this patch is difficult to apply
there. Users of older releases will just have to continue to be extra
careful when rewinding.
Co-authored-by: Полина Бунгина (Polina Bungina) <bungina@gmail.com>
Co-authored-by: Alexander Kukushkin <cyberdemn@gmail.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Discussion: https://postgr.es/m/CAAtGL4AhzmBRsEsaDdz7065T+k+BscNadfTqP1NcPmsqwA5HBw@mail.gmail.com
The macOS Finder application creates .DS_Store files in directories
when opened, which creates problems for serverside utilities which
expect all files to be PostgreSQL specific files. Skip these files
when encountered in pg_checksums, pg_rewind and pg_basebackup.
This was extracted from a larger patchset for skipping hidden files
and system files, where the concencus was to just skip these. Since
this is equally likely to happen in every version, backpatch to all
supported versions.
Reported-by: Mark Guertin <markguertin@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Tobias Bussmann <t.bussmann@gmx.net>
Discussion: https://postgr.es/m/E258CE50-AB0E-455D-8AAD-BB4FE8F882FB@gmail.com
Backpatch-through: v12
Exclusive-mode backups have been deprecated since 9.6 (when
non-exclusive backups were introduced) due to the issues
they can cause should the system crash while one is running and
generally because non-exclusive provides a much better interface.
Further, exclusive backup mode wasn't really being tested (nor was most
of the related code- like being able to log in just to stop an exclusive
backup and the bits of the state machine related to that) and having to
possibly deal with an exclusive backup and the backup_label file
existing during pg_basebackup, pg_rewind, etc, added other complexities
that we are better off without.
This patch removes the exclusive backup mode, the various special cases
for dealing with it, and greatly simplifies the online backup code and
documentation.
Authors: David Steele, Nathan Bossart
Reviewed-by: Chapman Flack
Discussion: https://postgr.es/m/ac7339ca-3718-3c93-929f-99e725d1172c@pgmasters.nethttps://postgr.es/m/CAHg+QDfiM+WU61tF6=nPZocMZvHDzCK47Kneyb0ZRULYzV5sKQ@mail.gmail.com
The patch needs test cases, reorganization, and cfbot testing.
Technically reverts commits 5c31afc49d..e35b2bad1a (exclusive/inclusive)
and 08db7c63f3..ccbe34139b.
Reported-by: Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/E1ktAAG-0002V2-VB@gemulon.postgresql.org
This adds a key management system that stores (currently) two data
encryption keys of length 128, 192, or 256 bits. The data keys are
AES256 encrypted using a key encryption key, and validated via GCM
cipher mode. A command to obtain the key encryption key must be
specified at initdb time, and will be run at every database server
start. New parameters allow a file descriptor open to the terminal to
be passed. pg_upgrade support has also been added.
Discussion: https://postgr.es/m/CA+fd4k7q5o6Nc_AaX6BcYM9yqTbC6_pnH-6nSD=54Zp6NBQTCQ@mail.gmail.com
Discussion: https://postgr.es/m/20201202213814.GG20285@momjian.us
Author: Masahiko Sawada, me, Stephen Frost
It's entirely possible to see WAL for a relation that doesn't exist in
the target anymore. That happens when the relation was dropped later.
The refactoring in commit eb00f1d4b broke that case, by sanity-checking
the file type in the target before checking the flag forwhether it
exists there at all.
I noticed this during manual testing. Modify the 001_basic.pl test so
that it covers this case.
Deciding what to do with each file is now a separate step after all the
necessary information has been gathered. It is more clear that way.
Previously, the decision-making was divided between process_source_file()
and process_target_file(), and it was a bit hard to piece together what
the overall rules were.
Reviewed-by: Kyotaro Horiguchi, Soumyadeep Chakraborty
Discussion: https://www.postgresql.org/message-id/0c5b3783-af52-3ee5-f8fa-6e794061f70d%40iki.fi
This is duplicative of an lstat that was just done by the calling
function (traverse_datadir), besides which we weren't really doing
anything with the results. There's not much point in checking to
see if someone removed the file since the previous lstat, since the
FILE_ACTION_REMOVE code would have to deal with missing-file cases
anyway. Moreover, the "exists = false" assignment was a dead store;
nothing was done with that value later.
A syscall saved is a syscall earned, so back-patch to 9.5
where this code was introduced.
Discussion: https://postgr.es/m/1221796.1599329320@sss.pgh.pa.us
Originally, the names assigned to SLRUs had no purpose other than
being shmem lookup keys, so not a lot of thought went into them.
As of v13, though, we're exposing them in the pg_stat_slru view and
the pg_stat_reset_slru function, so it seems advisable to take a bit
more care. Rename them to names based on the associated on-disk
storage directories (which fortunately we *did* think about, to some
extent; since those are also visible to DBAs, consistency seems like
a good thing). Also rename the associated LWLocks, since those names
are likewise user-exposed now as wait event names.
For the most part I only touched symbols used in the respective modules'
SimpleLruInit() calls, not the names of other related objects. This
renaming could have been taken further, and maybe someday we will do so.
But for now it seems undesirable to change the names of any globally
visible functions or structs, so some inconsistency is unavoidable.
(But I *did* terminate "oldserxid" with prejudice, as I found that
name both unreadable and not descriptive of the SLRU's contents.)
Table 27.12 needs re-alphabetization now, but I'll leave that till
after the other LWLock renamings I have in mind.
Discussion: https://postgr.es/m/28683.1589405363@sss.pgh.pa.us
If there is already a backup_manifest file in the database cluster,
it belongs to the past backup that was used to start this server.
It is not correct for the backup being taken now. So this commit
changes pg_basebackup so that it always skips such backup_manifest
file. The backup_manifest file for the current backup will be injected
separately if users want it.
Author: Fujii Masao
Reviewed-by: Robert Haas
Discussion: https://postgr.es/m/78f76a3d-1a28-a97d-0394-5c96985dd1c0@oss.nttdata.com
An instance of PostgreSQL crashing with a bad timing could leave behind
temporary pg_internal.init files, potentially causing failures when
verifying checksums. As the same exclusion lists are used between
pg_rewind, pg_checksums and basebackup.c, all those tools are extended
with prefix checks to keep everything in sync, with dedicated checks
added for pg_internal.init.
Backpatch down to 11, where pg_checksums (pg_verify_checksums in 11) and
checksum verification for base backups have been introduced.
Reported-by: Michael Banck
Author: Michael Paquier
Reviewed-by: Kyotaro Horiguchi, David Steele
Discussion: https://postgr.es/m/62031974fd8e941dd8351fbc8c7eff60d59c5338.camel@credativ.de
Backpatch-through: 11
This is still using the 2.0 version of pg_bsd_indent.
I thought it would be good to commit this separately,
so as to document the differences between 2.0 and 2.1 behavior.
Discussion: https://postgr.es/m/16296.1558103386@sss.pgh.pa.us
The existence of these files became rather confusing with the
introduction of a widely-known logging.h header in commit cc8d41511.
(Indeed, there's already some duplicative #includes here, perhaps
betraying such confusion.) The only thing left in them, after that
commit, is a progress-reporting function that's neither general-purpose
nor tied in any way to other logging infrastructure. Hence, let's just
move that function to pg_rewind.c, and get rid of the separate files.
Discussion: https://postgr.es/m/3971.1557787914@sss.pgh.pa.us
This unifies the various ad hoc logging (message printing, error
printing) systems used throughout the command-line programs.
Features:
- Program name is automatically prefixed.
- Message string does not end with newline. This removes a common
source of inconsistencies and omissions.
- Additionally, a final newline is automatically stripped, simplifying
use of PQerrorMessage() etc., another common source of mistakes.
- I converted error message strings to use %m where possible.
- As a result of the above several points, more translatable message
strings can be shared between different components and between
frontends and backend, without gratuitous punctuation or whitespace
differences.
- There is support for setting a "log level". This is not meant to be
user-facing, but can be used internally to implement debug or
verbose modes.
- Lazy argument evaluation, so no significant overhead if logging at
some level is disabled.
- Some color in the messages, similar to gcc and clang. Set
PG_COLOR=auto to try it out. Some colors are predefined, but can be
customized by setting PG_COLORS.
- Common files (common/, fe_utils/, etc.) can handle logging much more
simply by just using one API without worrying too much about the
context of the calling program, requiring callbacks, or having to
pass "progname" around everywhere.
- Some programs called setvbuf() to make sure that stderr is
unbuffered, even on Windows. But not all programs did that. This
is now done centrally.
Soft goals:
- Reduces vertical space use and visual complexity of error reporting
in the source code.
- Encourages more deliberate classification of messages. For example,
in some cases it wasn't clear without analyzing the surrounding code
whether a message was meant as an error or just an info.
- Concepts and terms are vaguely aligned with popular logging
frameworks such as log4j and Python logging.
This is all just about printing stuff out. Nothing affects program
flow (e.g., fatal exits). The uses are just too varied to do that.
Some existing code had wrappers that do some kind of print-and-exit,
and I adapted those.
I tried to keep the output mostly the same, but there is a lot of
historical baggage to unwind and special cases to consider, and I
might not always have succeeded. One significant change is that
pg_rewind used to write all error messages to stdout. That is now
changed to stderr.
Reviewed-by: Donald Dong <xdong@csumb.edu>
Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru>
Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
This fixes an issue introduced by 266b6ac, which has added filters to
exclude file patterns on the target and source data directories to
reduce the number of files transferred. Filters get applied to both
the target and source data files, and include pg_internal.init which is
present for each database once relations are created on it. However, if
the target differed from the source with at least one new database with
relations, the rewind would fail due to the exclusion filters applied on
the target files, causing pg_internal.init to still be present on the
target database folder, while its contents should have been completely
removed so as there is nothing remaining inside at the time of the
folder deletion.
Applying exclusion filters on the source files is fine, because this way
the amount of data copied from the source to the target is reduced. And
actually, not applying the filters on the target is what pg_rewind
should do, because this causes such files to be automatically removed
during the rewind on the target. Exclusion filters apply to paths which
are removed or recreated automatically at startup, so removing all those
files on the target during the rewind is a win.
The existing set of TAP tests already stresses the rewind of databases,
but it did not include any tables on those newly-created databases.
Creating extra tables in this case is enough to reproduce the failure,
so the existing tests are extended to close the gap.
Reported-by: Mithun Cy
Author: Michael Paquier
Discussion: https://postgr.es/m/CADq3xVYt6_pO7ZzmjOqPgY9HWsL=kLd-_tNyMtdfjKqEALDyTA@mail.gmail.com
Backpatch-through: 11
Historically we forbade datatype-specific comparison functions from
returning INT_MIN, so that it would be safe to invert the sort order
just by negating the comparison result. However, this was never
really safe for comparison functions that directly return the result
of memcmp(), strcmp(), etc, as POSIX doesn't place any such restriction
on those library functions. Buildfarm results show that at least on
recent Linux on s390x, memcmp() actually does return INT_MIN sometimes,
causing sort failures.
The agreed-on answer is to remove this restriction and fix relevant
call sites to not make such an assumption; code such as "res = -res"
should be replaced by "INVERT_COMPARE_RESULT(res)". The same is needed
in a few places that just directly negated the result of memcmp or
strcmp.
To help find places having this problem, I've also added a compile option
to nbtcompare.c that causes some of the commonly used comparators to
return INT_MIN/INT_MAX instead of their usual -1/+1. It'd likely be
a good idea to have at least one buildfarm member running with
"-DSTRESS_SORT_INT_MIN". That's far from a complete test of course,
but it should help to prevent fresh introductions of such bugs.
This is a longstanding portability hazard, so back-patch to all supported
branches.
Discussion: https://postgr.es/m/20180928185215.ffoq2xrq5d3pafna@alap3.anarazel.de
Fix one untranslatable string concatenation in pg_rewind.
Fix one message in pg_verify_checksums to use a style use elsewhere
and avoid plural issues.
Fix one gratuitous abbreviation in psql.
In commit 9c0a0de4c, I'd failed to notice that catalog/catalog.h
should also be considered a frontend-unsafe header, because it includes
(and needs) the full form of pg_class.h, not to mention relcache.h.
However, various frontend code was depending on it to get
TABLESPACE_VERSION_DIRECTORY, so refactoring of some sort is called for.
The cleanest answer seems to be to move TABLESPACE_VERSION_DIRECTORY,
as well as the OIDCHARS symbol, to common/relpath.h. Do that, and mop up
inclusions as necessary. (I found that quite a few current users of
catalog/catalog.h don't seem to need it at all anymore, apparently as a
result of the refactorings that created common/relpath.[hc]. And
initdb.c needed it only as a route to pg_class_d.h.)
Discussion: https://postgr.es/m/6629.1523294509@sss.pgh.pa.us
Everything of use to frontend code should now appear in the _d.h files,
and making this change frees us from needing to worry about whether the
catalog header files proper are frontend-safe.
Remove src/interfaces/ecpg/ecpglib/pg_type.h entirely, as the previous
commit reduced it to a confusingly-named wrapper around pg_type_d.h.
In passing, make test_rls_hooks.c follow project convention of including
our own files with #include "" not <>.
Discussion: https://postgr.es/m/23690.1523031777@sss.pgh.pa.us
The target cluster that was rewound needs to perform recovery from
the checkpoint created at failover, which leads it to remove or recreate
some files and directories that may have been copied from the source
cluster. So pg_rewind can skip synchronizing such files and directories,
and which reduces the amount of data transferred during a rewind
without changing the usefulness of the operation.
Author: Michael Paquier
Reviewed-by: Anastasia Lubennikova, Stephen Frost and me
Discussion: https://postgr.es/m/20180205071022.GA17337@paquier.xyz
pg_rewind checks whether each file is a relation data file, from its path.
Previously this check logic had the bug which made pg_rewind fail to
recognize any relation data files in tablespaces. Which also caused
an assertion failure in pg_rewind.
Back-patch to 9.5 where pg_rewind was added.
Author: Takayuki Tsunakawa
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/0A3221C70F24FB45833433255569204D1F8D6C7A@G01JPEXMBYT05
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
c.h #includes a number of core libc header files, such as <stdio.h>.
There's no point in re-including these after having read postgres.h,
postgres_fe.h, or c.h; so remove code that did so.
While at it, also fix some places that were ignoring our standard pattern
of "include postgres[_fe].h, then system header files, then other Postgres
header files". While there's not any great magic in doing it that way
rather than system headers last, it's silly to have just a few files
deviating from the general pattern. (But I didn't attempt to enforce this
globally, only in files I was touching anyway.)
I'd be the first to say that this is mostly compulsive neatnik-ism,
but over time it might save enough compile cycles to be useful.
"xlog" is not a particularly clear abbreviation for "write-ahead log",
and it sometimes confuses users into believe that the contents of the
"pg_xlog" directory are not critical data, leading to unpleasant
consequences. So, rename the directory to "pg_wal".
This patch modifies pg_upgrade and pg_basebackup to understand both
the old and new directory layouts; the former is necessary given the
purpose of the tool, while the latter merely avoids an unnecessary
backward-compatibility break.
We may wish to consider renaming other programs, switches, and
functions which still use the old "xlog" naming to also refer to
"wal". However, that's still under discussion, so let's do just this
much for now.
Discussion: CAB7nPqTeC-8+zux8_-4ZD46V7YPwooeFxgndfsq5Rg8ibLVm1A@mail.gmail.com
Michael Paquier
pg_xlog is often a symlink, typically to a different filesystem. Don't
get confused and comlain about by that, and just always pretend that it's a
normal directory, even if it's really a symlink.
Also add a test case for this.
Backpatch to 9.5.
* Remove invalid option character "N" from the third argument (valid option
string) of getopt_long().
* Use pg_free() or pfree() to free the memory allocated by pg_malloc() or
palloc() instead of always using free().
* Assume problem is no disk space if write() fails but doesn't set errno.
* Fix several typos.
Patch by me. Review by Michael Paquier.
Update comments and function names to use the terms "source" and "target"
consistently. Some places were calling them remote and local instead, which
was confusing.
Fix incorrect comment in extractPageInfo on database creation record - it
was wrong on what happens for databases created in the target that don't
exist in source.
Bugs all spotted by Coverity, including wrong realloc() size request
and memory leaks. Cosmetic improvements by me.
The usage of the global variable "filemap" here is still pretty awful,
but at least I got rid of the gratuitous aliasing in several routines
(which was helping to annoy Coverity, as well as being a bug risk).
Earlier versions of this tool were available (and still are) on github.
Thanks to Michael Paquier, Alvaro Herrera, Peter Eisentraut, Amit Kapila,
and Satoshi Nagayasu for review.