Buildfarm experience shows that this function can fail with ENOENT
if some other process unlinks a file between when we read the directory
entry and when we try to stat() it. The problem is old but we had
not noticed it until 085b6b667 added regression test coverage.
To fix, just ignore ENOENT failures. There is one other case that
this might hide: a symlink that points to nowhere. That seems okay
though, at least better than erroring.
Back-patch to v10 where this function was added, since the regression
test cases were too.
Discussion: https://postgr.es/m/20200308173103.GC1357@telsasoft.com
Experimentation shows that it's not hard at all to drive the
old implementation of "ltree ~ lquery" match to stack overflow,
so throw in a check_stack_depth() call, as I just did in HEAD.
I wasn't able to make it take a long time, because all the
pathological cases I tried hit stack overflow first; but
I bet there are some others that do take a long time, so add
CHECK_FOR_INTERRUPTS() too.
Discussion: https://postgr.es/m/CAP_rww=waX2Oo6q+MbMSiZ9ktdj6eaJj0cQzNu=Ry2cCDij5fw@mail.gmail.com
This reverts commit 2aa6e33, that added a fast path to skip
anti-wraparound and non-aggressive autovacuum jobs (these have no sense
as anti-wraparound implies aggressive). With a cluster using a high
amount of relations with a portion of them being heavily updated, this
could cause autovacuum to lock down, with autovacuum workers attempting
repeatedly those jobs on the same relations for the same database, that
just kept being skipped. This lock down can be solved with a manual
VACUUM FREEZE.
Justin King has reported one environment where the issue happened, and
Julien Rouhaud and I have been able to reproduce it in a second
environment. With a very aggressive autovacuum_freeze_max_age,
triggering those jobs with pgbench is a matter of minutes, and hitting
the lock down is a lot harder (my local tests failed to do that).
Note that anti-wraparound and non-aggressive jobs can only be triggered
on a subset of shared catalogs:
- pg_auth_members
- pg_authid
- pg_database
- pg_replication_origin
- pg_shseclabel
- pg_subscription
- pg_tablespace
While the lock down was possible down to v12, the root cause of those
jobs is a much older issue, which needs more analysis.
Bonus thanks to Andres Freund for the discussion.
Reported-by: Justin King
Discussion: https://postgr.es/m/CAE39h22zPLrkH17GrkDgAYL3kbjvySYD1io+rtnAUFnaJJVS4g@mail.gmail.com
Backpatch-through: 12
INCLUDE indexes failed to have their non-key attributes physically
truncated away in certain rare cases. This led to physically larger
pivot tuples that contained useless non-key attribute values. The
impact on users should be negligible, but this is still clearly a
regression (Postgres 11 supports INCLUDE indexes, and yet was not
affected).
The bug appeared in commit dd299df8, which introduced "true" suffix
truncation of key attributes.
Discussion: https://postgr.es/m/CAH2-Wz=E8pkV9ivRSFHtv812H5ckf8s1-yhx61_WrJbKccGcrQ@mail.gmail.com
Backpatch: 12-, where "true" suffix truncation was introduced.
GetLocaleInfoEx() can fail on strings that setlocale() was perfectly
happy with. A common way for that to happen is if the locale string
is actually a Unix-style string, say "et_EE.UTF-8". In that case,
what's after the dot is an encoding name, not a Windows codepage number;
blindly treating it as a codepage number led to failure, with a fairly
silly error message. Hence, check to see if what's after the dot is
all digits, and if not, treat it as a literal encoding name rather than
a codepage number. This will do the right thing with many Unix-style
locale strings, and produce a more sensible error message otherwise.
Somewhat independently of that, treat a zero (CP_ACP) result from
GetLocaleInfoEx() as meaning that we must use UTF-8 encoding.
Back-patch to all supported branches.
Juan José Santamaría Flecha
Discussion: https://postgr.es/m/24905.1585445371@sss.pgh.pa.us
The documentation says that the max length is 255 bytes, but
code inspection says it's actually 255 characters; and relevant
lengths are stored as uint16 so that that works.
These uint16 fields could be overflowed by excessively long input,
producing strange results. Complain for invalid input.
Likewise check for out-of-range values of the repeat counts in lquery.
(We don't try too hard on that one, notably not bothering to detect
if atoi's result has overflowed.)
Also detect length overflow in ltree_concat.
In passing, be more consistent about whether "syntax error" messages
include the type name. Also, clarify the documentation about what
the size limit is.
This has been broken for a long time, so back-patch to all supported
branches.
Nikita Glukhov, reviewed by Benjie Gillam and Tomas Vondra
Discussion: https://postgr.es/m/CAP_rww=waX2Oo6q+MbMSiZ9ktdj6eaJj0cQzNu=Ry2cCDij5fw@mail.gmail.com
In 9.4 I added support to use a historical snapshot in
ScanPgRelation(), while adding logical decoding. Unfortunately a
conflict with the concurrent removal of SnapshotNow was incorrectly
resolved, leading to an unregistered snapshot being used.
It is not correct to use an unregistered (or non-active) snapshot for
anything non-trivial, because catalog invalidations can cause the
snapshot to be invalidated.
Luckily it seems unlikely to actively cause problems in practice, as
ScanPgRelation() requires that we already have a lock on the relation,
we only look for a single row, and we don't appear to rely on the
result's tid to be correct. It however is clearly wrong and potential
negative consequences would likely be hard to find. So it seems worth
backpatching the fix, even without a concrete hazard.
Discussion: https://postgr.es/m/20200229052459.wzhqnbhrriezg4v2@alap3.anarazel.de
Backpatch: 9.5-
plpgsql_xact_cb ought to treat events XACT_EVENT_PARALLEL_COMMIT and
XACT_EVENT_PARALLEL_ABORT like XACT_EVENT_COMMIT and XACT_EVENT_ABORT
respectively, since its goal is to do process-local cleanup. This
oversight caused plpgsql's end-of-transaction cleanup to not get done
in parallel workers. Since a parallel worker will exit just after the
transaction cleanup, the effects of this are limited. I couldn't find
any case in the core code with user-visible effects, but perhaps there
are some in extensions. In any case it's wrong, so let's fix it before
it bites us not after.
In passing, add some comments around the handling of expression
evaluation resources in DO blocks. There's no live bug there, but it's
quite unobvious what's happening; at least I thought so. This isn't
related to the other issue, except that I found both things while poking
at expression-evaluation performance.
Back-patch the plpgsql_xact_cb fix to 9.5 where those event types
were introduced, and the DO-block commentary to v11 where DO blocks
gained the ability to issue COMMIT/ROLLBACK.
Discussion: https://postgr.es/m/10353.1585247879@sss.pgh.pa.us
When SaveSlotToPath() is called with elevel=LOG, the early exits didn't
release the slot's io_in_progress_lock.
This could result in a walsender being stuck on the lock forever. A
possible way to get into this situation is if the offending code paths
are triggered in a low disk space situation.
Author: Pavan Deolasee <pavan.deolasee@2ndquadrant.com>
Reported-by: Craig Ringer <craig@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/56a138c5-de61-f553-7e8f-6789296de785%402ndquadrant.com
Now that we require C99, we can depend on __VA_ARGS__ to work, and
revising ereport() to use it has several significant benefits:
* The extra parentheses around the auxiliary function calls are now
optional. Aside from being a bit less ugly, this removes a common
gotcha for new contributors, because in some cases the compiler errors
you got from forgetting them were unintelligible.
* The auxiliary function calls are now evaluated as a comma expression
list rather than as extra arguments to errfinish(). This means that
compilers can be expected to warn about no-op expressions in the list,
allowing detection of several other common mistakes such as forgetting
to add errmsg(...) when converting an elog() call to ereport().
* Unlike the situation with extra function arguments, comma expressions
are guaranteed to be evaluated left-to-right, so this removes platform
dependency in the order of the auxiliary function calls. While that
dependency hasn't caused us big problems in the past, this change does
allow dropping some rather shaky assumptions around errcontext() domain
handling.
There's no intention to make wholesale changes of existing ereport
calls, but as proof-of-concept this patch removes the extra parens
from a couple of calls in postgres.c.
While new code can be written either way, code intended to be
back-patched will need to use extra parens for awhile yet. It seems
worth back-patching this change into v12, so as to reduce the window
where we have to be careful about that by one year. Hence, this patch
is careful to preserve ABI compatibility; a followup HEAD-only patch
will make some additional simplifications.
Andres Freund and Tom Lane
Discussion: https://postgr.es/m/CA+fd4k6N8EjNvZpM8nme+y+05mz-SM8Z_BgkixzkA34R+ej0Kw@mail.gmail.com
src/port/getopt_long.c failed on such an argument, always seeing it
as an unrecognized switch. This is unhelpful; better is to treat such
an item as a non-switch argument. That behavior is what we find in
GNU's getopt_long(); it's what src/port/getopt.c does; and it is
required by POSIX for getopt(), which getopt_long() ought to be
generally a superset of. Moreover, it's expected by ecpg, which
intends an argument of "-" to mean "read from stdin". So fix it.
Also add some documentation about ecpg's behavior in this area, since
that was miserably underdocumented. I had to reverse-engineer it
from the code.
Per bug #16304 from James Gray. Back-patch to all supported branches,
since this has been broken forever.
Discussion: https://postgr.es/m/16304-c662b00a1322db7f@postgresql.org
This reverts commit cb2fd7eac285b1b0a24eeb2b8ed4456b66c5a09f. Per
numerous buildfarm members, it was incompatible with parallel query, and
a test case assumed LP64. Back-patch to 9.5 (all supported versions).
Discussion: https://postgr.es/m/20200321224920.GB1763544@rfd.leadboat.com
Until now, only selected bulk operations (e.g. COPY) did this. If a
given relfilenode received both a WAL-skipping COPY and a WAL-logged
operation (e.g. INSERT), recovery could lose tuples from the COPY. See
src/backend/access/transam/README section "Skipping WAL for New
RelFileNode" for the new coding rules. Maintainers of table access
methods should examine that section.
To maintain data durability, just before commit, we choose between an
fsync of the relfilenode and copying its contents to WAL. A new GUC,
wal_skip_threshold, guides that choice. If this change slows a workload
that creates small, permanent relfilenodes under wal_level=minimal, try
adjusting wal_skip_threshold. Users setting a timeout on COMMIT may
need to adjust that timeout, and log_min_duration_statement analysis
will reflect time consumption moving to COMMIT from commands like COPY.
Internally, this requires a reliable determination of whether
RollbackAndReleaseCurrentSubTransaction() would unlink a relation's
current relfilenode. Introduce rd_firstRelfilenodeSubid. Amend the
specification of rd_createSubid such that the field is zero when a new
rel has an old rd_node. Make relcache.c retain entries for certain
dropped relations until end of transaction.
Back-patch to 9.5 (all supported versions). This introduces a new WAL
record type, XLOG_GIST_ASSIGN_LSN, without bumping XLOG_PAGE_MAGIC. As
always, update standby systems before master systems. This changes
sizeof(RelationData) and sizeof(IndexStmt), breaking binary
compatibility for affected extensions. (The most recent commit to
affect the same class of extensions was
089e4d405d0f3b94c74a2c6a54357a84a681754b.)
Kyotaro Horiguchi, reviewed (in earlier, similar versions) by Robert
Haas. Heikki Linnakangas and Michael Paquier implemented earlier
designs that materially clarified the problem. Reviewed, in earlier
designs, by Andrew Dunstan, Andres Freund, Alvaro Herrera, Tom Lane,
Fujii Masao, and Simon Riggs. Reported by Martijn van Oosterhout.
Discussion: https://postgr.es/m/20150702220524.GA9392@svana.org
The function assumed forkNum=MAIN_FORKNUM and page_std=true, ignoring
the actual arguments. Existing callers passed exactly those values, so
there's no live bug. Back-patch to v12, where the function first
appeared, because another fix needs this.
Discussion: https://postgr.es/m/20191118045434.GA1173436@rfd.leadboat.com
swap_relation_files() calls toast_get_valid_index() to find and lock
this index, just before swapping with the rebuilt TOAST index. The
latter function releases the lock before returning. Potential for
mischief is low; a concurrent session can issue ALTER INDEX ... SET
(fillfactor = ...), which is not alarming. Nonetheless, changing
pg_class.relfilenode without a lock is unconventional. Back-patch to
9.5 (all supported versions), because another fix needs this.
Discussion: https://postgr.es/m/20191226001521.GA1772687@rfd.leadboat.com
Remove an obsolete comment from AtEOXact_cleanup(). Restore formatting
of a comment in struct RelationData, mangled by the pgindent run in
commit 9af4159fce6654aa0e081b00d02bca40b978745c. Back-patch to 9.5 (all
supported versions), because another fix stacks on this.
This patch fixes the error message in get_major_server_version() to be
"could not parse version file", and uses the full file path name, rather
than just the data directory path.
Also, commit 4109bb5de4 added the cause of the failure to the "could
not open" error message, and improved quoting. This patch backpatches
the "could not open" cause to PG 12, where it was first widely used, and
backpatches the quoting fix in that patch to all supported releases.
Reported-by: Tom Lane
Discussion: https://postgr.es/m/87pne2w98h.fsf@wibble.ilmari.org
Author: Dagfinn Ilmari Mannsåker
Backpatch-through: 9.5
A comment about switching indisvalid of the new and old indexes swapped
in REINDEX CONCURRENTLY got this backwards.
Issue introduced by 5dc92b8, the original commit of REINDEX
CONCURRENTLY.
Author: Julien Rouhaud
Discussion: https://postgr.es/m/20200318143340.GA46897@nol
Backpatch-through: 12
This commit corrects the descriptions of RecoveryWalAll and RecoveryWalStream
wait events in the documentation.
Back-patch to v10 where those wait events were added.
Author: Fujii Masao
Reviewed-by: Kyotaro Horiguchi, Atsushi Torikoshi
Discussion: https://postgr.es/m/124997ee-096a-5d09-d8da-2c7a57d0816e@oss.nttdata.com
Commit 9f06d79ef831's replication slot copying failed to
properly reserve the WAL that the slot is expecting to see
during DecodingContextFindStartpoint (to set the confirmed_flush
LSN), so concurrent activity could remove that WAL and cause the
copy process to error out. But it doesn't actually *need* that
WAL anyway: instead of running decode to find confirmed_flush, it
can be copied from the source slot. Fix this by rearranging things
to avoid DecodingContextFindStartpoint() (leaving the target slot's
confirmed_flush_lsn to invalid), and set that up afterwards by copying
from the target slot's value.
Also ensure the source slot's confirmed_flush_lsn is valid.
Reported-by: Arseny Sher
Author: Masahiko Sawada, Arseny Sher
Discussion: https://postgr.es/m/871rr3ohbo.fsf@ars-thinkpad
I noticed that we completely failed to document the restriction
that an "anyrange" result type has to be inferred from an "anyrange"
input. The docs also were less clear than they could be about the
relationship between "anyrange" and "anyarray".
It's been like this all along, so back-patch.
If pkg-config is installed and knows about libxml2, use its information
rather than asking xml2-config. Otherwise proceed as before. This
patch allows "configure --with-libxml" to succeed on platforms that
have pkg-config but not xml2-config, which is likely to soon become
a typical situation.
The old mechanism can be forced by setting XML2_CONFIG explicitly
(hence, build processes that were already doing so will certainly
not need adjustment). Also, it's now possible to set XML2_CFLAGS
and XML2_LIBS explicitly to override both programs.
There is a small risk of this breaking existing build processes,
if there are multiple libxml2 installations on the machine and
pkg-config disagrees with xml2-config about which to use. The
only case where that seems really likely is if a builder has tried
to select a non-default xml2-config by putting it early in his PATH
rather than setting XML2_CONFIG. Plan to warn against that in the
minor release notes.
Back-patch to v10; before that we had no pkg-config infrastructure,
and it doesn't seem worth adding it for this.
Hugh McMaster and Tom Lane; Peter Eisentraut also made an earlier
attempt at this, from which I lifted most of the docs changes.
Discussion: https://postgr.es/m/CAN9BcdvfUwc9Yx5015bLH2TOiQ-M+t_NADBSPhMF7dZ=pLa_iw@mail.gmail.com
This extends the fixes made in commit 085b6b667 to other SRFs with the
same bug, namely pg_logdir_ls(), pgrowlocks(), pg_timezone_names(),
pg_ls_dir(), and pg_tablespace_databases().
Also adjust various comments and documentation to warn against
expecting to clean up resources during a ValuePerCall SRF's final
call.
Back-patch to all supported branches, since these functions were
all born broken.
Justin Pryzby, with cosmetic tweaks by me
Discussion: https://postgr.es/m/20200308173103.GC1357@telsasoft.com
It's strange that a directory-listing function does not list all entries
in a directory, so let's at least document it. This involves
pg_ls_logdir
pg_ls_waldir
pg_ls_archive_statusdir
pg_ls_tmpdir
Backpatch as far back as it applies cleanly (and as far as as each
function exists). REL_10_STABLE uses different wording, but hopefully
people are not reading docs so old to write new apps anyway.
Author: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20200305161838.GJ684@telsasoft.com
resolve_polymorphic_tupdesc() and resolve_polymorphic_argtypes() failed to
cover the case of having to resolve anyarray given only an anyrange input.
The bug was masked if anyelement was also used (as either input or
output), which probably helps account for our not having noticed.
While looking at this I noticed that resolve_generic_type() would produce
the wrong answer if asked to make that same resolution. ISTM that
resolve_generic_type() is confusingly defined and overly complex, so
rather than fix it, let's just make funcapi.c do the specific lookups
it requires for itself.
With this change, resolve_generic_type() is not used anywhere, so remove
it in HEAD. In the back branches, leave it alone (complete with bug)
just in case any external code is using it.
While we're here, make some other refactoring adjustments in funcapi.c
with an eye to upcoming future expansion of the set of polymorphic types:
* Simplify quick-exit tests by adding an overall have_polymorphic_result
flag. This is about a wash now but will be a win when there are more
flags.
* Reduce duplication of code between resolve_polymorphic_tupdesc() and
resolve_polymorphic_argtypes().
* Don't bother to validate correct matching of anynonarray or anyenum;
the parser should have done that, and even if it didn't, just doing
"return false" here would lead to a very confusing, off-point error
message. (Really, "return false" in these two functions should only
occur if the call_expr isn't supplied or we can't obtain data type
info from it.)
* For the same reason, throw an elog rather than "return false" if
we fail to resolve a polymorphic type.
The bug's been there since we added anyrange, so back-patch to
all supported branches.
Discussion: https://postgr.es/m/6093.1584202130@sss.pgh.pa.us
This should of course be just "PG_ARGISNULL()".
Also reorder a couple of paras to make the discussion of PG_ARGISNULL
less disjointed.
Back-patch to v10 where the error was introduced.
Laurenz Albe and Tom Lane, per an anonymous docs comment
Discussion: https://postgr.es/m/158399487096.5708.10696365251766477013@wrigleys.postgresql.org
If an index was explicitly set as replica identity index, this setting
was lost when a table was rewritten by ALTER TABLE. Because this
setting is part of pg_index but actually controlled by ALTER
TABLE (not part of CREATE INDEX, say), we have to do some extra work
to restore it.
Based-on-patch-by: Quan Zongliang <quanzongliang@gmail.com>
Reviewed-by: Euler Taveira <euler.taveira@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/c70fcab2-4866-0d9f-1d01-e75e189db342@gmail.com
I forgot that the WAL directory might hold other files besides WAL
segments, notably including new segments still being filled.
That means a blind test for the first file's size being 16MB can
fail. Restrict based on file name length to make it more robust.
Per buildfarm.
The data types that contrib/pageinspect's bt_metap() function were
declared to return as OUT arguments were wrong in some cases. In
particular, the oldest_xact column (a TransactionId/xid field) was
declared integer/int4 within the pageinspect extension's sql file. This
led to errors when an oldest_xact value that exceeded 2^31-1 was
encountered.
We cannot fix the declaration on Postgres 11 or 12. All we can do is
ameliorate the problem. Use "%d" instead of "%u" to format the output
of the oldest_xact value. This makes the C code match the declaration,
suppressing unhelpful error messages that might otherwise make
bt_metap() totally unusable. A bogus negative oldest_xact value will be
displayed instead of raising an error.
This commit addresses the same issue as master branch commit 691e8b2e18,
which actually fixed the problem. Backpatch to the 11 and 12 branches
only, since they are the only branches (other than master) that have
oldest_xact. All of the other problematic columns already display bogus
output for out of range values.
Reported-By: Victor Yegorov
Bug: #16285
Discussion: https://postgr.es/m/20200309223557.aip5n6ewln4ixbbi@alap3.anarazel.de
Backpatch: 11 and 12 only
pg_dump is oblivious to this kind of dependency, so they're lost on
dump/restores (and pg_upgrade). Have pg_dump emit ALTER lines so that
they're preserved. Add some pg_dump tests for the whole thing, also.
Reviewed-by: Tom Lane (offlist)
Reviewed-by: Ibrar Ahmed
Reviewed-by: Ahsan Hadi (who also reviewed commit 899a04f5ed61)
Discussion: https://postgr.es/m/20200217225333.GA30974@alvherre.pgsql
This coding technique is undesirable because (a) it leaks the FD for
the rest of the transaction if the SRF is not run to completion, and
(b) allocated FDs are a scarce resource, but multiple interleaved
uses of the relevant functions could eat many such FDs.
In v11 and later, a query such as "SELECT pg_ls_waldir() LIMIT 1"
yields a warning about the leaked FD, and the only reason there's
no warning in earlier branches is that fd.c didn't whine about such
leaks before commit 9cb7db3f0. Even disregarding the warning, it
wouldn't be too hard to run a backend out of FDs with careless use
of these SQL functions.
Hence, rewrite the function so that it reads the directory within
a single call, returning the results as a tuplestore rather than
via value-per-call mode.
There are half a dozen other built-in SRFs with similar problems,
but let's fix this one to start with, just to see if the buildfarm
finds anything wrong with the code.
In passing, fix bogus error report for stat() failure: it was
whining about the directory when it should be fingering the
individual file. Doubtless a copy-and-paste error.
Back-patch to v10 where this function was added.
Justin Pryzby, with cosmetic tweaks and test cases by me
Discussion: https://postgr.es/m/20200308173103.GC1357@telsasoft.com
If the command is attempted for an extension that the object already
depends on, silently do nothing.
In particular, this means that if a database containing multiple such
entries is dumped, the restore will silently do the right thing and
record just the first one. (At least, in a world where pg_dump does
dump such entries -- which it doesn't currently, but it will.)
Backpatch to 9.6, where this kind of dependency was introduced.
Reviewed-by: Ibrar Ahmed, Tom Lane (offlist)
Discussion: https://postgr.es/m/20200217225333.GA30974@alvherre.pgsql
Such indexes can only be duplicated leftovers of a previously failed
REINDEX CONCURRENTLY command, and a valid equivalent is guaranteed to
exist. As toast indexes can only be dropped if invalid, reindexing
these would lead to useless duplicated indexes that can't be dropped
anymore, except if the parent relation is dropped.
Thanks to Justin Pryzby for reminding that this problem was reported
long ago during the review of the original patch of REINDEX
CONCURRENTLY, but the issue was never addressed.
Reported-by: Sergei Kornilov, Justin Pryzby
Author: Julien Rouhaud
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/36712441546604286%40sas1-890ba5c2334a.qloud-c.yandex.net
Discussion: https://postgr.es/m/20200216190835.GA21832@telsasoft.com
Backpatch-through: 12
Previously, event triggers were restored just after regular triggers
(and FK constraints, which are basically triggers). This is risky
since an event trigger, once installed, could interfere with subsequent
restore commands. Worse, because event triggers don't have any
particular dependencies on any post-data objects, a parallel restore
would consider them eligible to be restored the moment the post-data
phase starts, allowing them to also interfere with restoration of a
whole bunch of objects that would have been restored before them in
a serial restore. There's no way to completely remove the risk of a
misguided event trigger breaking the restore, since if nothing else
it could break other event triggers. But we can certainly push them
to later in the process to minimize the hazard.
To fix, tweak the RestorePass mechanism introduced by commit 3eb9a5e7c
so that event triggers are handled as part of the post-ACL processing
pass (renaming the "REFRESH" pass to "POST_ACL" to reflect its more
general use). This will cause them to restore after everything except
matview refreshes, which seems OK since matview refreshes really ought
to run in the post-restore state of the database. In a parallel
restore, event triggers and matview refreshes might be intermixed,
but that seems all right as well.
Also update the code and comments in pg_dump_sort.c so that its idea
of how things are sorted agrees with what actually happens due to
the RestorePass mechanism. This is mostly cosmetic: it'll affect the
order of objects in a dump's TOC, but not the actual restore order.
But not changing that would be quite confusing to somebody reading
the code.
Back-patch to all supported branches.
Fabrízio de Royes Mello, tweaked a bit by me
Discussion: https://postgr.es/m/CAFcNs+ow1hmFox8P--3GSdtwz-S3Binb6ZmoP6Vk+Xg=K6eZNA@mail.gmail.com
Previously "waiting" could appear twice via PS in case of lock conflict
in hot standby mode. Specifically this issue happend when the delay
in WAL application determined by max_standby_archive_delay and
max_standby_streaming_delay had passed but it took more than 500 msec
to cancel all the conflicting transactions. Especially we can observe this
easily by setting those delay parameters to -1.
The cause of this issue was that WaitOnLock() and
ResolveRecoveryConflictWithVirtualXIDs() added "waiting" to
the process title in that case. This commit prevents
ResolveRecoveryConflictWithVirtualXIDs() from reporting waiting
in case of lock conflict, to fix the bug.
Back-patch to all back branches.
Author: Masahiko Sawada
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/CA+fd4k4mXWTwfQLS3RPwGr4xnfAEs1ysFfgYHvmmoUgv6Zxvmg@mail.gmail.com
At the end of recovery, standby mode is turned off to re-fetch the last
valid record from archive or pg_wal. Previously, if recovery target was
reached and standby mode was turned off while the current WAL source
was stream, recovery could try to retrieve WAL file containing the last
valid record unexpectedly from stream even though not in standby mode.
This caused an assertion failure. That is, the assertion test confirms that
WAL file should not be retrieved from stream if standby mode is not true.
This commit moves back the current WAL source to archive if it's stream
even though not in standby mode, to avoid that assertion failure.
This issue doesn't cause the server to crash when built with assertion
disabled. In this case, the attempt to retrieve WAL file from stream not
in standby mode just fails. And then recovery tries to retrieve WAL file
from archive or pg_wal.
Back-patch to all supported branches.
Author: Kyotaro Horiguchi
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/20200227.124830.2197604521555566121.horikyota.ntt@gmail.com
This addresses a couple of issues in the documentation:
- Description of PG_COLOR was missing for some tools (pg_archivecleanup
and pg_test_fsync), while the other descriptions had grammar mistakes.
- pgbench supports more environment variables: PGUSER, PGHOST and
PGPORT.
- vacuumlo, oid2name and pgbench support coloring (HEAD only)
Author: Michael Paquier
Reviewed-by: Fabien Coelho, Daniel Gustafsson, Juan José Santamaría
Flecha
Discussion: https://postgr.es/m/20200304075418.GJ2593@paquier.xyz
Backpatch-through: 12
When canceling a REINDEX CONCURRENTLY operation after swapping is done,
a drop of the parent table would leave behind old indexes. This is a
consequence of 68ac9cf, which fixed the case of pg_depend bloat when
repeating REINDEX CONCURRENTLY on the same relation.
In order to take care of the problem without breaking the previous fix,
this uses a different strategy, possible even with the exiting set of
routines to handle dependency changes. The dependencies of/on the
new index are additionally switched to the old one, allowing an old
invalid index remaining around because of a cancellation or a failure to
use the dependency links of the concurrently-created index. This
ensures that dropping any objects the old invalid index depends on also
drops the old index automatically.
Reported-by: Julien Rouhaud
Author: Michael Paquier
Reviewed-by: Julien Rouhaud
Discussion: https://postgr.es/m/20200227080735.l32fqcauy73lon7o@nol
Backpatch-through: 12