contrib/lo's lo_manage() thought it could use
trigdata->tg_trigger->tgname in its error message about
not being called as a trigger. That naturally led to a core dump.
unique_key_recheck() figured it could Assert that fcinfo->context
is a TriggerData node in advance of having checked that it's
being called as a trigger. That's harmless in production builds,
and perhaps not that easy to reach in any case, but it's logically
wrong.
The first of these per bug #16340 from William Crowell;
the second from manual inspection of other CALLED_AS_TRIGGER
call sites.
Back-patch the lo.c change to all supported branches, the
other to v10 where the thinko crept in.
Discussion: https://postgr.es/m/16340-591c7449dc7c8c47@postgresql.org
A join that was added in commit 9b2009c4cf that did not use the INNER
keyword but the existing query used it. It was cleaner to remove the
existing INNER keyword.
Reported-by: Peter Eisentraut
Discussion: https://postgr.es/m/a1ffbfda-59d2-5732-e5fb-3df8582b6434@2ndquadrant.com
Backpatch-through: 9.5
Experimentation shows that it's not hard at all to drive the
old implementation of "ltree ~ lquery" match to stack overflow,
so throw in a check_stack_depth() call, as I just did in HEAD.
I wasn't able to make it take a long time, because all the
pathological cases I tried hit stack overflow first; but
I bet there are some others that do take a long time, so add
CHECK_FOR_INTERRUPTS() too.
Discussion: https://postgr.es/m/CAP_rww=waX2Oo6q+MbMSiZ9ktdj6eaJj0cQzNu=Ry2cCDij5fw@mail.gmail.com
GetLocaleInfoEx() can fail on strings that setlocale() was perfectly
happy with. A common way for that to happen is if the locale string
is actually a Unix-style string, say "et_EE.UTF-8". In that case,
what's after the dot is an encoding name, not a Windows codepage number;
blindly treating it as a codepage number led to failure, with a fairly
silly error message. Hence, check to see if what's after the dot is
all digits, and if not, treat it as a literal encoding name rather than
a codepage number. This will do the right thing with many Unix-style
locale strings, and produce a more sensible error message otherwise.
Somewhat independently of that, treat a zero (CP_ACP) result from
GetLocaleInfoEx() as meaning that we must use UTF-8 encoding.
Back-patch to all supported branches.
Juan José Santamaría Flecha
Discussion: https://postgr.es/m/24905.1585445371@sss.pgh.pa.us
The documentation says that the max length is 255 bytes, but
code inspection says it's actually 255 characters; and relevant
lengths are stored as uint16 so that that works.
These uint16 fields could be overflowed by excessively long input,
producing strange results. Complain for invalid input.
Likewise check for out-of-range values of the repeat counts in lquery.
(We don't try too hard on that one, notably not bothering to detect
if atoi's result has overflowed.)
Also detect length overflow in ltree_concat.
In passing, be more consistent about whether "syntax error" messages
include the type name. Also, clarify the documentation about what
the size limit is.
This has been broken for a long time, so back-patch to all supported
branches.
Nikita Glukhov, reviewed by Benjie Gillam and Tomas Vondra
Discussion: https://postgr.es/m/CAP_rww=waX2Oo6q+MbMSiZ9ktdj6eaJj0cQzNu=Ry2cCDij5fw@mail.gmail.com
In 9.4 I added support to use a historical snapshot in
ScanPgRelation(), while adding logical decoding. Unfortunately a
conflict with the concurrent removal of SnapshotNow was incorrectly
resolved, leading to an unregistered snapshot being used.
It is not correct to use an unregistered (or non-active) snapshot for
anything non-trivial, because catalog invalidations can cause the
snapshot to be invalidated.
Luckily it seems unlikely to actively cause problems in practice, as
ScanPgRelation() requires that we already have a lock on the relation,
we only look for a single row, and we don't appear to rely on the
result's tid to be correct. It however is clearly wrong and potential
negative consequences would likely be hard to find. So it seems worth
backpatching the fix, even without a concrete hazard.
Discussion: https://postgr.es/m/20200229052459.wzhqnbhrriezg4v2@alap3.anarazel.de
Backpatch: 9.5-
plpgsql_xact_cb ought to treat events XACT_EVENT_PARALLEL_COMMIT and
XACT_EVENT_PARALLEL_ABORT like XACT_EVENT_COMMIT and XACT_EVENT_ABORT
respectively, since its goal is to do process-local cleanup. This
oversight caused plpgsql's end-of-transaction cleanup to not get done
in parallel workers. Since a parallel worker will exit just after the
transaction cleanup, the effects of this are limited. I couldn't find
any case in the core code with user-visible effects, but perhaps there
are some in extensions. In any case it's wrong, so let's fix it before
it bites us not after.
In passing, add some comments around the handling of expression
evaluation resources in DO blocks. There's no live bug there, but it's
quite unobvious what's happening; at least I thought so. This isn't
related to the other issue, except that I found both things while poking
at expression-evaluation performance.
Back-patch the plpgsql_xact_cb fix to 9.5 where those event types
were introduced, and the DO-block commentary to v11 where DO blocks
gained the ability to issue COMMIT/ROLLBACK.
Discussion: https://postgr.es/m/10353.1585247879@sss.pgh.pa.us
When SaveSlotToPath() is called with elevel=LOG, the early exits didn't
release the slot's io_in_progress_lock.
This could result in a walsender being stuck on the lock forever. A
possible way to get into this situation is if the offending code paths
are triggered in a low disk space situation.
Author: Pavan Deolasee <pavan.deolasee@2ndquadrant.com>
Reported-by: Craig Ringer <craig@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/56a138c5-de61-f553-7e8f-6789296de785%402ndquadrant.com
src/port/getopt_long.c failed on such an argument, always seeing it
as an unrecognized switch. This is unhelpful; better is to treat such
an item as a non-switch argument. That behavior is what we find in
GNU's getopt_long(); it's what src/port/getopt.c does; and it is
required by POSIX for getopt(), which getopt_long() ought to be
generally a superset of. Moreover, it's expected by ecpg, which
intends an argument of "-" to mean "read from stdin". So fix it.
Also add some documentation about ecpg's behavior in this area, since
that was miserably underdocumented. I had to reverse-engineer it
from the code.
Per bug #16304 from James Gray. Back-patch to all supported branches,
since this has been broken forever.
Discussion: https://postgr.es/m/16304-c662b00a1322db7f@postgresql.org
This reverts commit cb2fd7eac285b1b0a24eeb2b8ed4456b66c5a09f. Per
numerous buildfarm members, it was incompatible with parallel query, and
a test case assumed LP64. Back-patch to 9.5 (all supported versions).
Discussion: https://postgr.es/m/20200321224920.GB1763544@rfd.leadboat.com
Until now, only selected bulk operations (e.g. COPY) did this. If a
given relfilenode received both a WAL-skipping COPY and a WAL-logged
operation (e.g. INSERT), recovery could lose tuples from the COPY. See
src/backend/access/transam/README section "Skipping WAL for New
RelFileNode" for the new coding rules. Maintainers of table access
methods should examine that section.
To maintain data durability, just before commit, we choose between an
fsync of the relfilenode and copying its contents to WAL. A new GUC,
wal_skip_threshold, guides that choice. If this change slows a workload
that creates small, permanent relfilenodes under wal_level=minimal, try
adjusting wal_skip_threshold. Users setting a timeout on COMMIT may
need to adjust that timeout, and log_min_duration_statement analysis
will reflect time consumption moving to COMMIT from commands like COPY.
Internally, this requires a reliable determination of whether
RollbackAndReleaseCurrentSubTransaction() would unlink a relation's
current relfilenode. Introduce rd_firstRelfilenodeSubid. Amend the
specification of rd_createSubid such that the field is zero when a new
rel has an old rd_node. Make relcache.c retain entries for certain
dropped relations until end of transaction.
Back-patch to 9.5 (all supported versions). This introduces a new WAL
record type, XLOG_GIST_ASSIGN_LSN, without bumping XLOG_PAGE_MAGIC. As
always, update standby systems before master systems. This changes
sizeof(RelationData) and sizeof(IndexStmt), breaking binary
compatibility for affected extensions. (The most recent commit to
affect the same class of extensions was
089e4d405d0f3b94c74a2c6a54357a84a681754b.)
Kyotaro Horiguchi, reviewed (in earlier, similar versions) by Robert
Haas. Heikki Linnakangas and Michael Paquier implemented earlier
designs that materially clarified the problem. Reviewed, in earlier
designs, by Andrew Dunstan, Andres Freund, Alvaro Herrera, Tom Lane,
Fujii Masao, and Simon Riggs. Reported by Martijn van Oosterhout.
Discussion: https://postgr.es/m/20150702220524.GA9392@svana.org
Back-patch a subset of commit 9155580fd5fc2a0cbb23376dfca7cd21f59c2c7b
to v11, v10, 9.6, and 9.5. Include the latest repairs to this function.
Use a new XLOG_FPI_MULTI value instead of reusing XLOG_FPI. That way,
if an older server reads WAL from this function, that server will PANIC
instead of applying just one page of the record. The next commit adds a
call to this function.
Discussion: https://postgr.es/m/20200304.162919.898938381201316571.horikyota.ntt@gmail.com
swap_relation_files() calls toast_get_valid_index() to find and lock
this index, just before swapping with the rebuilt TOAST index. The
latter function releases the lock before returning. Potential for
mischief is low; a concurrent session can issue ALTER INDEX ... SET
(fillfactor = ...), which is not alarming. Nonetheless, changing
pg_class.relfilenode without a lock is unconventional. Back-patch to
9.5 (all supported versions), because another fix needs this.
Discussion: https://postgr.es/m/20191226001521.GA1772687@rfd.leadboat.com
Remove an obsolete comment from AtEOXact_cleanup(). Restore formatting
of a comment in struct RelationData, mangled by the pgindent run in
commit 9af4159fce6654aa0e081b00d02bca40b978745c. Back-patch to 9.5 (all
supported versions), because another fix stacks on this.
This patch fixes the error message in get_major_server_version() to be
"could not parse version file", and uses the full file path name, rather
than just the data directory path.
Also, commit 4109bb5de4 added the cause of the failure to the "could
not open" error message, and improved quoting. This patch backpatches
the "could not open" cause to PG 12, where it was first widely used, and
backpatches the quoting fix in that patch to all supported releases.
Reported-by: Tom Lane
Discussion: https://postgr.es/m/87pne2w98h.fsf@wibble.ilmari.org
Author: Dagfinn Ilmari Mannsåker
Backpatch-through: 9.5
I noticed that we completely failed to document the restriction
that an "anyrange" result type has to be inferred from an "anyrange"
input. The docs also were less clear than they could be about the
relationship between "anyrange" and "anyarray".
It's been like this all along, so back-patch.
This extends the fixes made in commit 085b6b667 to other SRFs with the
same bug, namely pg_logdir_ls(), pgrowlocks(), pg_timezone_names(),
pg_ls_dir(), and pg_tablespace_databases().
Also adjust various comments and documentation to warn against
expecting to clean up resources during a ValuePerCall SRF's final
call.
Back-patch to all supported branches, since these functions were
all born broken.
Justin Pryzby, with cosmetic tweaks by me
Discussion: https://postgr.es/m/20200308173103.GC1357@telsasoft.com
resolve_polymorphic_tupdesc() and resolve_polymorphic_argtypes() failed to
cover the case of having to resolve anyarray given only an anyrange input.
The bug was masked if anyelement was also used (as either input or
output), which probably helps account for our not having noticed.
While looking at this I noticed that resolve_generic_type() would produce
the wrong answer if asked to make that same resolution. ISTM that
resolve_generic_type() is confusingly defined and overly complex, so
rather than fix it, let's just make funcapi.c do the specific lookups
it requires for itself.
With this change, resolve_generic_type() is not used anywhere, so remove
it in HEAD. In the back branches, leave it alone (complete with bug)
just in case any external code is using it.
While we're here, make some other refactoring adjustments in funcapi.c
with an eye to upcoming future expansion of the set of polymorphic types:
* Simplify quick-exit tests by adding an overall have_polymorphic_result
flag. This is about a wash now but will be a win when there are more
flags.
* Reduce duplication of code between resolve_polymorphic_tupdesc() and
resolve_polymorphic_argtypes().
* Don't bother to validate correct matching of anynonarray or anyenum;
the parser should have done that, and even if it didn't, just doing
"return false" here would lead to a very confusing, off-point error
message. (Really, "return false" in these two functions should only
occur if the call_expr isn't supplied or we can't obtain data type
info from it.)
* For the same reason, throw an elog rather than "return false" if
we fail to resolve a polymorphic type.
The bug's been there since we added anyrange, so back-patch to
all supported branches.
Discussion: https://postgr.es/m/6093.1584202130@sss.pgh.pa.us
If an index was explicitly set as replica identity index, this setting
was lost when a table was rewritten by ALTER TABLE. Because this
setting is part of pg_index but actually controlled by ALTER
TABLE (not part of CREATE INDEX, say), we have to do some extra work
to restore it.
Based-on-patch-by: Quan Zongliang <quanzongliang@gmail.com>
Reviewed-by: Euler Taveira <euler.taveira@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/c70fcab2-4866-0d9f-1d01-e75e189db342@gmail.com
RecordKnownAssignedTransactionIds() should never move
nextXid backwards. Before this commit, that could happen
if some other code path had advanced it without advancing
latestObservedXid.
One consequence is that a well timed XLOG_CHECKPOINT_ONLINE
could cause hot standby feedback messages to get confused
and report an xmin from a future epoch, potentially allowing
vacuum to run too soon on the primary.
Repair, by making sure RecordKnownAssignedTransactionIds()
can only move nextXid forwards.
In release 12 and master, this was already done by commit
2fc7af5e, which consolidated similar code and straightened
out this bug. Back-patch to supported releases before that.
Author: Eka Palamadai <ekanatha@amazon.com>
Discussion: https://postgr.es/m/98BB4805-D0A2-48E1-96F4-15014313EADC@amazon.com
pg_dump is oblivious to this kind of dependency, so they're lost on
dump/restores (and pg_upgrade). Have pg_dump emit ALTER lines so that
they're preserved. Add some pg_dump tests for the whole thing, also.
Reviewed-by: Tom Lane (offlist)
Reviewed-by: Ibrar Ahmed
Reviewed-by: Ahsan Hadi (who also reviewed commit 899a04f5ed61)
Discussion: https://postgr.es/m/20200217225333.GA30974@alvherre.pgsql
If the command is attempted for an extension that the object already
depends on, silently do nothing.
In particular, this means that if a database containing multiple such
entries is dumped, the restore will silently do the right thing and
record just the first one. (At least, in a world where pg_dump does
dump such entries -- which it doesn't currently, but it will.)
Backpatch to 9.6, where this kind of dependency was introduced.
Reviewed-by: Ibrar Ahmed, Tom Lane (offlist)
Discussion: https://postgr.es/m/20200217225333.GA30974@alvherre.pgsql
Previously, event triggers were restored just after regular triggers
(and FK constraints, which are basically triggers). This is risky
since an event trigger, once installed, could interfere with subsequent
restore commands. Worse, because event triggers don't have any
particular dependencies on any post-data objects, a parallel restore
would consider them eligible to be restored the moment the post-data
phase starts, allowing them to also interfere with restoration of a
whole bunch of objects that would have been restored before them in
a serial restore. There's no way to completely remove the risk of a
misguided event trigger breaking the restore, since if nothing else
it could break other event triggers. But we can certainly push them
to later in the process to minimize the hazard.
To fix, tweak the RestorePass mechanism introduced by commit 3eb9a5e7c
so that event triggers are handled as part of the post-ACL processing
pass (renaming the "REFRESH" pass to "POST_ACL" to reflect its more
general use). This will cause them to restore after everything except
matview refreshes, which seems OK since matview refreshes really ought
to run in the post-restore state of the database. In a parallel
restore, event triggers and matview refreshes might be intermixed,
but that seems all right as well.
Also update the code and comments in pg_dump_sort.c so that its idea
of how things are sorted agrees with what actually happens due to
the RestorePass mechanism. This is mostly cosmetic: it'll affect the
order of objects in a dump's TOC, but not the actual restore order.
But not changing that would be quite confusing to somebody reading
the code.
Back-patch to all supported branches.
Fabrízio de Royes Mello, tweaked a bit by me
Discussion: https://postgr.es/m/CAFcNs+ow1hmFox8P--3GSdtwz-S3Binb6ZmoP6Vk+Xg=K6eZNA@mail.gmail.com
Previously "waiting" could appear twice via PS in case of lock conflict
in hot standby mode. Specifically this issue happend when the delay
in WAL application determined by max_standby_archive_delay and
max_standby_streaming_delay had passed but it took more than 500 msec
to cancel all the conflicting transactions. Especially we can observe this
easily by setting those delay parameters to -1.
The cause of this issue was that WaitOnLock() and
ResolveRecoveryConflictWithVirtualXIDs() added "waiting" to
the process title in that case. This commit prevents
ResolveRecoveryConflictWithVirtualXIDs() from reporting waiting
in case of lock conflict, to fix the bug.
Back-patch to all back branches.
Author: Masahiko Sawada
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/CA+fd4k4mXWTwfQLS3RPwGr4xnfAEs1ysFfgYHvmmoUgv6Zxvmg@mail.gmail.com
At the end of recovery, standby mode is turned off to re-fetch the last
valid record from archive or pg_wal. Previously, if recovery target was
reached and standby mode was turned off while the current WAL source
was stream, recovery could try to retrieve WAL file containing the last
valid record unexpectedly from stream even though not in standby mode.
This caused an assertion failure. That is, the assertion test confirms that
WAL file should not be retrieved from stream if standby mode is not true.
This commit moves back the current WAL source to archive if it's stream
even though not in standby mode, to avoid that assertion failure.
This issue doesn't cause the server to crash when built with assertion
disabled. In this case, the attempt to retrieve WAL file from stream not
in standby mode just fails. And then recovery tries to retrieve WAL file
from archive or pg_wal.
Back-patch to all supported branches.
Author: Kyotaro Horiguchi
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/20200227.124830.2197604521555566121.horikyota.ntt@gmail.com
Previously the documentation explains that WAL segment files
start at 000000010000000000000000. But the first WAL segment file
that initdb creates is 000000010000000000000001 not
000000010000000000000000. This change was caused by old
commit 8c843fff2d, but the documentation had not been updated
a long time.
Back-patch to all supported branches.
Author: Fujii Masao
Reviewed-by: David Zhang
Discussion: https://postgr.es/m/CAHGQGwHOmGe2OqGOmp8cOfNVDivq7dbV74L5nUGr+3eVd2CU2Q@mail.gmail.com
The original coding failed to properly quote those arguments, leading to
failures when using quotes in the values used. As the quoting can be
encoding-sensitive, the connection to the backend needs to be taken
before applying the correct quoting.
Author: Michael Paquier
Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/20200214041004.GB1998@paquier.xyz
Backpatch-through: 9.5
The function hash table keys made by compute_function_hashkey() failed
to distinguish event-trigger call context from regular call context.
This meant that once we'd successfully made a hash entry for an event
trigger (either by validation, or by normal use as an event trigger),
an attempt to call the trigger function as a plain function would
find this hash entry and thereby bypass the you-can't-do-that check in
do_compile(). Thus we'd attempt to execute the function, leading to
strange errors or even crashes, depending on function contents and
server version.
To fix, add an isEventTrigger field to PLpgSQL_func_hashkey,
paralleling the longstanding infrastructure for regular triggers.
This fits into what had been pad space, so there's no risk of an ABI
break, even assuming that any third-party code is looking at these
hash keys. (I considered replacing isTrigger with a PLpgSQL_trigtype
enum field, but felt that that carried some API/ABI risk. Maybe we
should change it in HEAD though.)
Per bug #16266 from Alexander Lakhin. This has been broken since
event triggers were invented, so back-patch to all supported branches.
Discussion: https://postgr.es/m/16266-fcd7f838e97ba5d4@postgresql.org
VACUUM may truncate heap in several batches. The activity report
is logged for each batch, and contains the number of pages in the table
before and after the truncation, and also the elapsed time during
the truncation. Previously the elapsed time reported in each batch was
the total elapsed time since starting the truncation until finishing
each batch. For example, if the truncation was processed dividing into
three batches, the second batch reported the accumulated time elapsed
during both first and second batches. This is strange and confusing
because the number of pages in the table reported together is not
total. Instead, each batch should report the time elapsed during
only that batch.
The cause of this issue was that the resource usage snapshot was
initialized only at the beginning of the truncation and was never
reset later. This commit fixes the issue by changing VACUUM so that
the resource usage snapshot is reset at each batch.
Back-patch to all supported branches.
Reported-by: Tatsuhito Kasahara
Author: Tatsuhito Kasahara
Reviewed-by: Masahiko Sawada, Fujii Masao
Discussion: https://postgr.es/m/CAP0=ZVJsf=NvQuy+QXQZ7B=ZVLoDV_JzsVC1FRsF1G18i3zMGg@mail.gmail.com
Manifested as
ERROR: subtransaction logged without previous top-level txn record
this check forbids legit behaviours like
- First xl_xact_assignment record is beyond reading, i.e. earlier
restart_lsn.
- After restart_lsn there is some change of a subxact.
- After that, there is second xl_xact_assignment (for another subxact)
revealing the relationship between top and first subxact.
Such a transaction won't be streamed anyway because we hadn't seen it in
full. Saying for sure whether xact of some record encountered after
the snapshot was deserialized can be streamed or not requires to know
whether it wrote something before deserialization point --if yes, it
hasn't been seen in full and can't be decoded. Snapshot doesn't have such
info, so there is no easy way to relax the check.
Reported-by: Hsu, John
Diagnosed-by: Arseny Sher
Author: Arseny Sher, Amit Kapila
Reviewed-by: Amit Kapila, Dilip Kumar
Backpatch-through: 9.5
Discussion: https://postgr.es/m/AB5978B2-1772-4FEE-A245-74C91704ECB0@amazon.com
This was unaccountably omitted in the original RLS patch.
The SQL syntax is basically the same as for comments on triggers,
so crib code from dumpTrigger().
Per report from Marc Munro. Back-patch to all supported branches.
Discussion: https://postgr.es/m/1581889298.18009.15.camel@bloodnok.com
The GRANTED BY clause in GRANT/REVOKE ROLE has been there since 2005
but was never documented. I'm not sure now whether that was just an
oversight or was intentional (given the limited capability of the
option). But seeing that pg_dumpall does emit code that uses this
option, it seems like not documenting it at all is a bad idea.
Also, when we upgraded the syntax to allow CURRENT_USER/SESSION_USER
as the privilege recipient, the role form of GRANT was incorrectly
not modified to show that, and REVOKE's docs weren't touched at all.
Although I'm not that excited about GRANTED BY, the other oversight
seems serious enough to justify a back-patch.
Discussion: https://postgr.es/m/3070.1581526786@sss.pgh.pa.us