Usage of ReadNextXLogRecord()'s first_record parameter for error
reporting isn't always correct. For instance, in GetWALRecordsInfo()
and GetWalStats(), we're reading multiple records, and first_record
is always passed as the LSN of the first record which is then used
for error reporting for later WAL record read failures. This isn't
correct.
The correct parameter to use for error reports in case of WAL
reading failures is xlogreader->EndRecPtr. This change fixes it.
While on it, removed an unnecessary Assert in pg_walinspect code.
Reported-by: Robert Haas
Author: Bharath Rupireddy
Reviewed-by: Robert Haas
Discussion: https://www.postgresql.org/message-id/CA%2BTgmoZAOGzPUifrcZRjFZ2vbtcw3mp-mN6UgEoEcQg6bY3OVg%40mail.gmail.com
Backpatch-through: 15
Previously, we relied on HEAP2_NEW_CID records and XACT_INVALIDATION
records to know if the transaction has modified the catalog, and that
information is not serialized to snapshot. Therefore, after the restart,
if the logical decoding decodes only the commit record of the transaction
that has actually modified a catalog, we will miss adding its XID to the
snapshot. Thus, we will end up looking at catalogs with the wrong
snapshot.
To fix this problem, this changes the snapshot builder so that it
remembers the last-running-xacts list of the decoded RUNNING_XACTS record
after restoring the previously serialized snapshot. Then, we mark the
transaction as containing catalog changes if it's in the list of initial
running transactions and its commit record has XACT_XINFO_HAS_INVALS. To
avoid ABI breakage, we store the array of the initial running transactions
in the static variables InitialRunningXacts and NInitialRunningXacts,
instead of storing those in SnapBuild or ReorderBuffer.
This approach has a false positive; we could end up adding the transaction
that didn't change catalog to the snapshot since we cannot distinguish
whether the transaction has catalog changes only by checking the COMMIT
record. It doesn't have the information on which (sub) transaction has
catalog changes, and XACT_XINFO_HAS_INVALS doesn't necessarily indicate
that the transaction has catalog change. But that won't be a problem since
we use snapshot built during decoding only to read system catalogs.
On the master branch, we took a more future-proof approach by writing
catalog modifying transactions to the serialized snapshot which avoids the
above false positive. But we cannot backpatch it because of a change in
the SnapBuild.
Reported-by: Mike Oh
Author: Masahiko Sawada
Reviewed-by: Amit Kapila, Shi yu, Takamichi Osumi, Kyotaro Horiguchi, Bertrand Drouvot, Ahsan Hadi
Backpatch-through: 10
Discussion: https://postgr.es/m/81D0D8B0-E7C4-4999-B616-1E5004DBDCD2%40amazon.com
When inserting a view referencing a foreign table that has WITH CHECK
OPTION constraints, in single-insert mode postgres_fdw retrieves the
data that was actually inserted on the remote side so that the WITH
CHECK OPTION constraints are enforced with the data locally, but in
batch-insert mode it cannot currently retrieve the data (except for the
row first inserted through the view), resulting in enforcing the WITH
CHECK OPTION constraints with the data passed from the core (except for
the first-inserted row), which led to incorrect results when inserting
into a view referencing a foreign table in which a remote BEFORE ROW
INSERT trigger changes the rows inserted through the view so that they
violate the view's WITH CHECK OPTION constraint. Also, the query
inserting into the view caused an assertion failure in assert-enabled
builds.
Fix these by disabling batch insertion when inserting into such a view.
Back-patch to v14 where batch insertion was added.
Discussion: https://postgr.es/m/CAPmGK17LpbTZs4m4a_6THP54UBeK9fHvX8aVVA%2BC6yEZDZwQcg%40mail.gmail.com
We've heard a couple of reports of people having trouble with
multi-gigabyte-sized query-texts files. It occurred to me that on
32-bit platforms, there could be an issue with integer overflow
of calculations associated with the total query text size.
Address that with several changes:
1. Limit pg_stat_statements.max to INT_MAX / 2 not INT_MAX.
The hashtable code will bound it to that anyway unless "long"
is 64 bits. We still need overflow guards on its use, but
this helps.
2. Add a check to prevent extending the query-texts file to
more than MaxAllocHugeSize. If it got that big, qtext_load_file
would certainly fail, so there's not much point in allowing it.
Without this, we'd need to consider whether extent, query_offset,
and related variables shouldn't be off_t not size_t.
3. Adjust the comparisons in need_gc_qtexts() to be done in 64-bit
arithmetic on all platforms. It appears possible that under duress
those multiplications could overflow 32 bits, yielding a false
conclusion that we need to garbage-collect the texts file, which
could lead to repeatedly garbage-collecting after every hash table
insertion.
Per report from Bruno da Silva. I'm not convinced that these
issues fully explain his problem; there may be some other bug that's
contributing to the query-texts file becoming so large in the first
place. But it did get that big, so #2 is a reasonable defense,
and #3 could explain the reported performance difficulties.
(See also commit 8bbe4cbd9, which addressed some related bugs.
The second Discussion: link is the thread that led up to that.)
This issue is old, and is primarily a problem for old platforms,
so back-patch.
Discussion: https://postgr.es/m/CAB+Nuk93fL1Q9eLOCotvLP07g7RAv4vbdrkm0cVQohDVMpAb9A@mail.gmail.com
Discussion: https://postgr.es/m/5601D354.5000703@BlueTreble.com
Allow for the possible presence of a SQLSTATE code in the expected
warning message, similarly to b998196bb and 19408aae7 (although
here I see no need to allow more than one specific SQLSTATE).
Per gripe from Andrew Dunstan.
Discussion: https://postgr.es/m/c550ac53-5db5-3958-1798-50bae3d9af71@dunslane.net
In commit 7c34555f8, I overlooked the need to configure SSPI
on Windows to allow login as the non-superuser role.
Fix that by adding auth_extra/--create-role incantation
(which, oddly enough, doesn't actually create the role).
Per buildfarm.
While here, upgrade the mechanism for temporarily setting
$ENV{PGUSER}, as per recommendation from ilmari.
Discussion: https://postgr.es/m/87edy7j1zz.fsf@wibble.ilmari.org
We weren't exercising the session_preload_libraries option in any
meaningful way. auto_explain is a good testbed for doing so, since
it's one of the primary use-cases for session_preload_libraries.
Hence, adjust its TAP test to load the library via
session_preload_libraries not shared_preload_libraries. While at it,
feed test-specific settings to the backend via PGOPTIONS rather than
tediously rewriting postgresql.conf.
Also, since auto_explain has some PGC_SUSET parameters, we can use it
to provide a test case for the permissions-checking bug just fixed
by commit b35617de3.
Back-patch to v15 so that we have coverage for the permissions issue
in that branch too. To do that, I back-patched the refactoring
recently done by commit 550bc0a6c.
Dagfinn Ilmari Mannsåker and Tom Lane
Discussion: https://postgr.es/m/CABwTF4VEpwTHhRQ+q5MiC5ucngN-whN-PdcKeufX7eLSoAfbZA@mail.gmail.com
When postgres_fdw begins an asynchronous data fetch, it submits FETCH query
by using PQsendQuery(). If PQsendQuery() fails and returns 0, postgres_fdw
should report an error. But, previously, postgres_fdw reported an error
only when the return value is less than 0, though PQsendQuery() never return
the values other than 0 and 1. Therefore postgres_fdw could not handle
the failure to send FETCH query in an asynchronous data fetch.
This commit fixes postgres_fdw so that it reports an error
when PQsendQuery() returns 0.
Back-patch to v14 where asynchronous execution was supported in postgres_fdw.
Author: Fujii Masao
Reviewed-by: Japin Li, Tom Lane
Discussion: https://postgr.es/m/b187a7cf-d4e3-5a32-4d01-8383677797f3@oss.nttdata.com
Detail and hint messages should be full sentences and should end with a
period, but some of the messages newly-introduced in v15 did not follow
that.
Author: Justin Pryzby
Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/20220719120948.GF12702@telsasoft.com
Backpatch-through: 15
As $gzip is embedded in postgresql.conf \ needs to be escaped, otherwise guc.c
will take it as a string escape. Similarly, if "$gzip" contains spaces, the
prior incantation will fail. Both of these are common on windows.
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/ce1b6eb3-5736-6f38-9775-b7020128b8d8@enterprisedb.com
Backpatch: 15-, where the test was added in 027fa0fd726
The motivation for this is to ensure successful transmission of the
values of constants of regconfig and other reg* types. The remote
will be reading them with search_path = 'pg_catalog', so schema
qualification is necessary when referencing objects in other schemas.
Per bug #17483 from Emmanuel Quincerot. Back-patch to all supported
versions. (There's some other stuff to do here, but it's less
back-patchable.)
Discussion: https://postgr.es/m/1423433.1652722406@sss.pgh.pa.us
Commit a117cebd638dd02e5c2e791c25e43745f233111b used the original userid
for ACL checks located directly in DefineIndex(), but it still adopted
the table owner userid for more ACL checks than intended. That broke
dump/reload of indexes that refer to an operator class, collation, or
exclusion operator in a schema other than "public" or "pg_catalog".
Back-patch to v10 (all supported versions), like the earlier commit.
Nathan Bossart and Noah Misch
Discussion: https://postgr.es/m/f8a4105f076544c180a87ef0c4822352@stmuk.bayern.de
The macro is being applied to a TOAST pointer, not a varlena header.
Therefore the use of VARATT_IS_COMPRESSED() is wrong. We can check
VARATT_EXTERNAL_IS_COMPRESSED(), but then we don't need the length
check that follows.
Report and fix by Kyotaro Horiguchi.
Discussion: http://postgr.es/m/20220517.162719.1671558681467343711.horikyota.ntt@gmail.com
Since a117cebd6, some older gcc versions issue "variable may be used
uninitialized in this function" complaints for brin_summarize_range.
Silence that using the same coding pattern as in bt_index_check_internal;
arguably, a117cebd6 had too narrow a view of which compilers might give
trouble.
Nathan Bossart and Tom Lane. Back-patch as the previous commit was.
Discussion: https://postgr.es/m/20220601163537.GA2331988@nathanxps13
Currently, preloaded libraries are expected to request additional
shared memory and LWLocks in _PG_init(). However, it is not unusal
for such requests to depend on MaxBackends, which won't be
initialized at that time. Such requests could also depend on GUCs
that other modules might change. This introduces a new hook where
modules can safely use MaxBackends and GUCs to request additional
shared memory and LWLocks.
Furthermore, this change restricts requests for shared memory and
LWLocks to this hook. Previously, libraries could make requests
until the size of the main shared memory segment was calculated.
Unlike before, we no longer silently ignore requests received at
invalid times. Instead, we FATAL if someone tries to request
additional shared memory or LWLocks outside of the hook.
Nathan Bossart and Julien Rouhaud
Discussion: https://postgr.es/m/20220412210112.GA2065815%40nathanxps13
Discussion: https://postgr.es/m/Yn2jE/lmDhKtkUdr@paquier.xyz
Expand the comment about the parallel_commit option to mention that the
default is false.
Also, since the comment about alteration of the keep_connections option,
which was located above the expanded comment, holds true for the
parallel_commit option, rewrite it to reflect this, and move it to after
the expanded comment.
Follow-up for commit 04e706d42.
Discussion: https://postgr.es/m/CAPmGK16Kg2Bf90sqzcZ4YM5cN_G-4h7wFUS01qQpqNB%2B2BG5_w%40mail.gmail.com
The code for unloading a library has been commented-out for over 12
years, ever since commit 602a9ef5a7c60151e10293ae3c4bb3fbb0132d03, and we're
no closer to supporting it now than we were back then.
Nathan Bossart, reviewed by Michael Paquier and by me.
Discussion: http://postgr.es/m/Ynsc9bRL1caUSBSE@paquier.xyz
When a feature enumerates relations and runs functions associated with
all found relations, the feature's user shall not need to trust every
user having permission to create objects. BRIN-specific functionality
in autovacuum neglected to account for this, as did pg_amcheck and
CLUSTER. An attacker having permission to create non-temp objects in at
least one schema could execute arbitrary SQL functions under the
identity of the bootstrap superuser. CREATE INDEX (not a
relation-enumerating operation) and REINDEX protected themselves too
late. This change extends to the non-enumerating amcheck interface.
Back-patch to v10 (all supported versions).
Sergey Shinderuk, reviewed (in earlier versions) by Alexander Lakhin.
Reported by Alexander Lakhin.
Security: CVE-2022-1552
The PXE_CIPHER_INIT error is used to report initialization errors, so
appending a questionmark to the error isn't entirely accurate (using a
space before the questionmark doubly so).
Discussion: https://postgr.es/m/C89D932C-501E-4473-9750-638CFCD9095E@yesql.se
Report OpenSSL errors during initialization as PXE_CIPHER_INIT since
that's just what they were, and not generic unknown errors. This also
removes the last users of the generic error, and thus it can be removed.
Discussion: http://postgr.es/m/C89D932C-501E-4473-9750-638CFCD9095E@yesql.se
Instability in the test for pg_walinspect revealed that
pg_get_wal_records_info_till_end_of_wal(x) would try to decode all the
records with a start LSN earlier than the flush LSN, even though that
might include a partial record at the end of the range. In that case,
read_local_xlog_page_no_wait() would return NULL when it tried to read
past the flush LSN, which would be interpreted as an error by the
caller. That caused a test failure only on a BF animal that had been
restarted recently, but could be expected to happen in the wild quite
easily depending on the alignment of various parameters.
Fix by using private data in read_local_xlog_page_no_wait() to signal
end-of-wal to the caller, so that it can be properly distinguished
from a real error.
Discussion: https://postgr.es/m/Ymd/e5eeZMNAkrXo%40paquier.xyz
Discussion: https://postgr.es/m/111657.1650910309@sss.pgh.pa.us
Authors: Thomas Munro, Bharath Rupireddy.
mark_async_capable_plan(), which is called from create_append_plan() to
determine whether subplans are async-capable, failed to take into
account that the given subplan created from a given subpath might
include a gating Result node if the subpath is a SubqueryScanPath or
ForeignPath, causing a segmentation fault there when the subplan created
from a SubqueryScanPath includes the Result node, or causing
ExecAsyncRequest() to throw an error about an unrecognized node type
when the subplan created from a ForeignPath includes the Result node,
because in the latter case the Result node was unintentionally
considered as async-capable, but we don't currently support executing
Result nodes asynchronously. Fix by modifying mark_async_capable_plan()
to disable asynchronous execution in such cases. Also, adjust code in
the ProjectionPath case in mark_async_capable_plan(), for consistency
with other cases, and adjust/improve comments there.
is_async_capable_path() added in commit 27e1f1456, which was rewritten
to mark_async_capable_plan() in a later commit, has the same issue,
causing the error at execution mentioned above, so back-patch to v14
where the aforesaid commit went in.
Per report from Justin Pryzby.
Etsuro Fujita, reviewed by Zhihong Yu and Justin Pryzby.
Discussion: https://postgr.es/m/20220408124338.GK24419%40telsasoft.com
This reverts commits 2c902bb and ccfbd92. Per buildfarm members
kestrel, rorqual and calliphoridae, the assertions checking that a TLI
history file should not exist when created by a WAL receiver have been
failing, and switching to durable_rename() over durable_rename_excl()
would cause the newest TLI history file to overwrite the existing one.
We need to think harder about such cases, so revert the new logic for
now.
Note that all the failures have been reported in the test
025_stuck_on_old_timeline.
Discussion: https://postgr.es/m/511362.1651116498@sss.pgh.pa.us
durable_rename_excl() attempts to avoid overwriting any existing files
by using link() and unlink(), falling back to rename() on some platforms
(e.g., Windows where link() followed by unlink() is not concurrent-safe,
see 909b449). Most callers of durable_rename_excl() use it just in case
there is an existing file, but it happens that for all of them we never
expect a target file to exist (WAL segment recycling, creation of
timeline history file and basic_archive).
basic_archive used durable_rename_excl() to avoid overwriting an archive
concurrently created by another server. Now, there is a stat() call to
avoid overwriting an existing archive a couple of lines above, so note
that this change opens a small TOCTOU window in this module between the
stat() call and durable_rename().
Furthermore, as mentioned in the top comment of durable_rename_excl(),
this routine can result in multiple hard links to the same file and data
corruption, with two or more links to the same file in pg_wal/ if a
crash happens before the unlink() call during WAL recycling.
Specifically, this would produce links to the same file for the current
WAL file and the next one because the half-recycled WAL file was
re-recycled during crash recovery of a follow-up cluster restart.
This change replaces all calls to durable_rename_excl() with
durable_rename(). This removes the protection against accidentally
overwriting an existing file, but some platforms are already living
without it, and all those code paths never expect an existing file (a
couple of assertions are added to check after that, in case).
This is a bug fix, but knowing the unlikeliness of the problem involving
one of more crashes at an exceptionally bad moment, no backpatch is
done. This could be revisited in the future.
Author: Nathan Bossart
Reviewed-by: Robert Haas, Kyotaro Horiguchi, Michael Paquier
Discussion: https://postgr.es/m/20220407182954.GA1231544@nathanxps13
Previously, we allowed this, but such triggers might query the table to
insert into and act differently if the tuples that have already been
processed and prepared for insertion are not there, so disable it in
such cases.
Back-patch to v14 where batch insert was added.
Discussion: https://postgr.es/m/CAPmGK16_uPqsmgK0-LpLSUk54_BoK13bPrhxhfjSoSTVz414hA%40mail.gmail.com
We have some streaming tests that rely on the size of changes which can
fail if there are additional changes like invalidation messages by
background activity like auto analyze. Avoid such failures by increasing
autovacuum_naptime to a reasonably high value (1d).
Author: Dilip Kumar
Backpatch-through: 14
Discussion: https://postgr.es/m/1958043.1650129119@sss.pgh.pa.us
Getting from get_raw_page() an all-zero page is considered as a valid
case by the buffer manager and it can happen for example when finding a
corrupted page with zero_damaged_pages enabled (using zero_damaged_pages
to look at corrupted pages happens), or after a crash when a relation
file is extended before any WAL for its new data is generated (before a
vacuum or autovacuum job comes in to do some cleanup).
However, all the functions of pageinspect, as of the index AMs (except
hash that has its own idea of new pages), heap, the FSM or the page
header have never worked with all-zero pages, causing various crashes
when going through the page internals.
This commit changes all the pageinspect functions to be compliant with
all-zero pages, where the choice is made to return NULL or no rows for
SRFs when finding a new page. get_raw_page() still works the same way,
returning a batch of zeros in the bytea of the page retrieved. A hard
error could be used but NULL, while more invasive, is useful when
scanning relation files in full to get a batch of results for a single
relation in one query. Tests are added for all the code paths
impacted.
Reported-by: Daria Lepikhova
Author: Michael Paquier
Discussion: https://postgr.es/m/561e187b-3549-c8d5-03f5-525c14e65bd0@postgrespro.ru
Backpatch-through: 10
Get rid of the separate "FATAL" log level, as it was applied
so inconsistently as to be meaningless. This mostly involves
s/pg_log_fatal/pg_log_error/g.
Create a macro pg_fatal() to handle the common use-case of
pg_log_error() immediately followed by exit(1). Various
modules had already invented either this or equivalent macros;
standardize on pg_fatal() and apply it where possible.
Invent the ability to add "detail" and "hint" messages to a
frontend message, much as we have long had in the backend.
Except where rewording was needed to convert existing coding
to detail/hint style, I have (mostly) resisted the temptation
to change existing message wording.
Patch by me. Design and patch reviewed at various stages by
Robert Haas, Kyotaro Horiguchi, Peter Eisentraut and
Daniel Gustafsson.
Discussion: https://postgr.es/m/1363732.1636496441@sss.pgh.pa.us
This commit adds two new columns to pg_stat_statements, called
temp_blk_read_time and temp_blk_write_time. Those columns respectively
show the time spent to read and write temporary file blocks on disk,
whose tracking has been added in efb0ef9. This information is
available when track_io_timing is enabled, like blk_read_time and
blk_write_time.
pg_stat_statements is updated to version to 1.10 as an effect of the
newly-added columns. Tests for the upgrade path 1.9->1.10 are added.
PGSS_FILE_HEADER is bumped for the new stats file format.
Author: Masahiko Sawada
Reviewed-by: Georgios Kokolatos, Melanie Plageman, Julien Rouhaud,
Ranier Vilela
Discussion: https://postgr.es/m/CAD21AoAJgotTeP83p6HiAGDhs_9Fw9pZ2J=_tYTsiO5Ob-V5GQ@mail.gmail.com
- subscriber stats reset path was untested
- slot stat sreset path for all slots was untested
- pg_stat_database.sessions etc was untested
- pg_stat_reset_shared() was untested, for any kind of shared stats
- pg_stat_reset() was untested
Author: Melanie Plageman <melanieplageman@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
This reverts a sequence of commits, implementing features related to
logical decoding and replication of sequences:
- 0da92dc530c9251735fc70b20cd004d9630a1266
- 80901b32913ffa59bf157a4d88284b2b3a7511d9
- b779d7d8fdae088d70da5ed9fcd8205035676df3
- d5ed9da41d96988d905b49bebb273a9b2d6e2915
- a180c2b34de0989269fdb819bff241a249bf5380
- 75b1521dae1ff1fde17fda2e30e591f2e5d64b6a
- 2d2232933b02d9396113662e44dca5f120d6830e
- 002c9dd97a0c874fd1693a570383e2dd38cd40d5
- 05843b1aa49df2ecc9b97c693b755bd1b6f856a9
The implementation has issues, mostly due to combining transactional and
non-transactional behavior of sequences. It's not clear how this could
be fixed, but it'll require reworking significant part of the patch.
Discussion: https://postgr.es/m/95345a19-d508-63d1-860a-f5c2f41e8d40@enterprisedb.com
In the stats collector days it was hard to write tests for the stats system,
because fundamentally delivery of stats messages over UDP was not
synchronous (nor guaranteed). Now we easily can force pending stats updates to
be flushed synchronously.
This moves stats.sql into a parallel group, there isn't a reason for it to run
in isolation anymore. And it may shake out some bugs.
Bumps catversion.
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
With stats now being stored in shared memory, the GUC isn't needed
anymore. However, the pg_stat_tmp directory and PG_STAT_TMP_DIR define are
kept, as pg_stat_statements (and some out-of-core extensions) store data in
it.
Docs will be updated in a subsequent commit, together with the other pending
docs updates due to shared memory stats.
Author: Andres Freund <andres@anarazel.de>
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220330233550.eiwsbearu6xhuqwe@alap3.anarazel.de
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
Soon the stats collector will be no more, with statistics instead getting
stored in shared memory. There are a lot of references to the stats collector
in comments. This commit replaces most of these references with "cumulative
statistics system", with the remaining ones getting replaced as part of
subsequent commits.
This is done separately from the - quite large - shared memory statistics
patch to make review easier.
Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
Discussion: https://postgr.es/m/20220308205351.2xcn6k4x5yivcxyd@alap3.anarazel.de
In commit 27e1f1456, create_append_plan() only allowed the subplan
created from a given subpath to be executed asynchronously when it was
an async-capable ForeignPath. To extend coverage, this patch handles
cases when the given subpath includes some other Path types as well that
can be omitted in the plan processing, such as a ProjectionPath directly
atop an async-capable ForeignPath, allowing asynchronous execution in
partitioned-scan/partitioned-join queries with non-Var tlist expressions
and more UNION queries.
Andrey Lepikhov and Etsuro Fujita, reviewed by Alexander Pyhalov and
Zhihong Yu.
Discussion: https://postgr.es/m/659c37a8-3e71-0ff2-394c-f04428c76f08%40postgrespro.ru