pg_wal_replay_wait() is to be used on standby and specifies waiting for
the specific WAL location to be replayed. This option is useful when
the user makes some data changes on primary and needs a guarantee to see
these changes are on standby.
The queue of waiters is stored in the shared memory as an LSN-ordered pairing
heap, where the waiter with the nearest LSN stays on the top. During
the replay of WAL, waiters whose LSNs have already been replayed are deleted
from the shared memory pairing heap and woken up by setting their latches.
pg_wal_replay_wait() needs to wait without any snapshot held. Otherwise,
the snapshot could prevent the replay of WAL records, implying a kind of
self-deadlock. This is why it is only possible to implement
pg_wal_replay_wait() as a procedure working without an active snapshot,
not a function.
Catversion is bumped.
Discussion: https://postgr.es/m/eb12f9b03851bb2583adab5df9579b4b%40postgrespro.ru
Author: Kartyshov Ivan, Alexander Korotkov
Reviewed-by: Michael Paquier, Peter Eisentraut, Dilip Kumar, Amit Kapila
Reviewed-by: Alexander Lakhin, Bharath Rupireddy, Euler Taveira
Reviewed-by: Heikki Linnakangas, Kyotaro Horiguchi
Commit 9d9b9d46f3 added spinlocks to protect the fields in ProcSignal
flags, but in EmitProcSignalBarrier(), the spinlock was released
twice. With most spinlock implementations, releasing a lock that's not
held is not easy to notice, because most of the time it does nothing,
but if the spinlock was concurrently acquired by another process, it
could lead to more serious issues. Fortunately, with the
--disable-spinlocks emulation implementation, it caused more visible
failures.
In the passing, fix a type in comment and add an assertion that the
procNumber passed to SendProcSignal looks valid.
Discussion: https://www.postgresql.org/message-id/b8ce284c-18a2-4a79-afd3-1991a2e7d246@iki.fi
Move responsibility of generating the cancel key to the backend
process. The cancel key is now generated after forking, and the
backend advertises it in the ProcSignal array. When a cancel request
arrives, the backend handling it scans the ProcSignal array to find
the target pid and cancel key. This is similar to how this previously
worked in the EXEC_BACKEND case with the ShmemBackendArray, just
reusing the ProcSignal array.
One notable change is that we no longer generate cancellation keys for
non-backend processes. We generated them before just to prevent a
malicious user from canceling them; the keys for non-backend processes
were never actually given to anyone. There is now an explicit flag
indicating whether a process has a valid key or not.
I wrote this originally in preparation for supporting longer cancel
keys, but it's a nice cleanup on its own.
Reviewed-by: Jelte Fennema-Nio
Discussion: https://www.postgresql.org/message-id/508d0505-8b7a-4864-a681-e7e5edfe32aa@iki.fi
The documentation for System V IPC parameters provides complicated
formulas to determine the appropriate values for SEMMNI and SEMMNS.
Furthermore, these formulas have often been wrong because folks
forget to update them (e.g., when adding a new auxiliary process).
This commit introduces a new runtime-computed GUC named
num_os_semaphores that reports the number of semaphores needed for
the configured number of allowed connections, worker processes,
etc. This new GUC allows us to simplify the formulas in the
documentation, and it should help prevent future inaccuracies.
Like the other runtime-computed GUCs, users can view it with
"postgres -C" before starting the server, which is useful for
preconfiguring the necessary operating system resources.
Reviewed-by: Tom Lane, Sami Imseih, Andres Freund, Robert Haas
Discussion: https://postgr.es/m/20240517164452.GA1914161%40nathanxps13
clog.c, async.c and predicate.c included some SLRU page numbers still
handled as 4-byte integers, while int64 should be used for this purpose.
These holes have been introduced in 4ed8f0913b, that has introduced
the use of 8-byte integers for SLRU page numbers, still forgot about the
code paths updated by this commit.
Reported-by: Noah Misch
Author: Aleksander Alekseev, Michael Paquier
Discussion: https://postgr.es/m/20240626002747.dc.nmisch@google.com
Backpatch-through: 17
While this doesn't significantly change runtime now, it arranges for
STRATEGY=WAL_LOG to benefit automatically from future optimizations to
the read_stream subsystem. For large tables in the template database,
this does read 16x as many bytes per system call. Platforms with high
per-call overhead, if any, may see an immediate benefit.
Nazir Bilal Yavuz
Discussion: https://postgr.es/m/CAN55FZ0JKL6vk1xQp6rfOXiNFV1u1H0tJDPPGHWoiO3ea2Wc=A@mail.gmail.com
Winsock only signals an FD_CLOSE event once if the other end of the
socket shuts down gracefully. Because each WaitLatchOrSocket() call
constructs and destroys a new event handle every time, with unlucky
timing we can lose it and hang. We get away with this only if the other
end disconnects non-gracefully, because FD_CLOSE is repeatedly signaled
in that case.
To fix this design flaw in our Windows socket support fundamentally,
we'd probably need to rearchitect it so that a single event handle
exists for the lifetime of a socket, or switch to completely different
multiplexing or async I/O APIs. That's going to be a bigger job
and probably wouldn't be back-patchable.
This brute force kludge closes the race by explicitly polling with
MSG_PEEK before sleeping.
Back-patch to all supported releases. This should hopefully clear up
some random build farm and CI hang failures reported over the years. It
might also allow us to try using graceful shutdown in more places again
(reverted in commit 29992a6) to fix instability in the transmission of
FATAL error messages, but that isn't done by this commit.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Tested-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/176008.1715492071%40sss.pgh.pa.us
Since commit 3a9b18b309, roles with privileges of pg_signal_backend
cannot signal autovacuum workers. Many users treated the ability
to signal autovacuum workers as a feature instead of a bug, so we
are reintroducing it via a new predefined role. Having privileges
of this new role, named pg_signal_autovacuum_worker, only permits
signaling autovacuum workers. It does not permit signaling other
types of superuser backends.
Bumps catversion.
Author: Kirill Reshke
Reviewed-by: Anthony Leung, Michael Paquier, Andrey Borodin
Discussion: https://postgr.es/m/CALdSSPhC4GGmbnugHfB9G0%3DfAxjCSug_-rmL9oUh0LTxsyBfsg%40mail.gmail.com
Since commit 5764f611e1, we've been using the ilist.h functions for
handling the linked list. There's no need for 'links' to be the first
element of the struct anymore, except for one call in InitProcess
where we used a straight cast from the 'dlist_node *' to PGPROC *,
without the dlist_container() macro. That was just an oversight in
commit 5764f611e1, fix it.
There no imminent need to move 'links' from being the first field, but
let's be tidy.
Reviewed-by: Aleksander Alekseev, Andres Freund
Discussion: https://www.postgresql.org/message-id/22aa749e-cc1a-424a-b455-21325473a794@iki.fi
e85662df44 made GetRunningTransactionData() calculate the oldest running
transaction id within the current database. This commit optimized this
calculation by performing a cheap transaction id comparison before fetching
the process database id, while the latter could cause extra cache misses.
Reported-by: Noah Misch
Discussion: https://postgr.es/m/20240630231816.bf.nmisch%40google.com
e85662df44 made GetRunningTransactionData() calculate the oldest running
transaction id within the current database. However, because of the typo,
the new code uses oldestRunningXid instead of oldestDatabaseRunningXid
in comparison before updating oldestDatabaseRunningXid. This commit fixes
that issue.
Reported-by: Noah Misch
Discussion: https://postgr.es/m/20240630231816.bf.nmisch%40google.com
Backpatch-through: 17
Both BufFileSize() and BufFileAppend() contained Asserts to ensure the
given BufFile(s) had a valid fileset. A valid fileset isn't required in
either of these functions, so remove the Asserts and adjust the
comments accordingly.
This was noticed while work was being done on a new patch to call
BufFileSize() on a BufFile without a valid fileset. It seems there's
currently no code in the tree which could trigger these Asserts, so no
need to backpatch this, for now.
Reviewed-by: Peter Geoghegan, Matthias van de Meent, Tom Lane
Discussion: https://postgr.es/m/CAApHDvofgZT0VzydhyGH5MMb-XZzNDqqAbzf1eBZV5HDm3%2BosQ%40mail.gmail.com
We have in launch_backend.c:
/*
* The following need to be available to the save/restore_backend_variables
* functions. They are marked NON_EXEC_STATIC in their home modules.
*/
extern slock_t *ShmemLock;
extern slock_t *ProcStructLock;
extern PGPROC *AuxiliaryProcs;
extern PMSignalData *PMSignalState;
extern pg_time_t first_syslogger_file_time;
extern struct bkend *ShmemBackendArray;
extern bool redirection_done;
That comment is not completely true: ShmemLock, ShmemBackendArray, and
redirection_done are not in fact NON_EXEC_STATIC. ShmemLock once was,
but was then needed elsewhere. ShmemBackendArray was static inside
postmaster.c before launch_backend.c was created. redirection_done
was never static.
This patch moves the declaration of ShmemLock and redirection_done to
a header file.
ShmemBackendArray gets a NON_EXEC_STATIC. This doesn't make a
difference, since it only exists if EXEC_BACKEND anyway, but it makes
it consistent.
After that, the comment is now correct.
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c95ce@eisentraut.org
This old CPU architecture hasn't been produced in decades, and
whatever instances might still survive are surely too underpowered
for anyone to consider running Postgres on in production. We'd
nonetheless continued to carry code support for it (largely at my
insistence), because its unique implementation of spinlocks seemed
like a good edge case for our spinlock infrastructure. However,
our last buildfarm animal of this type was retired last year, and
it seems quite unlikely that another will emerge. Without the ability
to run tests, the argument that this is useful test code fails to
hold water. Furthermore, carrying code support for an untestable
architecture has costs not to be ignored. So, remove HPPA-specific
code, in the same vein as commits 718aa43a4 and 92d70b77e.
Discussion: https://postgr.es/m/3351991.1697728588@sss.pgh.pa.us
Commit 5b562644fe added a comment that
SetRelationHasSubclass() callers must hold this lock. When commit
17f206fbc8 extended use of this column to
partitioned indexes, it didn't take the lock. As the latter commit
message mentioned, we currently never reset a partitioned index to
relhassubclass=f. That largely avoids harm from the lock omission. The
cause for fixing this now is to unblock introducing a rule about locks
required to heap_update() a pg_class row. This might cause more
deadlocks. It gives minor user-visible benefits:
- If an ALTER INDEX SET TABLESPACE runs concurrently with ALTER TABLE
ATTACH PARTITION or CREATE PARTITION OF, one transaction blocks
instead of failing with "tuple concurrently updated". (Many cases of
DDL concurrency still fail that way.)
- Match ALTER INDEX ATTACH PARTITION in choosing to lock the index.
While not user-visible today, we'll need this if we ever make something
set the flag to false for a partitioned index, like ANALYZE does today
for tables. Back-patch to v12 (all supported versions), the plan for
the commit relying on the new rule. In back branches, add
LockOrStrongerHeldByMe() instead of adding a LockHeldByMe() parameter.
Reviewed (in an earlier version) by Robert Haas.
Discussion: https://postgr.es/m/20240611024525.9f.nmisch@google.com
We did not recover the subtransaction IDs of prepared transactions
when starting a hot standby from a shutdown checkpoint. As a result,
such subtransactions were considered as aborted, rather than
in-progress. That would lead to hint bits being set incorrectly, and
the subtransactions suddenly becoming visible to old snapshots when
the prepared transaction was committed.
To fix, update pg_subtrans with prepared transactions's subxids when
starting hot standby from a shutdown checkpoint. The snapshots taken
from that state need to be marked as "suboverflowed", so that we also
check the pg_subtrans.
Backport to all supported versions.
Discussion: https://www.postgresql.org/message-id/6b852e98-2d49-4ca1-9e95-db419a2696e0@iki.fi
Commit 210622c6 accidentally zeroed out pages even if they were found in
the buffer pool. It should always lock the page, but it should only
zero pages that were not already valid. Otherwise, concurrent readers
that hold only a pin could see corrupted page contents changing under
their feet.
While here, rename ZeroAndLockBuffer() to match the RBM_ flag name.
Also restore a some useful comments lost by 210622c6's refactoring, and
add some new ones to clarify why we need to use the BM_IO_IN_PROGRESS
infrastructure despite not doing I/O.
Reported-by: Noah Misch <noah@leadboat.com>
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> (earlier version)
Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier version)
Discussion: https://postgr.es/m/20240512171658.7e.nmisch@google.com
Discussion: https://postgr.es/m/7ed10231-ce47-03d5-d3f9-4aea0dc7d5a4%40gmail.com
The assertion used at the beginning of mdwritev(), that is not enabled
except by defining -DCHECK_WRITE_VS_EXTEND as mdnblocks() is costly,
forgot about the total number of blocks to write at location specified
by the caller. The calculation is fixed to count for that, and uses
casts to uint64 to ensure a proper check should the number of blocks
overflow.
Using a cast is a suggestion from Tom Lane.
Oversight in 4908c58720.
Author: Xing Guo
Discussion: https://postgr.es/m/CACpMh+BM-VgKeO7suPG-VHTtpzJ+zsbDPwVHu42PLp-iTk0z+A@mail.gmail.com
After further review, we want to move in the direction of always
quoting GUC names in error messages, rather than the previous (PG16)
wildly mixed practice or the intermittent (mid-PG17) idea of doing
this depending on how possibly confusing the GUC name is.
This commit applies appropriate quotes to (almost?) all mentions of
GUC names in error messages. It partially supersedes a243569bf6 and
8d9978a717, which had moved things a bit in the opposite direction
but which then were abandoned in a partial state.
Author: Peter Smith <smithpb2250@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAHut%2BPv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w%40mail.gmail.com
Specifically, it terminates a background worker even if the caller
couldn't terminate the background worker with pg_terminate_backend().
Commit 3a9b18b309 neglected to update
this. Back-patch to v13, which introduced DROP DATABASE FORCE.
Reviewed by Amit Kapila. Reported by Kirill Reshke.
Discussion: https://postgr.es/m/20240429212756.60.nmisch@google.com
GlobalVisTestNonRemovableHorizon() was only used for the implementation of
snapshot_too_old, which was removed in f691f5b80a. As using
GlobalVisTestNonRemovableHorizon() is not particularly efficient, no new uses
for it should be added. Therefore remove.
Discussion: https://postgr.es/m/20240415185720.q4dg4dlcyvvrabz4@awork3.anarazel.de
GetPageWithFreeSpace() callers assume the returned block exists in the
main fork, failing with "could not read block" errors if that doesn't
hold. Make that assumption reliable now. It hadn't been guaranteed,
due to the weak WAL and data ordering of participating components. Most
operations on the fsm fork are not WAL-logged. Relation extension is
not WAL-logged. Hence, an fsm-fork block on disk can reference a
main-fork block that no WAL record has initialized. That could happen
after an OS crash, a replica promote, or a PITR restore. wal_log_hints
makes the trouble easier to hit; a replica promote or PITR ending just
after a relevant fsm-fork FPI_FOR_HINT may yield this broken state. The
v16 RelationAddBlocks() mechanism also makes the trouble easier to hit,
since it bulk-extends even without extension lock waiters. Commit
917dc7d239 stopped trouble around
truncation, but vectors involving PageIsNew() pages remained.
This implementation adds a RelationGetNumberOfBlocks() call when the
cached relation size doesn't confirm a block exists. We've been unable
to identify a benchmark that slows materially, but this may show up as
additional time in lseek(). An alternative without that overhead would
be a new ReadBufferMode such that ReadBufferExtended() returns NULL
after a 0-byte read, with all other errors handled normally. However,
each GetFreeIndexPage() caller would then need code for the return-NULL
case. Back-patch to v14, due to earlier versions not caching relation
size and the absence of a pre-v16 problem report.
Ronan Dunklau. Reported by Ronan Dunklau.
Discussion: https://postgr.es/m/1878547.tdWV9SEqCh%40aivenlaptop
When we determine that a wanted block can't be combined with the current
pending read, it's time to start that read to get it out of the way. An
"if" in that code path should have been a "while", because it might take
more than one go in case of partial reads. This was only broken for
smaller ranges, as the more common case of io_combine_limit-sized ranges
is handled earlier in the code and knows how to loop, hiding the bug for
a while.
Discovered while testing large parallel sequential scans of partially
cached tables. The ramp-up-and-down block allocator for parallel scans
could hit the problem case and skip some blocks near the end that should
have been streamed.
Defect in commit b5a9b18c.
Discussion: https://postgr.es/m/CA%2BhUKG%2Bh8Whpv0YsJqjMVkjYX%2B80fTVc6oi-V%2BzxJvykLpLHYQ%40mail.gmail.com
The BAS_VACUUM ring size has been 256kB since commit d526575f introduced
the mechanism 17 years ago. Commit 1cbbee03 recently made it
configurable but retained the traditional default. The correct default
size has been debated for years, but 256kB is certainly very small.
VACUUM soon needs to write back data it dirtied only 32 blocks ago,
which usually requires flushing the WAL. New experiments in prefetching
pages for VACUUM exacerbated the problem by crashing into dirty data
even sooner. Let's make the default 2MB. That's 1.6% of the default
toy buffer pool size, and 0.2% of 1GB, which would be a considered a
small shared_buffers setting for a real system these days. Users are
still free to set the GUC to a different value.
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20240403221257.md4gfki3z75cdyf6%40awork3.anarazel.de
Discussion: https://postgr.es/m/CA%2BhUKGLY4Q4ZY4f1rvnFtv6%2BPkjNf8MejdPkcju3Qii9DYqqcQ%40mail.gmail.com
While pinning extra buffers to look ahead, users of strategies are in
danger of using too many buffers. For some strategies, that means
"escaping" from the ring, and in others it means forcing dirty data to
disk very frequently with associated WAL flushing. Since external code
has no insight into any of that, allow individual strategy types to
expose a clamp that should be applied when deciding how many buffers to
pin at once.
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CAAKRu_aJXnqsyZt6HwFLnxYEBgE17oypkxbKbT1t1geE_wvH2Q%40mail.gmail.com
The "fast path" for well cached scans that don't do any I/O was
accidentally coded in a way that could only be triggered by pg_prewarm's
usage pattern, which starts out with a higher distance because of the
flags it passes in. We want it to work for streaming sequential scans
too, once that patch is committed. Adjust.
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGKXZALJ%3D6aArUsXRJzBm%3Dqvc4AWp7%3DiJNXJQqpbRLnD_w%40mail.gmail.com
Execute both freezing and pruning of tuples in the same
heap_page_prune() function, now called heap_page_prune_and_freeze(),
and emit a single WAL record containing all changes. That reduces the
overall amount of WAL generated.
This moves the freezing logic from vacuumlazy.c to the
heap_page_prune_and_freeze() function. The main difference in the
coding is that in vacuumlazy.c, we looked at the tuples after the
pruning had already happened, but in heap_page_prune_and_freeze() we
operate on the tuples before pruning. The heap_prepare_freeze_tuple()
function is now invoked after we have determined that a tuple is not
going to be pruned away.
VACUUM no longer needs to loop through the items on the page after
pruning. heap_page_prune_and_freeze() does all the work. It now
returns the list of dead offsets, including existing LP_DEAD items, to
the caller. Similarly it's now responsible for tracking 'all_visible',
'all_frozen', and 'hastup' on the caller's behalf.
Author: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://www.postgresql.org/message-id/20240330055710.kqg6ii2cdojsxgje@liskov
Bug in commit 53c2a97a92: we failed to acquire the correct SLRU bank
lock when iterating to zero-out intermediate pages in predicate.c.
Rewrite the code block so that we follow the locking protocol correctly.
Also update an outdated comment in the same file -- SerialSLRULock
exists no more.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Discussion: https://postgr.es/m/2a25eaf4-a3a4-5fd1-6241-9d7c73142085@gmail.com
Previously, binaryheap didn't support updating a key and removing a
node in an efficient way. For example, in order to remove a node from
the binaryheap, the caller had to pass the node's position within the
array that the binaryheap internally has. Removing a node from the
binaryheap is done in O(log n) but searching for the key's position is
done in O(n).
This commit adds a hash table to binaryheap in order to track the
position of each nodes in the binaryheap. That way, by using newly
added functions such as binaryheap_update_up() etc., both updating a
key and removing a node can be done in O(1) on an average and O(log n)
in worst case. This is known as the indexed binary heap. The caller
can specify to use the indexed binaryheap by passing indexed = true.
The current code does not use the new indexing logic, but it will be
used by an upcoming patch.
Reviewed-by: Vignesh C, Peter Smith, Hayato Kuroda, Ajin Cherian,
Tomas Vondra, Shubham Khanna
Discussion: https://postgr.es/m/CAD21AoDffo37RC-eUuyHJKVEr017V2YYDLyn1xF_00ofptWbkg%40mail.gmail.com
pg_wal_replay_wait() is to be used on standby and specifies waiting for
the specific WAL location to be replayed before starting the transaction.
This option is useful when the user makes some data changes on primary and
needs a guarantee to see these changes on standby.
The queue of waiters is stored in the shared memory array sorted by LSN.
During replay of WAL waiters whose LSNs are already replayed are deleted from
the shared memory array and woken up by setting of their latches.
pg_wal_replay_wait() needs to wait without any snapshot held. Otherwise,
the snapshot could prevent the replay of WAL records implying a kind of
self-deadlock. This is why it is only possible to implement
pg_wal_replay_wait() as a procedure working in a non-atomic context,
not a function.
Catversion is bumped.
Discussion: https://postgr.es/m/eb12f9b03851bb2583adab5df9579b4b%40postgrespro.ru
Author: Kartyshov Ivan, Alexander Korotkov
Reviewed-by: Michael Paquier, Peter Eisentraut, Dilip Kumar, Amit Kapila
Reviewed-by: Alexander Lakhin, Bharath Rupireddy, Euler Taveira
If temp tables have dependencies (such as sequences) then it's
possible for autovacuum's cleanup of orphan temp tables to deadlock
against an incoming backend that's trying to clean out the temp
namespace for its own use. That can happen because RemoveTempRelations'
performDeletion call can visit objects within the namespace in
an order different from the order in which a per-table deletion
will visit them.
To fix, observe that performDeletion will begin by taking an exclusive
lock on the temp namespace (even though it won't actually delete it).
So, if we can get a shared lock on the namespace, we can be sure we're
not running concurrently with RemoveTempRelations, while also not
conflicting with ordinary use of the namespace. This requires
introducing a conditional version of LockDatabaseObject, but that's no
big deal. (It's surprising we've got along without that this long.)
Report and patch by Mikhail Zhilin. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/c43ce028-2bc2-4865-9b89-3f706246eed5@postgrespro.ru
Introduce an abstraction allowing relation data to be accessed as a
stream of buffers, with an implementation that is more efficient than
the equivalent sequence of ReadBuffer() calls.
Client code supplies a callback that can say which block number it wants
next, and then consumes individual buffers one at a time from the
stream. This division puts read_stream.c in control of how far ahead it
can see and allows it to read clusters of neighboring blocks with
StartReadBuffers(). It also issues POSIX_FADV_WILLNEED advice ahead of
time when random access is detected.
Other variants of I/O stream will be proposed in future work (for
example to support recovery, whose LsnReadQueue device in
xlogprefetcher.c is a distant cousin of this code and should eventually
be replaced by this), but this basic API is sufficient for many common
executor usage patterns involving predictable access to a single fork of
a single relation.
Several patches using this API are proposed separately.
This stream concept is loosely based on ideas from Andres Freund on how
we should pave the way for later work on asynchronous I/O.
Author: Thomas Munro <thomas.munro@gmail.com>
Author: Heikki Linnakangas <hlinnaka@iki.fi> (contributions)
Author: Melanie Plageman <melanieplageman@gmail.com> (contributions)
Suggested-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Tested-by: Tomas Vondra <tomas.vondra@enterprisedb.com>
Discussion: https://postgr.es/m/CA+hUKGJkOiOCa+mag4BF+zHo7qo=o9CFheB8=g6uT5TUm2gkvA@mail.gmail.com