CopyFrom() needs a range table for formatting certain errors for
constraint violations.
This changes the mechanism of how the range table is passed to the
CopyFrom() executor state. We used to generate the range table and one
entry for the relation manually inside DoCopy(). Now we use
addRangeTableEntryForRelation() to setup the range table and relation
entry for the ParseState, which is then passed down by BeginCopyFrom().
Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reported-by: Euler Taveira <euler@timbira.com.br>
Coverity complained because bgw.bgw_extra wasn't being filled in by
ApplyLauncherRegister(). The most future-proof fix is to memset the
whole BackgroundWorker struct to zeroes. While at it, let's apply the
same coding rule to other places that set up BackgroundWorker structs;
four out of five had the same or related issues.
When sending a tuple attribute, the previous coding erroneously sent the
length byte before encoding conversion, which would lead to protocol
failures on the receiving side if the length did not match the following
string.
To fix that, use pq_sendcountedtext() for sending tuple attributes,
which takes care of all of that internally. To match the API of
pq_sendcountedtext(), send even text values without a trailing zero byte
and have the receiving end put it in place instead. This matches how
the standard FE/BE protocol behaves.
Reported-by: Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>
We need to set the origin remote position to end_lsn, not commit_lsn, as
commit_lsn is the start of commit record, and we use the origin remote
position as start position when restarting replication stream. If we'd
use commit_lsn, we could request data that we already received from the
remote server after a crash of a downstream server.
Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Since change of slot name is a supported operation, handle it more
gracefully, instead of in the this-should-not-happen way.
Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
On EXEC_BACKEND builds, this can fail if ASLR is in use.
Backpatch to 9.5. On master, completely remove the bgw_main field
completely, since there is no situation in which it is safe for an
EXEC_BACKEND build. On 9.6 and 9.5, leave the field intact to avoid
breaking things for third-party code that doesn't care about working
under EXEC_BACKEND. Prior to 9.5, there are no in-core bgworker
entrypoints.
Petr Jelinek, reviewed by me.
Discussion: http://postgr.es/m/09d8ad33-4287-a09b-a77f-77f8761adb5e@2ndquadrant.com
There are no functional changes here; this simply encapsulates knowledge
of the ItemPointerData struct so that a future patch can change things
without more breakage.
All direct users of ip_blkid and ip_posid are changed to use existing
macros ItemPointerGetBlockNumber and ItemPointerGetOffsetNumber
respectively. For callers where that's inappropriate (because they
Assert that the itempointer is is valid-looking), add
ItemPointerGetBlockNumberNoCheck and ItemPointerGetOffsetNumberNoCheck,
which lack the assertion but are otherwise identical.
Author: Pavan Deolasee
Discussion: https://postgr.es/m/CABOikdNnFon4cJiL=h1mZH3bgUeU+sWHuU4Yr8AB=j3A2p1GiA@mail.gmail.com
Fix an incorrect assert condition (noted by Coverity), and spell the new
name of the function correctly. Typos introduced in commit 7c4f52409.
Michael Paquier
Add functionality for a new subscription to copy the initial data in the
tables and then sync with the ongoing apply process.
For the copying, add a new internal COPY option to have the COPY source
data provided by a callback function. The initial data copy works on
the subscriber by receiving COPY data from the publisher and then
providing it locally into a COPY that writes to the destination table.
A WAL receiver can now execute full SQL commands. This is used here to
obtain information about tables and publications.
Several new options were added to CREATE and ALTER SUBSCRIPTION to
control whether and when initial table syncing happens.
Change pg_dump option --no-create-subscription-slots to
--no-subscription-connect and use the new CREATE SUBSCRIPTION
... NOCONNECT option for that.
Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Tested-by: Erik Rijkers <er@xs4all.nl>
Uses page-based mechanism to ensure we’re using the correct timeline.
Tests are included to exercise the functionality using a cold disk-level copy
of the master that's started up as a replica with slots intact, but the
intended use of the functionality is with later features.
Craig Ringer, reviewed by Simon Riggs and Andres Freund
Previous commits, notably 53be0b1add and
6f3bd98ebf, made it possible to see from
pg_stat_activity when a backend was stuck waiting for another backend,
but it's also fairly common for a backend to be stuck waiting for an
I/O. Add wait events for those operations, too.
Rushabh Lathia, with further hacking by me. Reviewed and tested by
Michael Paquier, Amit Kapila, Rajkumar Raghuwanshi, and Rahila Syed.
Discussion: http://postgr.es/m/CAGPqQf0LsYHXREPAZqYGVkDqHSyjf=KsD=k0GTVPAuzyThh-VQ@mail.gmail.com
This makes almost all core code follow the policy introduced in the
previous commit. Specific decisions:
- Text search support functions with char* and length arguments, such as
prsstart and lexize, may receive unaligned strings. I doubt
maintainers of non-core text search code will notice.
- Use plain VARDATA() on values detoasted or synthesized earlier in the
same function. Use VARDATA_ANY() on varlenas sourced outside the
function, even if they happen to always have four-byte headers. As an
exception, retain the universal practice of using VARDATA() on return
values of SendFunctionCall().
- Retain PG_GETARG_BYTEA_P() in pageinspect. (Page images are too large
for a one-byte header, so this misses no optimization.) Sites that do
not call get_page_from_raw() typically need the four-byte alignment.
- For now, do not change btree_gist. Its use of four-byte headers in
memory is partly entangled with storage of 4-byte headers inside
GBT_VARKEY, on disk.
- For now, do not change gtrgm_consistent() or gtrgm_distance(). They
incorporate the varlena header into a cache, and there are multiple
credible implementation strategies to consider.
There's no really good reason why the autovacuum launcher and logical
replication launcher should announce themselves at startup and shutdown
by default. Users don't care that those processes exist, and it's
inconsistent that those background processes announce themselves while
others don't. So, reduce those messages from LOG to DEBUG1 level.
I was sorely tempted to reduce the "starting logical replication worker
for subscription ..." message to DEBUG1 as well, but forebore for now.
Those processes might possibly be of direct interest to users, at least
until logical replication is a lot better shaken out than it is today.
Discussion: https://postgr.es/m/19479.1489121003@sss.pgh.pa.us
Any logical rep workers must have their subscription entries in
pg_subscription. To ensure this, we need to prevent the launcher
from starting new worker corresponding to the subscription that
DROP SUBSCRIPTION command is removing. To implement this,
previously LogicalRepLauncherLock was introduced and held until
the end of transaction running DROP SUBSCRIPTION. But using
LWLock for that purpose was not valid.
Instead, this commit changes DROP SUBSCRIPTION so that it takes
AccessExclusiveLock on pg_subscription, in order to ensure that
the launcher cannot see any subscriptions being removed. Also this
commit gets rid of LogicalRepLauncherLock.
Patch by me, reviewed by Petr Jelinek
Discussion: https://www.postgresql.org/message-id/CAHGQGwHPi8ky-yANFfe0sgmhKtsYcQLTnKx07bW9S7-Rn1746w@mail.gmail.com
Note that this change alone does not yet fully address the performance
problems triggering this work, a large portion of the slowdown is
triggered by the tuple allocator, which isn't converted to the new
allocator. It would be possible to do so, but using evenly sized
objects, like both the current implementation in reorderbuffer.c and
slab.c, wastes a fair amount of memory. A later patch by Tomas will
introduce a better approach.
Author: Tomas Vondra
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
c.h #includes a number of core libc header files, such as <stdio.h>.
There's no point in re-including these after having read postgres.h,
postgres_fe.h, or c.h; so remove code that did so.
While at it, also fix some places that were ignoring our standard pattern
of "include postgres[_fe].h, then system header files, then other Postgres
header files". While there's not any great magic in doing it that way
rather than system headers last, it's silly to have just a few files
deviating from the general pattern. (But I didn't attempt to enforce this
globally, only in files I was touching anyway.)
I'd be the first to say that this is mostly compulsive neatnik-ism,
but over time it might save enough compile cycles to be useful.
Twiddle the replication-related code so that its timestamp variables
are declared TimestampTz, rather than the uninformative "int64" that
was previously used for meant-to-be-always-integer timestamps.
This resolves the int64-vs-TimestampTz declaration inconsistencies
introduced by commit 7c030783a, though in the opposite direction to
what was originally suggested.
This required including datatype/timestamp.h in a couple more places
than before. I decided it would be a good idea to slim down that
header by not having it pull in <float.h> etc, as those headers are
no longer at all relevant to its purpose. Unsurprisingly, a small number
of .c files turn out to have been depending on those inclusions, so add
them back in the .c files as needed.
Discussion: https://postgr.es/m/26788.1487455319@sss.pgh.pa.us
Discussion: https://postgr.es/m/27694.1487456324@sss.pgh.pa.us
This extends the work done in commit 2f5c9d9c9 to provide a more nearly
complete abstraction layer hiding the details of index updating for catalog
changes. That commit only invented abstractions for catalog inserts and
updates, leaving nearby code for catalog deletes still calling the
heap-level routines directly. That seems rather ugly from here, and it
does little to help if we ever want to shift to a storage system in which
indexing work is needed at delete time.
Hence, create a wrapper function CatalogTupleDelete(), and replace calls
of simple_heap_delete() on catalog tuples with it. There are now very
few direct calls of [simple_]heap_delete remaining in the tree.
Discussion: https://postgr.es/m/462.1485902736@sss.pgh.pa.us
Split the existing CatalogUpdateIndexes into two different routines,
CatalogTupleInsert and CatalogTupleUpdate, which do both the heap
insert/update plus the index update. This removes over 300 lines of
boilerplate code all over src/backend/catalog/ and src/backend/commands.
The resulting code is much more pleasing to the eye.
Also, by encapsulating what happens in detail during an UPDATE, this
facilitates the upcoming WARM patch, which is going to add a few more
lines to the update case making the boilerplate even more boring.
The original CatalogUpdateIndexes is removed; there was only one use
left, and since it's just three lines, we can as well expand it in place
there. We could keep it, but WARM is going to break all the UPDATE
out-of-core callsites anyway, so there seems to be no benefit in doing
so.
Author: Pavan Deolasee
Discussion: https://www.postgr.es/m/CABOikdOcFYSZ4vA2gYfs=M2cdXzXX4qGHeEiW3fu9PCfkHLa2A@mail.gmail.com
Some background activity (like checkpoints, archive timeout, standby
snapshots) is not supposed to happen on an idle system. Unfortunately
so far it was not easy to determine when a system is idle, which
defeated some of the attempts to avoid redundant activity on an idle
system.
To make that easier, allow to make individual WAL insertions as not
being "important". By checking whether any important activity happened
since the last time an activity was performed, it now is easy to check
whether some action needs to be repeated.
Use the new facility for checkpoints, archive timeout and standby
snapshots.
The lack of a facility causes some issues in older releases, but in my
opinion the consequences (superflous checkpoints / archived segments)
aren't grave enough to warrant backpatching.
Author: Michael Paquier, editorialized by Andres Freund
Reviewed-By: Andres Freund, David Steele, Amit Kapila, Kyotaro HORIGUCHI
Bug: #13685
Discussion:
https://www.postgresql.org/message-id/20151016203031.3019.72930@wrigleys.postgresql.orghttps://www.postgresql.org/message-id/CAB7nPqQcPqxEM3S735Bd2RzApNqSNJVietAC=6kfkYv_45dKwA@mail.gmail.com
Backpatch: -
array_base and array_stride were added so that we could identify the
offset of an LWLock within a tranche, but this facility is only very
marginally used apart from the main tranche. So, give every lock in
the main tranche its own tranche ID and get rid of array_base,
array_stride, and all that's attached. For debugging facilities
(Trace_lwlocks and LWLOCK_STATS) print the pointer address of the
LWLock using %p instead of the offset. This is arguably more useful,
and certainly a lot cheaper. Drop the offset-within-tranche from
the information reported to dtrace and from one can't-happen message
inside lwlock.c.
The main user-visible impact of this change is that pg_stat_activity
will now report all waits for LWLocks as "LWLock" rather than
reporting some as "LWLockTranche" and others as "LWLockNamed".
The main motivation for this change is that the need to specify an
array_base and an array_stride is awkward for parallel query. There
is only a very limited supply of tranche IDs so we can't just keep
allocating new ones, and if we try to use the same tranche IDs every
time then we run into trouble when multiple parallel contexts are
use simultaneously. So if we didn't get rid of this mechanism we'd
have to make it even more complicated. By simplifying it in this
way, we instead reduce the size of the generated code for lwlock.c
by about 5%.
Discussion: http://postgr.es/m/CA+TgmoYsFn6NUW1x0AZtupJGUAs1UDY4dJtCN47_Q6D0sP80PA@mail.gmail.com
Before initializing iteration over a subtransaction's changes, the last
few changes were not spilled to disk. That's correct if the transaction
didn't spill to disk, but otherwise... This bug can lead to missed or
misorderd subtransaction contents when they were spilled to disk.
Move spilling of the remaining in-memory changes to
ReorderBufferIterTXNInit(), where it can easily be applied to the top
transaction and, if present, subtransactions.
Since this code had too many bugs already, noticeably increase test
coverage.
Fixes: #14319
Reported-By: Huan Ruan
Discussion: <20160909012610.20024.58169@wrigleys.postgresql.org>
Backport: 9,4-, where logical decoding was added
Commit 3fe3511d05 introduced a new
case into this function, but neglected to ensure that the "ondisk"
pointer got updated after a possible reallocation as the code does
in other cases.
Stas Kelvich, per diagnosis by Konstantin Knizhnik.
Add a location field to the DefElem struct, used to parse many utility
commands. Update various error messages to supply error position
information.
To propogate the error position information in a more systematic way,
create a ParseState in standard_ProcessUtility() and pass that to
interested functions implementing the utility commands. This seems
better than passing the query string and then reassembling a parse state
ad hoc, which violates the encapsulation of the ParseState type.
Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>