1
0
mirror of https://github.com/postgres/postgres.git synced 2025-05-03 22:24:49 +03:00

18621 Commits

Author SHA1 Message Date
Andres Freund
0446551fc1 Fix error message typo introduced 691d79a07933.
Reported-By: Michael Paquier
Discussion: https://postgr.es/m/20181101003405.GB1727@paquier.xyz
Backpatch: 9.4-, like the previous commit
2018-11-01 10:44:53 -07:00
Peter Geoghegan
886319d71b Adjust trace_sort log messages.
The project message style guide dictates: "When citing the name of an
object, state what kind of object it is".  The parallel CREATE INDEX
patch added a worker number to most of the trace_sort messages within
tuplesort.c without specifying the object type.  Bring these messages
into compliance with the style guide.

We're still treating a leader or serial Tuplesortstate as having worker
number -1.  trace_sort is a developer option, and these two cases are
highly comparable, so this seems appropriate.

Per complaint from Tom Lane.

Discussion: https://postgr.es/m/8330.1540831863@sss.pgh.pa.us
Backpatch: 11-, where parallel CREATE INDEX was introduced.
2018-11-01 09:18:55 -07:00
Andres Freund
c33a01c797 Disallow starting server with insufficient wal_level for existing slot.
Previously it was possible to create a slot, change wal_level, and
restart, even if the new wal_level was insufficient for the
slot. That's a problem for both logical and physical slots, because
the necessary WAL records are not generated.

This removes a few tests in newer versions that, somewhat
inexplicably, whether restarting with a too low wal_level worked (a
buggy behaviour!).

Reported-By: Joshua D. Drake
Author: Andres Freund
Discussion: https://postgr.es/m/20181029191304.lbsmhshkyymhw22w@alap3.anarazel.de
Backpatch: 9.4-, where replication slots where introduced
2018-10-31 15:46:40 -07:00
Tom Lane
2493e2c2d1 Fix memory leak in repeated SPGIST index scans.
spgendscan neglected to pfree all the memory allocated by spgbeginscan.
It's possible to get away with that in most normal queries, since the
memory is allocated in the executor's per-query context which is about
to get deleted anyway; but it causes severe memory leakage during
creation or filling of large exclusion-constraint indexes.

Also, document that amendscan is supposed to free what ambeginscan
allocates.  The docs' lack of clarity on that point probably caused this
bug to begin with.  (There is discussion of changing that API spec going
forward, but I don't think it'd be appropriate for the back branches.)

Per report from Bruno Wolff.  It's been like this since the beginning,
so back-patch to all active branches.

In HEAD, also fix an independent leak caused by commit 2a6368343
(allocating memory during spgrescan instead of spgbeginscan, which
might be all right if it got cleaned up, but it didn't).  And do a bit
of code beautification on that commit, too.

Discussion: https://postgr.es/m/20181024012314.GA27428@wolff.to
2018-10-31 17:04:42 -04:00
Tom Lane
2bd6dcdeff Fix interaction of CASE and ArrayCoerceExpr.
An array-type coercion appearing within a CASE that has a constant
(after const-folding) test expression was mangled by the planner, causing
all the elements of the resulting array to be equal to the coerced value
of the CASE's test expression.  This is my oversight in commit c12d570fa:
that changed ArrayCoerceExpr to use a subexpression involving a
CaseTestExpr, and I didn't notice that eval_const_expressions needed an
adjustment to keep from folding such a CaseTestExpr to a constant when
it's inside a suitable CASE.

This is another in what's getting to be a depressingly long line of bugs
associated with misidentification of the referent of a CaseTestExpr.
We're overdue to redesign that mechanism; but any such fix is unlikely
to be back-patchable into v11.  As a stopgap, fix eval_const_expressions
to do what it must here.  Also add a bunch of comments pointing out the
restrictions and assumptions that are needed to make this work at all.

Also fix a related oversight: contain_context_dependent_node() was not
aware of the relationship of ArrayCoerceExpr to CaseTestExpr.  That was
somewhat fail-soft, in that the outcome of a wrong answer would be to
prevent optimizations that could have been made, but let's fix it while
we're at it.

Per bug #15471 from Matt Williams.  Back-patch to v11 where the faulty
logic came in.

Discussion: https://postgr.es/m/15471-1117f49271989bad@postgresql.org
2018-10-30 15:26:11 -04:00
Thomas Munro
c653cb27b4 Remove incorrect comment in dshash.c.
Back-patch to 11.

Author: Antonin Houska
Discussion: https://postgr.es/m/8726.1540553521%40localhost
2018-10-29 13:00:11 +13:00
Andrew Dunstan
372102b81d Correctly set t_self for heap tuples in expand_tuple
Commit 16828d5c0 incorrectly set an invalid pointer for t_self for heap
tuples. This patch correctly copies it from the source tuple, and
includes a regression test that relies on it being set correctly.

Backpatch to release 11.

Fixes bug #15448 reported by Tillmann Schulz

Diagnosis and test case by Amit Langote
2018-10-24 10:57:35 -04:00
Tom Lane
7aaeb7b45a Server-side fix for delayed NOTIFY and SIGTERM processing.
Commit 4f85fde8e introduced some code that was meant to ensure that we'd
process cancel, die, sinval catchup, and notify interrupts while waiting
for client input.  But there was a flaw: it supposed that the process
latch would be set upon arrival at secure_read() if any such interrupt
was pending.  In reality, we might well have cleared the process latch
at some earlier point while those flags remained set -- particularly
notifyInterruptPending, which can't be handled as long as we're within
a transaction.

To fix the NOTIFY case, also attempt to process signals (except
ProcDiePending) before trying to read.

Also, if we see that ProcDiePending is set before we read, forcibly set the
process latch to ensure that we will handle that signal promptly if no data
is available.  I also made it set the process latch on the way out, in case
there is similar logic elsewhere.  (It remains true that we won't service
ProcDiePending here unless we need to wait for input.)

The code for handling ProcDiePending during a write needs those changes,
too.

Also be a little more careful about when to reset whereToSendOutput,
and improve related comments.

Back-patch to 9.5 where this code was added.  I'm not entirely convinced
that older branches don't have similar issues, but the complaint at hand
is just about the >= 9.5 code.

Jeff Janes and Tom Lane

Discussion: https://postgr.es/m/CAOYf6ec-TmRYjKBXLLaGaB-jrd=mjG1Hzn1a1wufUAR39PQYhw@mail.gmail.com
2018-10-19 21:39:21 -04:00
Tom Lane
06292bb949 Add missing quote_identifier calls for CREATE TRIGGER ... REFERENCING.
Mixed-case names for transition tables weren't dumped correctly.
Oversight in commit 8c48375e5, per bug #15440 from Karl Czajkowski.

In passing, I couldn't resist a bit of code beautification.

Back-patch to v10 where this was introduced.

Discussion: https://postgr.es/m/15440-02d1468e94d63d76@postgresql.org
2018-10-19 00:50:16 -04:00
Tom Lane
d112682373 Avoid statically allocating gmtsub()'s timezone workspace.
localtime.c's "struct state" is a rather large object, ~23KB.  We were
statically allocating one for gmtsub() to use to represent the GMT
timezone, even though that function is not at all heavily used and is
never reached in most backends.  Let's malloc it on-demand, instead.

This does pose the question of how to handle a malloc failure, but
there's already a well-defined error report convention here, ie
set errno and return NULL.

We have but one caller of pg_gmtime in HEAD, and two in back branches,
neither of which were troubling to check for error.  Make them do so.
The possible errors are sufficiently unlikely (out-of-range timestamp,
and now malloc failure) that I think elog() is adequate.

Back-patch to all supported branches to keep our copies of the IANA
timezone code in sync.  This particular change is in a stanza that
already differs from upstream, so it's a wash for maintenance purposes
--- but only as long as we keep the branches the same.

Discussion: https://postgr.es/m/20181015200754.7y7zfuzsoux2c4ya@alap3.anarazel.de
2018-10-16 11:50:18 -04:00
Tom Lane
db9034bf76 Check for stack overrun in standard_ProcessUtility().
ProcessUtility can recurse, and indeed can be driven to infinite
recursion, so it ought to have a check_stack_depth() call.  This
covers the reported bug (portal trying to execute itself) and a bunch
of other cases that could perhaps arise somewhere.

Per bug #15428 from Malthe Borch.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/15428-b3c2915ec470b033@postgresql.org
2018-10-15 14:01:38 -04:00
Peter Eisentraut
6c6deadb04 Translation updates
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 63764ec4ef426dc469efe1cbcd9f2c45ef9fbe95
2018-10-15 11:33:11 +02:00
Michael Paquier
18781cd2af Avoid duplicate XIDs at recovery when building initial snapshot
On a primary, sets of XLOG_RUNNING_XACTS records are generated on a
periodic basis to allow recovery to build the initial state of
transactions for a hot standby.  The set of transaction IDs is created
by scanning all the entries in ProcArray.  However it happens that its
logic never counted on the fact that two-phase transactions finishing to
prepare can put ProcArray in a state where there are two entries with
the same transaction ID, one for the initial transaction which gets
cleared when prepare finishes, and a second, dummy, entry to track that
the transaction is still running after prepare finishes.  This way
ensures a continuous presence of the transaction so as callers of for
example TransactionIdIsInProgress() are always able to see it as alive.

So, if a XLOG_RUNNING_XACTS takes a standby snapshot while a two-phase
transaction finishes to prepare, the record can finish with duplicated
XIDs, which is a state expected by design.  If this record gets applied
on a standby to initial its recovery state, then it would simply fail,
so the odds of facing this failure are very low in practice.  It would
be tempting to change the generation of XLOG_RUNNING_XACTS so as
duplicates are removed on the source, but this requires to hold on
ProcArrayLock for longer and this would impact all workloads,
particularly those using heavily two-phase transactions.

XLOG_RUNNING_XACTS is also actually used only to initialize the standby
state at recovery, so instead the solution is taken to discard
duplicates when applying the initial snapshot.

Diagnosed-by: Konstantin Knizhnik
Author: Michael Paquier
Discussion: https://postgr.es/m/0c96b653-4696-d4b4-6b5d-78143175d113@postgrespro.ru
Backpatch-through: 9.3
2018-10-14 22:23:29 +09:00
Tom Lane
3d059655be Simplify use of AllocSetContextCreate() wrapper macro.
We can allow this macro to accept either abbreviated or non-abbreviated
allocation parameters by making use of __VA_ARGS__.  As noted by Andres
Freund, it's unlikely that any compiler would have __builtin_constant_p
but not __VA_ARGS__, so this gives up little or no error checking, and
it avoids a minor but annoying API break for extensions.

With this change, there is no reason for anybody to call
AllocSetContextCreateExtended directly, so in HEAD I renamed it to
AllocSetContextCreateInternal.  It's probably too late for an ABI
break like that in 11, though.

Discussion: https://postgr.es/m/20181012170355.bhxi273skjt6sag4@alap3.anarazel.de
2018-10-12 14:26:56 -04:00
Alvaro Herrera
355684ee3f Correct attach/detach logic for FKs in partitions
There was no code to handle foreign key constraints on partitioned
tables in the case of ALTER TABLE DETACH; and if you happened to ATTACH
a partition that already had an equivalent constraint, that one was
ignored and a new constraint was created.  Adding this to the fact that
foreign key cloning reuses the constraint name on the partition instead
of generating a new name (as it probably should, to cater to SQL
standard rules about constraint naming within schemas), the result was a
pretty poor user experience -- the most visible failure was that just
detaching a partition and re-attaching it failed with an error such as

  ERROR:  duplicate key value violates unique constraint "pg_constraint_conrelid_contypid_conname_index"
  DETAIL:  Key (conrelid, contypid, conname)=(26702, 0, test_result_asset_id_fkey) already exists.

because it would try to create an identically-named constraint in the
partition.  To make matters worse, if you tried to drop the constraint
in the now-independent partition, that would fail because the constraint
was still seen as dependent on the constraint in its former parent
partitioned table:
  ERROR:  cannot drop inherited constraint "test_result_asset_id_fkey" of relation "test_result_cbsystem_0001_0050_monthly_2018_09"

This fix attacks the problem from two angles: first, when the partition
is detached, the constraint is also marked as independent, so the drop
now works.  Second, when the partition is re-attached, we scan existing
constraints searching for one matching the FK in the parent, and if one
exists, we link that one to the parent constraint.  So we don't end up
with a duplicate -- and better yet, we don't need to scan the referenced
table to verify that the constraint holds.

To implement this I made a small change to previously planner-only
struct ForeignKeyCacheInfo to contain the constraint OID; also relcache
now maintains the list of FKs for partitioned tables too.

Backpatch to 11.

Reported-by: Michael Vitale (bug #15425)
Discussion: https://postgr.es/m/15425-2dbc9d2aa999f816@postgresql.org
2018-10-12 12:38:03 -03:00
Andres Freund
88670a4366 Fix logical decoding error when system table w/ toast is repeatedly rewritten.
Repeatedly rewriting a mapped catalog table with VACUUM FULL or
CLUSTER could cause logical decoding to fail with:
ERROR, "could not map filenode \"%s\" to relation OID"

To trigger the problem the rewritten catalog had to have live tuples
with toasted columns.

The problem was triggered as during catalog table rewrites the
heap_insert() check that prevents logical decoding information to be
emitted for system catalogs, failed to treat the new heap's toast table
as a system catalog (because the new heap is not recognized as a
catalog table via RelationIsLogicallyLogged()). The relmapper, in
contrast to the normal catalog contents, does not contain historical
information. After a single rewrite of a mapped table the new relation
is known to the relmapper, but if the table is rewritten twice before
logical decoding occurs, the relfilenode cannot be mapped to a
relation anymore.  Which then leads us to error out.   This only
happens for toast tables, because the main table contents aren't
re-inserted with heap_insert().

The fix is simple, add a new heap_insert() flag that prevents logical
decoding information from being emitted, and accept during decoding
that there might not be tuple data for toast tables.

Unfortunately that does not fix pre-existing logical decoding
errors. Doing so would require not throwing an error when a filenode
cannot be mapped to a relation during decoding, and that seems too
likely to hide bugs.  If it's crucial to fix decoding for an existing
slot, temporarily changing the ERROR in ReorderBufferCommit() to a
WARNING appears to be the best fix.

Author: Andres Freund
Discussion: https://postgr.es/m/20180914021046.oi7dm4ra3ot2g2kt@alap3.anarazel.de
Backpatch: 9.4-, where logical decoding was introduced
2018-10-10 13:53:02 -07:00
Tom Lane
1145c26b74 Advance transaction timestamp for intra-procedure transactions.
Per discussion, this behavior seems less astonishing than not doing so.

Peter Eisentraut and Tom Lane

Discussion: https://postgr.es/m/20180920234040.GC29981@momjian.us
2018-10-08 16:16:36 -04:00
Alvaro Herrera
8569ef63f4 Silence compiler warning in Assert()
gcc 6.3 does not whine about this mistake I made in 39808e8868c8 but
evidently lots of other compilers do, according to Michael Paquier,
Peter Eisentraut, Arthur Zakirov, Tomas Vondra.

Discussion: too many to list
2018-10-08 10:36:33 -03:00
Peter Eisentraut
69ff26b96f Translation updates
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 64b916c6c8a34d9e6aad88e78cc2356a941f1335
2018-10-08 12:03:54 +02:00
Peter Eisentraut
c2055a4a17 Track procedure calls in pg_stat_user_functions
This was forgotten when procedures were implemented.

Reported-by: Lukas Fittl <lukas@fittl.com>
2018-10-08 11:23:07 +02:00
Michael Paquier
c8ed820c68 Improve two error messages related to foreign keys on partitioned tables
Error messages for creating a foreign key on a partitioned table using
ONLY or NOT VALID were wrong in mentioning the objects they worked on.
This commit adds on the way some regression tests missing for those
cases.

Author: Laurenz Albe
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/c11c05810a9ed65e9b2c817a9ef442275a32fe80.camel@cybertec.at
2018-10-08 17:57:52 +09:00
Alvaro Herrera
1c7f585b5c Fix catalog insertion order for ATTACH PARTITION
Commit 2fbdf1b38bc changed the order in which we inserted catalog rows
when creating partitions, so that we could remove an unsightly hack
required for untimely relcache invalidations.  However, that commit only
changed the ordering for CREATE TABLE PARTITION OF, and left ALTER TABLE
ATTACH PARTITION unchanged, so the latter can be affected when catalog
invalidations occur, for instance when the partition key involves an SQL
function.

Reported-by: Rajkumar Raghuwanshi
Author: Amit Langote
Reviewed-by: Michaël Paquier
Discussion: https://postgr.es/m/CAKcux6=nTz9KSfTr_6Z2mpzLJ_09JN-rK6=dWic6gGyTSWueyQ@mail.gmail.com
2018-10-06 22:13:19 -03:00
Alvaro Herrera
1a852f7c1e Fix event triggers for partitioned tables
Index DDL cascading on partitioned tables introduced a way for ALTER
TABLE to be called reentrantly.  This caused an an important deficiency
in event trigger support to be exposed: on exiting the reentrant call,
the alter table state object was clobbered, causing a crash when the
outer alter table tries to finalize its processing.  Fix the crash by
creating a stack of event trigger state objects.  There are still ways
to cause things to misbehave (and probably other crashers) with more
elaborate tricks, but at least it now doesn't crash in the obvious
scenario.

Backpatch to 9.5, where DDL deparsing of event triggers was introduced.

Reported-by: Marco Slot
Authors: Michaël Paquier, Álvaro Herrera
Discussion: https://postgr.es/m/CANNhMLCpi+HQ7M36uPfGbJZEQLyTy7XvX=5EFkpR-b1bo0uJew@mail.gmail.com
2018-10-06 19:17:46 -03:00
Tom Lane
6bf278df89 Propagate xactStartTimestamp and stmtStartTimestamp to parallel workers.
Previously, a worker process would establish values for these based on
its own start time.  In v10 and up, this can trivially be shown to cause
misbehavior of transaction_timestamp(), timestamp_in(), and related
functions which are (perhaps unwisely?) marked parallel-safe.  It seems
likely that other behaviors might diverge from what happens in the parent
as well.

It's not as trivial to demonstrate problems in 9.6 or 9.5, but I'm sure
it's still possible, so back-patch to all branches containing parallel
worker infrastructure.

In HEAD only, mark now() and statement_timestamp() as parallel-safe
(other affected functions already were).  While in theory we could
still squeeze that change into v11, it doesn't seem important enough
to force a last-minute catversion bump.

Konstantin Knizhnik, whacked around a bit by me

Discussion: https://postgr.es/m/6406dbd2-5d37-4cb6-6eb2-9c44172c7e7c@postgrespro.ru
2018-10-06 12:00:09 -04:00
Michael Paquier
c905b67b79 Assign constraint name when cloning FK definition for partitions
This is for example used when attaching a partition to a partitioned
table which includes foreign keys, and in this case the constraint name
has been missing in the data cloned.  This could lead to hard crashes,
as when validating the foreign key constraint, the constraint name is
always expected.  Particularly, when using log_min_messages >= DEBUG1, a
log message would be generated with this unassigned constraint name,
leading to an assertion failure on HEAD.

While on it, rename a variable in ATExecAttachPartition which was
declared twice with the same name.

Author: Michael Paquier
Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/20181005042236.GG1629@paquier.xyz
Backpatch-through: 11
2018-10-06 14:59:52 +09:00
Tom Lane
67e7d4da72 Allow btree comparison functions to return INT_MIN.
Historically we forbade datatype-specific comparison functions from
returning INT_MIN, so that it would be safe to invert the sort order
just by negating the comparison result.  However, this was never
really safe for comparison functions that directly return the result
of memcmp(), strcmp(), etc, as POSIX doesn't place any such restriction
on those library functions.  Buildfarm results show that at least on
recent Linux on s390x, memcmp() actually does return INT_MIN sometimes,
causing sort failures.

The agreed-on answer is to remove this restriction and fix relevant
call sites to not make such an assumption; code such as "res = -res"
should be replaced by "INVERT_COMPARE_RESULT(res)".  The same is needed
in a few places that just directly negated the result of memcmp or
strcmp.

To help find places having this problem, I've also added a compile option
to nbtcompare.c that causes some of the commonly used comparators to
return INT_MIN/INT_MAX instead of their usual -1/+1.  It'd likely be
a good idea to have at least one buildfarm member running with
"-DSTRESS_SORT_INT_MIN".  That's far from a complete test of course,
but it should help to prevent fresh introductions of such bugs.

This is a longstanding portability hazard, so back-patch to all supported
branches.

Discussion: https://postgr.es/m/20180928185215.ffoq2xrq5d3pafna@alap3.anarazel.de
2018-10-05 16:01:29 -04:00
Alvaro Herrera
ff347f8aff Fix duplicate primary keys in partitions
When using the CREATE TABLE .. PARTITION OF syntax, it's possible to
cause a partition to get two primary keys if the parent already has one.
Tighten the check to disallow that.

Reported-by: Rajkumar Raghuwanshi
Author: Amul Sul
Discussion: https://postgr.es/m/CAKcux6=OnSV3-qd8Gb6W=KPPwcCz6Fe_O_MQYjTa24__Xn8XxA@mail.gmail.com
2018-10-04 11:40:10 -03:00
Andres Freund
e97c4d967b Fix issues around EXPLAIN with JIT.
I (Andres) was more than a bit hasty in committing 33001fd7a7072d48327
after last minute changes, leading to a number of problems (jit output
was only shown for JIT in parallel workers, and just EXPLAIN without
ANALYZE didn't work).  Lukas luckily found these issues quickly.

Instead of combining instrumentation in in standard_ExecutorEnd(), do
so on demand in the new ExplainPrintJITSummary().

Also update a documentation example of the JIT output, changed in
52050ad8ebec8d831.

Author: Lukas Fittl, with minor changes by me
Discussion: https://postgr.es/m/CAP53PkxmgJht69pabxBXJBM+0oc6kf3KHMborLP7H2ouJ0CCtQ@mail.gmail.com
Backpatch: 11, where JIT compilation was introduced
2018-10-03 13:11:44 -07:00
Amit Kapila
ca5ca25d08 MAXALIGN the target address where we store flattened value.
The API (EOH_flatten_into) that flattens the expanded value representation
expects the target address to be maxaligned.  All it's usage adhere to that
principle except when serializing datums for parallel query.  Fix that
usage.

Diagnosed-by: Tom Lane
Author: Tom Lane and Amit Kapila
Backpatch-through: 9.6
Discussion: https://postgr.es/m/11629.1536550032@sss.pgh.pa.us
2018-10-03 09:14:09 +05:30
Tom Lane
419cc8add5 Fix corner-case failures in has_foo_privilege() family of functions.
The variants of these functions that take numeric inputs (OIDs or
column numbers) are supposed to return NULL rather than failing
on bad input; this rule reduces problems with snapshot skew when
queries apply the functions to all rows of a catalog.

has_column_privilege() had careless handling of the case where the
table OID didn't exist.  You might get something like this:
	select has_column_privilege(9999,'nosuchcol','select');
	ERROR:  column "nosuchcol" of relation "(null)" does not exist
or you might get a crash, depending on the platform's printf's response
to a null string pointer.

In addition, while applying the column-number variant to a dropped
column returned NULL as desired, applying the column-name variant
did not:
	select has_column_privilege('mytable','........pg.dropped.2........','select');
	ERROR:  column "........pg.dropped.2........" of relation "mytable" does not exist
It seems better to make this case return NULL as well.

Also, the OID-accepting variants of has_foreign_data_wrapper_privilege,
has_server_privilege, and has_tablespace_privilege didn't follow the
principle of returning NULL for nonexistent OIDs.  Superusers got TRUE,
everybody else got an error.

Per investigation of Jaime Casanova's report of a new crash in HEAD.
These behaviors have been like this for a long time, so back-patch to
all supported branches.

Patch by me; thanks to Stephen Frost for discussion and review

Discussion: https://postgr.es/m/CAJGNTeP=-6Gyqq5TN9OvYEydi7Fv1oGyYj650LGTnW44oAzYCg@mail.gmail.com
2018-10-02 11:54:12 -04:00
Peter Eisentraut
3b983c3354 Change PROCEDURE to FUNCTION in CREATE EVENT TRIGGER syntax
This was claimed to have been done in
0a63f996e018ac508c858e87fa39cc254a5db49f, but that actually only
changed the documentation and not the grammar.  (That commit did fully
change it for CREATE TRIGGER.)
2018-10-01 23:04:46 +02:00
Tom Lane
4c985549fe Fix ALTER COLUMN TYPE to not open a relation without any lock.
If the column being modified is referenced by a foreign key constraint
of another table, ALTER TABLE would open the other table (to re-parse
the constraint's definition) without having first obtained a lock on it.
This was evidently intentional, but that doesn't mean it's really safe.
It's especially not safe in 9.3, which pre-dates use of MVCC scans for
catalog reads, but even in current releases it doesn't seem like a good
idea.

We know we'll need AccessExclusiveLock shortly to drop the obsoleted
constraint, so just get that a little sooner to close the hole.

Per testing with a patch that complains if we open a relation without
holding any lock on it.  I don't plan to back-patch that patch, but we
should close the holes it identifies in all supported branches.

Discussion: https://postgr.es/m/2038.1538335244@sss.pgh.pa.us
2018-10-01 11:39:13 -04:00
Tom Lane
a8fb03fcea Improve error reporting for unsupported effective_io_concurrency setting.
Give a specific error complaining about lack of posix_fadvise() when
someone tries to set effective_io_concurrency > 0 on platforms
without that.

This probably isn't worth extensive back-patching, but I (tgl) felt
cramming it into v11 was reasonable.

James Robinson

Discussion: https://postgr.es/m/153771876450.14994.560017943128223619@wrigleys.postgresql.org
Discussion: https://postgr.es/m/A3942987-5BC7-4F05-B54D-2A0EC2914B33@jlr-photo.com
2018-09-28 16:12:13 -04:00
Amit Kapila
6c8671bc39 Fix assertion failure when updating full_page_writes for checkpointer.
When the checkpointer receives a SIGHUP signal to update its configuration,
it may need to update the shared memory for full_page_writes and need to
write a WAL record for it.  Now, it is quite possible that the XLOG
machinery has not been initialized by that time and it will lead to
assertion failure while doing that.  Fix is to allow the initialization of
the XLOG machinery outside critical section.

This bug has been introduced by the commit 2c03216d83 which added the XLOG
machinery initialization in RecoveryInProgress code path.

Reported-by: Dilip Kumar
Author: Dilip Kumar
Reviewed-by: Michael Paquier and Amit Kapila
Backpatch-through: 9.5
Discussion: https://postgr.es/m/CAFiTN-u4BA8KXcQUWDPNgaKAjDXC=C2whnzBM8TAcv=stckYUw@mail.gmail.com
2018-09-28 16:40:43 +05:30
Michael Paquier
88926fd481 Fix WAL recycling on standbys depending on archive_mode
A restart point or a checkpoint recycling WAL segments treats segments
marked with neither ".done" (archiving is done) or ".ready" (segment is
ready to be archived) in archive_status the same way for archive_mode
being "on" or "always".  While for a primary this is fine, a standby
running a restart point with archive_mode = on would try to mark such a
segment as ready for archiving, which is something that will never
happen except after the standby is promoted.

Note that this problem applies only to WAL segments coming from the
local pg_wal the first time archive recovery is run.  Segments part of a
self-contained base backup are the most common case where this could
happen, however even in this case normally the .done markers would be
most likely part of the backup.  Segments recovered from an archive are
marked as .ready or .done by the startup process, and segments finished
streaming are marked as such by the WAL receiver, so they are handled
already.

Reported-by: Haruka Takatsuka
Author: Michael Paquier
Discussion: https://postgr.es/m/15402-a453c90ed4cf88b2@postgresql.org
Backpatch-through: 9.5, where archive_mode = always has been added.
2018-09-28 11:55:43 +09:00
Tom Lane
49507dec46 Fix assorted bugs in pg_get_partition_constraintdef().
It failed if passed a nonexistent relation OID, or one that was a non-heap
relation, because of blindly applying heap_open to a user-supplied OID.
This is not OK behavior for a SQL-exposed function; we have a project
policy that we should return NULL in such cases.  Moreover, since
pg_get_partition_constraintdef ought now to work on indexes, restricting
it to heaps is flat wrong anyway.

The underlying function generate_partition_qual() wasn't on board with
indexes having partition quals either, nor for that matter with rels
having relispartition set but yet null relpartbound.  (One wonders
whether the person who wrote the function comment blocks claiming that
these functions allow a missing relpartbound had ever tested it.)

Fix by testing relispartition before opening the rel, and by using
relation_open not heap_open.  (If any other relkinds ever grow the
ability to have relispartition set, the code will work with them
automatically.)  Also, don't reject null relpartbound in
generate_partition_qual.

Back-patch to v11, and all but the null-relpartbound change to v10.
(It's not really necessary to change generate_partition_qual at all
in v10, but I thought s/heap_open/relation_open/ would be a good
idea anyway just to keep the code in sync with later branches.)

Per report from Justin Pryzby.

Discussion: https://postgr.es/m/20180927200020.GJ776@telsasoft.com
2018-09-27 18:15:06 -04:00
Peter Eisentraut
6485469606 Recurse to sequences on ownership change for all relkinds
When a table ownership is changed, we must apply that also to any owned
sequences.  (Otherwise, it would result in a situation that cannot be
restored, because linked sequences must have the same owner as the
table.)  But this was previously only applied to regular tables and
materialized views.  But it should also apply to at least foreign
tables.  This patch removes the relkind check altogether, because it
doesn't save very much and just introduces the possibility of similar
omissions.

Bug: #15238
Reported-by: Christoph Berg <christoph.berg@credativ.de>
2018-09-26 20:19:31 +02:00
Michael Paquier
180feb8c7e Rework activation of commit timestamps during recovery
The activation and deactivation of commit timestamp tracking has not
been handled consistently for a primary or standbys at recovery.  The
facility can be activated at three different moments of recovery:
- The beginning, where a primary would use the GUC value for the
decision-making, and where a standby relies on the contents of the
control file.
- When replaying a XLOG_PARAMETER_CHANGE record at redo.
- The end, where both primary and standby rely on the GUC value.

Using the GUC value for a primary at the beginning of recovery causes
problems with commit timestamp access when doing crash recovery.
Particularly, when replaying transaction commits, it could be possible
that an attempt to read commit timestamps is done for a transaction
which committed at a moment when track_commit_timestamp was disabled.

A test case is added to reproduce the failure.  The test works down to
v11 as it takes advantage of transaction commits within procedures.

Reported-by: Hailong Li
Author: Masahiko Sawasa, Michael Paquier
Reviewed-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/11224478-a782-203b-1f17-e4797b39bdf0@qunar.com
Backpatch-through: 9.5, where commit timestamps have been introduced.
2018-09-26 10:29:20 +09:00
Alvaro Herrera
2886a8f704 Remove obsolete comment
The documented shortcoming was actually fixed in 4c728f3829
so the comment is not true anymore.
2018-09-25 17:55:36 -03:00
Andres Freund
e63441c3f5 Collect JIT instrumentation from workers.
Previously, when using parallel query, EXPLAIN (ANALYZE)'s JIT
compilation timings did not include the overhead from doing so on the
workers.  Fix that.

We do so by simply aggregating the cost of doing JIT compilation on
workers and the leader together. Arguably that's not quite accurate,
because the total time spend doing so is spent in parallel - but it's
hard to do much better.  For additional detail, when VERBOSE is
specified, the stats for workers are displayed separately.

Author: Amit Khandekar and Andres Freund
Discussion: https://postgr.es/m/CAJ3gD9eLrz51RK_gTkod+71iDcjpB_N8eC6vU2AW-VicsAERpQ@mail.gmail.com
Backpatch: 11-
2018-09-25 13:12:45 -07:00
Andres Freund
6859bd2632 Make EXPLAIN output for JIT compilation more dense.
A discussion about also reporting JIT compilation overhead on workers
brought unhappiness with the verbosity of the current explain format
to light.  Make the text format more dense, and restructure the
structured output to mirror that more closely.

As we're re-jiggering the output format anyway: The denser format
allows us to report all flags for JIT compilation (now also reporting
PGJIT_EXPR and PGJIT_DEFORM), and report the total time in addition to
the individual times.

Per complaint from Tom Lane.

Author: Andres Freund
Discussion: https://postgr.es/m/27812.1537221015@sss.pgh.pa.us
Backpatch: 11-, where JIT compilation was introduced
2018-09-24 13:35:46 -07:00
Andrew Dunstan
9625ab7924 Fast default trigger and expand_tuple fixes
Ensure that triggers get properly filled in tuples for the OLD value.
Also fix the logic of detecting missing null values. The previous logic
failed to detect a missing null column before the first missing column
with a default. Fixing this has simplified the logic a bit.

Regression tests are added to test changes. This should ensure better
coverage of expand_tuple().

Original bug reports, and some code and test scripts from Tomas Vondra

Backpatch to release 11.
2018-09-24 16:20:08 -04:00
Tom Lane
bfdd02f88b Fix over-allocation of space for array_out()'s result string.
array_out overestimated the space needed for its output, possibly by
a very substantial amount if the array is multi-dimensional, because
of wrong order of operations in the loop that counts the number of
curly-brace pairs needed.  While the output string is normally
short-lived, this could still cause problems in extreme cases.

An additional minor error was that it counted one more delimiter than
is actually needed.

Repair those errors, add an Assert that the space is now correctly
calculated, and make some minor improvements in the comments.

I also failed to resist the temptation to get rid of an integer
modulus operation per array element; a simple comparison is sufficient.

This bug dates clear back to Berkeley days, so back-patch to all
supported versions.

Keiichi Hirobe, minor additional work by me

Discussion: https://postgr.es/m/CAH=EFxE9W0tRvQkixR2XJRRCToUYUEDkJZk6tnADXugPBRdcdg@mail.gmail.com
2018-09-24 11:30:51 -04:00
Noah Misch
89f2b64da3 Initialize random() in bootstrap/stand-alone postgres and in initdb.
This removes a difference between the standard IsUnderPostmaster
execution environment and that of --boot and --single.  In a stand-alone
backend, "SELECT random()" always started at the same seed.

On a system capable of using posix shared memory, initdb could still
conclude "selecting dynamic shared memory implementation ... sysv".
Crashed --boot or --single postgres processes orphaned shared memory
objects having names that collided with the not-actually-random names
that initdb probed.  The sysv fallback appeared after ten crashes of
--boot or --single postgres.  Since --boot and --single are rare in
production use, systems used for PostgreSQL development are the
principal candidate to notice this symptom.

Back-patch to 9.3 (all supported versions).  PostgreSQL 9.4 introduced
dynamic shared memory, but 9.3 does share the "SELECT random()" problem.

Reviewed by Tom Lane and Kyotaro HORIGUCHI.

Discussion: https://postgr.es/m/20180915221546.GA3159382@rfd.leadboat.com
2018-09-23 22:56:42 -07:00
Tom Lane
fe30cd25ec Fix failure in WHERE CURRENT OF after rewinding the referenced cursor.
In a case where we have multiple relation-scan nodes in a cursor plan,
such as a scan of an inheritance tree, it's possible to fetch from a
given scan node, then rewind the cursor and fetch some row from an
earlier scan node.  In such a case, execCurrent.c mistakenly thought
that the later scan node was still active, because ExecReScan hadn't
done anything to make it look not-active.  We'd get some sort of
failure in the case of a SeqScan node, because the node's scan tuple
slot would be pointing at a HeapTuple whose t_self gets reset to
invalid by heapam.c.  But it seems possible that for other relation
scan node types we'd actually return a valid tuple TID to the caller,
resulting in updating or deleting a tuple that shouldn't have been
considered current.  To fix, forcibly clear the ScanTupleSlot in
ExecScanReScan.

Another issue here, which seems only latent at the moment but could
easily become a live bug in future, is that rewinding a cursor does
not necessarily lead to *immediately* applying ExecReScan to every
scan-level node in the plan tree.  Upper-level nodes will think that
they can postpone that call if their child node is already marked
with chgParam flags.  I don't see a way for that to happen today in
a plan tree that's simple enough for execCurrent.c's search_plan_tree
to understand, but that's one heck of a fragile assumption.  So, add
some logic in search_plan_tree to detect chgParam flags being set on
nodes that it descended to/through, and assume that that means we
should consider lower scan nodes to be logically reset even if their
ReScan call hasn't actually happened yet.

Per bug #15395 from Matvey Arye.  This has been broken for a long time,
so back-patch to all supported branches.

Discussion: https://postgr.es/m/153764171023.14986.280404050547008575@wrigleys.postgresql.org
2018-09-23 16:05:45 -04:00
Thomas Munro
ec7b27675d Use size_t consistently in dsa.{ch}.
Takeshi Ideriha complained that there is a mixture of Size and size_t
in dsa.c and corresponding header.  Let's use size_t.  Back-patch to 10
where dsa.c landed, to make future back-patching easy.

Discussion: https://postgr.es/m/4E72940DA2BF16479384A86D54D0988A6F19ABD9%40G01JPEXMBKW04
2018-09-22 00:41:49 +12:00
Thomas Munro
8ffc3be10f Fix segment_bins corruption in dsa.c.
If a segment has been freed by dsa.c because it is entirely empty, other
backends must make sure to unmap it before following links to new
segments that might happen to have the same index number, or they could
finish up looking at a defunct segment and then corrupt the segment_bins
lists.  The correct protocol requires checking freed_segment_counter
after acquiring the area lock and before resolving any index number to a
segment.  Add the missing checks and an assertion.

Back-patch to 10, where dsa.c first arrived.

Author: Thomas Munro
Reported-by: Tomas Vondra
Discussion: https://postgr.es/m/CAEepm%3D0thg%2Bja5zGVa7jBy-uqyHrTqTm8HGhEOtMmigGrAqTbw%40mail.gmail.com
2018-09-20 15:59:34 +12:00
Thomas Munro
9d178fb928 Defer restoration of libraries in parallel workers.
Several users of extensions complained of crashes in parallel workers
that turned out to be due to syscache access from their _PG_init()
functions.  Reorder the initialization of parallel workers so that
libraries are restored after the caches are initialized, and inside a
transaction.

This was reported in bug #15350 and elsewhere.  We don't consider it
to be a bug: extensions shouldn't do that, because then they can't be
used in shared_preload_libraries.  However, it's a fairly obscure
hazard and these extensions worked in practice before parallel query
came along.  So let's make it work.  Later commits might add a warning
message and eventually an error.

Back-patch to 9.6, where parallel query landed.

Author: Thomas Munro
Reviewed-by: Amit Kapila
Reported-by: Kieran McCusker, Jimmy
Discussion: https://postgr.es/m/153512195228.1489.8545997741965926448%40wrigleys.postgresql.org
2018-09-20 14:21:32 +12:00
Tom Lane
6449616419 Fix minor error message style guide violation.
No periods at the ends of primary error messages, please.

Daniel Gustafsson

Discussion: https://postgr.es/m/43E004C0-18C6-42B4-A313-003B43EB0571@yesql.se
2018-09-19 17:06:53 -04:00
Tom Lane
c84f83975e Don't ignore locktable-full failures in StandbyAcquireAccessExclusiveLock.
Commit 37c54863c removed the code in StandbyAcquireAccessExclusiveLock
that checked the return value of LockAcquireExtended.  That created a
bug, because it's still passing reportMemoryError = false to
LockAcquireExtended, meaning that LOCKACQUIRE_NOT_AVAIL will be returned
if we're out of shared memory for the lock table.

In such a situation, the startup process would believe it had acquired an
exclusive lock even though it hadn't, with potentially dire consequences.

To fix, just drop the use of reportMemoryError = false, which allows us
to simplify the call into a plain LockAcquire().  It's unclear that the
locktable-full situation arises often enough that it's worth having a
better recovery method than crash-and-restart.  (I strongly suspect that
the only reason the code path existed at all was that it was relatively
simple to do in the pre-37c54863c implementation.  But now it's not.)

LockAcquireExtended's reportMemoryError parameter is now dead code and
could be removed.  I refrained from doing so, however, because there
was some interest in resurrecting the behavior if we do get reports of
locktable-full failures in the field.  Also, it seems unwise to remove
the parameter concurrently with shipping commit f868a8143, which added a
parameter; if there are any third-party callers of LockAcquireExtended,
we want them to get a wrong-number-of-parameters compile error rather
than a possibly-silent misinterpretation of its last parameter.

Back-patch to 9.6 where the bug was introduced.

Discussion: https://postgr.es/m/6202.1536359835@sss.pgh.pa.us
2018-09-19 12:43:51 -04:00