Produce a NOTICE when the label already exists, for consistency with other
CREATE IF NOT EXISTS commands. Also, fix the code so it produces something
more user-friendly than an index violation when the label already exists.
This not incidentally enables making a regression test that the previous
patch didn't make for fear of exposing an unpredictable OID in the results.
Also some wordsmithing on the documentation.
If the label is already in the enum the statement becomes a no-op.
This will reduce the pain that comes from our not allowing this
operation inside a transaction block.
Andrew Dunstan, reviewed by Tom Lane and Magnus Hagander.
The previous scheme had bugs in some corner cases involving tables that had
been renamed since a view was made. This could result in dumped views that
failed to reload or reloaded incorrectly, as seen in bug #7553 from Lloyd
Albin, as well as in some pgsql-hackers discussion back in January. Also,
its behavior for printing EXPLAIN plans was sometimes confusing because of
willingness to use the same alias for multiple RTEs (it was Ashutosh
Bapat's complaint about that aspect that started the January thread).
To fix, ensure that each RTE in the query has a unique unqualified alias,
by modifying the alias if necessary (we add "_" and digits as needed to
create a non-conflicting name). Then we can just print its variables with
that alias, avoiding the confusing and bug-prone scheme of sometimes
schema-qualifying variable names. In EXPLAIN, it proves to be expedient to
take the further step of only assigning such aliases to RTEs that are
actually referenced in the query, since the planner has a habit of
generating extra RTEs with the same alias in situations such as
inheritance-tree expansion.
Although this fixes a bug of very long standing, I'm hesitant to back-patch
such a noticeable behavioral change. My experiments while creating a
regression test convinced me that actually incorrect output (as opposed to
confusing output) occurs only in very narrow cases, which is backed up by
the lack of previous complaints from the field. So we may be better off
living with it in released branches; and in any case it'd be smart to let
this ripen awhile in HEAD before we consider back-patching it.
Similar changes were done to pg_hba.conf earlier already, this commit makes
pg_ident.conf to behave the same as pg_hba.conf.
This has two user-visible effects. First, if pg_ident.conf contains multiple
errors, the whole file is parsed at postmaster startup time and all the
errors are immediately reported. Before this patch, the file was parsed and
the errors were reported only when someone tries to connect using an
authentication method that uses the file, and the parsing stopped on first
error. Second, if you SIGHUP to reload the config files, and the new
pg_ident.conf file contains an error, the error is logged but the old file
stays in effect.
Also, regular expressions in pg_ident.conf are now compiled only once when
the file is loaded, rather than every time the a user is authenticated. That
should speed up authentication if you have a lot of regexps in the file.
Amit Kapila
These calls were removed in commit 4240e429d0c2d889d0cda23c618f94e12c13ade7
as part of a general refactoring and improvement of DDL locking. However,
there's a problem not solved by the rewrite, which is that GRANT/REVOKE
update pg_class.relacl without taking any particular lock on the target
table as such. If another backend fails to do AcceptInvalidationMessages,
it won't notice a recently-committed change in ACLs. Bug #7557 from Piotr
Czachur demonstrates that there's at least one code path in 9.2.0 in which
a command fails to do any AcceptInvalidationMessages calls at all, if the
current transaction already holds all the locks it will need.
Since we're hard up against the release deadline for 9.2.1, fix this by
putting back the AcceptInvalidationMessages calls in heap_openrv and
heap_openrv_extended, thereby restoring the historical behavior in this
area. We ought to look for a more elegant and perhaps more bulletproof
solution, but there's no time for that right now.
In commit 9e8da0f75731aaa7605cf4656c21ea09e84d2eb1, I improved btree
to handle ScalarArrayOpExpr quals natively, so that constructs like
"indexedcol IN (list)" could be supported by index-only scans. Using
such a qual results in multiple scans of the index, under-the-hood.
I went to some lengths to ensure that this still produces rows in index
order ... but I failed to recognize that if a higher-order index column
is lacking an equality constraint, rescans can produce out-of-order
data from that column. Tweak the planner to not expect sorted output
in that case. Per trouble report from Robert McGehee.
Currently, we are making mangled copies of plpython/{expected,sql} to
plpython/python3/{expected,sql}, and run the tests in
plpython/python3. This has the disadvantage that the regression.diffs
file, if any, ends up in plpython/python3, which is not the normal
location. If we instead make the mangled copies in
plpython/{expected,sql}/python3/, we can run the tests from the normal
directory, regression.diffs ends up the normal place, and the
pg_regress invocation also becomes a lot simpler. It's also more
obvious at run time what's going on, because the tests end up being
named "python3/something" in the test output.
Some experimentation with examples similar to bug #7539 has convinced me
that indxpath.c's original implementation of parameterized-path generation
was several bricks shy of a load. In general, if we are relying on a
particular outer rel or set of outer rels for a parameterized path, the
path should use every indexable join clause that's available from that rel
or rels. Any join clauses that get left out of the indexqual will end up
getting applied as plain filter quals (qpquals), and that's generally a
significant loser compared to having the index AM enforce them. (This is
particularly true with btree, which can skip the index scan entirely if
it can see that the given indexquals are mutually contradictory.) The
original heuristics failed to ensure this, though, and were overly
complicated anyway. Rewrite to make the code explicitly identify each
useful set of outer rels and then select all applicable join clauses for
each one. The one plan that changes in the regression tests is in fact
for the better according to the planner's cost estimates.
(Note: this is not a correctness issue but just a matter of plan quality.
I don't yet know what is going on in bug #7539, but I don't expect this
change to fix that.)
Recovery code documents clearly that a shutdown checkpoint is executed at
end of recovery - a shutdown checkpoint WAL record is written but the buffer
manager had been altered to treat end of recovery as a normal checkpoint.
This bug exacerbates the bufmgr relpersistence bug.
Bug spotted by Andres Freund, patch by me.
Looks like the correct size of DOS-ified tenk.data is 680800 not 680801.
(I got the latter from a version of unix2dos that appends a trailing ^Z,
which evidently is not git's practice.)
The idea here is to provide a more easily diagnosable failure diff when
the problem is that tenk.data has been DOS-ified, as I believe to be
happening currently on buildfarm member hamerkop. Per suggestion from
Magnus Hagander.
Also, sync output/largeobject_1.source with current regression test.
Failure to do that in commit 3a0e4d36ebd7f477822d5bae41ba121a40d22ccc
turns out to be the real reason that hamerkop has been complaining.
Given what we now know about the cause of this bug, it seems like it'd
be a real good idea to include it in the plperl regression tests, so as
to catch any platform-specific cases where the code gets misoptimized.
This at least saves some palloc overhead, and should furthermore reduce
the risk of anything going wrong, eg somebody resetting the context the
current_call_data record was in.
In commit 1bc16a946008a7cbb33a9a06a7c6765a807d7f59 I added a minor
optimization to drop the component variables of a GROUP BY expression from
the target list computed at the aggregation level of a query, if those Vars
weren't referenced elsewhere in the tlist. However, I overlooked that the
window-function planning code would deconstruct such expressions and thus
need to have access to their component variables. Fix it to not do that.
While at it, I removed the distinction between volatile and nonvolatile
window partition/order expressions: the code now computes all of them
at the aggregation level. This saves a relatively expensive check for
volatility, and it's unclear that the resulting plan isn't better anyway.
Per bug #7535 from Louis-David Mitterrand. Back-patch to 9.2.
I made multiple errors in commit 97532f7c29468010b87e40a04f8daa3eb097f654,
stemming mostly from failure to think about the available frequency data
as being element frequencies not value frequencies (so that occurrences of
different elements are not mutually exclusive). This led to sillinesses
such as estimating that "word" would match more rows than "word:*".
The choice to clamp to a minimum estimate of DEFAULT_TS_MATCH_SEL also
seems pretty ill-considered in hindsight, as it would frequently result in
an estimate much larger than the available data suggests. We do need some
sort of clamp, since a pattern not matching any of the MCELEMs probably
still needs a selectivity estimate of more than zero. I chose instead to
clamp to at least what a non-MCELEM word would be estimated as, preserving
the property that "word:*" doesn't get an estimate less than plain "word",
whether or not the word appears in MCELEM.
Per investigation of a gripe from Bill Martin, though I suspect that his
example case actually isn't even reaching the erroneous code.
Back-patch to 9.1 where this code was introduced.
validate_plperl_function() supposed that it could free an old
plperl_proc_desc struct immediately upon detecting that it was stale.
However, if a plperl function is called recursively, this could result
in deleting the struct out from under an outer invocation, leading to
misbehavior or crashes. Add a simple reference-count mechanism to
ensure that such structs are freed only when the last reference goes
away.
Per investigation of bug #7516 from Marko Tiikkaja. I am not certain
that this error explains his report, because he says he didn't have
any recursive calls --- but it's hard to see how else it could have
crashed right there. In any case, this definitely fixes some problems
in the area.
Back-patch to all active branches.
Investigation shows that some intermittent build failures in ecpg are the
result of a gmake bug that was reported quite some time ago:
http://savannah.gnu.org/bugs/?30653
Preventing parallel builds of the ecpg subdirectories seems to dodge the
bug. Per yesterday's pgsql-hackers discussion, there are some other things
in the subdirectory makefiles that seem rather unsafe for parallel builds
too, but there's little point in fixing them as long as we have to work
around a make bug.
Back-patch to 9.1; parallel builds weren't very well supported before
that anyway.
Commit 2cfb1c6f77734db81b6e74bcae630f93b94f69be fixed some issues caused
by Python 3.3 choosing to iterate through dict entries in a different order
than before. But here's another one: the test cases adjusted here made two
bad entries in a dict and expected the one complained of would always be
the same.
Possibly this should be back-patched further than 9.2, but there seems
little point unless the earlier fix is too.
Create an internal function pqDropConnection that does the physical socket
close and cleans up closely-associated state. This removes a bunch of ad
hoc, not always consistent closure code. The ulterior motive is to have a
single place to wait for a spawned child backend to exit, but this seems
like good cleanup even if that never happens.
I went back and forth on whether to include "conn->status = CONNECTION_BAD"
in pqDropConnection's actions, but for the moment decided not to. Only a
minority of the call sites actually want that, and in any case it's
arguable that conn->status is slightly higher-level state, and thus not
part of this function's purview.
This fix removes an unnecessary incompatibility with the old behavior of
the unix_socket_directory parameter. Since pathnames with embedded spaces
are fairly popular on some platforms, the incompatibility could be
significant in practice. We'll still strip unquoted leading/trailing
spaces, however.
No docs update since the documentation already implied that it worked
like this.
Per bug #7514 from Murray Cumming.
When the startup process restores a WAL file from the archive, it deletes
any old file with the same name and renames the new file in its place. On
Windows, however, when a file is deleted, it still lingers as long as a
process holds a file handle open on it. With cascading replication, a
walsender process can hold the old file open, so the rename() in the startup
process would fail. To fix that, rename the old file to a temporary name, to
make the original file name available for reuse, before deleting the old
file.
Give the correct name of the GUC parameter being complained of.
Also, emit a more suitable SQLSTATE (INVALID_PARAMETER_VALUE,
not the default INTERNAL_ERROR).
Gurjeet Singh, errcode adjustment by me
Perl, for some unaccountable reason, believes it's a good idea to reset
SIGFPE handling to SIG_IGN. Which wouldn't be a good idea even if it
worked; but on some platforms (Linux at least) it doesn't work at all,
instead resulting in forced process termination if the signal occurs.
Given the lack of other complaints, it seems safe to assume that Perl
never actually provokes SIGFPE and so there is no value in the setting
anyway. Hence, reset it to our normal handler after initializing Perl.
Report, analysis and patch by Andres Freund.
The planner previously assumed that parameter Vars having the same absolute
query level, varno, and varattno could safely be assigned the same runtime
PARAM_EXEC slot, even though they might be different Vars appearing in
different subqueries. This was (probably) safe before the introduction of
CTEs, but the lazy-evalution mechanism used for CTEs means that a CTE can
be executed during execution of some other subquery, causing the lifespan
of Params at the same syntactic nesting level as the CTE to overlap with
use of the same slots inside the CTE. In 9.1 we created additional hazards
by using the same parameter-assignment technology for nestloop inner scan
parameters, but it was broken before that, as illustrated by the added
regression test.
To fix, restructure the planner's management of PlannerParamItems so that
items having different semantic lifespans are kept rigorously separated.
This will probably result in complex queries using more runtime PARAM_EXEC
slots than before, but the slots are cheap enough that this hardly matters.
Also, stop generating PlannerParamItems containing Params for subquery
outputs: all we really need to do is reserve the PARAM_EXEC slot number,
and that now only takes incrementing a counter. The planning code is
simpler and probably faster than before, as well as being more correct.
Per report from Vik Reykja.
These changes will mostly also need to be made in the back branches, but
I'm going to hold off on that until after 9.2.0 wraps.
The cascading replication code assumed that the current RecoveryTargetTLI
never changes, but that's not true with recovery_target_timeline='latest'.
The obvious upshot of that is that RecoveryTargetTLI in shared memory needs
to be protected by a lock. A less obvious consequence is that when a
cascading standby is connected, and the standby switches to a new target
timeline after scanning the archive, it will continue to stream WAL to the
cascading standby, but from a wrong file, ie. the file of the previous
timeline. For example, if the standby is currently streaming from the middle
of file 000000010000000000000005, and the timeline changes, the standby
will continue to stream from that file. However, the WAL on the new
timeline is in file 000000020000000000000005, so the standby sends garbage
from 000000010000000000000005 to the cascading standby, instead of the
correct WAL from file 000000020000000000000005.
This also fixes a related bug where a partial WAL segment is restored from
the archive and streamed to a cascading standby. The code assumed that when
a WAL segment is copied from the archive, it can immediately be fully
streamed to a cascading standby. However, if the segment is only partially
filled, ie. has the right size, but only N first bytes contain valid WAL,
that's not safe. That can happen if a partial WAL segment is manually copied
to the archive, or if a partial WAL segment is archived because a server is
started up on a new timeline within that segment. The cascading standby will
get confused if the WAL it received is not valid, and will get stuck until
it's restarted. This patch fixes that problem by not allowing WAL restored
from the archive to be streamed to a cascading standby until it's been
replayed, and thus validated.
Serializable Snapshot Isolation used for serializable transactions
depends on acquiring SIRead locks on all heap relation tuples which
are used to generate the query result, so that a later delete or
update of any of the tuples can flag a read-write conflict between
transactions. This is normally handled in heapam.c, with tuple level
locking. Since an index-only scan avoids heap access in many cases,
building the result from the index tuple, the necessary predicate
locks were not being acquired for all tuples in an index-only scan.
To prevent problems with tuple IDs which are vacuumed and re-used
while the transaction still matters, the xmin of the tuple is part of
the tag for the tuple lock. Since xmin is not available to the
index-only scan for result rows generated from the index tuples, it
is not possible to acquire a tuple-level predicate lock in such
cases, in spite of having the tid. If we went to the heap to get the
xmin value, it would no longer be an index-only scan. Rather than
prohibit index-only scans under serializable transaction isolation,
we acquire an SIRead lock on the page containing the tuple, when it
was not necessary to visit the heap for other reasons.
Backpatch to 9.2.
Kevin Grittner and Tom Lane
Each setup block is run as a single PQexec submission, and some
statements such as VACUUM cannot be combined with others in such a
block.
Backpatch to 9.2.
Kevin Grittner and Tom Lane
The same message is used in both pg_restore and pg_dump, and it's
confusing to output "restoring data for table xyz" when the user
is just doing a pg_dump.
This gets rid of a dangerous-looking use of the not-volatile XLogCtl
pointer in a couple of spinlock-protected sections, where the normal
coding rule is that you should only access shared memory through a
pointer-to-volatile. I think the risk is only hypothetical not actual,
since for there to be a bug the compiler would have to move the spinlock
acquire or release across the memcpy() call, which one sincerely hopes
it will not. Still, it looks cleaner this way.
Per comment from Daniel Farina and subsequent discussion.
Formerly it would only show them for relkinds 'r' and 'f' (plain tables
and foreign tables). However, as of 9.2, views can also have reloptions,
namely security_barrier. The relkind restriction seems pointless and
not at all future-proof, so just print reloptions whenever there are any.
In passing, make some cosmetic improvements to the code that pulls the
"tableinfo" fields out of the PGresult.
Noted and patched by Dean Rasheed, with adjustment for all relkinds by me.
We can detect whether the planner top level is going to care at all about
cheap startup cost (it will only do so if query_planner's tuple_fraction
argument is greater than zero). If it isn't, we might as well discard
paths immediately whose only advantage over others is cheap startup cost.
This turns out to get rid of quite a lot of paths in complex queries ---
I saw planner runtime reduction of more than a third on one large query.
Since add_path isn't currently passed the PlannerInfo "root", the easiest
way to tell it whether to do this was to add a bool flag to RelOptInfo.
That's a bit redundant, since all relations in a given query level will
have the same setting. But in the future it's possible that we'd refine
the control decision to work on a per-relation basis, so this seems like
a good arrangement anyway.
Per my suggestion of a few months ago.
If a PlaceHolderVar contains a pulled-up LATERAL reference, its minimum
possible evaluation level might be higher in the join tree than its
original syntactic location. That in turn affects the ph_needed level for
any contained PlaceHolderVars (that is, those PHVs had better propagate up
the join tree at least to the evaluation level of the outer PHV). We got
this mostly right, but mark_placeholder_maybe_needed() failed to account
for the effect, and in consequence could leave the inner PHVs with
ph_may_need less than what their ultimate ph_needed value will be. That's
bad because it could lead to failure to select a join order that will allow
evaluation of the inner PHV at a valid location. Fix that, and add an
Assert that checks that we don't ever set ph_needed to more than
ph_may_need.
Only warn when connecting to a newer server, since connecting to older
servers works pretty well nowadays. Also update the documentation a
little about current psql/server compatibility expectations.
This was removed in commit cd004067742ee16ee63e55abfb4acbd5f09fbaab,
we're not quite sure why, but there have been reports of crashes due
to AS Perl being built with it when we are not, and it certainly
seems like the right thing to do. There is still some uncertainty
as to why it sometimes fails and sometimes doesn't.
Original patch from Owais Khani, substantially reworked and
extended by Andrew Dunstan.