Dept of second thoughts: given the RESET SESSION AUTHORIZATION that
was just added by commit cec550139, we don't need the reconnection
that used to be here. Might as well buy back a few microseconds.
Test the external-sort code path in CLUSTER for two different scenarios:
multiple-pass external sorting, and the best case for replacement
selection, where only one run is produced, so that no merge is required.
This test would have caught the bug fixed in commit 1b0fc8507, at
least when run with valgrind enabled.
In passing, add a short-circuit test in plan_cluster_use_sort() to make
dead certain that it selects sorting when enable_indexscan is off. As
things stand, that would happen anyway, but it seems like good future
proofing for this test.
Peter Geoghegan
Discussion: <CAM3SWZSgxehDkDMq1FdiW2A0Dxc79wH0hz1x-TnGy=1BXEL+nw@mail.gmail.com>
Change assorted places in our Perl code that did things like
system("prog $path/file");
to do it more like
system('prog', "$path/file");
which is safe against spaces and other special characters in the path
variable. The latter was already the prevailing style, but a few bits
of code hadn't gotten this memo. Back-patch to 9.4 as relevant.
Michael Paquier, Kyotaro Horiguchi
Discussion: <20160704.160213.111134711.horiguchi.kyotaro@lab.ntt.co.jp>
Per discussion on pgsql-hackers, conninfo is better as the column name
because it's more commonly used in PostgreSQL.
Catalog version bumped due to the change of pg_proc.
Author: Michael Paquier
ExecInsertIndexTuples treated an exclusion constraint as subject to
noDupErr processing even when it was not listed in arbiterIndexes, and
would therefore not error out for a conflict in such a constraint, instead
returning it as an arbiter-index failure. That led to an infinite loop in
ExecInsert, since ExecCheckIndexConstraints ignored the index as-intended
and therefore didn't throw the expected error. To fix, make the exclusion
constraint code path use the same condition as the index_insert call does
to decide whether no-error-for-duplicates behavior is appropriate. While
at it, refactor a little bit to avoid unnecessary list_member_oid calls.
(That surely wouldn't save anything worth noticing, but I find the code
a bit clearer this way.)
Per bug report from Heikki Rauhala. Back-patch to 9.5 where ON CONFLICT
was introduced.
Report: <4C976D6B-76B4-434C-8052-D009F7B7AEDA@reaktor.fi>
In commit 915b703e1 I gave get_agg_clause_costs() the responsibility of
marking Aggref nodes with the appropriate aggtranstype. I failed to notice
that where it was being called from, it might see only a subset of the
Aggref nodes that were in the original targetlist. Specifically, if there
are duplicate aggregate calls in the tlist, either make_sort_input_target
or make_window_input_target might put just a single instance into the
grouping_target, and then only that instance would get marked. Fix by
moving the call back into grouping_planner(), before we start building
assorted PathTargets from the query tlist. Per report from Stefan Huehner.
Report: <20160702131056.GD3165@huehner.biz>
In commit 68fa28f77 I tried to teach SS_finalize_plan() to cope with
initPlans attached anywhere in the plan tree, by dint of moving its
handling of those into the recursion in finalize_plan(). It turns out that
that doesn't really work: if a lower-level plan node emits an initPlan
output parameter in its targetlist, it's legitimate for upper levels to
reference those Params --- and at the point where this code runs, those
references look just like the Param itself, so finalize_plan() quite
properly rejects them as being in the wrong place. We could lobotomize
the checks enough to allow that, probably, but then it's not clear that
we'd have any meaningful check for misplaced Params at all. What seems
better, at least in the near term, is to tweak standard_planner() a bit
so that initPlans are never placed anywhere but the topmost plan node
for a query level, restoring the behavior that occurred pre-9.6. Possibly
we can do better if this code is ever merged into setrefs.c: then it would
be possible to check a Param's placement only when we'd failed to replace
it with a Var referencing a child plan node's targetlist.
BTW, I'm now suspicious that finalize_plan is doing the wrong thing by
returning the node's allParam rather than extParam to be incorporated
in the parent node's set of used parameters. However, it makes no
difference given that initPlans only appear at top level, so I'll leave
that alone for now.
Another thing that emerged from this is that standard_planner() needs
to check for initPlans before deciding that it's safe to stick a Gather
node on top in force_parallel_mode mode. We previously guarded against
that by deciding the plan wasn't wholePlanParallelSafe if any subplans
had been found, but after commit 5ce5e4a12 it's necessary to have this
substitute test, because path parallel_safe markings don't account for
initPlans. (Normally, we'd have decided the paths weren't safe anyway
due to appearances of SubPlan nodes, Params, or CTE scans somewhere in
the tree --- but it's possible for those all to be optimized away while
initPlans still remain.)
Per fuzz testing by Andreas Seltenreich.
Report: <874m89rw7x.fsf@credativ.de>
As of 9.6, pg_regress doesn't build unless storage/lwlocknames.h has been
created; but there was nothing forcing that to happen if you just went into
src/test/regress/ and built there. We previously had a similar complaint
about plpython.
To fix in a way that won't break next time we invent a generated header,
make src/backend/Makefile expose a phony target for updating all the
include files it builds, and invoke that before building pg_regress or
plpython. In principle, maybe we ought to invoke that everywhere; but
it would add a lot of usually-useless make cycles, so let's just do it
in the places where people have complained.
I made a couple of cosmetic adjustments in src/backend/Makefile as well,
to deal with the generated headers in consistent orders.
Michael Paquier and Tom Lane
Report: <31398.1467036827@sss.pgh.pa.us>
Report: <20150916200959.GB32090@msg.df7cb.de>
Commit 3fc6e2d7f5b652b417fa6937c34de2438d60fa9f introduced new "upper"
RelOptInfo structures but didn't set consider_parallel for them
correctly, a point I completely missed when reviewing it. Later,
commit e06a38965b3bcdaa881e7e06892d4d8ab6c2c980 made the situation
worse by doing it incorrectly for the grouping relation. Try to
straighten all of that out. Along the way, get rid of the annoying
wholePlanParallelSafe flag, which was only necessarily because of
the fact that upper planning stages didn't use paths at the time
that code was written.
The most important immediate impact of these changes is that
force_parallel_mode will provide useful test coverage in quite a few
more scenarios than it did previously, but it's also necessary
preparation for fixing some problems related to subqueries.
Patch by me, reviewed by Tom Lane.
In non-text output formats, parallelized aggregates were reporting
"Partial" or "Finalize" as a field named "Operation", which might be all
right in the absence of any context --- but other plan node types use that
field to report SQL-visible semantics, such as Select/Insert/Update/Delete.
So that naming choice didn't seem good to me. I changed it to "Partial
Mode".
Also, the field did not appear at all for a non-parallelized Agg plan node,
which is contrary to expectation in non-text formats. We're notionally
producing objects that conform to a schema, so the set of fields for a
given node type and EXPLAIN mode should be well-defined. I set it up to
fill in "Simple" in such cases.
Other fields that were added for parallel query, namely "Parallel Aware"
and Gather's "Single Copy", had not gotten the word on that point either.
Make them appear always in non-text output.
Also, the latter two fields were nominally producing boolean output, but
were getting it wrong, because bool values shouldn't be quoted in JSON or
YAML. Somehow we'd not needed an ExplainPropertyBool formatting subroutine
before 9.6; but now we do, so invent it.
Discussion: <16002.1466972724@sss.pgh.pa.us>
Since get_relation_foreign_keys doesn't try to determine whether RTEs
are actually part of the query semantics, it might make FK info records
linking to RTEs that won't have a RelOptInfo at all. Cope with that.
Per bug #14219 from Andrew Gierth.
Report: <20160629183338.1397.43514@wrigleys.postgresql.org>
If we need to use a gating Result node for pseudoconstant quals,
create_scan_plan() intentionally suppresses use_physical_tlist's checks
on whether there are matches for sortgroupref labels, on the grounds that
we don't need matches because we can label the Result's projection output
properly. However, it then called apply_pathtarget_labeling_to_tlist
anyway. This oversight was harmless when written, but in commit aeb9ae645
I made that function throw an error if there was no match. Thus, the
combination of a table scan, pseudoconstant quals, and a non-simple-Var
sortgroupref column threw the dreaded "ORDER/GROUP BY expression not found
in targetlist" error. To fix, just skip applying the labeling in this
case. Per report from Rushabh Lathia.
Report: <CAGPqQf2iLB8t6t-XrL-zR233DFTXxEsfVZ4WSqaYfLupEoDxXA@mail.gmail.com>
Previously, these commands always planned the given query and went through
executor startup before deciding not to actually run the query if WITH NO
DATA is specified. This behavior is problematic for pg_dump because it
may cause errors to be raised that we would rather not see before a
REFRESH MATERIALIZED VIEW command is issued. See for example bug #13907
from Marian Krucina. This change is not sufficient to fix that particular
bug, because we also need to tweak pg_dump to issue the REFRESH later,
but it's a necessary step on the way.
A user-visible side effect of doing things this way is that the returned
command tag for WITH NO DATA cases will now be "CREATE MATERIALIZED VIEW"
or "CREATE TABLE AS", not "SELECT 0". We could preserve the old behavior
but it would take more code, and arguably that was just an implementation
artifact not intended behavior anyhow.
In 9.5 and HEAD, also get rid of the static variable CreateAsReladdr, which
was trouble waiting to happen; there is not any prohibition on nested
CREATE commands.
Back-patch to 9.3 where CREATE MATERIALIZED VIEW was introduced.
Michael Paquier and Tom Lane
Report: <20160202161407.2778.24659@wrigleys.postgresql.org>
<-> operator now have higher predecence than & (AND) operator. This change
was motivated by unexpected difference of similar queries:
'a & b <-> c'::tsquery and 'b <-> c & a'. Before first query means
(a & b) <-> c and second one - '(b <-> c) & a', now phrase operator evaluates
first.
Per suggestion from Tom Lane 32260.1465402409@sss.pgh.pa.us
If there is no positional information of lexemes then phrase operator will not
fallback to AND operator. This change makes needing to modify TS_execute()
interface, because somewhere (in indexes, for example) positional information
is unaccesible and in this cases we need to force fallback to AND.
Per discussion c19fcfec308e6ccd952cdde9e648b505@mail.gmail.com
The original specification for this called for the deserialization function
to have signature "deserialize(serialtype) returns transtype", which is a
security violation if transtype is INTERNAL (which it always would be in
practice) and serialtype is not (which ditto). The patch blithely overrode
the opr_sanity check for that, which was sloppy-enough work in itself,
but the indisputable reason this cannot be allowed to stand is that CREATE
FUNCTION will reject such a signature and thus it'd be impossible for
extensions to create parallelizable aggregates.
The minimum fix to make the signature type-safe is to add a second, dummy
argument of type INTERNAL. But to lock it down a bit more and make misuse
of INTERNAL-accepting functions less likely, let's get rid of the ability
to specify a "serialtype" for an aggregate and just say that the only
useful serialtype is BYTEA --- which, in practice, is the only interesting
value anyway, due to the usefulness of the send/recv infrastructure for
this purpose. That means we only have to allow "serialize(internal)
returns bytea" and "deserialize(bytea, internal) returns internal" as
the signatures for these support functions.
In passing fix bogus signature of int4_avg_combine, which I found thanks
to adding an opr_sanity check on combinefunc signatures.
catversion bump due to removing pg_aggregate.aggserialtype and adjusting
signatures of assorted built-in functions.
David Rowley and Tom Lane
Discussion: <27247.1466185504@sss.pgh.pa.us>
The annotation for "ERROR: language "foo" is not trusted" used to say
"HINT: Only superusers can use untrusted languages", which was fairly
poorly thought out. For one thing, it's not a hint about what to do,
but a statement of fact, which makes it errdetail. But also, this
fails to clarify things much, because there's a missing step in the
chain of reasoning. I think it's more useful to say "GRANT and REVOKE
are not allowed on untrusted languages, because only superusers can use
untrusted languages".
It's been like this for a long time, but given the lack of previous
complaints, I don't think this is worth back-patching.
Discussion: <1417.1466289901@sss.pgh.pa.us>
The previous code neglected the fact that the scanjoin_target might
carry sortgroupref labelings that we need to absorb. Instead, do
create_projection_path() unconditionally, and tweak the path's cost
estimate after the fact. (I'm now convinced that we ought to refactor
the way we account for sometimes not needing a separate projection step,
but right now is not the time for that sort of cleanup.)
Problem identified by Amit Kapila, patch by me.
Commit 04ae11f62e643e07c411c4935ea6af46cb112aa9 removed some broken
code to apply the scan/join target to partial paths, but its theory
that this processing step is totally unnecessary turns out to be wrong.
Put similar code back again, but this time, check for parallel-safety
and avoid in-place modifications to paths that may already have been
used as part of some other path.
(This is not an entirely elegant solution to this problem; it might
be better, for example, to postpone generate_gather_paths for the
topmost scan/join rel until after the scan/join target has been
applied. But this is not the time for such redesign work.)
Amit Kapila and Robert Haas
If you really want to vacuum every single page in the relation,
regardless of apparent visibility status or anything else, you can use
this option. In previous releases, this behavior could be achieved
using VACUUM (FREEZE), but because we can now recognize all-frozen
pages as not needing to be frozen again, that no longer works. There
should be no need for routine use of this option, but maybe bugs or
disaster recovery will necessitate its use.
Patch by me, reviewed by Andres Freund.
In commit 8c1d9d56e9a00680a035b8b333a98ea16b121eb7, I attempted to
add a regression test that would fail if the target list was pushed
into a parallel worker, but due to brain fade on my part, it just
randomly fails whether anything bad or not, because the error check
inside the parallel_restricted() function tests whether there is
*any process in the system* that is not connected to a client, not
whether the process running the query is not connected to a client.
A little experimentation has left me pessimistic about the
prospects of doing better here in a short amount of time, so let's
just fall back to checking that the plan is as we expect and leave
the execution-time check for another day.
Commit 14a254fb52423c57059851abafbd1247261f7f03 managed not to
exercise the code it was intended to test, and the comment explaining
why no "parallel worker" line showed up in the context wasn't right.
Amit Kapila, tweaked by me per Amit's analysis.
To achieve this, ANALYZE the data table before querying it, as suggested
by Tom Lane. On my system, this enables the test to pass with 128 kB of
work_mem (a value with which other tests fail -- so it seems good
enough).
Reported by Michaël Paquier.
This commit reverts 137805f89 as well as the associated commits 015e88942,
5306df283, and 68d704edb. We found multiple bugs in this feature, and
there was concern about possible planner slowdown (though to be fair,
exhibiting a very large slowdown proved difficult). The way forward
requires a considerable rewrite, which may or may not be possible to
accomplish in time for beta2. In my judgment reviewing the rewrite will
be easier to accomplish starting from a clean slate, so let's temporarily
revert what's there now. This also leaves us in a safe state if it turns
out to be necessary to postpone the rewrite to the next development cycle.
Discussion: <20160429102531.GA13701@huehner.biz>
dumpAccessMethod() didn't get the memo that we now have a bitfield for
the components which should be dumped instead of a simple boolean.
Correct that by checking if the relevant bit is set for each component
being dumped out (and not dumping it out if it isn't set).
This corrects an issue where CREATE ACCESS METHOD commands were being
included in non-binary-upgrades when an extension included an access
method (as the bloom extensions does).
Also add a regression test to make sure that we only dump out the
ACCESS METHOD commands, when they are part of an extension, when doing
a binary upgrade.
Pointed out by Thom Brown.
Further thought about bug #14174 motivated me to try the case of a
R/W datum being returned from a VALUES list, and sure enough it was
broken. Fix that.
Also add a regression test case exercising the same scenario for
FunctionScan. That's not broken right now, because the function's
result will get shoved into a tuplestore between generation and use;
but it could easily become broken whenever we get around to optimizing
FunctionScan better.
There don't seem to be any other places where we put the result of
expression evaluation into a virtual tuple slot that could then be
the source for Vars of further expression evaluation, so I think
this is the end of this bug.
If a plan node output expression returns an "expanded" datum, and that
output column is referenced in more than one place in upper-level plan
nodes, we need to ensure that what is returned is a read-only reference
not a read/write reference. Otherwise one of the referencing sites could
scribble on or even delete the expanded datum before we have evaluated the
others. Commit 1dc5ebc9077ab742, which introduced this feature, supposed
that it'd be sufficient to make SubqueryScan nodes force their output
columns to read-only state. The folly of that was revealed by bug #14174
from Andrew Gierth, and really should have been immediately obvious
considering that the planner will happily optimize SubqueryScan nodes
out of the plan without any regard for this issue.
The safest fix seems to be to make ExecProject() force its results into
read-only state; that will cover every case where a plan node returns
expression results. Actually we can delegate this to ExecTargetList()
since we can recursively assume that plain Vars will not reference
read-write datums. That should keep the extra overhead down to something
minimal. We no longer need ExecMakeSlotContentsReadOnly(), which was
introduced only in support of the idea that just a few plan node types
would need to do this.
In the future it would be nice to have the planner account for this problem
and inject force-to-read-only expression evaluation nodes into only the
places where there's a risk of multiple evaluation. That's not a suitable
solution for 9.5 or even 9.6 at this point, though.
Report: <20160603124628.9932.41279@wrigleys.postgresql.org>
Mostly these are just comments but there are a few in documentation
and a handful in code and tests. Hopefully this doesn't cause too much
unnecessary pain for backpatching. I relented from some of the most
common like "thru" for that reason. The rest don't seem numerous
enough to cause problems.
Thanks to Kevin Lyda's tool https://pypi.python.org/pypi/misspellings
The IF EXISTS option was documented, and implemented in the grammar, but
it didn't actually work for lack of support in does_not_exist_skipping().
Per bug #14160.
Report and patch by Kouhei Sutou
Report: <20160527070433.19424.81712@wrigleys.postgresql.org>
As part of upper planner pathification (commit 3fc6e2d7f5b652b4) I redid
createplan.c's approach to the physical-tlist optimization, in which scan
nodes are allowed to return exactly the underlying table's columns so as
to save doing a projection step at runtime. The logic was intentionally
more aggressive than before about applying the optimization, which is
generally a good thing, but Andres Freund found a case in which it got
too aggressive. Namely, if any column is referenced more than once in
the parent plan node's sorting or grouping column list, we can't optimize
because then that column would need to have more than one ressortgroupref
label, and we only have space for one.
Add logic to detect this situation in use_physical_tlist(), and also add
some error checking in apply_pathtarget_labeling_to_tlist(), which this
example proves was being overly cavalier about whether what it was doing
made any sense.
The added test case exposes the problem only because we do not eliminate
duplicate grouping keys. That might be something to fix someday, but it
doesn't seem like appropriate post-beta work.
Report: <20160526021235.w4nq7k3gnheg7vit@alap3.anarazel.de>
All of the other tables used in the query in dumpTable(), which is
collecting column-level ACLs, are qualified, so we should be qualifying
the pg_init_privs, the related sub-select against pg_class and the
other queries added by the pg_dump catalog ACLs work.
Also, use ::regclass (or ::pg_catalog.regclass, where appropriate)
instead of using a poorly constructed query to get the OID for various
catalog tables.
Issues identified by Noah and Alvaro, patch by me.
subquery_planner() failed to apply expression preprocessing to the
arbiterElems and arbiterWhere fields of an OnConflictExpr. No doubt the
theory was that this wasn't necessary because we don't actually try to
execute those expressions; but that's wrong, because it results in failure
to match to index expressions or index predicates that are changed at all
by preprocessing. Per bug #14132 from Reynold Smith.
Also add pullup_replace_vars processing for onConflictWhere. Perhaps
it's impossible to have a subquery reference there, but I'm not exactly
convinced; and even if true today it's a failure waiting to happen.
Also add some comments to other places where one or another field of
OnConflictExpr is intentionally ignored, with explanation as to why it's
okay to do so.
Also, catalog/dependency.c failed to record any dependency on the named
constraint in ON CONFLICT ON CONSTRAINT, allowing such a constraint to
be dropped while rules exist that depend on it, and allowing pg_dump to
dump such a rule before the constraint it refers to. The normal execution
path managed to error out reasonably for a dangling constraint reference,
but ruleutils.c dumped core; so in addition to fixing the omission, add
a protective check in ruleutils.c, since we can't retroactively add a
dependency in existing databases.
Back-patch to 9.5 where this code was introduced.
Report: <20160510190350.2608.48667@wrigleys.postgresql.org>
The test_pg_dump extension doesn't have a C component, so we need
to exclude it from the MSVC build system trying to figure out how
to build it.
Also add a "MODULES" line to the Makefile, as test_extensions has.
Might not be necessary, but seems good to keep things consistent.
Lastly, remove the 'installcheck' line from test_pg_dump, as that
was causing redefinition errors, at least on my box. This also
makes test_pg_dump consistent with how commit_ts is set up.
The Makefile for test_pg_dump shouldn't have a MODULES_big line
because there's no actual compiled bit for that extension. Hopefully
this will fix the Windows buildfarm members which were complaining.
In passing, also add the 'prove_installcheck' bit to the pg_dump and
test_pg_dump Makefiles, to get the buildfarm members to actually run
those tests.
Default roles really should be like regular roles, for the most part.
This removes a number of checks that were trying to make default roles
extra special by not allowing them to be used as regular roles.
We still prevent users from creating roles in the "pg_" namespace or
from altering roles which exist in that namespace via ALTER ROLE, as
we can't preserve such changes, but otherwise the roles are very much
like regular roles.
Based on discussion with Robert and Tom.
This TAP test suite will create a new cluster, populate it based on
the 'create_sql' values in the '%tests' hash, run all of the runs
defined in the '%pgdump_runs' hash, and then for each test in the
'%tests' hash, compare each run's output the the regular expression
defined for the test under the 'like' and 'unlike' functions, as
appropriate.
While this test suite covers a fair bit of ground (67% of pg_dump.c
and quite a bit of the other files in src/bin/pg_dump), there is
still quite a bit which remains to be added to provide better code
coverage. Still, this is quite a bit better than we had, and has
found a few bugs already (note that the CREATE TRANSFORM test is
commented out, as it is currently failing).
Idea for using the TAP system from Tom, though all of the code is mine.
The similarity of the original names to SQL keywords seems like a bad
idea. Rename them before we're stuck with 'em forever.
In passing, minor code and docs cleanup.
Discussion: <4875.1462210058@sss.pgh.pa.us>
Commit 7d9a4737c268f61fb8800957631f12d3f13be218 greatly improved the
accuracy of the numeric transcendental functions, however it failed to
consider the case where the result from pow() is close to the overflow
threshold, for example 0.12 ^ -2345.6. For such inputs, where the
result has more than 2000 digits before the decimal point, the decimal
result weight estimate was being clamped to 2000, leading to a loss of
precision in the final calculation.
Fix this by replacing the clamping code with an overflow test that
aborts the calculation early if the final result is sure to overflow,
based on the overflow limit in exp_var(). This provides the same
protection against integer overflow in the subsequent result scale
computation as the original clamping code, but it also ensures that
precision is never lost and saves compute cycles in cases that are
sure to overflow.
The new early overflow test works with the initial low-precision
result (expected to be accurate to around 8 significant digits) and
includes a small fuzz factor to ensure that it doesn't kick in for
values that would not overflow exp_var(), so the overall overflow
threshold of pow() is unchanged and consistent for all inputs with
non-integer exponents.
Author: Dean Rasheed
Reviewed-by: Tom Lane
Discussion: http://www.postgresql.org/message-id/CAEZATCUj3U-cQj0jjoia=qgs0SjE3auroxh8swvNKvZWUqegrg@mail.gmail.com
See-also: http://www.postgresql.org/message-id/CAEZATCV7w+8iB=07dJ8Q0zihXQT1semcQuTeK+4_rogC_zq5Hw@mail.gmail.com
This reverts commits f07d18b6e94d, 82c83b337202, 3a3b309041b0, and
24c5f1a103ce.
This feature has shown enough immaturity that it was deemed better to
rip it out before rushing some more fixes at the last minute. There are
discussions on larger changes in this area for the next release.