struct representing a tree entry, rather than being a separately allocated
piece of storage. This API is at least as clean as the old one (if not
more so --- there were some bizarre choices in there) and it permits a
very substantial memory savings, on the order of 2X in ginbulk.c's usage.
Also, fix minor memory leaks in code called by ginEntryInsert, in
particular in ginInsertValue and entryFillRoot, as well as ginEntryInsert
itself. These leaks resulted in the GIN index build context continuing
to bloat even after we'd filled it to maintenance_work_mem and started
to dump data out to the index.
In combination these fixes restore the GIN index build code to honoring
the maintenance_work_mem limit about as well as it did in 8.4. Speed
seems on par with 8.4 too, maybe even a bit faster, for a non-pathological
case in which HEAD was formerly slower.
Back-patch to 9.0 so we don't have a performance regression from 8.4.
possible (ie, whenever the tsquery is a constant), even when no statistics
are available for the tsvector. For example, foo @@ 'a & b'::tsquery
can be expected to be more selective than foo @@ 'a'::tsquery, whether
or not we know anything about foo. We use DEFAULT_TS_MATCH_SEL as the assumed
selectivity of individual query terms when no stats are available, then
combine the terms according to the query's AND/OR structure as usual.
Per experimentation with Artur Dabrowski's example. (The fact that there
are no stats available in that example is a problem in itself, but
nonetheless tsmatchsel should be smarter about the case.)
Back-patch to 8.4 to keep all versions of tsmatchsel() in sync.
routines to make them behave better in the presence of "lossy" index pointers.
The previous coding was outright incorrect for some cases, as recently
reported by Artur Dabrowski: scanGetItem would fail to return index entries in
cases where one index key had multiple exact pointers on the same page as
another key had a lossy pointer. Also, keyGetItem was extremely inefficient
for cases where a single index key generates multiple "entry" streams, such as
an @@ operator with a multiple-clause tsquery. The presence of a lossy page
pointer in any one stream defeated its ability to use the opclass
consistentFn, resulting in probing many heap pages that didn't really need to
be visited. In Artur's example case, a query like
WHERE tsvector @@ to_tsquery('a & b')
was about 50X slower than the theoretically equivalent
WHERE tsvector @@ to_tsquery('a') AND tsvector @@ to_tsquery('b')
The way that I chose to fix this was to have GIN call the consistentFn
twice with both TRUE and FALSE values for the in-doubt entry stream,
returning a hit if either call produces TRUE, but not if they both return
FALSE. The code handles this for the case of a single in-doubt entry stream,
but punts (falling back to the stupid behavior) if there's more than one lossy
reference to the same page. The idea could be scaled up to deal with multiple
lossy references, but I think that would probably be wasted complexity. At
least to judge by Artur's example, such cases don't occur often enough to be
worth trying to optimize.
Back-patch to 8.4. 8.3 did not have lossy GIN index pointers, so not
subject to these problems.
look through join alias Vars to avoid breaking join queries, and
move the test to someplace where it will catch more possible ways
of calling a function. We still ought to throw away the whole thing
in favor of a data-type-based solution, but that's not feasible in
the back branches.
This needs to be back-patched further than 9.0, but I don't have time
to do so today. Committing now so that the fix gets into 9.0beta4.
Transaction aborts now record their LSN to avoid corner case
behaviour in SR/HS, hence change of name of variables and functions.
As pointed out by Fujii Masao. Cosmetic changes only.
related functions. Per today's discussion, we will henceforth assume
that datatype I/O functions are either stable or immutable, never volatile.
(This implies in particular that domain CHECK constraint expressions shouldn't
be volatile, since domain_in executes them.) In turn, functions that execute
the I/O functions of arbitrary datatypes should always be labeled stable.
This affects the labeling of array_to_string, which was unsafely marked
immutable, and record_in, record_out, record_recv, record_send,
domain_in, domain_recv, which were over-conservatively marked volatile.
The array I/O functions were already marked stable, which is correct
per this policy but would have been wrong if we maintained domain_in
as volatile.
Back-patch to 9.0, along with an earlier fix to correctly mark cash_in
and cash_out as stable not immutable (since they depend on lc_monetary).
No catversion bump --- the implications of this are not currently
severe enough to justify a forced initdb.
assuming that a local char[] array would be aligned on at least a word
boundary. There are architectures on which that is pretty much guaranteed to
NOT be the case ... and those arches also don't like non-aligned memory
accesses, meaning that log_newpage() would crash if it ever got invoked.
Even on Intel-ish machines there's a potential for a large performance penalty
from doing I/O to an inadequately aligned buffer. So palloc it instead.
Backpatch to 8.0 --- 7.4 doesn't have this code.
If a zeroed page is present in the heap, ALTER TABLE .. SET TABLESPACE will
set the LSN and TLI while copying it, which is wrong, and heap_xlog_newpage()
will do the same thing during replay, so the corruption propagates to any
standby. Note, however, that the bug can't be demonstrated unless archiving
is enabled, since in that case we skip WAL logging altogether, and the LSN/TLI
are not set.
Back-patch to 8.0; prior releases do not have tablespaces.
Analysis and patch by Jeff Davis. Adjustments for back-branches and minor
wordsmithing by me.
list in ExecLockRows() forgot to allow for the possibility that some of the
rowmarks are for child tables that aren't relevant to the current row.
Per report from Kenichiro Tanaka.
Avoid hard-coding lockmode used for many altering DDL commands, allowing easier
future changes of lock levels. Implementation of initial analysis on DDL
sub-commands, so that many lock levels are now at ShareUpdateExclusiveLock or
ShareRowExclusiveLock, allowing certain DDL not to block reads/writes.
First of number of planned changes in this area; additional docs required
when full project complete.
a pass-by-reference datatype with a nontrivial projection step.
We were using the same memory context for the projection operation as for
the temporary context used by the hashtable routines in execGrouping.c.
However, the hashtable routines feel free to reset their temp context at
any time, which'd lead to destroying input data that was still needed.
Report and diagnosis by Tao Ma.
Back-patch to 8.1, where the problem was introduced by the changes that
allowed us to work with "virtual" tuples instead of materializing intermediate
tuple values everywhere. The earlier code looks quite similar, but it doesn't
suffer the problem because the data gets copied into another context as a
result of having to materialize ExecProject's output tuple.
We used to be consistent about this, but my recent patch to add a
restart_after_crash GUC failed to follow the existing convention.
Report and patch from Fujii Masao.
We now use the phrase 'via local socket in' rather than 'on host' in both
\c and \conninfo output, when applicable.
Fujii Masao, with some kibitzing by me.
I've added a quote_all_identifiers GUC which affects the behavior
of the backend, and a --quote-all-identifiers argument to pg_dump
and pg_dumpall which sets the GUC and also affects the quoting done
internally by those applications.
Design by Tom Lane; review by Alex Hunsaker; in response to bug #5488
filed by Hartmut Goebel.
Remove bespoke code in DoCopy and RI_Initial_Check, which now instead
fabricate call ExecCheckRTPerms with a manufactured RangeTblEntry.
This is intended to make it feasible for an enhanced security provider
to actually make use of ExecutorCheckPerms_hook, but also has the
advantage that RI_Initial_Check can allow use of the fast-path when
column-level but not table-level permissions are present.
KaiGai Kohei. Reviewed (in an earlier version) by Stephen Frost, and by me.
Some further changes to the comments by me.
Per discussion with David Christensen, there can be multiple
instances of PG accessible via local sockets, and you need the port
to see which one you're actually connected to. David's original
patch worked this way, but I inadvertently ripped it out during
commit.
Normally, we automatically restart after a backend crash, but in some
cases when PostgreSQL is invoked by clusterware it may be desirable to
suppress this behavior, so we provide an option which does this.
Since no existing GUC group quite fits, create a new group called
"error handling options" for this and the previously undocumented GUC
exit_on_error, which is now documented.
Review by Fujii Masao.
path when CSV logging is configured but not yet operational. It's sufficient
to send the message to stderr, as we were already doing, and the "Not safe"
gripe has already confused at least two core members ...
Backpatch to 9.0, but not further --- doesn't seem appropriate to change
this behavior in stable branches.
any implicit casting previously applied to the targetlist item. This is
reasonable because the implicit cast, by definition, wasn't written by the
user; so we are preserving the expected behavior that ORDER BY items match
textually equivalent tlist items. The case never arose before because there
couldn't be any implicit casting of a top-level SELECT item before we process
ORDER BY etc. But now it can arise in the context of aggregates containing
ORDER BY clauses, since the "targetlist" is the already-casted list of
arguments for the aggregate. The net effect is that the datatype used for
ORDER BY/DISTINCT purposes is the aggregate's declared input type, not that
of the original input column; which is a bit debatable but not horrendous,
and to do otherwise would require major rework that doesn't seem justified.
Per bug #5564 from Daniel Grace. Back-patch to 9.0 where aggregate ORDER BY
was implemented.
This adds a libpq connection parameter requirepeer that specifies the user
name that the server process is expected to run under.
reviewed by KaiGai Kohei
log files created by the syslogger process.
In passing, make unix_file_permissions display its value in octal, same
as log_file_mode now does.
Martin Pihlak
from defining non-self-conflicting constraints.
Jeff Davis
Note: I (tgl) objected to removing this check in 9.0 on the grounds that it
was an important sanity check in new, poorly tested code. However, it should
be all right to remove it for 9.1, since we'll get field testing from the
9.0 branch.
to dump a PUBLIC user mapping correctly, as per bug #5560 from Shigeru Hanada.
Use the pg_user_mappings view rather than trying to access pg_user_mapping
directly, so that the code doesn't fail when run by a non-superuser. And
clean up some minor carelessness such as unsafe usage of fmtId().
Back-patch to 8.4 where this code was added.
parameter against server cert's CN field) to succeed in the case where
both host and hostaddr are specified. As with the existing precedents
for Kerberos, GSSAPI, SSPI, it is the calling application's responsibility
that host and hostaddr match up --- we just use the host name as given.
Per bug #5559 from Christopher Head.
In passing, make the error handling and messages for the no-host-name-given
failure more consistent among these four cases, and correct a lie in the
documentation: we don't attempt to reverse-lookup host from hostaddr
if host is missing.
Back-patch to 8.4 where SSL cert verification was introduced.
rather than just $N. This brings the display of nestloop-inner-indexscan
plans back to where it's been, and incidentally improves the display of
SubPlan parameters as well. In passing, simplify the EXPLAIN code by
having it deal primarily in the PlanState tree rather than separately
searching Plan and PlanState trees. This is noticeably cleaner for
subplans, and about a wash elsewhere.
One small difference from previous behavior is that EXPLAIN will no longer
qualify local variable references in inner-indexscan plan nodes, since it
no longer sees such nodes as possibly referencing multiple tables. Vars
referenced through PARAM_EXEC Params are still forcibly qualified, though,
so I don't think the display is any more confusing than before. Adjust a
couple of examples in the documentation to match this behavior.
loop from being dropped, I missed subtransaction cleanup. Pinned portals
must be dropped at subtransaction cleanup just as they are at main
transaction cleanup.
Per bug #5556 by Robert Walker. Backpatch to 8.0, 7.4 didn't have
subtransactions.
relation using the general PARAM_EXEC executor parameter mechanism, rather
than the ad-hoc kluge of passing the outer tuple down through ExecReScan.
The previous method was hard to understand and could never be extended to
handle parameters coming from multiple join levels. This patch doesn't
change the set of possible plans nor have any significant performance effect,
but it's necessary infrastructure for future generalization of the concept
of an inner indexscan plan.
ExecReScan's second parameter is now unused, so it's removed.