discussion on pgsql-hackers: in READ COMMITTED mode we just have to force
a QuerySnapshot update in the trigger, but in SERIALIZABLE mode we have
to run the scan under a current snapshot and then complain if any rows
would be updated/deleted that are not visible in the transaction snapshot.
to allow es_snapshot to be set to SnapshotNow rather than a query snapshot.
This solves a bug reported by Wade Klaver, wherein triggers fired as a
result of RI cascade updates could misbehave.
now able to cope with assigning new relfilenode values to nailed-in-cache
indexes, so they can be reindexed using the fully crash-safe method. This
leaves only shared system indexes as special cases. Remove the 'index
deactivation' code, since it provides no useful protection in the shared-
index case. Require reindexing of shared indexes to be done in standalone
mode, but remove other restrictions on REINDEX. -P (IgnoreSystemIndexes)
now prevents using indexes for lookups, but does not disable index updates.
It is therefore safe to allow from PGOPTIONS. Upshot: reindexing system catalogs
can be done without a standalone backend for all cases except
shared catalogs.
really general fix might be difficult, I believe the only case where
AtCommit_Notify could see an uncommitted tuple is where the other guy
has just unlistened and not yet committed. The best solution seems to
be to just skip updating that tuple, on the assumption that the other
guy does not want to hear about the notification anyway. This is not
perfect --- if the other guy rolls back his unlisten instead of committing,
then he really should have gotten this notify. But to do that, we'd have
to wait to see if he commits or not, or make UNLISTEN hold exclusive lock
on pg_listener until commit. Either of these answers is deadlock-prone,
not to mention horrible for interactive performance. Do it this way
for now. (What happened to that project to do LISTEN/NOTIFY in memory
with no table, anyway?)
handling many-way scans: instead of re-evaluating all prior indexscan
quals to see if a tuple has been fetched more than once, use a hash table
indexed by tuple CTID. But fall back to the old way if the hash table
grows to exceed SortMem.
as well as the hash function (formerly the comparison function was hardwired
as memcmp()). This makes it possible to eliminate the special-purpose
hashtable management code in execGrouping.c in favor of using dynahash to
manage tuple hashtables; which is a win because dynahash knows how to expand
a hashtable when the original size estimate was too small, whereas the
special-purpose code was too stupid to do that. (See recent gripe from
Stephan Szabo about poor performance when hash table size estimate is way
off.) Free side benefit: when using string_hash, the default comparison
function is now strncmp() instead of memcmp(). This should eliminate some
part of the overhead associated with larger NAMEDATALEN values.
be anything yielding an array of the proper kind, not only sub-ARRAY[]
constructs; do subscript checking at runtime not parse time. Also,
adjust array_cat to make array || array comply with the SQL99 spec.
Joe Conway
yet, though). Avoid using nth() to fetch tlist entries; provide a
common routine get_tle_by_resno() to search a tlist for a particular
resno. This replaces a couple uses of nth() and a dozen hand-coded
search loops. Also, replace a few uses of nth(length-1, list) with
llast().
error, if any input element is NULL. This is not what we ultimately want,
but until arrays can have NULL elements, it will have to do. Patch from
Joe Conway.
It also works to create a non-polymorphic aggregate from polymorphic
functions, should you want to do that. Regression test added, docs still
lacking. By Joe Conway, with some kibitzing from Tom Lane.
ANYELEMENT. The effect is to postpone typechecking of the function
body until runtime. Documentation is still lacking.
Original patch by Joe Conway, modified to postpone type checking
by Tom Lane.
'scalar op ALL (array)', where the operator is applied between the
lefthand scalar and each element of the array. The operator must
yield boolean; the result of the construct is the OR or AND of the
per-element results, respectively.
Original coding by Joe Conway, after an idea of Peter's. Rewritten
by Tom to keep the implementation strictly separate from subqueries.
comparison functions), replacing the highly bogus bitwise array_eq. Create
a btree index opclass for ANYARRAY --- it is now possible to create indexes
on array columns.
Arrange to cache the results of catalog lookups across multiple array
operations, instead of repeating the lookups on every call.
Add string_to_array and array_to_string functions.
Remove singleton_array, array_accum, array_assign, and array_subscript
functions, since these were for proof-of-concept and not intended to become
supported functions.
Minor adjustments to behavior in some corner cases with empty or
zero-dimensional arrays.
Joe Conway (with some editorializing by Tom Lane).
specific hash functions used by hash indexes, rather than the old
not-datatype-aware ComputeHashFunc routine. This makes it safe to do
hash joining on several datatypes that previously couldn't use hashing.
The sets of datatypes that are hash indexable and hash joinable are now
exactly the same, whereas before each had some that weren't in the other.
extensions to support our historical behavior. An aggregate belongs
to the closest query level of any of the variables in its argument,
or the current query level if there are no variables (e.g., COUNT(*)).
The implementation involves adding an agglevelsup field to Aggref,
and treating outer aggregates like outer variables at planning time.
when the plan is ReScanned, we don't have to rebuild the hash table
if there is no parameter change for its child node. This idea has
been used for a long time in Sort and Material nodes, but was not in
the hash code till now.
introducing new 'FastList' list-construction subroutines to use in
hot spots. This avoids the O(N^2) behavior of repeated lappend's
by keeping a tail pointer, while not changing behavior by reversing
list order as the lcons() method would do.
of an index can now be a computed expression instead of a simple variable.
Restrictions on expressions are the same as for predicates (only immutable
functions, no sub-selects). This fixes problems recently introduced with
inlining SQL functions, because the inlining transformation is applied to
both expression trees so the planner can still match them up. Along the
way, improve efficiency of handling index predicates (both predicates and
index expressions are now cached by the relcache) and fix 7.3 oversight
that didn't record dependencies of predicate expressions.
handle multiple 'formats' for data I/O. Restructure CommandDest and
DestReceiver stuff one more time (it's finally starting to look a bit
clean though). Code now matches latest 3.0 protocol document as far
as message formats go --- but there is no support for binary I/O yet.
DestReceiver pointers instead of just CommandDest values. The DestReceiver
is made at the point where the destination is selected, rather than
deep inside the executor. This cleans up the original kluge implementation
of tstoreReceiver.c, and makes it easy to support retrieving results
from utility statements inside portals. Thus, you can now do fun things
like Bind and Execute a FETCH or EXPLAIN command, and it'll all work
as expected (e.g., you can Describe the portal, or use Execute's count
parameter to suspend the output partway through). Implementation involves
stuffing the utility command's output into a Tuplestore, which would be
kind of annoying for huge output sets, but should be quite acceptable
for typical uses of utility commands.
the column by table OID and column number, if it's a simple column
reference. Along the way, get rid of reskey/reskeyop fields in Resdoms.
Turns out that representation was not convenient for either the planner
or the executor; we can make the planner deliver exactly what the
executor wants with no more effort.
initdb forced due to change in stored rule representation.
which does the same thing. Perhaps at one time there was a reason to
allow plan nodes to store their result types in different places, but
AFAICT that's been unnecessary for a good while.
Both plannable queries and utility commands are now always executed
within Portals, which have been revamped so that they can handle the
load (they used to be good only for single SELECT queries). Restructure
code to push command-completion-tag selection logic out of postgres.c,
so that it won't have to be duplicated between simple and extended queries.
initdb forced due to addition of a field to Query nodes.