memory in the executor's per-query memory context. It also inefficient:
it invokes get_call_result_type() and TupleDescGetAttInMetadata() for
every call to return_next, rather than invoking them once (per PL/Perl
function call) and memoizing the result.
This patch makes the following changes:
- refactor the code to include all the "per PL/Perl function call" data
inside a single struct, "current_call_data". This means we don't need to
save and restore N pointers for every recursive call into PL/Perl, we
can just save and restore one.
- lookup the return type metadata needed by plperl_return_next() once,
and then stash it in "current_call_data", so as to avoid doing the
lookup for every call to return_next.
- create a temporary memory context in which to evaluate the return
type's input functions. This memory context is reset for each call to
return_next.
The patch appears to fix the memory leak, and substantially reduces
the overhead imposed by return_next.
While we normally prefer the notation "foo.*" for a whole-row Var, that does
not work at SELECT top level, because in that context the parser will assume
that what is wanted is to expand the "*" into a list of separate target
columns, yielding behavior different from a whole-row Var. We have to emit
just "foo" instead in that context. Per report from Sokolov Yura.
because pqSendSome will absorb input data anytime it'd be forced to block.
Avoiding a kernel call per PQputCopyData call helps COPY speed materially.
Alon Goldshuv
to try to create a log segment file concurrently, but the code erroneously
specified O_EXCL to open(), resulting in a needless failure. Before 7.4,
it was even a PANIC condition :-(. Correct code is actually simpler than
what we had, because we can just say O_CREAT to start with and not need a
second open() call. I believe this accounts for several recent reports of
hard-to-reproduce "could not create file ...: File exists" errors in both
pg_clog and pg_subtrans.
sizebitvec of tsearch2, as well as identical code in several other
contrib modules. This provided about a 20X speedup in building a
large tsearch2 index ... didn't try to measure its effects for other
operations. Thanks to Stephan Vollmer for providing a test case.
temp table not only our own process' tables. It's not real important
since vacuum.c will skip temp tables anyway, but might as well make the
code do what it claims to do.
index's support-function cache (in index_getprocinfo). Since none of that
data can change for an index that's in active use, it seems sufficient to
treat all open indexes the same way we were treating "nailed" system indexes
--- that is, just re-read the pg_class row and leave the rest of the relcache
entry strictly alone. The pg_class re-read might not be strictly necessary
either, but since the reltablespace and relfilenode can change in normal
operation it seems safest to do it. (We don't support changing any of the
other info about an index at all, at the moment.)
Back-patch as far as 8.0. It might be possible to adapt the patch to 7.4,
but it would take more work than I care to expend for such a low-probability
problem. 7.3 is out of luck for sure.
occurs when it tries to heap_open pg_tablespace. When control returns to
smgrcreate, that routine will be holding a dangling pointer to a closed
SMgrRelation, resulting in mayhem. This is of course a consequence of
the violation of proper module layering inherent in having smgr.c call
a tablespace command routine, but the simplest fix seems to be to change
the locking mechanism. There's no real need for TablespaceCreateDbspace
to touch pg_tablespace at all --- it's only opening it as a way of locking
against a parallel DROP TABLESPACE command. A much better answer is to
create a special-purpose LWLock to interlock these two operations.
This drops TablespaceCreateDbspace quite a few layers down the food chain
and makes it something reasonably safe for smgr to call.
This is utterly insignificant in normal operation, but it becomes a
problem during cache inval stress testing. The original coding in fact
had no leak --- the 8.0 List rewrite created the issue. I wonder whether
list_concat should pfree the discarded header?
files: avoid creating stats hashtable entries for tables that aren't being
touched except by vacuum/analyze, ensure that entries for dropped tables are
removed promptly, and tweak the data layout to avoid storing useless struct
padding. Also improve the performance of pgstat_vacuum_tabstat(), and make
sure that autovacuum invokes it exactly once per autovac cycle rather than
multiple times or not at all. This should cure recent complaints about 8.1
showing much higher stats I/O volume than was seen in 8.0. It'd still be a
good idea to revisit the design with an eye to not re-writing the entire
stats dataset every half second ... but that would be too much to backpatch,
I fear.
discarded by cache flush while still in use. This is a minimal patch that
just copies the tupdesc anywhere it could be needed across a flush. Applied
to back branches only; Neil Conway is working on a better long-term solution
for HEAD.
a va_list. Christof Petig's previous patch made this change, but neglected
to update ecpglib/descriptor.c, resulting in a compiler warning (and a
likely runtime crash) on AMD64 and PPC.
our own command (or more generally, xmin = our xact and cmin >= current
command ID) should not be seen as good. Else we may try to update rows
we already updated. This error was inserted last August while fixing the
even bigger problem that the old coding wouldn't see *any* tuples inserted
by our own transaction as good. Per report from Euler Taveira de Oliveira.
It seems that recent gcc versions can optimize away calls to these functions
even when the functions do not exist on the platform, resulting in a bogus
positive result. Avoid this by using a non-constant argument and ensuring
that the function result is not simply discarded. Per report from
François Laupretre.
one argument at a time and then inserting the argument into a Python
list via PyList_SetItem(). This "steals" the reference to the argument:
that is, the reference to the new list member is now held by the Python
list itself. This works fine, except if an elog occurs. This causes the
function's PG_CATCH() block to be invoked, which decrements the
reference counts on both the current argument and the list of arguments.
If the elog happens to occur during the second or subsequent iteration
of the loop, the reference count on the current argument will be
decremented twice.
The fix is simple: set the local pointer to the current argument to NULL
immediately after adding it to the argument list. This ensures that the
Py_XDECREF() in the PG_CATCH() block doesn't double-decrement.
operator names. This is needed when dumping operator definitions that have
COMMUTATOR (or similar) links to operators in other schemas.
Apparently Daniel Whitter is the first person ever to try this :-(
use it. While it normally has been opened earlier during btree index
build, testing shows that it's possible for the link to be closed again
if an sinval reset occurs while the index is being built.
dead and have become unreferenced. Before 8.1, such members were left
for AtEOXact_CatCache() to clean up, but now AtEOXact_CatCache isn't
supposed to have anything to do. In an assert-enabled build this bug
leads to an assertion failure at transaction end, but in a non-assert
build the dead member is effectively just a small memory leak.
Per report from Jeremy Drake.
rather than elog(FATAL), when there is no more room in ShmemBackendArray.
This is a security issue since too many connection requests arriving close
together could cause the postmaster to shut down, resulting in denial of
service. Reported by Yoshiyuki Asaba, fixed by Magnus Hagander.
the relation but it finds a pre-existing valid buffer. The buffer does not
correspond to any page known to the kernel, so we *must* do smgrextend to
ensure that the space becomes allocated. The 7.x branches all do this
correctly, but the corner case got lost somewhere during 8.0 bufmgr rewrites.
(My fault no doubt :-( ... I think I assumed that such a buffer must be
not-BM_VALID, which is not so.)