In the previous coding, we would look up a relation in RangeVarGetRelid,
lock the resulting OID, and then AcceptInvalidationMessages(). While
this was sufficient to ensure that we noticed any changes to the
relation definition before building the relcache entry, it didn't
handle the possibility that the name we looked up no longer referenced
the same OID. This was particularly problematic in the case where a
table had been dropped and recreated: we'd latch on to the entry for
the old relation and fail later on. Now, we acquire the relation lock
inside RangeVarGetRelid, and retry the name lookup if we notice that
invalidation messages have been processed meanwhile. Many operations
that would previously have failed with an error in the presence of
concurrent DDL will now succeed.
There is a good deal of work remaining to be done here: many callers
of RangeVarGetRelid still pass NoLock for one reason or another. In
addition, nothing in this patch guards against the possibility that
the meaning of an unqualified name might change due to the creation
of a relation in a schema earlier in the user's search path than the
one where it was previously found. Furthermore, there's nothing at
all here to guard against similar race conditions for non-relations.
For all that, it's a start.
Noah Misch and Robert Haas
This patch gets us out from under the Unix limitation of two user-defined
signal types. We already had done something similar for signals directed to
the postmaster process; this adds multiplexing for signals directed to
backends and auxiliary processes (so long as they're connected to shared
memory).
As proof of concept, replace the former usage of SIGUSR1 and SIGUSR2
for backends with use of the multiplexing mechanism. There are still some
hard-wired definitions of SIGUSR1 and SIGUSR2 for other process types,
but getting rid of those doesn't seem interesting at the moment.
Fujii Masao
replication patch needs a signal, but we've already used SIGUSR1 and
SIGUSR2 in normal backends. This patch allows reusing SIGUSR1 for that,
and for other purposes too if the need arises.
unnecessary cache resets. The major changes are:
* When the queue overflows, we only issue a cache reset to the specific
backend or backends that still haven't read the oldest message, rather
than resetting everyone as in the original coding.
* When we observe backend(s) falling well behind, we signal SIGUSR1
to only one backend, the one that is furthest behind and doesn't already
have a signal outstanding for it. When it finishes catching up, it will
in turn signal SIGUSR1 to the next-furthest-back guy, if there is one that
is far enough behind to justify a signal. The PMSIGNAL_WAKEN_CHILDREN
mechanism is removed.
* We don't attempt to clean out dead messages after every message-receipt
operation; rather, we do it on the insertion side, and only when the queue
fullness passes certain thresholds.
* Split SInvalLock into SInvalReadLock and SInvalWriteLock so that readers
don't block writers nor vice versa (except during the infrequent queue
cleanout operations).
* Transfer multiple sinval messages for each acquisition of a read or
write lock.
deals with the queue, including locking etc, is all in sinvaladt.c. This means
that the struct definition of the queue, and the queue pointer, are now
internal "implementation details" inside sinvaladt.c.
Per my proposal dated 25-Jun-2007 and followup discussion.
comment line where output as too long, and update typedefs for /lib
directory. Also fix case where identifiers were used as variable names
in the backend, but as typedefs in ecpg (favor the backend for
indenting).
Backpatch to 8.1.X.
to 'Size' (that is, size_t), and install overflow detection checks in it.
This allows us to remove the former arbitrary restrictions on NBuffers
etc. It won't make any difference in a 32-bit machine, but in a 64-bit
machine you could theoretically have terabytes of shared buffers.
(How efficiently we could manage 'em remains to be seen.) Similarly,
num_temp_buffers, work_mem, and maintenance_work_mem can be set above
2Gb on a 64-bit machine. Original patch from Koichi Suzuki, additional
work by moi.
communication structure, and make it its own module with its own lock.
This should reduce contention at least a little, and it definitely makes
the code seem cleaner. Per my recent proposal.
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
pg_subtrans --- what we need is the oldest xmin of any snapshot in use
in the current top transaction. Introduce a new variable TransactionXmin
to play this role. Fixes intermittent regression failure reported by
Neil Conway.
Asserts would lead to a server core dump if an error occurred while
trying to abort a failed subtransaction (thereby leading to re-execution
of whatever parts of AbortSubTransaction had already run). This of course
does not prevent such an error from creating an infinite loop, but at
least we don't make the situation worse. Responds to an open item on
the subtransactions to-do list.
updates are no longer WAL-logged nor even fsync'd; we do not need to,
since after a crash no old pg_subtrans data is needed again. We truncate
pg_subtrans to RecentGlobalXmin at each checkpoint. slru.c's API is
refactored a little bit to separate out the necessary decisions.
RecentXmin (== MyProc->xmin). This ensures that it will be safe to
truncate pg_subtrans at RecentGlobalXmin, which should largely eliminate
any fear of bloat. Along the way, eliminate SubTransXidsHaveCommonAncestor,
which isn't really needed and could not give a trustworthy result anyway
under the lookback restriction.
In an unrelated but nearby change, #ifdef out GetUndoRecPtr, which has
been dead code since 2001 and seems unlikely to ever be resurrected.
therefore starting with GetCurrentTransactionId is wrong. Fixes
miscomputation of RecentGlobalXmin leading to bizarre behavior
reported by Gavin Sherry.
number of active subtransaction XIDs in each backend's PGPROC entry,
and use this to avoid expensive probes into pg_subtrans during
TransactionIdIsInProgress. Extend EOXactCallback API to allow add-on
modules to get control at subxact start/end. (This is deliberately
not compatible with the former API, since any uses of that API probably
need manual review anyway.) Add basic reference documentation for
SAVEPOINT and related commands. Minor other cleanups to check off some
of the open issues for subtransactions.
Alvaro Herrera and Tom Lane.
performance front, but with feature freeze upon us I think it's time to
drive a stake in the ground and say that this will be in 7.5.
Alvaro Herrera, with some help from Tom Lane.
(SIGUSR1, which we have not been using recently) instead of piggybacking
on SIGUSR2-driven NOTIFY processing. This has several good results:
the processing needed to drain the sinval queue is a lot less than the
processing needed to answer a NOTIFY; there's less contention since we
don't have a bunch of backends all trying to acquire exclusive lock on
pg_listener; backends that are sitting inside a transaction block can
still drain the queue, whereas NOTIFY processing can't run if there's
an open transaction block. (This last is a fairly serious issue that
I don't think we ever recognized before --- with clients like JDBC that
tend to sit with open transaction blocks, the sinval queue draining
mechanism never really worked as intended, probably resulting in a lot
of useless cache-reset overhead.) This is the last of several proposed
changes in response to Philip Warner's recent report of sinval-induced
performance problems.
discussion on pgsql-hackers: in READ COMMITTED mode we just have to force
a QuerySnapshot update in the trigger, but in SERIALIZABLE mode we have
to run the scan under a current snapshot and then complain if any rows
would be updated/deleted that are not visible in the transaction snapshot.
now able to cope with assigning new relfilenode values to nailed-in-cache
indexes, so they can be reindexed using the fully crash-safe method. This
leaves only shared system indexes as special cases. Remove the 'index
deactivation' code, since it provides no useful protection in the shared-
index case. Require reindexing of shared indexes to be done in standalone
mode, but remove other restrictions on REINDEX. -P (IgnoreSystemIndexes)
now prevents using indexes for lookups, but does not disable index updates.
It is therefore safe to allow from PGOPTIONS. Upshot: reindexing system catalogs
can be done without a standalone backend for all cases except
shared catalogs.
free'd for every transaction or statement, respectively. This patch
puts these data structures into static memory, thus saving a few CPU
cycles and two malloc calls per transaction or (in isolation level
READ COMMITTED) per query.
Manfred Koizar
database access outside a transaction; revert bogus performance improvement
in SIBackendInit(); improve comments; add documentation (this part courtesy
Neil Conway).
connections by the superuser only.
This patch replaces the last patch I sent a couple of days ago.
It closes a connection that has not been authorised by a superuser if it would
leave less than the GUC variable ReservedBackends
(superuser_reserved_connections in postgres.conf) backend process slots free
in the SISeg. This differs to the first patch which only reserved the last
ReservedBackends slots in the procState array. This has made the free slot
test more expensive due to the use of a lock.
After thinking about a comment on the first patch I've also made it a fatal
error if the number of reserved slots is not less than the maximum number of
connections.
Nigel J. Andrews