1
0
mirror of https://github.com/postgres/postgres.git synced 2025-05-08 07:21:33 +03:00

942 Commits

Author SHA1 Message Date
Tom Lane
1e172f9692 Fix aboriginal bug in BufFileDumpBuffer that would cause it to write the
wrong data when dumping a bufferload that crosses a component-file boundary.
This probably has not been seen in the wild because (a) component files are
normally 1GB apiece and (b) non-block-aligned buffer usage is relatively
rare.  But it's fairly easy to reproduce a problem if one reduces RELSEG_SIZE
in a test build.  Kudos to Kurt Harriman for spotting the bug.
2007-06-01 23:43:17 +00:00
Tom Lane
b26329654e Fix dynahash.c to suppress hash bucket splits while a hash_seq_search() scan
is in progress on the same hashtable.  This seems the least invasive way to
fix the recently-recognized problem that a split could cause the scan to
visit entries twice or (with much lower probability) miss them entirely.
The only field-reported problem caused by this is the "failed to re-find
shared lock object" PANIC in COMMIT PREPARED reported by Michel Dorochevsky,
which was caused by multiply visited entries.  However, it seems certain
that mdsync() is vulnerable to missing required fsync's due to missed
entries, and I am fearful that RelationCacheInitializePhase2() might be at
risk as well.  Because of that and the generalized hazard presented by this
bug, back-patch all the supported branches.

Along the way, fix pg_prepared_statement() and pg_cursor() to not assume
that the hashtables they are examining will stay static between calls.
This is risky regardless of the newly noted dynahash problem, because
hash_seq_search() has never promised to cope with deletion of table entries
other than the just-returned one.  There may be no bug here because the only
supported way to call these functions is via ExecMakeTableFunctionResult()
which will cycle them to completion before doing anything very interesting,
but it seems best to get rid of the assumption.  This affects 8.2 and HEAD
only, since those functions weren't there earlier.
2007-04-26 23:24:57 +00:00
Tom Lane
366e9eea24 Rearrange mdsync() looping logic to avoid the problem that a sufficiently
fast flow of new fsync requests can prevent mdsync() from ever completing.
This was an unforeseen consequence of a patch added in Mar 2006 to prevent
the fsync request queue from overflowing.  Problem identified by Heikki
Linnakangas and independently by ITAGAKI Takahiro; fix based on ideas from
Takahiro-san, Heikki, and Tom.

Back-patch as far as 8.1 because a previous back-patch introduced the problem
into 8.1 ...
2007-04-12 17:11:00 +00:00
Tom Lane
cb476c1ec3 Back-port changes of Jan 16 and 17 to "revoke" pending fsync requests during
DROP TABLE and DROP DATABASE.  Should prevent unexpected "permission denied"
failures on Windows, and is cleaner on other platforms too since we no longer
have to take it on faith that ENOENT is okay during an fsync attempt.

Patched as far back as 8.1; per recent discussion I think we are not going
to worry about Windows-specific issues in 8.0 anymore.
2007-01-27 20:15:47 +00:00
Tom Lane
1e78ea0ffe Modify local buffer management to request memory for local buffers in blocks
of increasing size, instead of one at a time.  This reduces the memory
management overhead when num_temp_buffers is large: in the previous coding
we would actually waste 50% of the space used for temp buffers, because aset.c
would round the individual requests up to 16K.  Problem noted while studying
a performance issue reported by Steven Flatt.

Back-patch as far as 8.1 --- older versions used few enough local buffers
that the issue isn't significant for them.
2006-12-27 22:31:59 +00:00
Peter Eisentraut
409600942b KB -> kB 2006-11-24 09:20:12 +00:00
Tom Lane
3ad0728c81 On systems that have setsid(2) (which should be just about everything except
Windows), arrange for each postmaster child process to be its own process
group leader, and deliver signals SIGINT, SIGTERM, SIGQUIT to the whole
process group not only the direct child process.  This provides saner behavior
for archive and recovery scripts; in particular, it's possible to shut down a
warm-standby recovery server using "pg_ctl stop -m immediate", since delivery
of SIGQUIT to the startup subprocess will result in killing the waiting
recovery_command.  Also, this makes Query Cancel and statement_timeout apply
to scripts being run from backends via system().  (There is no support in the
core backend for that, but it's widely done using untrusted PLs.)  Per gripe
from Stephen Harris and subsequent discussion.
2006-11-21 20:59:53 +00:00
Tom Lane
1a5c450f30 When truncating a relation in-place (eg during VACUUM), do not try to unlink
any no-longer-needed segments; just truncate them to zero bytes and leave
the files in place for possible future re-use.  This avoids problems when
the segments are re-used due to relation growth shortly after truncation.
Before, the bgwriter, and possibly other backends, could still be holding
open file references to the old segment files, and would write dirty blocks
into those files where they'd disappear from the view of other processes.

Back-patch as far as 8.0.  I believe the 7.x branches are not vulnerable,
because they had no bgwriter, and "blind" writes by other backends would
always be done via freshly-opened file references.
2006-11-20 01:07:56 +00:00
Tom Lane
36e012e727 Remove temporary Windows-specific debugging code; it seems the problem
with fopen() not using FILE_SHARE_DELETE was indeed the bug we were after,
given lack of recent reports.
2006-11-06 17:10:22 +00:00
Tom Lane
48188e1621 Fix recently-understood problems with handling of XID freezing, particularly
in PITR scenarios.  We now WAL-log the replacement of old XIDs with
FrozenTransactionId, so that such replacement is guaranteed to propagate to
PITR slave databases.  Also, rather than relying on hint-bit updates to be
preserved, pg_clog is not truncated until all instances of an XID are known to
have been replaced by FrozenTransactionId.  Add new GUC variables and
pg_autovacuum columns to allow management of the freezing policy, so that
users can trade off the size of pg_clog against the amount of freezing work
done.  Revise the already-existing code that forces autovacuum of tables
approaching the wraparound point to make it more bulletproof; also, revise the
autovacuum logic so that anti-wraparound vacuuming is done per-table rather
than per-database.  initdb forced because of changes in pg_class, pg_database,
and pg_autovacuum catalogs.  Heikki Linnakangas, Simon Riggs, and Tom Lane.
2006-11-05 22:42:10 +00:00
Tom Lane
954c1813ac Remove an unnecessary HOLD_INTERRUPTS/RESUME_INTERRUPTS pair.
This was required back when RESUME_INTERRUPTS could actually
execute ProcessInterrupts, but that hasn't been true since 2001...
2006-10-22 20:34:54 +00:00
Tom Lane
e0dece127d Redesign the patch for allocation of shmem space and LWLocks for add-on
modules; the first try was not usable in EXEC_BACKEND builds (e.g.,
Windows).  Instead, just provide some entry points to increase the
allocation requests during postmaster start, and provide a dedicated
LWLock that can be used to synchronize allocation operations performed
by backends.  Per discussion with Marc Munro.
2006-10-15 22:04:08 +00:00
Bruce Momjian
f99a569a2e pgindent run for 8.2. 2006-10-04 00:30:14 +00:00
Tom Lane
c92f7e258e Replace strncpy with strlcpy in selected places that seem possibly relevant
to performance.  (A wholesale effort to get rid of strncpy should be
undertaken sometime, but not during beta.)  This commit also fixes dynahash.c
to correctly truncate overlength string keys for hashtables, so that its
callers don't have to anymore.
2006-09-27 18:40:10 +00:00
Tom Lane
ffae5cc5a6 Add a check to prevent overwriting valid data if smgrnblocks() gives a
wrong answer, as has been seen to occur with a buggy Linux kernel.  Not
really our bug, but it's a simple test in a seldom-used control path,
so might as well have a defense.
2006-09-25 22:01:10 +00:00
Tom Lane
d40d34863e Fix pg_locks view to call advisory locks advisory locks, while preserving
backward compatibility for anyone using the old userlock code that's now
on pgfoundry --- locks from that code still show as 'userlock'.
2006-09-22 23:20:14 +00:00
Tom Lane
9e936693a9 Fix free space map to correctly track the total amount of FSM space needed
even when a single relation requires more than max_fsm_pages pages.  Also,
make VACUUM emit a warning in this case, since it likely means that VACUUM
FULL or other drastic corrective measure is needed.  Per reports from Jeff
Frost and others of unexpected changes in the claimed max_fsm_pages need.
2006-09-21 20:31:22 +00:00
Tom Lane
9b4cda0df6 Add built-in userlock manipulation functions to replace the former
contrib functionality.  Along the way, remove the USER_LOCKS configuration
symbol, since it no longer makes any sense to try to compile that out.
No user documentation yet ... mmoncure has promised to write some.
Thanks to Abhijit Menon-Sen for creating a first draft to work from.
2006-09-18 22:40:40 +00:00
Tom Lane
2e5e856f6b Marginal cleanup in arrangements for ensuring StrategyHintVacuum is cleared
after an error during VACUUM.  We have a PG_TRY block anyway around the only
call sites, so just reset it in the CATCH clause instead of having
AtEOXact_Buffers blindly do it during xact end.  I think the old code was
actively wrong for the case of a failure during ANALYZE inside a
subtransaction --- the flag wouldn't get cleared until main transaction end.
Probably not worth back-patching though.
2006-09-17 22:16:22 +00:00
Bruce Momjian
a0e87ad7a5 Specify lo_write() to take a _const_ buffer, to match documentation. 2006-09-07 15:37:25 +00:00
Tom Lane
8fad2e3ff4 Arrange for GetSnapshotData to copy live-subtransaction XIDs from the
PGPROC array into snapshots, and use this information to avoid visits
to pg_subtrans in HeapTupleSatisfiesSnapshot.  This appears to solve
the pg_subtrans-related context swap storm problem that's been reported
by several people for 8.1.  While at it, modify GetSnapshotData to not
take an exclusive lock on ProcArrayLock, as closer analysis shows that
shared lock is always sufficient.
Itagaki Takahiro and Tom Lane
2006-09-03 15:59:39 +00:00
Tom Lane
e06fda0a8b Add a function GetLockConflicts() to lock.c to report xacts holding
locks that would conflict with a specified lock request, without
actually trying to get that lock.  Use this instead of the former ad hoc
method of doing the first wait step in CREATE INDEX CONCURRENTLY.
Fixes problem with undetected deadlock and in many cases will allow the
index creation to proceed sooner than it otherwise could've.  Per
discussion with Greg Stark.
2006-08-27 19:14:34 +00:00
Tom Lane
e093dcdd28 Add the ability to create indexes 'concurrently', that is, without
blocking concurrent writes to the table.  Greg Stark, with a little help
from Tom Lane.
2006-08-25 04:06:58 +00:00
Tom Lane
f836c2e37e Add some debug logging code to AllocateFile's failure path to log the
specific Windows error code (GetLastError).  This is a hopefully temporary
hack to try to diagnose rare failures.  Magnus Hagander
2006-08-24 03:15:43 +00:00
Tom Lane
9bf760f7de Add a 'waiting' column to pg_stat_activity to carry the same information
that ps_status provides by appending 'waiting' to the PS display.  This
completes the project of making it feasible to turn off process title
updates and instead rely on pg_stat_activity.  Per my suggestion a few
weeks ago.
2006-08-19 01:36:34 +00:00
Tom Lane
7aa772f03e Now that we've rearranged relation open to get a lock before touching
the rel, it's easy to get rid of the narrow race-condition window that
used to exist in VACUUM and CLUSTER.  Did some minor code-beautification
work in the same area, too.
2006-08-18 16:09:13 +00:00
Tom Lane
2dd7ab0627 Put back another improperly-removed #include. 2006-08-07 21:56:25 +00:00
Tom Lane
3467758809 Fix missing 'static' keywords --- some compilers gripe about this. 2006-08-04 16:42:56 +00:00
Bruce Momjian
2c6d96cef6 Add support for loadable modules to allocated shared memory and
lightweight locks.

Marc Munro
2006-08-01 19:03:11 +00:00
Tom Lane
09d3670df3 Change the relation_open protocol so that we obtain lock on a relation
(table or index) before trying to open its relcache entry.  This fixes
race conditions in which someone else commits a change to the relation's
catalog entries while we are in process of doing relcache load.  Problems
of that ilk have been reported sporadically for years, but it was not
really practical to fix until recently --- for instance, the recent
addition of WAL-log support for in-place updates helped.

Along the way, remove pg_am.amconcurrent: all AMs are now expected to support
concurrent update.
2006-07-31 20:09:10 +00:00
Tom Lane
8822263635 Fix a couple of comments. 2006-07-30 20:17:11 +00:00
Alvaro Herrera
92c2ecc130 Modify snapshot definition so that lazy vacuums are ignored by other
vacuums.  This allows a OLTP-like system with big tables to continue
regular vacuuming on small-but-frequently-updated tables while the
big tables are being vacuumed.

Original patch from Hannu Krossing, rewritten by Tom Lane and updated
by me.
2006-07-30 02:07:18 +00:00
Peter Eisentraut
e9b4969062 DTrace support, with a small initial set of probes
by Robert Lor
2006-07-24 16:32:45 +00:00
Tom Lane
a794fb0681 Convert the lock manager to use the new dynahash.c support for partitioned
hash tables, instead of the previous kluge involving multiple hash tables.
This partially undoes my patch of last December.
2006-07-23 23:08:46 +00:00
Tom Lane
b25dc481c8 Fix oversight in sizing of shared buffer lookup hashtable. Because
BufferAlloc tries to insert a new mapping entry before deleting the old one
for a buffer, we have a transient need for more than NBuffers entries ---
one more in 8.1, and as many as NUM_BUFFER_PARTITIONS more in CVS HEAD.
In theory this could lead to an "out of shared memory" failure if shmem
had already been completely claimed by the time the extra entries were
needed.
2006-07-23 18:34:45 +00:00
Tom Lane
10b9ca3d05 Split the buffer mapping table into multiple separately lockable
partitions, as per discussion.  Passes functionality checks, but
I don't have any performance data yet.
2006-07-23 03:07:58 +00:00
Tom Lane
51ee9fa157 Add support to dynahash.c for partitioning shared hashtables according
to the low-order bits of the entry hash value.  Also make some incidental
cleanups in the dynahash API, such as not exporting the hash header
structs to the world.
2006-07-22 23:04:39 +00:00
Tom Lane
c0e9b3139f Hmm, seems --disable-spinlocks has been broken for awhile and nobody
noticed.  Fix SpinlockSemas() to report the correct count considering
that PG 8.1 adds a spinlock to each shared-buffer header.
2006-07-22 21:04:40 +00:00
Tom Lane
3ff58b48c9 Put back another not-so-unnecessary #include, per report from Hiroshi Saito. 2006-07-16 01:05:23 +00:00
Tom Lane
daecd97617 Put back some more not-so-unused-as-all-that #includes. This un-breaks
the EXEC_BACKEND code on my machines, so hopefully it will fix the
Windows buildfarm members.
2006-07-15 15:47:17 +00:00
Tom Lane
cd24163f6d Fix another passel of include-file breakage. Kris Jurka, Tom Lane 2006-07-14 16:59:19 +00:00
Bruce Momjian
e0522505bd Remove 576 references of include files that were not needed. 2006-07-14 14:52:27 +00:00
Tom Lane
ae643747b1 Fix a passel of recently-committed violations of the rule 'thou shalt
have no other gods before c.h'.  Also remove some demonstrably redundant
#include lines, mostly of <errno.h> which was added to c.h years ago.
2006-07-14 05:28:29 +00:00
Bruce Momjian
a22d76d96a Allow include files to compile own their own.
Strip unused include files out unused include files, and add needed
includes to C files.

The next step is to remove unused include files in C files.
2006-07-13 16:49:20 +00:00
Bruce Momjian
370a709c75 Add GUC update_process_title to control whether 'ps' display is updated
for every command, default to on.
2006-06-27 22:16:44 +00:00
Tom Lane
27c3e3de09 Remove redundant gettimeofday() calls to the extent practical without
changing semantics too much.  statement_timestamp is now set immediately
upon receipt of a client command message, and the various places that used
to do their own gettimeofday() calls to mark command startup are referenced
to that instead.  I have also made stats_command_string use that same
value for pg_stat_activity.query_start for both the command itself and
its eventual replacement by <IDLE> or <idle in transaction>.  There was
some debate about that, but no argument that seemed convincing enough to
justify an extra gettimeofday() call.
2006-06-20 22:52:00 +00:00
Tom Lane
b13c9686d0 Take the statistics collector out of the loop for monitoring backends'
current commands; instead, store current-status information in shared
memory.  This substantially reduces the overhead of stats_command_string
and also ensures that pg_stat_activity is fully up to date at all times.
Per my recent proposal.
2006-06-19 01:51:22 +00:00
Tom Lane
8ff80c1bd3 Remove obsolete comment about VACUUM FULL: it takes buffer content locks
now, and must do so to ensure bgwriter doesn't write a page that is in
process of being compacted.
2006-06-08 14:58:33 +00:00
Bruce Momjian
26cfefabad Fix printf mask for SizeVfdCache
Qingqing Zhou
2006-05-30 13:04:59 +00:00
Tom Lane
2246e31775 Upon closer inspection, the sparc code in s_lock.c is dead code, and
always has been, because it's not got any .globl declaration!  We've
been relying on the solaris_sparc.s code instead.  Rip it out.
(Not back-patched, since this is just cosmetic cleanup.)
2006-05-12 16:50:52 +00:00