* removed a few redundant defines
* get_user_name safe under win32
* rationalized pipe read EOF for win32 (UPDATED PATCH USED)
* changed all backend instances of sleep() to pg_usleep
- except for the SLEEP_ON_ASSERT in assert.c, as it would exceed a
32-bit long [Note to patcher: If a SLEEP_ON_ASSERT of 2000 seconds is
acceptable, please replace with pg_usleep(2000000000L)]
I added a comment to that part of the code:
/*
* It would be nice to use pg_usleep() here, but only does 2000 sec
* or 33 minutes, which seems too short.
*/
sleep(1000000);
Claudio Natoli
wit: Add a header record to each WAL segment file so that it can be reliably
identified. Avoid splitting WAL records across segment files (this is not
strictly necessary, but makes it simpler to incorporate the header records).
Make WAL entries for file creation, deletion, and truncation (as foreseen but
never implemented by Vadim). Also, add support for making XLOG_SEG_SIZE
configurable at compile time, similarly to BLCKSZ. Fix a couple bugs I
introduced in WAL replay during recent smgr API changes. initdb is forced
due to changes in pg_control contents.
the relcache, and so the notion of 'blind write' is gone. This should
improve efficiency in bgwriter and background checkpoint processes.
Internal restructuring in md.c to remove the not-very-useful array of
MdfdVec objects --- might as well just use pointers.
Also remove the long-dead 'persistent main memory' storage manager (mm.c),
since it seems quite unlikely to ever get resurrected.
pointer type when it is not necessary to do so.
For future reference, casting NULL to a pointer type is only necessary
when (a) invoking a function AND either (b) the function has no prototype
OR (c) the function is a varargs function.
- Update comment in IsReservedName() to the present day
- Improve some variable & function names in commands/vacuum.c. I
was planning to rewrite this to avoid lappend(), but since I
still intend to do the list rewrite, there's no need for that.
- Update some smgr comments which seemed to imply that we still
forced all dirty pages to disk at commit-time.
- Replace some #ifdef DIAGNOSTIC code with assertions.
- Make the distinction between OS-level file descriptors and
virtual file descriptors a little clearer in a few comments
- Other minor comment improvements in the smgr code
Adjustable threshold is gone in favor of keeping track of total requested
page storage and doling out proportional fractions to each relation
(with a minimum amount per relation, and some quantization of the results
to avoid thrashing with small changes in page counts). Provide special-
case code for indexes so as not to waste space storing useless page
free space counts. Restructure internal data storage to be a flat array
instead of list-of-chunks; this may cost a little more work in data
copying when reorganizing, but allows binary search to be used during
lookup_fsm_page_entry().
previously determined not to be the last segment of a relation.
This reduces the expected cost to one seek, rather than one seek per
segment. We can get away with this because truncation of a relation
will cause a relcache flush and so the md.c file descriptor will be
closed; when it is re-opened we will re-determine the last segment.
> There's no longer a separate call to heap_storage_create in that routine
> --- the right place to make the test is now in the storage_create
> boolean parameter being passed to heap_create. A simple change, but
> it passeth patch's understanding ...
Thanks.
Attached is a patch against cvs tip as of 8:30 PM PST or so. Turned out
that even after fixing the failed hunks, there was a new spot in
bufmgr.c which needed to be fixed (related to temp relations;
RelationUpdateNumberOfBlocks). But thankfully the regression test code
caught it :-)
Joe Conway
The local buffer manager is no longer used for newly-created relations
(unless they are TEMP); a new non-TEMP relation goes through the shared
bufmgr and thus will participate normally in checkpoints. But TEMP relations
use the local buffer manager throughout their lifespan. Also, operations
in TEMP relations are not logged in WAL, thus improving performance.
Since it's no longer necessary to fsync relations as they move out of the
local buffers into shared buffers, quite a lot of smgr.c/md.c/fd.c code
is no longer needed and has been removed: there's no concept of a dirty
relation anymore in md.c/fd.c, and we never fsync anything but WAL.
Still TODO: improve local buffer management algorithms so that it would
be reasonable to increase NLocBuffer.
o Change all current CVS messages of NOTICE to WARNING. We were going
to do this just before 7.3 beta but it has to be done now, as you will
see below.
o Change current INFO messages that should be controlled by
client_min_messages to NOTICE.
o Force remaining INFO messages, like from EXPLAIN, VACUUM VERBOSE, etc.
to always go to the client.
o Remove INFO from the client_min_messages options and add NOTICE.
Seems we do need three non-ERROR elog levels to handle the various
behaviors we need for these messages.
Regression passed.
now just below FATAL in server_min_messages. Added more text to
highlight ordering difference between it and client_min_messages.
---------------------------------------------------------------------------
REALLYFATAL => PANIC
STOP => PANIC
New INFO level the prints to client by default
New LOG level the prints to server log by default
Cause VACUUM information to print only to the client
NOTICE => INFO where purely information messages are sent
DEBUG => LOG for purely server status messages
DEBUG removed, kept as backward compatible
DEBUG5, DEBUG4, DEBUG3, DEBUG2, DEBUG1 added
DebugLvl removed in favor of new DEBUG[1-5] symbols
New server_min_messages GUC parameter with values:
DEBUG[5-1], INFO, NOTICE, ERROR, LOG, FATAL, PANIC
New client_min_messages GUC parameter with values:
DEBUG[5-1], LOG, INFO, NOTICE, ERROR, FATAL, PANIC
Server startup now logged with LOG instead of DEBUG
Remove debug_level GUC parameter
elog() numbers now start at 10
Add test to print error message if older elog() values are passed to elog()
Bootstrap mode now has a -d that requires an argument, like postmaster
readability. Bizarre '(long *) TRUE' return convention is gone,
in favor of just raising an error internally in dynahash.c when
we detect hashtable corruption. HashTableWalk is gone, in favor
of using hash_seq_search directly, since it had no hope of working
with non-LONGALIGNable datatypes. Simplify some other code that was
made undesirably grotty by promixity to HashTableWalk.
portability issues). Caller-visible data structures are now allocated
on MAXALIGN boundaries, allowing safe use of datatypes wider than 'long'.
Rejigger hash_create API so that caller specifies size of key and
total size of entry, not size of key and size of rest of entry.
This simplifies life considerably since each number is just a sizeof(),
and padding issues etc. are taken care of automatically.
existing lock manager and spinlocks: it understands exclusive vs shared
lock but has few other fancy features. Replace most uses of spinlocks
with lightweight locks. All remaining uses of spinlocks have very short
lock hold times (a few dozen instructions), so tweak spinlock backoff
code to work efficiently given this assumption. All per my proposal on
pghackers 26-Sep-01.
useful as yet, since its primary source of information is (full) VACUUM,
which makes a concerted effort to get rid of free space before telling
the map about it ... next stop is concurrent VACUUM ...
stub) into the rest of the system. Adopt a cleaner approach to preventing
deadlock in concurrent heap_updates: allow RelationGetBufferForTuple to
select any page of the rel, and put the onus on it to lock both buffers
in a consistent order. Remove no-longer-needed isExtend hack from
API of ReleaseAndReadBuffer.
do anything yet, but it has the necessary connections to initialization
and so forth. Make some gestures towards allowing number of blocks in
a relation to be BlockNumber, ie, unsigned int, rather than signed int.
(I doubt I got all the places that are sloppy about it, yet.) On the
way, replace the hardwired NLOCKS_PER_XACT fudge factor with a GUC
variable.
checkpoint's redo pointer, not its undo pointer, per discussion in
pghackers a few days ago. No point in hanging onto undo information
until we have the ability to do something with it --- and this solves
a rather large problem with log space for long-running transactions.
Also, change all calls of write() to detect the case where write
returned a count less than requested, but failed to set errno.
Presume that this situation indicates ENOSPC, and give the appropriate
error message, rather than a random message associated with the previous
value of errno.
not being consulted anywhere, so remove it and remove the _mdnblocks()
calls that were used to set it. Change smgrextend interface to pass in
the target block number (ie, current file length) --- the caller always
knows this already, having already done smgrnblocks(), so it's silly to
do it over again inside mdextend. Net result: extension of a file now
takes one lseek(SEEK_END) and a write(), not three lseeks and a write.
(WAL logging for this is not done yet, however.) Clean up a number of really
crufty things that are no longer needed now that DROP behaves nicely. Make
temp table mapper do the right things when drop or rename affecting a temp
table is rolled back. Also, remove "relation modified while in use" error
check, in favor of locking tables at first reference and holding that lock
throughout the statement.
There's now only one transition value and transition function.
NULL handling in aggregates is a lot cleaner. Also, use Numeric
accumulators instead of integer accumulators for sum/avg on integer
datatypes --- this avoids overflow at the cost of being a little slower.
Implement VARIANCE() and STDDEV() aggregates in the standard backend.
Also, enable new LIKE selectivity estimators by default. Unrelated
change, but as long as I had to force initdb anyway...