as a shared dirtybit for each shared buffer. The shared dirtybit still
controls writing the buffer, but the local bit controls whether we need
to fsync the buffer's file. This arrangement fixes a bug that allowed
some required fsyncs to be missed, and should improve performance as well.
For more info see my post of same date on pghackers.
running gcc and HP's cc with warnings cranked way up. Signed vs unsigned
comparisons, routines declared static and then defined not-static,
that kind of thing. Tedious, but perhaps useful...
Initdb help correction
Changed end/abort to commit/rollback and changed related notices
Commented out way old printing functions in libpq
Fixed a typo in alter table / alter column
Implements the CREATE CONSTRAINT TRIGGER and SET CONSTRAINTS commands.
TODO:
Generic builtin trigger procedures
Automatic execution of appropriate CREATE CONSTRAINT... at CREATE TABLE
Support of new trigger type in pg_dump
Swapping of huge # of events to disk
Jan
* Buffer refcount cleanup (per my "progress report" to pghackers, 9/22).
* Add links to backend PROC structs to sinval's array of per-backend info,
and use these links for routines that need to check the state of all
backends (rather than the slow, complicated search of the ShmemIndex
hashtable that was used before). Add databaseOID to PROC structs.
* Use this to implement an interlock that prevents DESTROY DATABASE of
a database containing running backends. (It's a little tricky to prevent
a concurrently-starting backend from getting in there, since the new
backend is not able to lock anything at the time it tries to look up
its database in pg_database. My solution is to recheck that the DB is
OK at the end of InitPostgres. It may not be a 100% solution, but it's
a lot better than no interlock at all...)
* In ALTER TABLE RENAME, flush buffers for the relation before doing the
rename of the physical files, to ensure we don't get failures later from
mdblindwrt().
* Update TRUNCATE patch so that it actually compiles against current
sources :-(.
You should do "make clean all" after pulling these changes.
See attached mail for more details.
-------------------------------------------------------------------
From: "Vadim Mikheev" <vadim@krs.ru>
To: "Hiroshi Inoue" <Inoue@tpf.co.jp>
References: <000201befa94$42fe04c0$2801007e@cadzone.tpf.co.jp>
Subject: Re: elog(ERROR) in vacuum
Date: Fri, 10 Sep 1999 10:27:10 +0900
Organization: OJSC Rostelecom (Krasnoyarsk)
Message-ID: <37D85E6E.5AFA126D@krs.ru>
Hiroshi Inoue wrote:
>
> Hello Vadim,
>
> I have a question about vacuum.
>
> VACUUM has a phase like commit which calls TransactionIdCommit().
> But if elog(ERROR) occured after that,the status of transaction is
> changed from XID_COMMIT to XID_ABORT.
>
> Seems to me this causes inconsistency.
> Shoudn't AbortTransaction() be changed not to call TransacionIdAbort()
> in case of vacuum.
You're right!
As usual -:)
Vadim
transaction abort --- before it only worked if there was exactly one level
of allocation context stacked in the blank portal. Now it does the right
thing for any depth, including zero...
has positive refcount, it is rebuilt from pg_class data. This ensures
that relcache entries will track changes made by other backends. Formerly,
a shared inval report would just be ignored if it happened to arrive while
the relcache entry was in use. Also, fix relcache to reset ref counts
to zero during transaction abort. Finally, change LockRelation() so that
it checks for shared inval reports after obtaining the lock. In this way,
once any kind of lock has been obtained on a rel, we can trust the relcache
entry to be up-to-date.
Also, move responsibility for calling vc_abort into main xact.c list of
things-to-call-at-abort. What in the world was it doing down inside of
TransactionIdAbort()?
and possibly for other cases too:
DO NOT cache status of transaction in unknown state
(i.e. non-committed and non-aborted ones)
Example:
T1 reads row updated/inserted by running T2 and cache T2 status.
T2 commits.
Now T1 reads a row updated by T2 and with HEAP_XMAX_COMMITTED
in t_infomask (so cached T2 status is not changed).
Now T1 EvalPlanQual gets updated row version without HEAP_XMIN_COMMITTED
-> TransactionIdDidCommit(t_xmin) and TransactionIdDidAbort(t_xmin)
return FALSE and T2 decides that t_xmin is not committed and gets
ERROR above.
It's too late to find more smart way to handle such cases and so
I just changed xact status caching and got rid TransactionIdFlushCache()
from code.
Changed: transam.c, xact.c, lmgr.c and transam.h - last three
just because of TransactionIdFlushCache() is removed.
2. heapam.c:
T1 marked a row for update. T2 waits for T1 commit/abort.
T1 commits. T3 updates the row before T2 locks row page.
Now T2 sees that new row t_xmax is different from xact id (T1)
T2 was waiting for. Old code did Assert here. New one goes to
HeapTupleSatisfiesUpdate. Obvious changes too.
3. Added Assert to vacuum.c
4. bufmgr.c: break
Assert(buf->r_locks == 0 && !buf->ri_lock)
into two Asserts.
2. varsup.c:ReadNewTransactionId(): don't read nextXid from disk -
this func doesn't allocate next xid, so ShmemVariableCache->nextXid
may be used (but GetNewTransactionId() must be called first).
3. vacuum.c: change elog(ERROR, "Child item....") to elog(NOTICE) -
this is not ERROR, proper handling is just not implemented, yet.
4. s_lock.c: increase S_MAX_BUSY by 2 times.
5. shmem.c:GetSnapshotData(): have to call ReadNewTransactionId()
_after_ SpinAcquire(ShmemIndexLock).
transactions will not assume that MyProc transaction was committed
before snapshot calculations. With old MyProc->xid assignment
(in xact.c:StartTransaction()) there was ability to see the same
row twice (I used gdb for this)!...
2. Assignments of InvalidTransactionId to MyProc->xid and MyProc->xmin
are moved from xact.c:CommitTransaction() to
xact.c:RecordTransactionCommit() - this invalidation must be done
before releasing transaction locks or bad (too high) XmaxRecent value
might be used by vacuum ("ERROR: Child itemid marked as unused"
reported by "Hiroshi Inoue" <Inoue@tpf.co.jp>; once again, gdb
allowed me reproduce this error).
files to be closed automatically at transaction abort or commit, should
they still be open. Also close any still-open stdio files allocated with
AllocateFile at abort/commit. This should eliminate problems with leakage
of file descriptors after an error. Also, put in some primitive buffered-IO
support so that psort.c can use virtual files without severe performance
penalties.
2. Much faster btree tuples deletion in the case when first on page
index tuple is deleted (no movement to the left page(s)).
3. Remember blkno of new root page in BTPageOpaque of
left/right siblings when root page is splitted.
calls. Outside a transaction, the backend detects them as buffer
leaks; it sends a NOTICE, and frees them. This sometimes cause a
segmentation fault (at least on Linux). These indexes are initialized
on the first lo_read/lo_write/lo_tell call, and (normally) closed
on a lo_close call. Thus the buffer leaks appear when lo direct
access functions are used, and not with lo_import/lo_export functions
(libpq version calls lo_close before ending the command, and the
backend version uses another path).
The included patches (against recent snapshot, and against 6.3.2)
cause indexes to be closed on transaction end (that is on explicit
'END' statment, or on command termination outside trasaction blocks),
thus preventing the buffer leaks while increasing performance inside
transactions. Some (all?) 'classic' memory leaks are also removed.
I hope it will be ok.
--- Pascal ANDRE, graduated from Ecole Centrale Paris andre@via.ecp.fr