Also fix the name of the dtrace probe for LWLockAcquireOrWait(). The
function was renamed from LWLockWaitUntilFree to LWLockAqcuireOrWait, but
the dtrace probe was neglected.
Pointed out by Andres Freund and the buildfarm.
This makes it possible to store lwlocks as part of some other data
structure in the main shared memory segment, or in a dynamic shared
memory segment. There is still a main LWLock array and this patch does
not move anything out of it, but it provides necessary infrastructure
for doing that in the future.
This change is likely to increase the size of LWLockPadded on some
platforms, especially 32-bit platforms where it was previously only
16 bytes.
Patch by me. Review by Andres Freund and KaiGai Kohei.
When a backend needs to flush the WAL, and someone else is already flushing
the WAL, wait until it releases the WALInsertLock and check if we still need
to do the flush or if the other backend already did the work for us, before
acquiring WALInsertLock. This helps group commit, because when the WAL flush
finishes, all the backends that were waiting for it can be woken up in one
go, and the can all concurrently observe that they're done, rather than
waking them up one by one in a cascading fashion.
This is based on a new LWLock function, LWLockWaitUntilFree(), which has
peculiar semantics. If the lock is immediately free, it grabs the lock and
returns true. If it's not free, it waits until it is released, but then
returns false without grabbing the lock. This is used in XLogFlush(), so
that when the lock is acquired, the backend flushes the WAL, but if it's
not, the backend first checks the current flush location before retrying.
Original patch and benchmarking by Peter Geoghegan and Simon Riggs, although
this patch as committed ended up being very different from that.
This allows us to reliably remove all leftover temporary relation
files on cluster startup without reference to system catalogs or WAL;
therefore, we no longer include temporary relations in XLOG_XACT_COMMIT
and XLOG_XACT_ABORT WAL records.
Since these changes require including a backend ID in each
SharedInvalSmgrMsg, the size of the SharedInvalidationMessage.id
field has been reduced from two bytes to one, and the maximum number
of connections has been reduced from INT_MAX / 4 to 2^23-1. It would
be possible to remove these restrictions by increasing the size of
SharedInvalidationMessage by 4 bytes, but right now that doesn't seem
like a good trade-off.
Review by Jaime Casanova and Tom Lane.
probes --- the BUFFER_READ_DONE probe provides the same information and more
besides. Expand the LOCK_WAIT_START/DONE probe arguments so that there's
actually some chance of telling what is being waited for. Update and
clean up the documentation.
DTrace probes, so that ordinary reads can be distinguished from relation
extension operations. Move buffer_read_start probe to before the
smgrnblocks() call that's needed in the isExtend case, since really that step
should be charged as part of the time needed for the extension operation.
(This makes it slightly harder to match the read_start with the associated
read_done, since now you can't match them on blockNumber, but it should still
be possible since isExtend operations on the same relation can never be
interleaved.) Per recent discussion.
In passing, add the page identity (forkNum/blockNum) to the parameters of the
buffer_flush_start/buffer_flush_done probes, which were unaccountably lacking
the info.
some bufmgr probes, take out redundant and memory-leak-inducing path arguments
to smgr__md__read__done and smgr__md__write__done, fix bogus attempt to
recalculate space used in sort__done, clean up formatting in places where
I'm not sure pgindent will do a nice job by itself.
includes a few new ones.
- Fixed compilation errors on OS X for probes that use typedefs
- Fixed a number of probes to pass ForkNumber per the relation forks
patch
- The new probes are those that were taken out from the previous
submitted patch and required simple fixes. Will submit the other probes
that may require more discussion in a separate patch.
Robert Lor