mirror of
https://github.com/postgres/postgres.git
synced 2025-11-13 16:22:44 +03:00
pgindent run for 9.4
This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.
This commit is contained in:
@@ -44,7 +44,7 @@ int32 *PrivateRefCount;
|
||||
*
|
||||
* IO_IN_PROGRESS -- this is a flag in the buffer descriptor.
|
||||
* It must be set when an IO is initiated and cleared at
|
||||
* the end of the IO. It is there to make sure that one
|
||||
* the end of the IO. It is there to make sure that one
|
||||
* process doesn't start to use a buffer while another is
|
||||
* faulting it in. see WaitIO and related routines.
|
||||
*
|
||||
@@ -54,7 +54,7 @@ int32 *PrivateRefCount;
|
||||
*
|
||||
* PrivateRefCount -- Each buffer also has a private refcount that keeps
|
||||
* track of the number of times the buffer is pinned in the current
|
||||
* process. This is used for two purposes: first, if we pin a
|
||||
* process. This is used for two purposes: first, if we pin a
|
||||
* a buffer more than once, we only need to change the shared refcount
|
||||
* once, thus only lock the shared state once; second, when a transaction
|
||||
* aborts, it should only unpin the buffers exactly the number of times it
|
||||
|
||||
@@ -3,7 +3,7 @@
|
||||
* buf_table.c
|
||||
* routines for mapping BufferTags to buffer indexes.
|
||||
*
|
||||
* Note: the routines in this file do no locking of their own. The caller
|
||||
* Note: the routines in this file do no locking of their own. The caller
|
||||
* must hold a suitable lock on the appropriate BufMappingLock, as specified
|
||||
* in the comments. We can't do the locking inside these functions because
|
||||
* in most cases the caller needs to adjust the buffer header contents
|
||||
@@ -112,7 +112,7 @@ BufTableLookup(BufferTag *tagPtr, uint32 hashcode)
|
||||
* Insert a hashtable entry for given tag and buffer ID,
|
||||
* unless an entry already exists for that tag
|
||||
*
|
||||
* Returns -1 on successful insertion. If a conflicting entry exists
|
||||
* Returns -1 on successful insertion. If a conflicting entry exists
|
||||
* already, returns the buffer ID in that entry.
|
||||
*
|
||||
* Caller must hold exclusive lock on BufMappingLock for tag's partition
|
||||
|
||||
@@ -116,7 +116,7 @@ static int rnode_comparator(const void *p1, const void *p2);
|
||||
* PrefetchBuffer -- initiate asynchronous read of a block of a relation
|
||||
*
|
||||
* This is named by analogy to ReadBuffer but doesn't actually allocate a
|
||||
* buffer. Instead it tries to ensure that a future ReadBuffer for the given
|
||||
* buffer. Instead it tries to ensure that a future ReadBuffer for the given
|
||||
* block will not be delayed by the I/O. Prefetching is optional.
|
||||
* No-op if prefetching isn't compiled in.
|
||||
*/
|
||||
@@ -206,7 +206,7 @@ ReadBuffer(Relation reln, BlockNumber blockNum)
|
||||
* Assume when this function is called, that reln has been opened already.
|
||||
*
|
||||
* In RBM_NORMAL mode, the page is read from disk, and the page header is
|
||||
* validated. An error is thrown if the page header is not valid. (But
|
||||
* validated. An error is thrown if the page header is not valid. (But
|
||||
* note that an all-zero page is considered "valid"; see PageIsVerified().)
|
||||
*
|
||||
* RBM_ZERO_ON_ERROR is like the normal mode, but if the page header is not
|
||||
@@ -214,7 +214,7 @@ ReadBuffer(Relation reln, BlockNumber blockNum)
|
||||
* for non-critical data, where the caller is prepared to repair errors.
|
||||
*
|
||||
* In RBM_ZERO mode, if the page isn't in buffer cache already, it's filled
|
||||
* with zeros instead of reading it from disk. Useful when the caller is
|
||||
* with zeros instead of reading it from disk. Useful when the caller is
|
||||
* going to fill the page from scratch, since this saves I/O and avoids
|
||||
* unnecessary failure if the page-on-disk has corrupt page headers.
|
||||
* Caution: do not use this mode to read a page that is beyond the relation's
|
||||
@@ -371,7 +371,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
|
||||
* This can happen because mdread doesn't complain about reads beyond
|
||||
* EOF (when zero_damaged_pages is ON) and so a previous attempt to
|
||||
* read a block beyond EOF could have left a "valid" zero-filled
|
||||
* buffer. Unfortunately, we have also seen this case occurring
|
||||
* buffer. Unfortunately, we have also seen this case occurring
|
||||
* because of buggy Linux kernels that sometimes return an
|
||||
* lseek(SEEK_END) result that doesn't account for a recent write. In
|
||||
* that situation, the pre-existing buffer would contain valid data
|
||||
@@ -597,7 +597,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
|
||||
|
||||
/*
|
||||
* Didn't find it in the buffer pool. We'll have to initialize a new
|
||||
* buffer. Remember to unlock the mapping lock while doing the work.
|
||||
* buffer. Remember to unlock the mapping lock while doing the work.
|
||||
*/
|
||||
LWLockRelease(newPartitionLock);
|
||||
|
||||
@@ -607,7 +607,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
|
||||
bool lock_held;
|
||||
|
||||
/*
|
||||
* Select a victim buffer. The buffer is returned with its header
|
||||
* Select a victim buffer. The buffer is returned with its header
|
||||
* spinlock still held! Also (in most cases) the BufFreelistLock is
|
||||
* still held, since it would be bad to hold the spinlock while
|
||||
* possibly waking up other processes.
|
||||
@@ -656,7 +656,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
|
||||
* If using a nondefault strategy, and writing the buffer
|
||||
* would require a WAL flush, let the strategy decide whether
|
||||
* to go ahead and write/reuse the buffer or to choose another
|
||||
* victim. We need lock to inspect the page LSN, so this
|
||||
* victim. We need lock to inspect the page LSN, so this
|
||||
* can't be done inside StrategyGetBuffer.
|
||||
*/
|
||||
if (strategy != NULL)
|
||||
@@ -786,7 +786,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
|
||||
{
|
||||
/*
|
||||
* We can only get here if (a) someone else is still reading
|
||||
* in the page, or (b) a previous read attempt failed. We
|
||||
* in the page, or (b) a previous read attempt failed. We
|
||||
* have to wait for any active read attempt to finish, and
|
||||
* then set up our own read attempt if the page is still not
|
||||
* BM_VALID. StartBufferIO does it all.
|
||||
@@ -879,7 +879,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
|
||||
* This is used only in contexts such as dropping a relation. We assume
|
||||
* that no other backend could possibly be interested in using the page,
|
||||
* so the only reason the buffer might be pinned is if someone else is
|
||||
* trying to write it out. We have to let them finish before we can
|
||||
* trying to write it out. We have to let them finish before we can
|
||||
* reclaim the buffer.
|
||||
*
|
||||
* The buffer could get reclaimed by someone else while we are waiting
|
||||
@@ -978,7 +978,7 @@ retry:
|
||||
*
|
||||
* Marks buffer contents as dirty (actual write happens later).
|
||||
*
|
||||
* Buffer must be pinned and exclusive-locked. (If caller does not hold
|
||||
* Buffer must be pinned and exclusive-locked. (If caller does not hold
|
||||
* exclusive lock, then somebody could be in process of writing the buffer,
|
||||
* leading to risk of bad data written to disk.)
|
||||
*/
|
||||
@@ -1027,7 +1027,7 @@ MarkBufferDirty(Buffer buffer)
|
||||
*
|
||||
* Formerly, this saved one cycle of acquiring/releasing the BufMgrLock
|
||||
* compared to calling the two routines separately. Now it's mainly just
|
||||
* a convenience function. However, if the passed buffer is valid and
|
||||
* a convenience function. However, if the passed buffer is valid and
|
||||
* already contains the desired block, we just return it as-is; and that
|
||||
* does save considerable work compared to a full release and reacquire.
|
||||
*
|
||||
@@ -1079,7 +1079,7 @@ ReleaseAndReadBuffer(Buffer buffer,
|
||||
* when we first pin it; for other strategies we just make sure the usage_count
|
||||
* isn't zero. (The idea of the latter is that we don't want synchronized
|
||||
* heap scans to inflate the count, but we need it to not be zero to discourage
|
||||
* other backends from stealing buffers from our ring. As long as we cycle
|
||||
* other backends from stealing buffers from our ring. As long as we cycle
|
||||
* through the ring faster than the global clock-sweep cycles, buffers in
|
||||
* our ring won't be chosen as victims for replacement by other backends.)
|
||||
*
|
||||
@@ -1087,7 +1087,7 @@ ReleaseAndReadBuffer(Buffer buffer,
|
||||
*
|
||||
* Note that ResourceOwnerEnlargeBuffers must have been done already.
|
||||
*
|
||||
* Returns TRUE if buffer is BM_VALID, else FALSE. This provision allows
|
||||
* Returns TRUE if buffer is BM_VALID, else FALSE. This provision allows
|
||||
* some callers to avoid an extra spinlock cycle.
|
||||
*/
|
||||
static bool
|
||||
@@ -1241,7 +1241,7 @@ BufferSync(int flags)
|
||||
* have the flag set.
|
||||
*
|
||||
* Note that if we fail to write some buffer, we may leave buffers with
|
||||
* BM_CHECKPOINT_NEEDED still set. This is OK since any such buffer would
|
||||
* BM_CHECKPOINT_NEEDED still set. This is OK since any such buffer would
|
||||
* certainly need to be written for the next checkpoint attempt, too.
|
||||
*/
|
||||
num_to_write = 0;
|
||||
@@ -1344,7 +1344,7 @@ BufferSync(int flags)
|
||||
* This is called periodically by the background writer process.
|
||||
*
|
||||
* Returns true if it's appropriate for the bgwriter process to go into
|
||||
* low-power hibernation mode. (This happens if the strategy clock sweep
|
||||
* low-power hibernation mode. (This happens if the strategy clock sweep
|
||||
* has been "lapped" and no buffer allocations have occurred recently,
|
||||
* or if the bgwriter has been effectively disabled by setting
|
||||
* bgwriter_lru_maxpages to 0.)
|
||||
@@ -2110,7 +2110,7 @@ BufferGetLSNAtomic(Buffer buffer)
|
||||
* specified relation fork that have block numbers >= firstDelBlock.
|
||||
* (In particular, with firstDelBlock = 0, all pages are removed.)
|
||||
* Dirty pages are simply dropped, without bothering to write them
|
||||
* out first. Therefore, this is NOT rollback-able, and so should be
|
||||
* out first. Therefore, this is NOT rollback-able, and so should be
|
||||
* used only with extreme caution!
|
||||
*
|
||||
* Currently, this is called only from smgr.c when the underlying file
|
||||
@@ -2119,7 +2119,7 @@ BufferGetLSNAtomic(Buffer buffer)
|
||||
* be deleted momentarily anyway, and there is no point in writing it.
|
||||
* It is the responsibility of higher-level code to ensure that the
|
||||
* deletion or truncation does not lose any data that could be needed
|
||||
* later. It is also the responsibility of higher-level code to ensure
|
||||
* later. It is also the responsibility of higher-level code to ensure
|
||||
* that no other process could be trying to load more pages of the
|
||||
* relation into buffers.
|
||||
*
|
||||
@@ -2281,9 +2281,9 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
|
||||
*
|
||||
* This function removes all the buffers in the buffer cache for a
|
||||
* particular database. Dirty pages are simply dropped, without
|
||||
* bothering to write them out first. This is used when we destroy a
|
||||
* bothering to write them out first. This is used when we destroy a
|
||||
* database, to avoid trying to flush data to disk when the directory
|
||||
* tree no longer exists. Implementation is pretty similar to
|
||||
* tree no longer exists. Implementation is pretty similar to
|
||||
* DropRelFileNodeBuffers() which is for destroying just one relation.
|
||||
* --------------------------------------------------------------------
|
||||
*/
|
||||
|
||||
@@ -36,7 +36,7 @@ typedef struct
|
||||
*/
|
||||
|
||||
/*
|
||||
* Statistics. These counters should be wide enough that they can't
|
||||
* Statistics. These counters should be wide enough that they can't
|
||||
* overflow during a single bgwriter cycle.
|
||||
*/
|
||||
uint32 completePasses; /* Complete cycles of the clock sweep */
|
||||
@@ -135,7 +135,7 @@ StrategyGetBuffer(BufferAccessStrategy strategy, bool *lock_held)
|
||||
|
||||
/*
|
||||
* We count buffer allocation requests so that the bgwriter can estimate
|
||||
* the rate of buffer consumption. Note that buffers recycled by a
|
||||
* the rate of buffer consumption. Note that buffers recycled by a
|
||||
* strategy object are intentionally not counted here.
|
||||
*/
|
||||
StrategyControl->numBufferAllocs++;
|
||||
@@ -266,7 +266,7 @@ StrategyFreeBuffer(volatile BufferDesc *buf)
|
||||
*
|
||||
* In addition, we return the completed-pass count (which is effectively
|
||||
* the higher-order bits of nextVictimBuffer) and the count of recent buffer
|
||||
* allocs if non-NULL pointers are passed. The alloc count is reset after
|
||||
* allocs if non-NULL pointers are passed. The alloc count is reset after
|
||||
* being read.
|
||||
*/
|
||||
int
|
||||
@@ -291,7 +291,7 @@ StrategySyncStart(uint32 *complete_passes, uint32 *num_buf_alloc)
|
||||
* StrategyNotifyBgWriter -- set or clear allocation notification latch
|
||||
*
|
||||
* If bgwriterLatch isn't NULL, the next invocation of StrategyGetBuffer will
|
||||
* set that latch. Pass NULL to clear the pending notification before it
|
||||
* set that latch. Pass NULL to clear the pending notification before it
|
||||
* happens. This feature is used by the bgwriter process to wake itself up
|
||||
* from hibernation, and is not meant for anybody else to use.
|
||||
*/
|
||||
@@ -484,7 +484,7 @@ GetBufferFromRing(BufferAccessStrategy strategy)
|
||||
|
||||
/*
|
||||
* If the slot hasn't been filled yet, tell the caller to allocate a new
|
||||
* buffer with the normal allocation strategy. He will then fill this
|
||||
* buffer with the normal allocation strategy. He will then fill this
|
||||
* slot by calling AddBufferToRing with the new buffer.
|
||||
*/
|
||||
bufnum = strategy->buffers[strategy->current];
|
||||
@@ -537,7 +537,7 @@ AddBufferToRing(BufferAccessStrategy strategy, volatile BufferDesc *buf)
|
||||
*
|
||||
* When a nondefault strategy is used, the buffer manager calls this function
|
||||
* when it turns out that the buffer selected by StrategyGetBuffer needs to
|
||||
* be written out and doing so would require flushing WAL too. This gives us
|
||||
* be written out and doing so would require flushing WAL too. This gives us
|
||||
* a chance to choose a different victim.
|
||||
*
|
||||
* Returns true if buffer manager should ask for a new victim, and false
|
||||
|
||||
@@ -94,7 +94,7 @@ LocalPrefetchBuffer(SMgrRelation smgr, ForkNumber forkNum,
|
||||
* Find or create a local buffer for the given page of the given relation.
|
||||
*
|
||||
* API is similar to bufmgr.c's BufferAlloc, except that we do not need
|
||||
* to do any locking since this is all local. Also, IO_IN_PROGRESS
|
||||
* to do any locking since this is all local. Also, IO_IN_PROGRESS
|
||||
* does not get set. Lastly, we support only default access strategy
|
||||
* (hence, usage_count is always advanced).
|
||||
*/
|
||||
@@ -292,7 +292,7 @@ MarkLocalBufferDirty(Buffer buffer)
|
||||
* specified relation that have block numbers >= firstDelBlock.
|
||||
* (In particular, with firstDelBlock = 0, all pages are removed.)
|
||||
* Dirty pages are simply dropped, without bothering to write them
|
||||
* out first. Therefore, this is NOT rollback-able, and so should be
|
||||
* out first. Therefore, this is NOT rollback-able, and so should be
|
||||
* used only with extreme caution!
|
||||
*
|
||||
* See DropRelFileNodeBuffers in bufmgr.c for more notes.
|
||||
@@ -459,7 +459,7 @@ GetLocalBufferStorage(void)
|
||||
/*
|
||||
* We allocate local buffers in a context of their own, so that the
|
||||
* space eaten for them is easily recognizable in MemoryContextStats
|
||||
* output. Create the context on first use.
|
||||
* output. Create the context on first use.
|
||||
*/
|
||||
if (LocalBufferContext == NULL)
|
||||
LocalBufferContext =
|
||||
|
||||
@@ -29,7 +29,7 @@
|
||||
* that was current at that time.
|
||||
*
|
||||
* BufFile also supports temporary files that exceed the OS file size limit
|
||||
* (by opening multiple fd.c temporary files). This is an essential feature
|
||||
* (by opening multiple fd.c temporary files). This is an essential feature
|
||||
* for sorts and hashjoins on large amounts of data.
|
||||
*-------------------------------------------------------------------------
|
||||
*/
|
||||
@@ -72,7 +72,7 @@ struct BufFile
|
||||
bool dirty; /* does buffer need to be written? */
|
||||
|
||||
/*
|
||||
* resowner is the ResourceOwner to use for underlying temp files. (We
|
||||
* resowner is the ResourceOwner to use for underlying temp files. (We
|
||||
* don't need to remember the memory context we're using explicitly,
|
||||
* because after creation we only repalloc our arrays larger.)
|
||||
*/
|
||||
@@ -519,7 +519,7 @@ BufFileSeek(BufFile *file, int fileno, off_t offset, int whence)
|
||||
{
|
||||
/*
|
||||
* Seek is to a point within existing buffer; we can just adjust
|
||||
* pos-within-buffer, without flushing buffer. Note this is OK
|
||||
* pos-within-buffer, without flushing buffer. Note this is OK
|
||||
* whether reading or writing, but buffer remains dirty if we were
|
||||
* writing.
|
||||
*/
|
||||
|
||||
@@ -83,7 +83,7 @@
|
||||
* and other code that tries to open files without consulting fd.c. This
|
||||
* is the number left free. (While we can be pretty sure we won't get
|
||||
* EMFILE, there's never any guarantee that we won't get ENFILE due to
|
||||
* other processes chewing up FDs. So it's a bad idea to try to open files
|
||||
* other processes chewing up FDs. So it's a bad idea to try to open files
|
||||
* without consulting fd.c. Nonetheless we cannot control all code.)
|
||||
*
|
||||
* Because this is just a fixed setting, we are effectively assuming that
|
||||
@@ -168,8 +168,8 @@ typedef struct vfd
|
||||
} Vfd;
|
||||
|
||||
/*
|
||||
* Virtual File Descriptor array pointer and size. This grows as
|
||||
* needed. 'File' values are indexes into this array.
|
||||
* Virtual File Descriptor array pointer and size. This grows as
|
||||
* needed. 'File' values are indexes into this array.
|
||||
* Note that VfdCache[0] is not a usable VFD, just a list header.
|
||||
*/
|
||||
static Vfd *VfdCache;
|
||||
@@ -189,7 +189,7 @@ static bool have_xact_temporary_files = false;
|
||||
/*
|
||||
* Tracks the total size of all temporary files. Note: when temp_file_limit
|
||||
* is being enforced, this cannot overflow since the limit cannot be more
|
||||
* than INT_MAX kilobytes. When not enforcing, it could theoretically
|
||||
* than INT_MAX kilobytes. When not enforcing, it could theoretically
|
||||
* overflow, but we don't care.
|
||||
*/
|
||||
static uint64 temporary_files_size = 0;
|
||||
@@ -252,7 +252,7 @@ static int nextTempTableSpace = 0;
|
||||
*
|
||||
* The Least Recently Used ring is a doubly linked list that begins and
|
||||
* ends on element zero. Element zero is special -- it doesn't represent
|
||||
* a file and its "fd" field always == VFD_CLOSED. Element zero is just an
|
||||
* a file and its "fd" field always == VFD_CLOSED. Element zero is just an
|
||||
* anchor that shows us the beginning/end of the ring.
|
||||
* Only VFD elements that are currently really open (have an FD assigned) are
|
||||
* in the Lru ring. Elements that are "virtually" open can be recognized
|
||||
@@ -473,7 +473,7 @@ InitFileAccess(void)
|
||||
* We stop counting if usable_fds reaches max_to_probe. Note: a small
|
||||
* value of max_to_probe might result in an underestimate of already_open;
|
||||
* we must fill in any "gaps" in the set of used FDs before the calculation
|
||||
* of already_open will give the right answer. In practice, max_to_probe
|
||||
* of already_open will give the right answer. In practice, max_to_probe
|
||||
* of a couple of dozen should be enough to ensure good results.
|
||||
*
|
||||
* We assume stdin (FD 0) is available for dup'ing
|
||||
@@ -550,7 +550,7 @@ count_usable_fds(int max_to_probe, int *usable_fds, int *already_open)
|
||||
pfree(fd);
|
||||
|
||||
/*
|
||||
* Return results. usable_fds is just the number of successful dups. We
|
||||
* Return results. usable_fds is just the number of successful dups. We
|
||||
* assume that the system limit is highestfd+1 (remember 0 is a legal FD
|
||||
* number) and so already_open is highestfd+1 - usable_fds.
|
||||
*/
|
||||
@@ -1045,7 +1045,7 @@ OpenTemporaryFile(bool interXact)
|
||||
|
||||
/*
|
||||
* If not, or if tablespace is bad, create in database's default
|
||||
* tablespace. MyDatabaseTableSpace should normally be set before we get
|
||||
* tablespace. MyDatabaseTableSpace should normally be set before we get
|
||||
* here, but just in case it isn't, fall back to pg_default tablespace.
|
||||
*/
|
||||
if (file <= 0)
|
||||
@@ -1339,7 +1339,7 @@ FileWrite(File file, char *buffer, int amount)
|
||||
|
||||
/*
|
||||
* If enforcing temp_file_limit and it's a temp file, check to see if the
|
||||
* write would overrun temp_file_limit, and throw error if so. Note: it's
|
||||
* write would overrun temp_file_limit, and throw error if so. Note: it's
|
||||
* really a modularity violation to throw error here; we should set errno
|
||||
* and return -1. However, there's no way to report a suitable error
|
||||
* message if we do that. All current callers would just throw error
|
||||
@@ -1618,7 +1618,7 @@ reserveAllocatedDesc(void)
|
||||
/*
|
||||
* Routines that want to use stdio (ie, FILE*) should use AllocateFile
|
||||
* rather than plain fopen(). This lets fd.c deal with freeing FDs if
|
||||
* necessary to open the file. When done, call FreeFile rather than fclose.
|
||||
* necessary to open the file. When done, call FreeFile rather than fclose.
|
||||
*
|
||||
* Note that files that will be open for any significant length of time
|
||||
* should NOT be handled this way, since they cannot share kernel file
|
||||
@@ -1923,7 +1923,7 @@ TryAgain:
|
||||
* Read a directory opened with AllocateDir, ereport'ing any error.
|
||||
*
|
||||
* This is easier to use than raw readdir() since it takes care of some
|
||||
* otherwise rather tedious and error-prone manipulation of errno. Also,
|
||||
* otherwise rather tedious and error-prone manipulation of errno. Also,
|
||||
* if you are happy with a generic error message for AllocateDir failure,
|
||||
* you can just do
|
||||
*
|
||||
@@ -2058,7 +2058,7 @@ SetTempTablespaces(Oid *tableSpaces, int numSpaces)
|
||||
numTempTableSpaces = numSpaces;
|
||||
|
||||
/*
|
||||
* Select a random starting point in the list. This is to minimize
|
||||
* Select a random starting point in the list. This is to minimize
|
||||
* conflicts between backends that are most likely sharing the same list
|
||||
* of temp tablespaces. Note that if we create multiple temp files in the
|
||||
* same transaction, we'll advance circularly through the list --- this
|
||||
@@ -2087,7 +2087,7 @@ TempTablespacesAreSet(void)
|
||||
/*
|
||||
* GetNextTempTableSpace
|
||||
*
|
||||
* Select the next temp tablespace to use. A result of InvalidOid means
|
||||
* Select the next temp tablespace to use. A result of InvalidOid means
|
||||
* to use the current database's default tablespace.
|
||||
*/
|
||||
Oid
|
||||
|
||||
@@ -48,7 +48,7 @@
|
||||
* Range Category
|
||||
* 0 - 31 0
|
||||
* 32 - 63 1
|
||||
* ... ... ...
|
||||
* ... ... ...
|
||||
* 8096 - 8127 253
|
||||
* 8128 - 8163 254
|
||||
* 8164 - 8192 255
|
||||
@@ -123,7 +123,7 @@ static uint8 fsm_vacuum_page(Relation rel, FSMAddress addr, bool *eof);
|
||||
* will turn out to have too little space available by the time the caller
|
||||
* gets a lock on it. In that case, the caller should report the actual
|
||||
* amount of free space available on that page and then try again (see
|
||||
* RecordAndGetPageWithFreeSpace). If InvalidBlockNumber is returned,
|
||||
* RecordAndGetPageWithFreeSpace). If InvalidBlockNumber is returned,
|
||||
* extend the relation.
|
||||
*/
|
||||
BlockNumber
|
||||
|
||||
@@ -185,13 +185,13 @@ restart:
|
||||
|
||||
/*----------
|
||||
* Start the search from the target slot. At every step, move one
|
||||
* node to the right, then climb up to the parent. Stop when we reach
|
||||
* node to the right, then climb up to the parent. Stop when we reach
|
||||
* a node with enough free space (as we must, since the root has enough
|
||||
* space).
|
||||
*
|
||||
* The idea is to gradually expand our "search triangle", that is, all
|
||||
* nodes covered by the current node, and to be sure we search to the
|
||||
* right from the start point. At the first step, only the target slot
|
||||
* right from the start point. At the first step, only the target slot
|
||||
* is examined. When we move up from a left child to its parent, we are
|
||||
* adding the right-hand subtree of that parent to the search triangle.
|
||||
* When we move right then up from a right child, we are dropping the
|
||||
|
||||
@@ -59,29 +59,29 @@
|
||||
/* Backend-local tracking for on-detach callbacks. */
|
||||
typedef struct dsm_segment_detach_callback
|
||||
{
|
||||
on_dsm_detach_callback function;
|
||||
Datum arg;
|
||||
slist_node node;
|
||||
on_dsm_detach_callback function;
|
||||
Datum arg;
|
||||
slist_node node;
|
||||
} dsm_segment_detach_callback;
|
||||
|
||||
/* Backend-local state for a dynamic shared memory segment. */
|
||||
struct dsm_segment
|
||||
{
|
||||
dlist_node node; /* List link in dsm_segment_list. */
|
||||
ResourceOwner resowner; /* Resource owner. */
|
||||
dsm_handle handle; /* Segment name. */
|
||||
uint32 control_slot; /* Slot in control segment. */
|
||||
void *impl_private; /* Implementation-specific private data. */
|
||||
void *mapped_address; /* Mapping address, or NULL if unmapped. */
|
||||
Size mapped_size; /* Size of our mapping. */
|
||||
slist_head on_detach; /* On-detach callbacks. */
|
||||
dlist_node node; /* List link in dsm_segment_list. */
|
||||
ResourceOwner resowner; /* Resource owner. */
|
||||
dsm_handle handle; /* Segment name. */
|
||||
uint32 control_slot; /* Slot in control segment. */
|
||||
void *impl_private; /* Implementation-specific private data. */
|
||||
void *mapped_address; /* Mapping address, or NULL if unmapped. */
|
||||
Size mapped_size; /* Size of our mapping. */
|
||||
slist_head on_detach; /* On-detach callbacks. */
|
||||
};
|
||||
|
||||
/* Shared-memory state for a dynamic shared memory segment. */
|
||||
typedef struct dsm_control_item
|
||||
{
|
||||
dsm_handle handle;
|
||||
uint32 refcnt; /* 2+ = active, 1 = moribund, 0 = gone */
|
||||
uint32 refcnt; /* 2+ = active, 1 = moribund, 0 = gone */
|
||||
} dsm_control_item;
|
||||
|
||||
/* Layout of the dynamic shared memory control segment. */
|
||||
@@ -90,7 +90,7 @@ typedef struct dsm_control_header
|
||||
uint32 magic;
|
||||
uint32 nitems;
|
||||
uint32 maxitems;
|
||||
dsm_control_item item[FLEXIBLE_ARRAY_MEMBER];
|
||||
dsm_control_item item[FLEXIBLE_ARRAY_MEMBER];
|
||||
} dsm_control_header;
|
||||
|
||||
static void dsm_cleanup_for_mmap(void);
|
||||
@@ -132,7 +132,7 @@ static dlist_head dsm_segment_list = DLIST_STATIC_INIT(dsm_segment_list);
|
||||
static dsm_handle dsm_control_handle;
|
||||
static dsm_control_header *dsm_control;
|
||||
static Size dsm_control_mapped_size = 0;
|
||||
static void *dsm_control_impl_private = NULL;
|
||||
static void *dsm_control_impl_private = NULL;
|
||||
|
||||
/*
|
||||
* Start up the dynamic shared memory system.
|
||||
@@ -166,14 +166,14 @@ dsm_postmaster_startup(PGShmemHeader *shim)
|
||||
maxitems = PG_DYNSHMEM_FIXED_SLOTS
|
||||
+ PG_DYNSHMEM_SLOTS_PER_BACKEND * MaxBackends;
|
||||
elog(DEBUG2, "dynamic shared memory system will support %u segments",
|
||||
maxitems);
|
||||
maxitems);
|
||||
segsize = dsm_control_bytes_needed(maxitems);
|
||||
|
||||
/*
|
||||
* Loop until we find an unused identifier for the new control segment.
|
||||
* We sometimes use 0 as a sentinel value indicating that no control
|
||||
* segment is known to exist, so avoid using that value for a real
|
||||
* control segment.
|
||||
* Loop until we find an unused identifier for the new control segment. We
|
||||
* sometimes use 0 as a sentinel value indicating that no control segment
|
||||
* is known to exist, so avoid using that value for a real control
|
||||
* segment.
|
||||
*/
|
||||
for (;;)
|
||||
{
|
||||
@@ -224,17 +224,17 @@ dsm_cleanup_using_control_segment(dsm_handle old_control_handle)
|
||||
|
||||
/*
|
||||
* Try to attach the segment. If this fails, it probably just means that
|
||||
* the operating system has been rebooted and the segment no longer exists,
|
||||
* or an unrelated proces has used the same shm ID. So just fall out
|
||||
* quietly.
|
||||
* the operating system has been rebooted and the segment no longer
|
||||
* exists, or an unrelated proces has used the same shm ID. So just fall
|
||||
* out quietly.
|
||||
*/
|
||||
if (!dsm_impl_op(DSM_OP_ATTACH, old_control_handle, 0, &impl_private,
|
||||
&mapped_address, &mapped_size, DEBUG1))
|
||||
return;
|
||||
|
||||
/*
|
||||
* We've managed to reattach it, but the contents might not be sane.
|
||||
* If they aren't, we disregard the segment after all.
|
||||
* We've managed to reattach it, but the contents might not be sane. If
|
||||
* they aren't, we disregard the segment after all.
|
||||
*/
|
||||
old_control = (dsm_control_header *) mapped_address;
|
||||
if (!dsm_control_segment_sane(old_control, mapped_size))
|
||||
@@ -245,14 +245,14 @@ dsm_cleanup_using_control_segment(dsm_handle old_control_handle)
|
||||
}
|
||||
|
||||
/*
|
||||
* OK, the control segment looks basically valid, so we can get use
|
||||
* it to get a list of segments that need to be removed.
|
||||
* OK, the control segment looks basically valid, so we can get use it to
|
||||
* get a list of segments that need to be removed.
|
||||
*/
|
||||
nitems = old_control->nitems;
|
||||
for (i = 0; i < nitems; ++i)
|
||||
{
|
||||
dsm_handle handle;
|
||||
uint32 refcnt;
|
||||
dsm_handle handle;
|
||||
uint32 refcnt;
|
||||
|
||||
/* If the reference count is 0, the slot is actually unused. */
|
||||
refcnt = old_control->item[i].refcnt;
|
||||
@@ -262,7 +262,7 @@ dsm_cleanup_using_control_segment(dsm_handle old_control_handle)
|
||||
/* Log debugging information. */
|
||||
handle = old_control->item[i].handle;
|
||||
elog(DEBUG2, "cleaning up orphaned dynamic shared memory with ID %u (reference count %u)",
|
||||
handle, refcnt);
|
||||
handle, refcnt);
|
||||
|
||||
/* Destroy the referenced segment. */
|
||||
dsm_impl_op(DSM_OP_DESTROY, handle, 0, &junk_impl_private,
|
||||
@@ -290,7 +290,7 @@ dsm_cleanup_using_control_segment(dsm_handle old_control_handle)
|
||||
static void
|
||||
dsm_cleanup_for_mmap(void)
|
||||
{
|
||||
DIR *dir;
|
||||
DIR *dir;
|
||||
struct dirent *dent;
|
||||
|
||||
/* Open the directory; can't use AllocateDir in postmaster. */
|
||||
@@ -298,15 +298,16 @@ dsm_cleanup_for_mmap(void)
|
||||
ereport(ERROR,
|
||||
(errcode_for_file_access(),
|
||||
errmsg("could not open directory \"%s\": %m",
|
||||
PG_DYNSHMEM_DIR)));
|
||||
PG_DYNSHMEM_DIR)));
|
||||
|
||||
/* Scan for something with a name of the correct format. */
|
||||
while ((dent = ReadDir(dir, PG_DYNSHMEM_DIR)) != NULL)
|
||||
{
|
||||
if (strncmp(dent->d_name, PG_DYNSHMEM_MMAP_FILE_PREFIX,
|
||||
strlen(PG_DYNSHMEM_MMAP_FILE_PREFIX)) == 0)
|
||||
strlen(PG_DYNSHMEM_MMAP_FILE_PREFIX)) == 0)
|
||||
{
|
||||
char buf[MAXPGPATH];
|
||||
char buf[MAXPGPATH];
|
||||
|
||||
snprintf(buf, MAXPGPATH, PG_DYNSHMEM_DIR "/%s", dent->d_name);
|
||||
|
||||
elog(DEBUG2, "removing file \"%s\"", buf);
|
||||
@@ -314,7 +315,7 @@ dsm_cleanup_for_mmap(void)
|
||||
/* We found a matching file; so remove it. */
|
||||
if (unlink(buf) != 0)
|
||||
{
|
||||
int save_errno;
|
||||
int save_errno;
|
||||
|
||||
save_errno = errno;
|
||||
closedir(dir);
|
||||
@@ -352,8 +353,8 @@ dsm_postmaster_shutdown(int code, Datum arg)
|
||||
* If some other backend exited uncleanly, it might have corrupted the
|
||||
* control segment while it was dying. In that case, we warn and ignore
|
||||
* the contents of the control segment. This may end up leaving behind
|
||||
* stray shared memory segments, but there's not much we can do about
|
||||
* that if the metadata is gone.
|
||||
* stray shared memory segments, but there's not much we can do about that
|
||||
* if the metadata is gone.
|
||||
*/
|
||||
nitems = dsm_control->nitems;
|
||||
if (!dsm_control_segment_sane(dsm_control, dsm_control_mapped_size))
|
||||
@@ -375,7 +376,7 @@ dsm_postmaster_shutdown(int code, Datum arg)
|
||||
/* Log debugging information. */
|
||||
handle = dsm_control->item[i].handle;
|
||||
elog(DEBUG2, "cleaning up orphaned dynamic shared memory with ID %u",
|
||||
handle);
|
||||
handle);
|
||||
|
||||
/* Destroy the segment. */
|
||||
dsm_impl_op(DSM_OP_DESTROY, handle, 0, &junk_impl_private,
|
||||
@@ -427,7 +428,7 @@ dsm_backend_startup(void)
|
||||
&dsm_control_mapped_size, WARNING);
|
||||
ereport(FATAL,
|
||||
(errcode(ERRCODE_INTERNAL_ERROR),
|
||||
errmsg("dynamic shared memory control segment is not valid")));
|
||||
errmsg("dynamic shared memory control segment is not valid")));
|
||||
}
|
||||
}
|
||||
#endif
|
||||
@@ -455,9 +456,9 @@ dsm_set_control_handle(dsm_handle h)
|
||||
dsm_segment *
|
||||
dsm_create(Size size)
|
||||
{
|
||||
dsm_segment *seg = dsm_create_descriptor();
|
||||
uint32 i;
|
||||
uint32 nitems;
|
||||
dsm_segment *seg = dsm_create_descriptor();
|
||||
uint32 i;
|
||||
uint32 nitems;
|
||||
|
||||
/* Unsafe in postmaster (and pointless in a stand-alone backend). */
|
||||
Assert(IsUnderPostmaster);
|
||||
@@ -524,10 +525,10 @@ dsm_create(Size size)
|
||||
dsm_segment *
|
||||
dsm_attach(dsm_handle h)
|
||||
{
|
||||
dsm_segment *seg;
|
||||
dlist_iter iter;
|
||||
uint32 i;
|
||||
uint32 nitems;
|
||||
dsm_segment *seg;
|
||||
dlist_iter iter;
|
||||
uint32 i;
|
||||
uint32 nitems;
|
||||
|
||||
/* Unsafe in postmaster (and pointless in a stand-alone backend). */
|
||||
Assert(IsUnderPostmaster);
|
||||
@@ -537,13 +538,13 @@ dsm_attach(dsm_handle h)
|
||||
|
||||
/*
|
||||
* Since this is just a debugging cross-check, we could leave it out
|
||||
* altogether, or include it only in assert-enabled builds. But since
|
||||
* the list of attached segments should normally be very short, let's
|
||||
* include it always for right now.
|
||||
* altogether, or include it only in assert-enabled builds. But since the
|
||||
* list of attached segments should normally be very short, let's include
|
||||
* it always for right now.
|
||||
*
|
||||
* If you're hitting this error, you probably want to attempt to
|
||||
* find an existing mapping via dsm_find_mapping() before calling
|
||||
* dsm_attach() to create a new one.
|
||||
* If you're hitting this error, you probably want to attempt to find an
|
||||
* existing mapping via dsm_find_mapping() before calling dsm_attach() to
|
||||
* create a new one.
|
||||
*/
|
||||
dlist_foreach(iter, &dsm_segment_list)
|
||||
{
|
||||
@@ -584,10 +585,10 @@ dsm_attach(dsm_handle h)
|
||||
LWLockRelease(DynamicSharedMemoryControlLock);
|
||||
|
||||
/*
|
||||
* If we didn't find the handle we're looking for in the control
|
||||
* segment, it probably means that everyone else who had it mapped,
|
||||
* including the original creator, died before we got to this point.
|
||||
* It's up to the caller to decide what to do about that.
|
||||
* If we didn't find the handle we're looking for in the control segment,
|
||||
* it probably means that everyone else who had it mapped, including the
|
||||
* original creator, died before we got to this point. It's up to the
|
||||
* caller to decide what to do about that.
|
||||
*/
|
||||
if (seg->control_slot == INVALID_CONTROL_SLOT)
|
||||
{
|
||||
@@ -612,7 +613,7 @@ dsm_backend_shutdown(void)
|
||||
{
|
||||
while (!dlist_is_empty(&dsm_segment_list))
|
||||
{
|
||||
dsm_segment *seg;
|
||||
dsm_segment *seg;
|
||||
|
||||
seg = dlist_head_element(dsm_segment, node, &dsm_segment_list);
|
||||
dsm_detach(seg);
|
||||
@@ -628,11 +629,11 @@ dsm_backend_shutdown(void)
|
||||
void
|
||||
dsm_detach_all(void)
|
||||
{
|
||||
void *control_address = dsm_control;
|
||||
void *control_address = dsm_control;
|
||||
|
||||
while (!dlist_is_empty(&dsm_segment_list))
|
||||
{
|
||||
dsm_segment *seg;
|
||||
dsm_segment *seg;
|
||||
|
||||
seg = dlist_head_element(dsm_segment, node, &dsm_segment_list);
|
||||
dsm_detach(seg);
|
||||
@@ -697,7 +698,7 @@ dsm_detach(dsm_segment *seg)
|
||||
{
|
||||
slist_node *node;
|
||||
dsm_segment_detach_callback *cb;
|
||||
on_dsm_detach_callback function;
|
||||
on_dsm_detach_callback function;
|
||||
Datum arg;
|
||||
|
||||
node = slist_pop_head_node(&seg->on_detach);
|
||||
@@ -710,13 +711,12 @@ dsm_detach(dsm_segment *seg)
|
||||
}
|
||||
|
||||
/*
|
||||
* Try to remove the mapping, if one exists. Normally, there will be,
|
||||
* but maybe not, if we failed partway through a create or attach
|
||||
* operation. We remove the mapping before decrementing the reference
|
||||
* count so that the process that sees a zero reference count can be
|
||||
* certain that no remaining mappings exist. Even if this fails, we
|
||||
* pretend that it works, because retrying is likely to fail in the
|
||||
* same way.
|
||||
* Try to remove the mapping, if one exists. Normally, there will be, but
|
||||
* maybe not, if we failed partway through a create or attach operation.
|
||||
* We remove the mapping before decrementing the reference count so that
|
||||
* the process that sees a zero reference count can be certain that no
|
||||
* remaining mappings exist. Even if this fails, we pretend that it
|
||||
* works, because retrying is likely to fail in the same way.
|
||||
*/
|
||||
if (seg->mapped_address != NULL)
|
||||
{
|
||||
@@ -730,8 +730,8 @@ dsm_detach(dsm_segment *seg)
|
||||
/* Reduce reference count, if we previously increased it. */
|
||||
if (seg->control_slot != INVALID_CONTROL_SLOT)
|
||||
{
|
||||
uint32 refcnt;
|
||||
uint32 control_slot = seg->control_slot;
|
||||
uint32 refcnt;
|
||||
uint32 control_slot = seg->control_slot;
|
||||
|
||||
LWLockAcquire(DynamicSharedMemoryControlLock, LW_EXCLUSIVE);
|
||||
Assert(dsm_control->item[control_slot].handle == seg->handle);
|
||||
@@ -744,15 +744,15 @@ dsm_detach(dsm_segment *seg)
|
||||
if (refcnt == 1)
|
||||
{
|
||||
/*
|
||||
* If we fail to destroy the segment here, or are killed before
|
||||
* we finish doing so, the reference count will remain at 1, which
|
||||
* If we fail to destroy the segment here, or are killed before we
|
||||
* finish doing so, the reference count will remain at 1, which
|
||||
* will mean that nobody else can attach to the segment. At
|
||||
* postmaster shutdown time, or when a new postmaster is started
|
||||
* after a hard kill, another attempt will be made to remove the
|
||||
* segment.
|
||||
*
|
||||
* The main case we're worried about here is being killed by
|
||||
* a signal before we can finish removing the segment. In that
|
||||
* The main case we're worried about here is being killed by a
|
||||
* signal before we can finish removing the segment. In that
|
||||
* case, it's important to be sure that the segment still gets
|
||||
* removed. If we actually fail to remove the segment for some
|
||||
* other reason, the postmaster may not have any better luck than
|
||||
@@ -827,8 +827,8 @@ dsm_keep_segment(dsm_segment *seg)
|
||||
dsm_segment *
|
||||
dsm_find_mapping(dsm_handle h)
|
||||
{
|
||||
dlist_iter iter;
|
||||
dsm_segment *seg;
|
||||
dlist_iter iter;
|
||||
dsm_segment *seg;
|
||||
|
||||
dlist_foreach(iter, &dsm_segment_list)
|
||||
{
|
||||
@@ -899,7 +899,7 @@ void
|
||||
cancel_on_dsm_detach(dsm_segment *seg, on_dsm_detach_callback function,
|
||||
Datum arg)
|
||||
{
|
||||
slist_mutable_iter iter;
|
||||
slist_mutable_iter iter;
|
||||
|
||||
slist_foreach_modify(iter, &seg->on_detach)
|
||||
{
|
||||
@@ -921,7 +921,7 @@ cancel_on_dsm_detach(dsm_segment *seg, on_dsm_detach_callback function,
|
||||
void
|
||||
reset_on_dsm_detach(void)
|
||||
{
|
||||
dlist_iter iter;
|
||||
dlist_iter iter;
|
||||
|
||||
dlist_foreach(iter, &dsm_segment_list)
|
||||
{
|
||||
@@ -952,7 +952,7 @@ reset_on_dsm_detach(void)
|
||||
static dsm_segment *
|
||||
dsm_create_descriptor(void)
|
||||
{
|
||||
dsm_segment *seg;
|
||||
dsm_segment *seg;
|
||||
|
||||
ResourceOwnerEnlargeDSMs(CurrentResourceOwner);
|
||||
|
||||
@@ -1005,5 +1005,5 @@ static uint64
|
||||
dsm_control_bytes_needed(uint32 nitems)
|
||||
{
|
||||
return offsetof(dsm_control_header, item)
|
||||
+ sizeof(dsm_control_item) * (uint64) nitems;
|
||||
+sizeof(dsm_control_item) * (uint64) nitems;
|
||||
}
|
||||
|
||||
@@ -76,40 +76,40 @@ static bool dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size,
|
||||
#endif
|
||||
#ifdef USE_DSM_SYSV
|
||||
static bool dsm_impl_sysv(dsm_op op, dsm_handle handle, Size request_size,
|
||||
void **impl_private, void **mapped_address,
|
||||
Size *mapped_size, int elevel);
|
||||
void **impl_private, void **mapped_address,
|
||||
Size *mapped_size, int elevel);
|
||||
#endif
|
||||
#ifdef USE_DSM_WINDOWS
|
||||
static bool dsm_impl_windows(dsm_op op, dsm_handle handle, Size request_size,
|
||||
void **impl_private, void **mapped_address,
|
||||
Size *mapped_size, int elevel);
|
||||
void **impl_private, void **mapped_address,
|
||||
Size *mapped_size, int elevel);
|
||||
#endif
|
||||
#ifdef USE_DSM_MMAP
|
||||
static bool dsm_impl_mmap(dsm_op op, dsm_handle handle, Size request_size,
|
||||
void **impl_private, void **mapped_address,
|
||||
Size *mapped_size, int elevel);
|
||||
#endif
|
||||
static int errcode_for_dynamic_shared_memory(void);
|
||||
static int errcode_for_dynamic_shared_memory(void);
|
||||
|
||||
const struct config_enum_entry dynamic_shared_memory_options[] = {
|
||||
#ifdef USE_DSM_POSIX
|
||||
{ "posix", DSM_IMPL_POSIX, false},
|
||||
{"posix", DSM_IMPL_POSIX, false},
|
||||
#endif
|
||||
#ifdef USE_DSM_SYSV
|
||||
{ "sysv", DSM_IMPL_SYSV, false},
|
||||
{"sysv", DSM_IMPL_SYSV, false},
|
||||
#endif
|
||||
#ifdef USE_DSM_WINDOWS
|
||||
{ "windows", DSM_IMPL_WINDOWS, false},
|
||||
{"windows", DSM_IMPL_WINDOWS, false},
|
||||
#endif
|
||||
#ifdef USE_DSM_MMAP
|
||||
{ "mmap", DSM_IMPL_MMAP, false},
|
||||
{"mmap", DSM_IMPL_MMAP, false},
|
||||
#endif
|
||||
{ "none", DSM_IMPL_NONE, false},
|
||||
{"none", DSM_IMPL_NONE, false},
|
||||
{NULL, 0, false}
|
||||
};
|
||||
|
||||
/* Implementation selector. */
|
||||
int dynamic_shared_memory_type;
|
||||
int dynamic_shared_memory_type;
|
||||
|
||||
/* Size of buffer to be used for zero-filling. */
|
||||
#define ZBUFFER_SIZE 8192
|
||||
@@ -137,20 +137,20 @@ int dynamic_shared_memory_type;
|
||||
* segment.
|
||||
*
|
||||
* Arguments:
|
||||
* op: The operation to be performed.
|
||||
* handle: The handle of an existing object, or for DSM_OP_CREATE, the
|
||||
* a new handle the caller wants created.
|
||||
* request_size: For DSM_OP_CREATE, the requested size. For DSM_OP_RESIZE,
|
||||
* the new size. Otherwise, 0.
|
||||
* impl_private: Private, implementation-specific data. Will be a pointer
|
||||
* to NULL for the first operation on a shared memory segment within this
|
||||
* backend; thereafter, it will point to the value to which it was set
|
||||
* on the previous call.
|
||||
* mapped_address: Pointer to start of current mapping; pointer to NULL
|
||||
* if none. Updated with new mapping address.
|
||||
* mapped_size: Pointer to size of current mapping; pointer to 0 if none.
|
||||
* Updated with new mapped size.
|
||||
* elevel: Level at which to log errors.
|
||||
* op: The operation to be performed.
|
||||
* handle: The handle of an existing object, or for DSM_OP_CREATE, the
|
||||
* a new handle the caller wants created.
|
||||
* request_size: For DSM_OP_CREATE, the requested size. For DSM_OP_RESIZE,
|
||||
* the new size. Otherwise, 0.
|
||||
* impl_private: Private, implementation-specific data. Will be a pointer
|
||||
* to NULL for the first operation on a shared memory segment within this
|
||||
* backend; thereafter, it will point to the value to which it was set
|
||||
* on the previous call.
|
||||
* mapped_address: Pointer to start of current mapping; pointer to NULL
|
||||
* if none. Updated with new mapping address.
|
||||
* mapped_size: Pointer to size of current mapping; pointer to 0 if none.
|
||||
* Updated with new mapped size.
|
||||
* elevel: Level at which to log errors.
|
||||
*
|
||||
* Return value: true on success, false on failure. When false is returned,
|
||||
* a message should first be logged at the specified elevel, except in the
|
||||
@@ -165,7 +165,7 @@ dsm_impl_op(dsm_op op, dsm_handle handle, Size request_size,
|
||||
{
|
||||
Assert(op == DSM_OP_CREATE || op == DSM_OP_RESIZE || request_size == 0);
|
||||
Assert((op != DSM_OP_CREATE && op != DSM_OP_ATTACH) ||
|
||||
(*mapped_address == NULL && *mapped_size == 0));
|
||||
(*mapped_address == NULL && *mapped_size == 0));
|
||||
|
||||
switch (dynamic_shared_memory_type)
|
||||
{
|
||||
@@ -243,10 +243,10 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size,
|
||||
void **impl_private, void **mapped_address, Size *mapped_size,
|
||||
int elevel)
|
||||
{
|
||||
char name[64];
|
||||
int flags;
|
||||
int fd;
|
||||
char *address;
|
||||
char name[64];
|
||||
int flags;
|
||||
int fd;
|
||||
char *address;
|
||||
|
||||
snprintf(name, 64, "/PostgreSQL.%u", handle);
|
||||
|
||||
@@ -258,8 +258,8 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size,
|
||||
{
|
||||
ereport(elevel,
|
||||
(errcode_for_dynamic_shared_memory(),
|
||||
errmsg("could not unmap shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
errmsg("could not unmap shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
return false;
|
||||
}
|
||||
*mapped_address = NULL;
|
||||
@@ -268,8 +268,8 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size,
|
||||
{
|
||||
ereport(elevel,
|
||||
(errcode_for_dynamic_shared_memory(),
|
||||
errmsg("could not remove shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
errmsg("could not remove shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
@@ -290,7 +290,7 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size,
|
||||
ereport(elevel,
|
||||
(errcode_for_dynamic_shared_memory(),
|
||||
errmsg("could not open shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
name)));
|
||||
return false;
|
||||
}
|
||||
|
||||
@@ -304,7 +304,7 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size,
|
||||
|
||||
if (fstat(fd, &st) != 0)
|
||||
{
|
||||
int save_errno;
|
||||
int save_errno;
|
||||
|
||||
/* Back out what's already been done. */
|
||||
save_errno = errno;
|
||||
@@ -314,14 +314,14 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size,
|
||||
ereport(elevel,
|
||||
(errcode_for_dynamic_shared_memory(),
|
||||
errmsg("could not stat shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
name)));
|
||||
return false;
|
||||
}
|
||||
request_size = st.st_size;
|
||||
}
|
||||
else if (*mapped_size != request_size && ftruncate(fd, request_size))
|
||||
{
|
||||
int save_errno;
|
||||
int save_errno;
|
||||
|
||||
/* Back out what's already been done. */
|
||||
save_errno = errno;
|
||||
@@ -332,8 +332,8 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size,
|
||||
|
||||
ereport(elevel,
|
||||
(errcode_for_dynamic_shared_memory(),
|
||||
errmsg("could not resize shared memory segment %s to %zu bytes: %m",
|
||||
name, request_size)));
|
||||
errmsg("could not resize shared memory segment %s to %zu bytes: %m",
|
||||
name, request_size)));
|
||||
return false;
|
||||
}
|
||||
|
||||
@@ -347,7 +347,7 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size,
|
||||
return true;
|
||||
if (munmap(*mapped_address, *mapped_size) != 0)
|
||||
{
|
||||
int save_errno;
|
||||
int save_errno;
|
||||
|
||||
/* Back out what's already been done. */
|
||||
save_errno = errno;
|
||||
@@ -358,8 +358,8 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size,
|
||||
|
||||
ereport(elevel,
|
||||
(errcode_for_dynamic_shared_memory(),
|
||||
errmsg("could not unmap shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
errmsg("could not unmap shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
return false;
|
||||
}
|
||||
*mapped_address = NULL;
|
||||
@@ -367,11 +367,11 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size,
|
||||
}
|
||||
|
||||
/* Map it. */
|
||||
address = mmap(NULL, request_size, PROT_READ|PROT_WRITE,
|
||||
MAP_SHARED|MAP_HASSEMAPHORE, fd, 0);
|
||||
address = mmap(NULL, request_size, PROT_READ | PROT_WRITE,
|
||||
MAP_SHARED | MAP_HASSEMAPHORE, fd, 0);
|
||||
if (address == MAP_FAILED)
|
||||
{
|
||||
int save_errno;
|
||||
int save_errno;
|
||||
|
||||
/* Back out what's already been done. */
|
||||
save_errno = errno;
|
||||
@@ -409,11 +409,11 @@ dsm_impl_sysv(dsm_op op, dsm_handle handle, Size request_size,
|
||||
void **impl_private, void **mapped_address, Size *mapped_size,
|
||||
int elevel)
|
||||
{
|
||||
key_t key;
|
||||
int ident;
|
||||
char *address;
|
||||
char name[64];
|
||||
int *ident_cache;
|
||||
key_t key;
|
||||
int ident;
|
||||
char *address;
|
||||
char name[64];
|
||||
int *ident_cache;
|
||||
|
||||
/* Resize is not supported for System V shared memory. */
|
||||
if (op == DSM_OP_RESIZE)
|
||||
@@ -427,38 +427,38 @@ dsm_impl_sysv(dsm_op op, dsm_handle handle, Size request_size,
|
||||
return true;
|
||||
|
||||
/*
|
||||
* POSIX shared memory and mmap-based shared memory identify segments
|
||||
* with names. To avoid needless error message variation, we use the
|
||||
* handle as the name.
|
||||
* POSIX shared memory and mmap-based shared memory identify segments with
|
||||
* names. To avoid needless error message variation, we use the handle as
|
||||
* the name.
|
||||
*/
|
||||
snprintf(name, 64, "%u", handle);
|
||||
|
||||
/*
|
||||
* The System V shared memory namespace is very restricted; names are
|
||||
* of type key_t, which is expected to be some sort of integer data type,
|
||||
* but not necessarily the same one as dsm_handle. Since we use
|
||||
* dsm_handle to identify shared memory segments across processes, this
|
||||
* might seem like a problem, but it's really not. If dsm_handle is
|
||||
* bigger than key_t, the cast below might truncate away some bits from
|
||||
* the handle the user-provided, but it'll truncate exactly the same bits
|
||||
* away in exactly the same fashion every time we use that handle, which
|
||||
* is all that really matters. Conversely, if dsm_handle is smaller than
|
||||
* key_t, we won't use the full range of available key space, but that's
|
||||
* no big deal either.
|
||||
* The System V shared memory namespace is very restricted; names are of
|
||||
* type key_t, which is expected to be some sort of integer data type, but
|
||||
* not necessarily the same one as dsm_handle. Since we use dsm_handle to
|
||||
* identify shared memory segments across processes, this might seem like
|
||||
* a problem, but it's really not. If dsm_handle is bigger than key_t,
|
||||
* the cast below might truncate away some bits from the handle the
|
||||
* user-provided, but it'll truncate exactly the same bits away in exactly
|
||||
* the same fashion every time we use that handle, which is all that
|
||||
* really matters. Conversely, if dsm_handle is smaller than key_t, we
|
||||
* won't use the full range of available key space, but that's no big deal
|
||||
* either.
|
||||
*
|
||||
* We do make sure that the key isn't negative, because that might not
|
||||
* be portable.
|
||||
* We do make sure that the key isn't negative, because that might not be
|
||||
* portable.
|
||||
*/
|
||||
key = (key_t) handle;
|
||||
if (key < 1) /* avoid compiler warning if type is unsigned */
|
||||
if (key < 1) /* avoid compiler warning if type is unsigned */
|
||||
key = -key;
|
||||
|
||||
/*
|
||||
* There's one special key, IPC_PRIVATE, which can't be used. If we end
|
||||
* up with that value by chance during a create operation, just pretend
|
||||
* it already exists, so that caller will retry. If we run into it
|
||||
* anywhere else, the caller has passed a handle that doesn't correspond
|
||||
* to anything we ever created, which should not happen.
|
||||
* up with that value by chance during a create operation, just pretend it
|
||||
* already exists, so that caller will retry. If we run into it anywhere
|
||||
* else, the caller has passed a handle that doesn't correspond to
|
||||
* anything we ever created, which should not happen.
|
||||
*/
|
||||
if (key == IPC_PRIVATE)
|
||||
{
|
||||
@@ -469,9 +469,9 @@ dsm_impl_sysv(dsm_op op, dsm_handle handle, Size request_size,
|
||||
}
|
||||
|
||||
/*
|
||||
* Before we can do anything with a shared memory segment, we have to
|
||||
* map the shared memory key to a shared memory identifier using shmget().
|
||||
* To avoid repeated lookups, we store the key using impl_private.
|
||||
* Before we can do anything with a shared memory segment, we have to map
|
||||
* the shared memory key to a shared memory identifier using shmget(). To
|
||||
* avoid repeated lookups, we store the key using impl_private.
|
||||
*/
|
||||
if (*impl_private != NULL)
|
||||
{
|
||||
@@ -480,8 +480,8 @@ dsm_impl_sysv(dsm_op op, dsm_handle handle, Size request_size,
|
||||
}
|
||||
else
|
||||
{
|
||||
int flags = IPCProtection;
|
||||
size_t segsize;
|
||||
int flags = IPCProtection;
|
||||
size_t segsize;
|
||||
|
||||
/*
|
||||
* Allocate the memory BEFORE acquiring the resource, so that we don't
|
||||
@@ -506,7 +506,8 @@ dsm_impl_sysv(dsm_op op, dsm_handle handle, Size request_size,
|
||||
{
|
||||
if (errno != EEXIST)
|
||||
{
|
||||
int save_errno = errno;
|
||||
int save_errno = errno;
|
||||
|
||||
pfree(ident_cache);
|
||||
errno = save_errno;
|
||||
ereport(elevel,
|
||||
@@ -529,8 +530,8 @@ dsm_impl_sysv(dsm_op op, dsm_handle handle, Size request_size,
|
||||
{
|
||||
ereport(elevel,
|
||||
(errcode_for_dynamic_shared_memory(),
|
||||
errmsg("could not unmap shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
errmsg("could not unmap shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
return false;
|
||||
}
|
||||
*mapped_address = NULL;
|
||||
@@ -539,8 +540,8 @@ dsm_impl_sysv(dsm_op op, dsm_handle handle, Size request_size,
|
||||
{
|
||||
ereport(elevel,
|
||||
(errcode_for_dynamic_shared_memory(),
|
||||
errmsg("could not remove shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
errmsg("could not remove shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
@@ -553,7 +554,7 @@ dsm_impl_sysv(dsm_op op, dsm_handle handle, Size request_size,
|
||||
|
||||
if (shmctl(ident, IPC_STAT, &shm) != 0)
|
||||
{
|
||||
int save_errno;
|
||||
int save_errno;
|
||||
|
||||
/* Back out what's already been done. */
|
||||
save_errno = errno;
|
||||
@@ -564,7 +565,7 @@ dsm_impl_sysv(dsm_op op, dsm_handle handle, Size request_size,
|
||||
ereport(elevel,
|
||||
(errcode_for_dynamic_shared_memory(),
|
||||
errmsg("could not stat shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
name)));
|
||||
return false;
|
||||
}
|
||||
request_size = shm.shm_segsz;
|
||||
@@ -574,7 +575,7 @@ dsm_impl_sysv(dsm_op op, dsm_handle handle, Size request_size,
|
||||
address = shmat(ident, NULL, PG_SHMAT_FLAGS);
|
||||
if (address == (void *) -1)
|
||||
{
|
||||
int save_errno;
|
||||
int save_errno;
|
||||
|
||||
/* Back out what's already been done. */
|
||||
save_errno = errno;
|
||||
@@ -614,9 +615,9 @@ dsm_impl_windows(dsm_op op, dsm_handle handle, Size request_size,
|
||||
void **impl_private, void **mapped_address,
|
||||
Size *mapped_size, int elevel)
|
||||
{
|
||||
char *address;
|
||||
char *address;
|
||||
HANDLE hmap;
|
||||
char name[64];
|
||||
char name[64];
|
||||
MEMORY_BASIC_INFORMATION info;
|
||||
|
||||
/* Resize is not supported for Windows shared memory. */
|
||||
@@ -631,12 +632,12 @@ dsm_impl_windows(dsm_op op, dsm_handle handle, Size request_size,
|
||||
return true;
|
||||
|
||||
/*
|
||||
* Storing the shared memory segment in the Global\ namespace, can
|
||||
* allow any process running in any session to access that file
|
||||
* mapping object provided that the caller has the required access rights.
|
||||
* But to avoid issues faced in main shared memory, we are using the naming
|
||||
* convention similar to main shared memory. We can change here once
|
||||
* issue mentioned in GetSharedMemName is resolved.
|
||||
* Storing the shared memory segment in the Global\ namespace, can allow
|
||||
* any process running in any session to access that file mapping object
|
||||
* provided that the caller has the required access rights. But to avoid
|
||||
* issues faced in main shared memory, we are using the naming convention
|
||||
* similar to main shared memory. We can change here once issue mentioned
|
||||
* in GetSharedMemName is resolved.
|
||||
*/
|
||||
snprintf(name, 64, "%s.%u", SEGMENT_NAME_PREFIX, handle);
|
||||
|
||||
@@ -652,8 +653,8 @@ dsm_impl_windows(dsm_op op, dsm_handle handle, Size request_size,
|
||||
_dosmaperr(GetLastError());
|
||||
ereport(elevel,
|
||||
(errcode_for_dynamic_shared_memory(),
|
||||
errmsg("could not unmap shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
errmsg("could not unmap shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
return false;
|
||||
}
|
||||
if (*impl_private != NULL
|
||||
@@ -662,8 +663,8 @@ dsm_impl_windows(dsm_op op, dsm_handle handle, Size request_size,
|
||||
_dosmaperr(GetLastError());
|
||||
ereport(elevel,
|
||||
(errcode_for_dynamic_shared_memory(),
|
||||
errmsg("could not remove shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
errmsg("could not remove shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
return false;
|
||||
}
|
||||
|
||||
@@ -688,9 +689,9 @@ dsm_impl_windows(dsm_op op, dsm_handle handle, Size request_size,
|
||||
size_low = (DWORD) request_size;
|
||||
|
||||
hmap = CreateFileMapping(INVALID_HANDLE_VALUE, /* Use the pagefile */
|
||||
NULL, /* Default security attrs */
|
||||
PAGE_READWRITE, /* Memory is read/write */
|
||||
size_high, /* Upper 32 bits of size */
|
||||
NULL, /* Default security attrs */
|
||||
PAGE_READWRITE, /* Memory is read/write */
|
||||
size_high, /* Upper 32 bits of size */
|
||||
size_low, /* Lower 32 bits of size */
|
||||
name);
|
||||
if (!hmap)
|
||||
@@ -698,8 +699,8 @@ dsm_impl_windows(dsm_op op, dsm_handle handle, Size request_size,
|
||||
_dosmaperr(GetLastError());
|
||||
ereport(elevel,
|
||||
(errcode_for_dynamic_shared_memory(),
|
||||
errmsg("could not create shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
errmsg("could not create shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
return false;
|
||||
}
|
||||
_dosmaperr(GetLastError());
|
||||
@@ -718,8 +719,8 @@ dsm_impl_windows(dsm_op op, dsm_handle handle, Size request_size,
|
||||
else
|
||||
{
|
||||
hmap = OpenFileMapping(FILE_MAP_WRITE | FILE_MAP_READ,
|
||||
FALSE, /* do not inherit the name */
|
||||
name); /* name of mapping object */
|
||||
FALSE, /* do not inherit the name */
|
||||
name); /* name of mapping object */
|
||||
if (!hmap)
|
||||
{
|
||||
_dosmaperr(GetLastError());
|
||||
@@ -736,7 +737,7 @@ dsm_impl_windows(dsm_op op, dsm_handle handle, Size request_size,
|
||||
0, 0, 0);
|
||||
if (!address)
|
||||
{
|
||||
int save_errno;
|
||||
int save_errno;
|
||||
|
||||
_dosmaperr(GetLastError());
|
||||
/* Back out what's already been done. */
|
||||
@@ -752,14 +753,14 @@ dsm_impl_windows(dsm_op op, dsm_handle handle, Size request_size,
|
||||
}
|
||||
|
||||
/*
|
||||
* VirtualQuery gives size in page_size units, which is 4K for Windows.
|
||||
* We need size only when we are attaching, but it's better to get the
|
||||
* size when creating new segment to keep size consistent both for
|
||||
* VirtualQuery gives size in page_size units, which is 4K for Windows. We
|
||||
* need size only when we are attaching, but it's better to get the size
|
||||
* when creating new segment to keep size consistent both for
|
||||
* DSM_OP_CREATE and DSM_OP_ATTACH.
|
||||
*/
|
||||
if (VirtualQuery(address, &info, sizeof(info)) == 0)
|
||||
{
|
||||
int save_errno;
|
||||
int save_errno;
|
||||
|
||||
_dosmaperr(GetLastError());
|
||||
/* Back out what's already been done. */
|
||||
@@ -770,8 +771,8 @@ dsm_impl_windows(dsm_op op, dsm_handle handle, Size request_size,
|
||||
|
||||
ereport(elevel,
|
||||
(errcode_for_dynamic_shared_memory(),
|
||||
errmsg("could not stat shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
errmsg("could not stat shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
return false;
|
||||
}
|
||||
|
||||
@@ -799,13 +800,13 @@ dsm_impl_mmap(dsm_op op, dsm_handle handle, Size request_size,
|
||||
void **impl_private, void **mapped_address, Size *mapped_size,
|
||||
int elevel)
|
||||
{
|
||||
char name[64];
|
||||
int flags;
|
||||
int fd;
|
||||
char *address;
|
||||
char name[64];
|
||||
int flags;
|
||||
int fd;
|
||||
char *address;
|
||||
|
||||
snprintf(name, 64, PG_DYNSHMEM_DIR "/" PG_DYNSHMEM_MMAP_FILE_PREFIX "%u",
|
||||
handle);
|
||||
handle);
|
||||
|
||||
/* Handle teardown cases. */
|
||||
if (op == DSM_OP_DETACH || op == DSM_OP_DESTROY)
|
||||
@@ -815,8 +816,8 @@ dsm_impl_mmap(dsm_op op, dsm_handle handle, Size request_size,
|
||||
{
|
||||
ereport(elevel,
|
||||
(errcode_for_dynamic_shared_memory(),
|
||||
errmsg("could not unmap shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
errmsg("could not unmap shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
return false;
|
||||
}
|
||||
*mapped_address = NULL;
|
||||
@@ -825,8 +826,8 @@ dsm_impl_mmap(dsm_op op, dsm_handle handle, Size request_size,
|
||||
{
|
||||
ereport(elevel,
|
||||
(errcode_for_dynamic_shared_memory(),
|
||||
errmsg("could not remove shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
errmsg("could not remove shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
@@ -840,7 +841,7 @@ dsm_impl_mmap(dsm_op op, dsm_handle handle, Size request_size,
|
||||
ereport(elevel,
|
||||
(errcode_for_dynamic_shared_memory(),
|
||||
errmsg("could not open shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
name)));
|
||||
return false;
|
||||
}
|
||||
|
||||
@@ -854,7 +855,7 @@ dsm_impl_mmap(dsm_op op, dsm_handle handle, Size request_size,
|
||||
|
||||
if (fstat(fd, &st) != 0)
|
||||
{
|
||||
int save_errno;
|
||||
int save_errno;
|
||||
|
||||
/* Back out what's already been done. */
|
||||
save_errno = errno;
|
||||
@@ -864,14 +865,14 @@ dsm_impl_mmap(dsm_op op, dsm_handle handle, Size request_size,
|
||||
ereport(elevel,
|
||||
(errcode_for_dynamic_shared_memory(),
|
||||
errmsg("could not stat shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
name)));
|
||||
return false;
|
||||
}
|
||||
request_size = st.st_size;
|
||||
}
|
||||
else if (*mapped_size > request_size && ftruncate(fd, request_size))
|
||||
{
|
||||
int save_errno;
|
||||
int save_errno;
|
||||
|
||||
/* Back out what's already been done. */
|
||||
save_errno = errno;
|
||||
@@ -882,8 +883,8 @@ dsm_impl_mmap(dsm_op op, dsm_handle handle, Size request_size,
|
||||
|
||||
ereport(elevel,
|
||||
(errcode_for_dynamic_shared_memory(),
|
||||
errmsg("could not resize shared memory segment %s to %zu bytes: %m",
|
||||
name, request_size)));
|
||||
errmsg("could not resize shared memory segment %s to %zu bytes: %m",
|
||||
name, request_size)));
|
||||
return false;
|
||||
}
|
||||
else if (*mapped_size < request_size)
|
||||
@@ -891,23 +892,23 @@ dsm_impl_mmap(dsm_op op, dsm_handle handle, Size request_size,
|
||||
/*
|
||||
* Allocate a buffer full of zeros.
|
||||
*
|
||||
* Note: palloc zbuffer, instead of just using a local char array,
|
||||
* to ensure it is reasonably well-aligned; this may save a few
|
||||
* cycles transferring data to the kernel.
|
||||
* Note: palloc zbuffer, instead of just using a local char array, to
|
||||
* ensure it is reasonably well-aligned; this may save a few cycles
|
||||
* transferring data to the kernel.
|
||||
*/
|
||||
char *zbuffer = (char *) palloc0(ZBUFFER_SIZE);
|
||||
uint32 remaining = request_size;
|
||||
bool success = true;
|
||||
char *zbuffer = (char *) palloc0(ZBUFFER_SIZE);
|
||||
uint32 remaining = request_size;
|
||||
bool success = true;
|
||||
|
||||
/*
|
||||
* Zero-fill the file. We have to do this the hard way to ensure
|
||||
* that all the file space has really been allocated, so that we
|
||||
* don't later seg fault when accessing the memory mapping. This
|
||||
* is pretty pessimal.
|
||||
* Zero-fill the file. We have to do this the hard way to ensure that
|
||||
* all the file space has really been allocated, so that we don't
|
||||
* later seg fault when accessing the memory mapping. This is pretty
|
||||
* pessimal.
|
||||
*/
|
||||
while (success && remaining > 0)
|
||||
{
|
||||
Size goal = remaining;
|
||||
Size goal = remaining;
|
||||
|
||||
if (goal > ZBUFFER_SIZE)
|
||||
goal = ZBUFFER_SIZE;
|
||||
@@ -919,7 +920,7 @@ dsm_impl_mmap(dsm_op op, dsm_handle handle, Size request_size,
|
||||
|
||||
if (!success)
|
||||
{
|
||||
int save_errno;
|
||||
int save_errno;
|
||||
|
||||
/* Back out what's already been done. */
|
||||
save_errno = errno;
|
||||
@@ -931,7 +932,7 @@ dsm_impl_mmap(dsm_op op, dsm_handle handle, Size request_size,
|
||||
ereport(elevel,
|
||||
(errcode_for_dynamic_shared_memory(),
|
||||
errmsg("could not resize shared memory segment %s to %zu bytes: %m",
|
||||
name, request_size)));
|
||||
name, request_size)));
|
||||
return false;
|
||||
}
|
||||
}
|
||||
@@ -946,7 +947,7 @@ dsm_impl_mmap(dsm_op op, dsm_handle handle, Size request_size,
|
||||
return true;
|
||||
if (munmap(*mapped_address, *mapped_size) != 0)
|
||||
{
|
||||
int save_errno;
|
||||
int save_errno;
|
||||
|
||||
/* Back out what's already been done. */
|
||||
save_errno = errno;
|
||||
@@ -957,8 +958,8 @@ dsm_impl_mmap(dsm_op op, dsm_handle handle, Size request_size,
|
||||
|
||||
ereport(elevel,
|
||||
(errcode_for_dynamic_shared_memory(),
|
||||
errmsg("could not unmap shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
errmsg("could not unmap shared memory segment \"%s\": %m",
|
||||
name)));
|
||||
return false;
|
||||
}
|
||||
*mapped_address = NULL;
|
||||
@@ -966,11 +967,11 @@ dsm_impl_mmap(dsm_op op, dsm_handle handle, Size request_size,
|
||||
}
|
||||
|
||||
/* Map it. */
|
||||
address = mmap(NULL, request_size, PROT_READ|PROT_WRITE,
|
||||
MAP_SHARED|MAP_HASSEMAPHORE, fd, 0);
|
||||
address = mmap(NULL, request_size, PROT_READ | PROT_WRITE,
|
||||
MAP_SHARED | MAP_HASSEMAPHORE, fd, 0);
|
||||
if (address == MAP_FAILED)
|
||||
{
|
||||
int save_errno;
|
||||
int save_errno;
|
||||
|
||||
/* Back out what's already been done. */
|
||||
save_errno = errno;
|
||||
@@ -1009,24 +1010,24 @@ dsm_impl_keep_segment(dsm_handle handle, void *impl_private)
|
||||
{
|
||||
#ifdef USE_DSM_WINDOWS
|
||||
case DSM_IMPL_WINDOWS:
|
||||
{
|
||||
HANDLE hmap;
|
||||
|
||||
if (!DuplicateHandle(GetCurrentProcess(), impl_private,
|
||||
PostmasterHandle, &hmap, 0, FALSE,
|
||||
DUPLICATE_SAME_ACCESS))
|
||||
{
|
||||
char name[64];
|
||||
HANDLE hmap;
|
||||
|
||||
snprintf(name, 64, "%s.%u", SEGMENT_NAME_PREFIX, handle);
|
||||
_dosmaperr(GetLastError());
|
||||
ereport(ERROR,
|
||||
(errcode_for_dynamic_shared_memory(),
|
||||
errmsg("could not duplicate handle for \"%s\": %m",
|
||||
name)));
|
||||
if (!DuplicateHandle(GetCurrentProcess(), impl_private,
|
||||
PostmasterHandle, &hmap, 0, FALSE,
|
||||
DUPLICATE_SAME_ACCESS))
|
||||
{
|
||||
char name[64];
|
||||
|
||||
snprintf(name, 64, "%s.%u", SEGMENT_NAME_PREFIX, handle);
|
||||
_dosmaperr(GetLastError());
|
||||
ereport(ERROR,
|
||||
(errcode_for_dynamic_shared_memory(),
|
||||
errmsg("could not duplicate handle for \"%s\": %m",
|
||||
name)));
|
||||
}
|
||||
break;
|
||||
}
|
||||
break;
|
||||
}
|
||||
#endif
|
||||
default:
|
||||
break;
|
||||
|
||||
@@ -4,7 +4,7 @@
|
||||
* POSTGRES inter-process communication definitions.
|
||||
*
|
||||
* This file is misnamed, as it no longer has much of anything directly
|
||||
* to do with IPC. The functionality here is concerned with managing
|
||||
* to do with IPC. The functionality here is concerned with managing
|
||||
* exit-time cleanup for either a postmaster or a backend.
|
||||
*
|
||||
*
|
||||
@@ -90,7 +90,7 @@ static int on_proc_exit_index,
|
||||
* -cim 2/6/90
|
||||
*
|
||||
* Unfortunately, we can't really guarantee that add-on code
|
||||
* obeys the rule of not calling exit() directly. So, while
|
||||
* obeys the rule of not calling exit() directly. So, while
|
||||
* this is the preferred way out of the system, we also register
|
||||
* an atexit callback that will make sure cleanup happens.
|
||||
* ----------------------------------------------------------------
|
||||
@@ -109,7 +109,7 @@ proc_exit(int code)
|
||||
* fixed file name, each backend will overwrite earlier profiles. To
|
||||
* fix that, we create a separate subdirectory for each backend
|
||||
* (./gprof/pid) and 'cd' to that subdirectory before we exit() - that
|
||||
* forces mcleanup() to write each profile into its own directory. We
|
||||
* forces mcleanup() to write each profile into its own directory. We
|
||||
* end up with something like: $PGDATA/gprof/8829/gmon.out
|
||||
* $PGDATA/gprof/8845/gmon.out ...
|
||||
*
|
||||
@@ -219,16 +219,16 @@ shmem_exit(int code)
|
||||
/*
|
||||
* Call before_shmem_exit callbacks.
|
||||
*
|
||||
* These should be things that need most of the system to still be
|
||||
* up and working, such as cleanup of temp relations, which requires
|
||||
* catalog access; or things that need to be completed because later
|
||||
* cleanup steps depend on them, such as releasing lwlocks.
|
||||
* These should be things that need most of the system to still be up and
|
||||
* working, such as cleanup of temp relations, which requires catalog
|
||||
* access; or things that need to be completed because later cleanup steps
|
||||
* depend on them, such as releasing lwlocks.
|
||||
*/
|
||||
elog(DEBUG3, "shmem_exit(%d): %d before_shmem_exit callbacks to make",
|
||||
code, before_shmem_exit_index);
|
||||
while (--before_shmem_exit_index >= 0)
|
||||
(*before_shmem_exit_list[before_shmem_exit_index].function) (code,
|
||||
before_shmem_exit_list[before_shmem_exit_index].arg);
|
||||
before_shmem_exit_list[before_shmem_exit_index].arg);
|
||||
before_shmem_exit_index = 0;
|
||||
|
||||
/*
|
||||
@@ -241,9 +241,9 @@ shmem_exit(int code)
|
||||
* callback before invoking it, so that we don't get stuck in an infinite
|
||||
* loop if one of those callbacks itself throws an ERROR or FATAL.
|
||||
*
|
||||
* Note that explicitly calling this function here is quite different
|
||||
* from registering it as an on_shmem_exit callback for precisely this
|
||||
* reason: if one dynamic shared memory callback errors out, the remaining
|
||||
* Note that explicitly calling this function here is quite different from
|
||||
* registering it as an on_shmem_exit callback for precisely this reason:
|
||||
* if one dynamic shared memory callback errors out, the remaining
|
||||
* callbacks will still be invoked. Thus, hard-coding this call puts it
|
||||
* equal footing with callbacks for the main shared memory segment.
|
||||
*/
|
||||
@@ -261,7 +261,7 @@ shmem_exit(int code)
|
||||
code, on_shmem_exit_index);
|
||||
while (--on_shmem_exit_index >= 0)
|
||||
(*on_shmem_exit_list[on_shmem_exit_index].function) (code,
|
||||
on_shmem_exit_list[on_shmem_exit_index].arg);
|
||||
on_shmem_exit_list[on_shmem_exit_index].arg);
|
||||
on_shmem_exit_index = 0;
|
||||
}
|
||||
|
||||
@@ -287,7 +287,7 @@ atexit_callback(void)
|
||||
* on_proc_exit
|
||||
*
|
||||
* this function adds a callback function to the list of
|
||||
* functions invoked by proc_exit(). -cim 2/6/90
|
||||
* functions invoked by proc_exit(). -cim 2/6/90
|
||||
* ----------------------------------------------------------------
|
||||
*/
|
||||
void
|
||||
@@ -380,7 +380,7 @@ cancel_before_shmem_exit(pg_on_exit_callback function, Datum arg)
|
||||
{
|
||||
if (before_shmem_exit_index > 0 &&
|
||||
before_shmem_exit_list[before_shmem_exit_index - 1].function
|
||||
== function &&
|
||||
== function &&
|
||||
before_shmem_exit_list[before_shmem_exit_index - 1].arg == arg)
|
||||
--before_shmem_exit_index;
|
||||
}
|
||||
|
||||
@@ -55,7 +55,7 @@ static bool addin_request_allowed = true;
|
||||
* a loadable module.
|
||||
*
|
||||
* This is only useful if called from the _PG_init hook of a library that
|
||||
* is loaded into the postmaster via shared_preload_libraries. Once
|
||||
* is loaded into the postmaster via shared_preload_libraries. Once
|
||||
* shared memory has been allocated, calls will be ignored. (We could
|
||||
* raise an error, but it seems better to make it a no-op, so that
|
||||
* libraries containing such calls can be reloaded if needed.)
|
||||
@@ -85,7 +85,7 @@ RequestAddinShmemSpace(Size size)
|
||||
* This is a bit code-wasteful and could be cleaned up.)
|
||||
*
|
||||
* If "makePrivate" is true then we only need private memory, not shared
|
||||
* memory. This is true for a standalone backend, false for a postmaster.
|
||||
* memory. This is true for a standalone backend, false for a postmaster.
|
||||
*/
|
||||
void
|
||||
CreateSharedMemoryAndSemaphores(bool makePrivate, int port)
|
||||
|
||||
@@ -26,9 +26,9 @@
|
||||
|
||||
/*
|
||||
* The postmaster is signaled by its children by sending SIGUSR1. The
|
||||
* specific reason is communicated via flags in shared memory. We keep
|
||||
* specific reason is communicated via flags in shared memory. We keep
|
||||
* a boolean flag for each possible "reason", so that different reasons
|
||||
* can be signaled by different backends at the same time. (However,
|
||||
* can be signaled by different backends at the same time. (However,
|
||||
* if the same reason is signaled more than once simultaneously, the
|
||||
* postmaster will observe it only once.)
|
||||
*
|
||||
@@ -42,7 +42,7 @@
|
||||
* have three possible states: UNUSED, ASSIGNED, ACTIVE. An UNUSED slot is
|
||||
* available for assignment. An ASSIGNED slot is associated with a postmaster
|
||||
* child process, but either the process has not touched shared memory yet,
|
||||
* or it has successfully cleaned up after itself. A ACTIVE slot means the
|
||||
* or it has successfully cleaned up after itself. A ACTIVE slot means the
|
||||
* process is actively using shared memory. The slots are assigned to
|
||||
* child processes at random, and postmaster.c is responsible for tracking
|
||||
* which one goes with which PID.
|
||||
|
||||
@@ -19,11 +19,11 @@
|
||||
*
|
||||
* During hot standby, we also keep a list of XIDs representing transactions
|
||||
* that are known to be running in the master (or more precisely, were running
|
||||
* as of the current point in the WAL stream). This list is kept in the
|
||||
* as of the current point in the WAL stream). This list is kept in the
|
||||
* KnownAssignedXids array, and is updated by watching the sequence of
|
||||
* arriving XIDs. This is necessary because if we leave those XIDs out of
|
||||
* snapshots taken for standby queries, then they will appear to be already
|
||||
* complete, leading to MVCC failures. Note that in hot standby, the PGPROC
|
||||
* complete, leading to MVCC failures. Note that in hot standby, the PGPROC
|
||||
* array represents standby processes, which by definition are not running
|
||||
* transactions that have XIDs.
|
||||
*
|
||||
@@ -276,7 +276,7 @@ ProcArrayAdd(PGPROC *proc)
|
||||
if (arrayP->numProcs >= arrayP->maxProcs)
|
||||
{
|
||||
/*
|
||||
* Ooops, no room. (This really shouldn't happen, since there is a
|
||||
* Ooops, no room. (This really shouldn't happen, since there is a
|
||||
* fixed supply of PGPROC structs too, and so we should have failed
|
||||
* earlier.)
|
||||
*/
|
||||
@@ -686,7 +686,7 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
|
||||
ExtendSUBTRANS(latestObservedXid);
|
||||
TransactionIdAdvance(latestObservedXid);
|
||||
}
|
||||
TransactionIdRetreat(latestObservedXid); /* = running->nextXid - 1 */
|
||||
TransactionIdRetreat(latestObservedXid); /* = running->nextXid - 1 */
|
||||
|
||||
/* ----------
|
||||
* Now we've got the running xids we need to set the global values that
|
||||
@@ -733,7 +733,7 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
|
||||
* ShmemVariableCache->nextXid must be beyond any observed xid.
|
||||
*
|
||||
* We don't expect anyone else to modify nextXid, hence we don't need to
|
||||
* hold a lock while examining it. We still acquire the lock to modify
|
||||
* hold a lock while examining it. We still acquire the lock to modify
|
||||
* it, though.
|
||||
*/
|
||||
nextXid = latestObservedXid;
|
||||
@@ -1485,7 +1485,7 @@ GetSnapshotData(Snapshot snapshot)
|
||||
* do that much work while holding the ProcArrayLock.
|
||||
*
|
||||
* The other backend can add more subxids concurrently, but cannot
|
||||
* remove any. Hence it's important to fetch nxids just once.
|
||||
* remove any. Hence it's important to fetch nxids just once.
|
||||
* Should be safe to use memcpy, though. (We needn't worry about
|
||||
* missing any xids added concurrently, because they must postdate
|
||||
* xmax.)
|
||||
@@ -2153,7 +2153,7 @@ BackendPidGetProc(int pid)
|
||||
* Only main transaction Ids are considered. This function is mainly
|
||||
* useful for determining what backend owns a lock.
|
||||
*
|
||||
* Beware that not every xact has an XID assigned. However, as long as you
|
||||
* Beware that not every xact has an XID assigned. However, as long as you
|
||||
* only call this using an XID found on disk, you're safe.
|
||||
*/
|
||||
int
|
||||
@@ -2217,7 +2217,7 @@ IsBackendPid(int pid)
|
||||
* some snapshot we have. Since we examine the procarray with only shared
|
||||
* lock, there are race conditions: a backend could set its xmin just after
|
||||
* we look. Indeed, on multiprocessors with weak memory ordering, the
|
||||
* other backend could have set its xmin *before* we look. We know however
|
||||
* other backend could have set its xmin *before* we look. We know however
|
||||
* that such a backend must have held shared ProcArrayLock overlapping our
|
||||
* own hold of ProcArrayLock, else we would see its xmin update. Therefore,
|
||||
* any snapshot the other backend is taking concurrently with our scan cannot
|
||||
@@ -2723,7 +2723,7 @@ ProcArrayGetReplicationSlotXmin(TransactionId *xmin,
|
||||
* XidCacheRemoveRunningXids
|
||||
*
|
||||
* Remove a bunch of TransactionIds from the list of known-running
|
||||
* subtransactions for my backend. Both the specified xid and those in
|
||||
* subtransactions for my backend. Both the specified xid and those in
|
||||
* the xids[] array (of length nxids) are removed from the subxids cache.
|
||||
* latestXid must be the latest XID among the group.
|
||||
*/
|
||||
@@ -2829,7 +2829,7 @@ DisplayXidCache(void)
|
||||
* treated as running by standby transactions, even though they are not in
|
||||
* the standby server's PGXACT array.
|
||||
*
|
||||
* We record all XIDs that we know have been assigned. That includes all the
|
||||
* We record all XIDs that we know have been assigned. That includes all the
|
||||
* XIDs seen in WAL records, plus all unobserved XIDs that we can deduce have
|
||||
* been assigned. We can deduce the existence of unobserved XIDs because we
|
||||
* know XIDs are assigned in sequence, with no gaps. The KnownAssignedXids
|
||||
@@ -2838,7 +2838,7 @@ DisplayXidCache(void)
|
||||
*
|
||||
* During hot standby we do not fret too much about the distinction between
|
||||
* top-level XIDs and subtransaction XIDs. We store both together in the
|
||||
* KnownAssignedXids list. In backends, this is copied into snapshots in
|
||||
* KnownAssignedXids list. In backends, this is copied into snapshots in
|
||||
* GetSnapshotData(), taking advantage of the fact that XidInMVCCSnapshot()
|
||||
* doesn't care about the distinction either. Subtransaction XIDs are
|
||||
* effectively treated as top-level XIDs and in the typical case pg_subtrans
|
||||
@@ -3053,14 +3053,14 @@ ExpireOldKnownAssignedTransactionIds(TransactionId xid)
|
||||
* must hold shared ProcArrayLock to examine the array. To remove XIDs from
|
||||
* the array, the startup process must hold ProcArrayLock exclusively, for
|
||||
* the usual transactional reasons (compare commit/abort of a transaction
|
||||
* during normal running). Compressing unused entries out of the array
|
||||
* during normal running). Compressing unused entries out of the array
|
||||
* likewise requires exclusive lock. To add XIDs to the array, we just insert
|
||||
* them into slots to the right of the head pointer and then advance the head
|
||||
* pointer. This wouldn't require any lock at all, except that on machines
|
||||
* with weak memory ordering we need to be careful that other processors
|
||||
* see the array element changes before they see the head pointer change.
|
||||
* We handle this by using a spinlock to protect reads and writes of the
|
||||
* head/tail pointers. (We could dispense with the spinlock if we were to
|
||||
* head/tail pointers. (We could dispense with the spinlock if we were to
|
||||
* create suitable memory access barrier primitives and use those instead.)
|
||||
* The spinlock must be taken to read or write the head/tail pointers unless
|
||||
* the caller holds ProcArrayLock exclusively.
|
||||
@@ -3157,7 +3157,7 @@ KnownAssignedXidsCompress(bool force)
|
||||
* If exclusive_lock is true then caller already holds ProcArrayLock in
|
||||
* exclusive mode, so we need no extra locking here. Else caller holds no
|
||||
* lock, so we need to be sure we maintain sufficient interlocks against
|
||||
* concurrent readers. (Only the startup process ever calls this, so no need
|
||||
* concurrent readers. (Only the startup process ever calls this, so no need
|
||||
* to worry about concurrent writers.)
|
||||
*/
|
||||
static void
|
||||
@@ -3203,7 +3203,7 @@ KnownAssignedXidsAdd(TransactionId from_xid, TransactionId to_xid,
|
||||
Assert(tail >= 0 && tail < pArray->maxKnownAssignedXids);
|
||||
|
||||
/*
|
||||
* Verify that insertions occur in TransactionId sequence. Note that even
|
||||
* Verify that insertions occur in TransactionId sequence. Note that even
|
||||
* if the last existing element is marked invalid, it must still have a
|
||||
* correctly sequenced XID value.
|
||||
*/
|
||||
@@ -3306,7 +3306,7 @@ KnownAssignedXidsSearch(TransactionId xid, bool remove)
|
||||
}
|
||||
|
||||
/*
|
||||
* Standard binary search. Note we can ignore the KnownAssignedXidsValid
|
||||
* Standard binary search. Note we can ignore the KnownAssignedXidsValid
|
||||
* array here, since even invalid entries will contain sorted XIDs.
|
||||
*/
|
||||
first = tail;
|
||||
|
||||
@@ -64,7 +64,7 @@ typedef struct
|
||||
* Spurious wakeups must be expected. Make sure that the flag is cleared
|
||||
* in the error path.
|
||||
*/
|
||||
bool set_latch_on_sigusr1;
|
||||
bool set_latch_on_sigusr1;
|
||||
|
||||
static ProcSignalSlot *ProcSignalSlots = NULL;
|
||||
static volatile ProcSignalSlot *MyProcSignalSlot = NULL;
|
||||
|
||||
@@ -142,7 +142,7 @@ static shm_mq_result shm_mq_send_bytes(shm_mq_handle *mq, Size nbytes,
|
||||
void *data, bool nowait, Size *bytes_written);
|
||||
static shm_mq_result shm_mq_receive_bytes(shm_mq *mq, Size bytes_needed,
|
||||
bool nowait, Size *nbytesp, void **datap);
|
||||
static bool shm_mq_wait_internal(volatile shm_mq *mq, PGPROC * volatile *ptr,
|
||||
static bool shm_mq_wait_internal(volatile shm_mq *mq, PGPROC *volatile * ptr,
|
||||
BackgroundWorkerHandle *handle);
|
||||
static uint64 shm_mq_get_bytes_read(volatile shm_mq *mq, bool *detached);
|
||||
static void shm_mq_inc_bytes_read(volatile shm_mq *mq, Size n);
|
||||
@@ -152,8 +152,8 @@ static shm_mq_result shm_mq_notify_receiver(volatile shm_mq *mq);
|
||||
static void shm_mq_detach_callback(dsm_segment *seg, Datum arg);
|
||||
|
||||
/* Minimum queue size is enough for header and at least one chunk of data. */
|
||||
const Size shm_mq_minimum_size =
|
||||
MAXALIGN(offsetof(shm_mq, mq_ring)) + MAXIMUM_ALIGNOF;
|
||||
const Size shm_mq_minimum_size =
|
||||
MAXALIGN(offsetof(shm_mq, mq_ring)) + MAXIMUM_ALIGNOF;
|
||||
|
||||
#define MQH_INITIAL_BUFSIZE 8192
|
||||
|
||||
@@ -193,7 +193,7 @@ void
|
||||
shm_mq_set_receiver(shm_mq *mq, PGPROC *proc)
|
||||
{
|
||||
volatile shm_mq *vmq = mq;
|
||||
PGPROC *sender;
|
||||
PGPROC *sender;
|
||||
|
||||
SpinLockAcquire(&mq->mq_mutex);
|
||||
Assert(vmq->mq_receiver == NULL);
|
||||
@@ -212,7 +212,7 @@ void
|
||||
shm_mq_set_sender(shm_mq *mq, PGPROC *proc)
|
||||
{
|
||||
volatile shm_mq *vmq = mq;
|
||||
PGPROC *receiver;
|
||||
PGPROC *receiver;
|
||||
|
||||
SpinLockAcquire(&mq->mq_mutex);
|
||||
Assert(vmq->mq_sender == NULL);
|
||||
@@ -231,7 +231,7 @@ PGPROC *
|
||||
shm_mq_get_receiver(shm_mq *mq)
|
||||
{
|
||||
volatile shm_mq *vmq = mq;
|
||||
PGPROC *receiver;
|
||||
PGPROC *receiver;
|
||||
|
||||
SpinLockAcquire(&mq->mq_mutex);
|
||||
receiver = vmq->mq_receiver;
|
||||
@@ -247,7 +247,7 @@ PGPROC *
|
||||
shm_mq_get_sender(shm_mq *mq)
|
||||
{
|
||||
volatile shm_mq *vmq = mq;
|
||||
PGPROC *sender;
|
||||
PGPROC *sender;
|
||||
|
||||
SpinLockAcquire(&mq->mq_mutex);
|
||||
sender = vmq->mq_sender;
|
||||
@@ -280,7 +280,7 @@ shm_mq_get_sender(shm_mq *mq)
|
||||
shm_mq_handle *
|
||||
shm_mq_attach(shm_mq *mq, dsm_segment *seg, BackgroundWorkerHandle *handle)
|
||||
{
|
||||
shm_mq_handle *mqh = palloc(sizeof(shm_mq_handle));
|
||||
shm_mq_handle *mqh = palloc(sizeof(shm_mq_handle));
|
||||
|
||||
Assert(mq->mq_receiver == MyProc || mq->mq_sender == MyProc);
|
||||
mqh->mqh_queue = mq;
|
||||
@@ -317,9 +317,9 @@ shm_mq_attach(shm_mq *mq, dsm_segment *seg, BackgroundWorkerHandle *handle)
|
||||
shm_mq_result
|
||||
shm_mq_send(shm_mq_handle *mqh, Size nbytes, void *data, bool nowait)
|
||||
{
|
||||
shm_mq_result res;
|
||||
shm_mq *mq = mqh->mqh_queue;
|
||||
Size bytes_written;
|
||||
shm_mq_result res;
|
||||
shm_mq *mq = mqh->mqh_queue;
|
||||
Size bytes_written;
|
||||
|
||||
Assert(mq->mq_sender == MyProc);
|
||||
|
||||
@@ -328,7 +328,7 @@ shm_mq_send(shm_mq_handle *mqh, Size nbytes, void *data, bool nowait)
|
||||
{
|
||||
Assert(mqh->mqh_partial_bytes < sizeof(Size));
|
||||
res = shm_mq_send_bytes(mqh, sizeof(Size) - mqh->mqh_partial_bytes,
|
||||
((char *) &nbytes) + mqh->mqh_partial_bytes,
|
||||
((char *) &nbytes) +mqh->mqh_partial_bytes,
|
||||
nowait, &bytes_written);
|
||||
mqh->mqh_partial_bytes += bytes_written;
|
||||
if (res != SHM_MQ_SUCCESS)
|
||||
@@ -390,11 +390,11 @@ shm_mq_send(shm_mq_handle *mqh, Size nbytes, void *data, bool nowait)
|
||||
shm_mq_result
|
||||
shm_mq_receive(shm_mq_handle *mqh, Size *nbytesp, void **datap, bool nowait)
|
||||
{
|
||||
shm_mq *mq = mqh->mqh_queue;
|
||||
shm_mq_result res;
|
||||
Size rb = 0;
|
||||
Size nbytes;
|
||||
void *rawdata;
|
||||
shm_mq *mq = mqh->mqh_queue;
|
||||
shm_mq_result res;
|
||||
Size rb = 0;
|
||||
Size nbytes;
|
||||
void *rawdata;
|
||||
|
||||
Assert(mq->mq_receiver == MyProc);
|
||||
|
||||
@@ -439,18 +439,19 @@ shm_mq_receive(shm_mq_handle *mqh, Size *nbytesp, void **datap, bool nowait)
|
||||
*/
|
||||
if (mqh->mqh_partial_bytes == 0 && rb >= sizeof(Size))
|
||||
{
|
||||
Size needed;
|
||||
Size needed;
|
||||
|
||||
nbytes = * (Size *) rawdata;
|
||||
nbytes = *(Size *) rawdata;
|
||||
|
||||
/* If we've already got the whole message, we're done. */
|
||||
needed = MAXALIGN(sizeof(Size)) + MAXALIGN(nbytes);
|
||||
if (rb >= needed)
|
||||
{
|
||||
/*
|
||||
* Technically, we could consume the message length information
|
||||
* at this point, but the extra write to shared memory wouldn't
|
||||
* be free and in most cases we would reap no benefit.
|
||||
* Technically, we could consume the message length
|
||||
* information at this point, but the extra write to shared
|
||||
* memory wouldn't be free and in most cases we would reap no
|
||||
* benefit.
|
||||
*/
|
||||
mqh->mqh_consume_pending = needed;
|
||||
*nbytesp = nbytes;
|
||||
@@ -469,7 +470,7 @@ shm_mq_receive(shm_mq_handle *mqh, Size *nbytesp, void **datap, bool nowait)
|
||||
}
|
||||
else
|
||||
{
|
||||
Size lengthbytes;
|
||||
Size lengthbytes;
|
||||
|
||||
/* Can't be split unless bigger than required alignment. */
|
||||
Assert(sizeof(Size) > MAXIMUM_ALIGNOF);
|
||||
@@ -498,7 +499,7 @@ shm_mq_receive(shm_mq_handle *mqh, Size *nbytesp, void **datap, bool nowait)
|
||||
if (mqh->mqh_partial_bytes >= sizeof(Size))
|
||||
{
|
||||
Assert(mqh->mqh_partial_bytes == sizeof(Size));
|
||||
mqh->mqh_expected_bytes = * (Size *) mqh->mqh_buffer;
|
||||
mqh->mqh_expected_bytes = *(Size *) mqh->mqh_buffer;
|
||||
mqh->mqh_length_word_complete = true;
|
||||
mqh->mqh_partial_bytes = 0;
|
||||
}
|
||||
@@ -527,12 +528,12 @@ shm_mq_receive(shm_mq_handle *mqh, Size *nbytesp, void **datap, bool nowait)
|
||||
|
||||
/*
|
||||
* The message has wrapped the buffer. We'll need to copy it in order
|
||||
* to return it to the client in one chunk. First, make sure we have a
|
||||
* large enough buffer available.
|
||||
* to return it to the client in one chunk. First, make sure we have
|
||||
* a large enough buffer available.
|
||||
*/
|
||||
if (mqh->mqh_buflen < nbytes)
|
||||
{
|
||||
Size newbuflen = Max(mqh->mqh_buflen, MQH_INITIAL_BUFSIZE);
|
||||
Size newbuflen = Max(mqh->mqh_buflen, MQH_INITIAL_BUFSIZE);
|
||||
|
||||
while (newbuflen < nbytes)
|
||||
newbuflen *= 2;
|
||||
@@ -551,7 +552,7 @@ shm_mq_receive(shm_mq_handle *mqh, Size *nbytesp, void **datap, bool nowait)
|
||||
/* Loop until we've copied the entire message. */
|
||||
for (;;)
|
||||
{
|
||||
Size still_needed;
|
||||
Size still_needed;
|
||||
|
||||
/* Copy as much as we can. */
|
||||
Assert(mqh->mqh_partial_bytes + rb <= nbytes);
|
||||
@@ -559,10 +560,10 @@ shm_mq_receive(shm_mq_handle *mqh, Size *nbytesp, void **datap, bool nowait)
|
||||
mqh->mqh_partial_bytes += rb;
|
||||
|
||||
/*
|
||||
* Update count of bytes read, with alignment padding. Note
|
||||
* that this will never actually insert any padding except at the
|
||||
* end of a message, because the buffer size is a multiple of
|
||||
* MAXIMUM_ALIGNOF, and each read and write is as well.
|
||||
* Update count of bytes read, with alignment padding. Note that this
|
||||
* will never actually insert any padding except at the end of a
|
||||
* message, because the buffer size is a multiple of MAXIMUM_ALIGNOF,
|
||||
* and each read and write is as well.
|
||||
*/
|
||||
Assert(mqh->mqh_partial_bytes == nbytes || rb == MAXALIGN(rb));
|
||||
shm_mq_inc_bytes_read(mq, MAXALIGN(rb));
|
||||
@@ -601,7 +602,7 @@ shm_mq_result
|
||||
shm_mq_wait_for_attach(shm_mq_handle *mqh)
|
||||
{
|
||||
shm_mq *mq = mqh->mqh_queue;
|
||||
PGPROC **victim;
|
||||
PGPROC **victim;
|
||||
|
||||
if (shm_mq_get_receiver(mq) == MyProc)
|
||||
victim = &mq->mq_sender;
|
||||
@@ -663,8 +664,8 @@ shm_mq_send_bytes(shm_mq_handle *mqh, Size nbytes, void *data, bool nowait,
|
||||
|
||||
while (sent < nbytes)
|
||||
{
|
||||
bool detached;
|
||||
uint64 rb;
|
||||
bool detached;
|
||||
uint64 rb;
|
||||
|
||||
/* Compute number of ring buffer bytes used and available. */
|
||||
rb = shm_mq_get_bytes_read(mq, &detached);
|
||||
@@ -679,7 +680,7 @@ shm_mq_send_bytes(shm_mq_handle *mqh, Size nbytes, void *data, bool nowait,
|
||||
|
||||
if (available == 0)
|
||||
{
|
||||
shm_mq_result res;
|
||||
shm_mq_result res;
|
||||
|
||||
/*
|
||||
* The queue is full, so if the receiver isn't yet known to be
|
||||
@@ -717,11 +718,11 @@ shm_mq_send_bytes(shm_mq_handle *mqh, Size nbytes, void *data, bool nowait,
|
||||
}
|
||||
|
||||
/*
|
||||
* Wait for our latch to be set. It might already be set for
|
||||
* some unrelated reason, but that'll just result in one extra
|
||||
* trip through the loop. It's worth it to avoid resetting the
|
||||
* latch at top of loop, because setting an already-set latch is
|
||||
* much cheaper than setting one that has been reset.
|
||||
* Wait for our latch to be set. It might already be set for some
|
||||
* unrelated reason, but that'll just result in one extra trip
|
||||
* through the loop. It's worth it to avoid resetting the latch
|
||||
* at top of loop, because setting an already-set latch is much
|
||||
* cheaper than setting one that has been reset.
|
||||
*/
|
||||
WaitLatch(&MyProc->procLatch, WL_LATCH_SET, 0);
|
||||
|
||||
@@ -733,8 +734,8 @@ shm_mq_send_bytes(shm_mq_handle *mqh, Size nbytes, void *data, bool nowait,
|
||||
}
|
||||
else
|
||||
{
|
||||
Size offset = mq->mq_bytes_written % (uint64) ringsize;
|
||||
Size sendnow = Min(available, ringsize - offset);
|
||||
Size offset = mq->mq_bytes_written % (uint64) ringsize;
|
||||
Size sendnow = Min(available, ringsize - offset);
|
||||
|
||||
/* Write as much data as we can via a single memcpy(). */
|
||||
memcpy(&mq->mq_ring[mq->mq_ring_offset + offset],
|
||||
@@ -751,9 +752,9 @@ shm_mq_send_bytes(shm_mq_handle *mqh, Size nbytes, void *data, bool nowait,
|
||||
shm_mq_inc_bytes_written(mq, MAXALIGN(sendnow));
|
||||
|
||||
/*
|
||||
* For efficiency, we don't set the reader's latch here. We'll
|
||||
* do that only when the buffer fills up or after writing an
|
||||
* entire message.
|
||||
* For efficiency, we don't set the reader's latch here. We'll do
|
||||
* that only when the buffer fills up or after writing an entire
|
||||
* message.
|
||||
*/
|
||||
}
|
||||
}
|
||||
@@ -801,10 +802,10 @@ shm_mq_receive_bytes(shm_mq *mq, Size bytes_needed, bool nowait,
|
||||
/*
|
||||
* Fall out before waiting if the queue has been detached.
|
||||
*
|
||||
* Note that we don't check for this until *after* considering
|
||||
* whether the data already available is enough, since the
|
||||
* receiver can finish receiving a message stored in the buffer
|
||||
* even after the sender has detached.
|
||||
* Note that we don't check for this until *after* considering whether
|
||||
* the data already available is enough, since the receiver can finish
|
||||
* receiving a message stored in the buffer even after the sender has
|
||||
* detached.
|
||||
*/
|
||||
if (detached)
|
||||
return SHM_MQ_DETACHED;
|
||||
@@ -814,11 +815,11 @@ shm_mq_receive_bytes(shm_mq *mq, Size bytes_needed, bool nowait,
|
||||
return SHM_MQ_WOULD_BLOCK;
|
||||
|
||||
/*
|
||||
* Wait for our latch to be set. It might already be set for
|
||||
* some unrelated reason, but that'll just result in one extra
|
||||
* trip through the loop. It's worth it to avoid resetting the
|
||||
* latch at top of loop, because setting an already-set latch is
|
||||
* much cheaper than setting one that has been reset.
|
||||
* Wait for our latch to be set. It might already be set for some
|
||||
* unrelated reason, but that'll just result in one extra trip through
|
||||
* the loop. It's worth it to avoid resetting the latch at top of
|
||||
* loop, because setting an already-set latch is much cheaper than
|
||||
* setting one that has been reset.
|
||||
*/
|
||||
WaitLatch(&MyProc->procLatch, WL_LATCH_SET, 0);
|
||||
|
||||
@@ -842,11 +843,11 @@ shm_mq_receive_bytes(shm_mq *mq, Size bytes_needed, bool nowait,
|
||||
* non-NULL when our counterpart attaches to the queue.
|
||||
*/
|
||||
static bool
|
||||
shm_mq_wait_internal(volatile shm_mq *mq, PGPROC * volatile *ptr,
|
||||
shm_mq_wait_internal(volatile shm_mq *mq, PGPROC *volatile * ptr,
|
||||
BackgroundWorkerHandle *handle)
|
||||
{
|
||||
bool save_set_latch_on_sigusr1;
|
||||
bool result = false;
|
||||
bool save_set_latch_on_sigusr1;
|
||||
bool result = false;
|
||||
|
||||
save_set_latch_on_sigusr1 = set_latch_on_sigusr1;
|
||||
if (handle != NULL)
|
||||
@@ -856,9 +857,9 @@ shm_mq_wait_internal(volatile shm_mq *mq, PGPROC * volatile *ptr,
|
||||
{
|
||||
for (;;)
|
||||
{
|
||||
BgwHandleStatus status;
|
||||
pid_t pid;
|
||||
bool detached;
|
||||
BgwHandleStatus status;
|
||||
pid_t pid;
|
||||
bool detached;
|
||||
|
||||
/* Acquire the lock just long enough to check the pointer. */
|
||||
SpinLockAcquire(&mq->mq_mutex);
|
||||
@@ -913,7 +914,7 @@ shm_mq_wait_internal(volatile shm_mq *mq, PGPROC * volatile *ptr,
|
||||
static uint64
|
||||
shm_mq_get_bytes_read(volatile shm_mq *mq, bool *detached)
|
||||
{
|
||||
uint64 v;
|
||||
uint64 v;
|
||||
|
||||
SpinLockAcquire(&mq->mq_mutex);
|
||||
v = mq->mq_bytes_read;
|
||||
@@ -948,7 +949,7 @@ shm_mq_inc_bytes_read(volatile shm_mq *mq, Size n)
|
||||
static uint64
|
||||
shm_mq_get_bytes_written(volatile shm_mq *mq, bool *detached)
|
||||
{
|
||||
uint64 v;
|
||||
uint64 v;
|
||||
|
||||
SpinLockAcquire(&mq->mq_mutex);
|
||||
v = mq->mq_bytes_written;
|
||||
@@ -975,8 +976,8 @@ shm_mq_inc_bytes_written(volatile shm_mq *mq, Size n)
|
||||
static shm_mq_result
|
||||
shm_mq_notify_receiver(volatile shm_mq *mq)
|
||||
{
|
||||
PGPROC *receiver;
|
||||
bool detached;
|
||||
PGPROC *receiver;
|
||||
bool detached;
|
||||
|
||||
SpinLockAcquire(&mq->mq_mutex);
|
||||
detached = mq->mq_detached;
|
||||
|
||||
@@ -19,17 +19,17 @@
|
||||
|
||||
typedef struct shm_toc_entry
|
||||
{
|
||||
uint64 key; /* Arbitrary identifier */
|
||||
uint64 offset; /* Bytes offset */
|
||||
uint64 key; /* Arbitrary identifier */
|
||||
uint64 offset; /* Bytes offset */
|
||||
} shm_toc_entry;
|
||||
|
||||
struct shm_toc
|
||||
{
|
||||
uint64 toc_magic; /* Magic number for this TOC */
|
||||
slock_t toc_mutex; /* Spinlock for mutual exclusion */
|
||||
Size toc_total_bytes; /* Bytes managed by this TOC */
|
||||
uint64 toc_magic; /* Magic number for this TOC */
|
||||
slock_t toc_mutex; /* Spinlock for mutual exclusion */
|
||||
Size toc_total_bytes; /* Bytes managed by this TOC */
|
||||
Size toc_allocated_bytes; /* Bytes allocated of those managed */
|
||||
Size toc_nentry; /* Number of entries in TOC */
|
||||
Size toc_nentry; /* Number of entries in TOC */
|
||||
shm_toc_entry toc_entry[FLEXIBLE_ARRAY_MEMBER];
|
||||
};
|
||||
|
||||
@@ -39,7 +39,7 @@ struct shm_toc
|
||||
shm_toc *
|
||||
shm_toc_create(uint64 magic, void *address, Size nbytes)
|
||||
{
|
||||
shm_toc *toc = (shm_toc *) address;
|
||||
shm_toc *toc = (shm_toc *) address;
|
||||
|
||||
Assert(nbytes > offsetof(shm_toc, toc_entry));
|
||||
toc->toc_magic = magic;
|
||||
@@ -58,7 +58,7 @@ shm_toc_create(uint64 magic, void *address, Size nbytes)
|
||||
extern shm_toc *
|
||||
shm_toc_attach(uint64 magic, void *address)
|
||||
{
|
||||
shm_toc *toc = (shm_toc *) address;
|
||||
shm_toc *toc = (shm_toc *) address;
|
||||
|
||||
if (toc->toc_magic != magic)
|
||||
return NULL;
|
||||
@@ -96,7 +96,7 @@ shm_toc_allocate(shm_toc *toc, Size nbytes)
|
||||
total_bytes = vtoc->toc_total_bytes;
|
||||
allocated_bytes = vtoc->toc_allocated_bytes;
|
||||
nentry = vtoc->toc_nentry;
|
||||
toc_bytes = offsetof(shm_toc, toc_entry) + nentry * sizeof(shm_toc_entry)
|
||||
toc_bytes = offsetof(shm_toc, toc_entry) +nentry * sizeof(shm_toc_entry)
|
||||
+ allocated_bytes;
|
||||
|
||||
/* Check for memory exhaustion and overflow. */
|
||||
@@ -132,7 +132,7 @@ shm_toc_freespace(shm_toc *toc)
|
||||
nentry = vtoc->toc_nentry;
|
||||
SpinLockRelease(&toc->toc_mutex);
|
||||
|
||||
toc_bytes = offsetof(shm_toc, toc_entry) + nentry * sizeof(shm_toc_entry);
|
||||
toc_bytes = offsetof(shm_toc, toc_entry) +nentry * sizeof(shm_toc_entry);
|
||||
Assert(allocated_bytes + BUFFERALIGN(toc_bytes) <= total_bytes);
|
||||
return total_bytes - (allocated_bytes + BUFFERALIGN(toc_bytes));
|
||||
}
|
||||
@@ -176,7 +176,7 @@ shm_toc_insert(shm_toc *toc, uint64 key, void *address)
|
||||
total_bytes = vtoc->toc_total_bytes;
|
||||
allocated_bytes = vtoc->toc_allocated_bytes;
|
||||
nentry = vtoc->toc_nentry;
|
||||
toc_bytes = offsetof(shm_toc, toc_entry) + nentry * sizeof(shm_toc_entry)
|
||||
toc_bytes = offsetof(shm_toc, toc_entry) +nentry * sizeof(shm_toc_entry)
|
||||
+ allocated_bytes;
|
||||
|
||||
/* Check for memory exhaustion and overflow. */
|
||||
@@ -241,6 +241,6 @@ Size
|
||||
shm_toc_estimate(shm_toc_estimator *e)
|
||||
{
|
||||
return add_size(offsetof(shm_toc, toc_entry),
|
||||
add_size(mul_size(e->number_of_keys, sizeof(shm_toc_entry)),
|
||||
e->space_for_chunks));
|
||||
add_size(mul_size(e->number_of_keys, sizeof(shm_toc_entry)),
|
||||
e->space_for_chunks));
|
||||
}
|
||||
|
||||
@@ -26,7 +26,7 @@
|
||||
* for a module and should never be allocated after the shared memory
|
||||
* initialization phase. Hash tables have a fixed maximum size, but
|
||||
* their actual size can vary dynamically. When entries are added
|
||||
* to the table, more space is allocated. Queues link data structures
|
||||
* to the table, more space is allocated. Queues link data structures
|
||||
* that have been allocated either within fixed-size structures or as hash
|
||||
* buckets. Each shared data structure has a string name to identify
|
||||
* it (assigned in the module that declares it).
|
||||
@@ -40,7 +40,7 @@
|
||||
* The shmem index has two purposes: first, it gives us
|
||||
* a simple model of how the world looks when a backend process
|
||||
* initializes. If something is present in the shmem index,
|
||||
* it is initialized. If it is not, it is uninitialized. Second,
|
||||
* it is initialized. If it is not, it is uninitialized. Second,
|
||||
* the shmem index allows us to allocate shared memory on demand
|
||||
* instead of trying to preallocate structures and hard-wire the
|
||||
* sizes and locations in header files. If you are using a lot
|
||||
@@ -55,8 +55,8 @@
|
||||
* pointers using the method described in (b) above.
|
||||
*
|
||||
* (d) memory allocation model: shared memory can never be
|
||||
* freed, once allocated. Each hash table has its own free list,
|
||||
* so hash buckets can be reused when an item is deleted. However,
|
||||
* freed, once allocated. Each hash table has its own free list,
|
||||
* so hash buckets can be reused when an item is deleted. However,
|
||||
* if one hash table grows very large and then shrinks, its space
|
||||
* cannot be redistributed to other tables. We could build a simple
|
||||
* hash bucket garbage collector if need be. Right now, it seems
|
||||
@@ -232,7 +232,7 @@ InitShmemIndex(void)
|
||||
*
|
||||
* Since ShmemInitHash calls ShmemInitStruct, which expects the ShmemIndex
|
||||
* hashtable to exist already, we have a bit of a circularity problem in
|
||||
* initializing the ShmemIndex itself. The special "ShmemIndex" hash
|
||||
* initializing the ShmemIndex itself. The special "ShmemIndex" hash
|
||||
* table name will tell ShmemInitStruct to fake it.
|
||||
*/
|
||||
info.keysize = SHMEM_INDEX_KEYSIZE;
|
||||
@@ -309,7 +309,7 @@ ShmemInitHash(const char *name, /* table string name for shmem index */
|
||||
* ShmemInitStruct -- Create/attach to a structure in shared memory.
|
||||
*
|
||||
* This is called during initialization to find or allocate
|
||||
* a data structure in shared memory. If no other process
|
||||
* a data structure in shared memory. If no other process
|
||||
* has created the structure, this routine allocates space
|
||||
* for it. If it exists already, a pointer to the existing
|
||||
* structure is returned.
|
||||
@@ -318,7 +318,7 @@ ShmemInitHash(const char *name, /* table string name for shmem index */
|
||||
* already in the shmem index (hence, already initialized).
|
||||
*
|
||||
* Note: before Postgres 9.0, this function returned NULL for some failure
|
||||
* cases. Now, it always throws error instead, so callers need not check
|
||||
* cases. Now, it always throws error instead, so callers need not check
|
||||
* for NULL.
|
||||
*/
|
||||
void *
|
||||
@@ -350,7 +350,7 @@ ShmemInitStruct(const char *name, Size size, bool *foundPtr)
|
||||
* be trying to init the shmem index itself.
|
||||
*
|
||||
* Notice that the ShmemIndexLock is released before the shmem
|
||||
* index has been initialized. This should be OK because no other
|
||||
* index has been initialized. This should be OK because no other
|
||||
* process can be accessing shared memory yet.
|
||||
*/
|
||||
Assert(shmemseghdr->index == NULL);
|
||||
|
||||
@@ -14,7 +14,7 @@
|
||||
*
|
||||
* Package for managing doubly-linked lists in shared memory.
|
||||
* The only tricky thing is that SHM_QUEUE will usually be a field
|
||||
* in a larger record. SHMQueueNext has to return a pointer
|
||||
* in a larger record. SHMQueueNext has to return a pointer
|
||||
* to the record itself instead of a pointer to the SHMQueue field
|
||||
* of the record. It takes an extra parameter and does some extra
|
||||
* pointer arithmetic to do this correctly.
|
||||
|
||||
@@ -29,7 +29,7 @@ uint64 SharedInvalidMessageCounter;
|
||||
* Because backends sitting idle will not be reading sinval events, we
|
||||
* need a way to give an idle backend a swift kick in the rear and make
|
||||
* it catch up before the sinval queue overflows and forces it to go
|
||||
* through a cache reset exercise. This is done by sending
|
||||
* through a cache reset exercise. This is done by sending
|
||||
* PROCSIG_CATCHUP_INTERRUPT to any backend that gets too far behind.
|
||||
*
|
||||
* State for catchup events consists of two flags: one saying whether
|
||||
@@ -68,7 +68,7 @@ SendSharedInvalidMessages(const SharedInvalidationMessage *msgs, int n)
|
||||
* NOTE: it is entirely possible for this routine to be invoked recursively
|
||||
* as a consequence of processing inside the invalFunction or resetFunction.
|
||||
* Furthermore, such a recursive call must guarantee that all outstanding
|
||||
* inval messages have been processed before it exits. This is the reason
|
||||
* inval messages have been processed before it exits. This is the reason
|
||||
* for the strange-looking choice to use a statically allocated buffer array
|
||||
* and counters; it's so that a recursive call can process messages already
|
||||
* sucked out of sinvaladt.c.
|
||||
@@ -137,7 +137,7 @@ ReceiveSharedInvalidMessages(
|
||||
* We are now caught up. If we received a catchup signal, reset that
|
||||
* flag, and call SICleanupQueue(). This is not so much because we need
|
||||
* to flush dead messages right now, as that we want to pass on the
|
||||
* catchup signal to the next slowest backend. "Daisy chaining" the
|
||||
* catchup signal to the next slowest backend. "Daisy chaining" the
|
||||
* catchup signal this way avoids creating spikes in system load for what
|
||||
* should be just a background maintenance activity.
|
||||
*/
|
||||
@@ -157,7 +157,7 @@ ReceiveSharedInvalidMessages(
|
||||
*
|
||||
* If we are idle (catchupInterruptEnabled is set), we can safely
|
||||
* invoke ProcessCatchupEvent directly. Otherwise, just set a flag
|
||||
* to do it later. (Note that it's quite possible for normal processing
|
||||
* to do it later. (Note that it's quite possible for normal processing
|
||||
* of the current transaction to cause ReceiveSharedInvalidMessages()
|
||||
* to be run later on; in that case the flag will get cleared again,
|
||||
* since there's no longer any reason to do anything.)
|
||||
@@ -233,7 +233,7 @@ HandleCatchupInterrupt(void)
|
||||
* EnableCatchupInterrupt
|
||||
*
|
||||
* This is called by the PostgresMain main loop just before waiting
|
||||
* for a frontend command. We process any pending catchup events,
|
||||
* for a frontend command. We process any pending catchup events,
|
||||
* and enable the signal handler to process future events directly.
|
||||
*
|
||||
* NOTE: the signal handler starts out disabled, and stays so until
|
||||
@@ -278,7 +278,7 @@ EnableCatchupInterrupt(void)
|
||||
* DisableCatchupInterrupt
|
||||
*
|
||||
* This is called by the PostgresMain main loop just after receiving
|
||||
* a frontend command. Signal handler execution of catchup events
|
||||
* a frontend command. Signal handler execution of catchup events
|
||||
* is disabled until the next EnableCatchupInterrupt call.
|
||||
*
|
||||
* The PROCSIG_NOTIFY_INTERRUPT signal handler also needs to call this,
|
||||
|
||||
@@ -46,7 +46,7 @@
|
||||
* In reality, the messages are stored in a circular buffer of MAXNUMMESSAGES
|
||||
* entries. We translate MsgNum values into circular-buffer indexes by
|
||||
* computing MsgNum % MAXNUMMESSAGES (this should be fast as long as
|
||||
* MAXNUMMESSAGES is a constant and a power of 2). As long as maxMsgNum
|
||||
* MAXNUMMESSAGES is a constant and a power of 2). As long as maxMsgNum
|
||||
* doesn't exceed minMsgNum by more than MAXNUMMESSAGES, we have enough space
|
||||
* in the buffer. If the buffer does overflow, we recover by setting the
|
||||
* "reset" flag for each backend that has fallen too far behind. A backend
|
||||
@@ -59,7 +59,7 @@
|
||||
* normal behavior is that at most one such interrupt is in flight at a time;
|
||||
* when a backend completes processing a catchup interrupt, it executes
|
||||
* SICleanupQueue, which will signal the next-furthest-behind backend if
|
||||
* needed. This avoids undue contention from multiple backends all trying
|
||||
* needed. This avoids undue contention from multiple backends all trying
|
||||
* to catch up at once. However, the furthest-back backend might be stuck
|
||||
* in a state where it can't catch up. Eventually it will get reset, so it
|
||||
* won't cause any more problems for anyone but itself. But we don't want
|
||||
@@ -90,7 +90,7 @@
|
||||
* the writer wants to change maxMsgNum while readers need to read it.
|
||||
* We deal with that by having a spinlock that readers must take for just
|
||||
* long enough to read maxMsgNum, while writers take it for just long enough
|
||||
* to write maxMsgNum. (The exact rule is that you need the spinlock to
|
||||
* to write maxMsgNum. (The exact rule is that you need the spinlock to
|
||||
* read maxMsgNum if you are not holding SInvalWriteLock, and you need the
|
||||
* spinlock to write maxMsgNum unless you are holding both locks.)
|
||||
*
|
||||
@@ -442,7 +442,7 @@ SIInsertDataEntries(const SharedInvalidationMessage *data, int n)
|
||||
SISeg *segP = shmInvalBuffer;
|
||||
|
||||
/*
|
||||
* N can be arbitrarily large. We divide the work into groups of no more
|
||||
* N can be arbitrarily large. We divide the work into groups of no more
|
||||
* than WRITE_QUANTUM messages, to be sure that we don't hold the lock for
|
||||
* an unreasonably long time. (This is not so much because we care about
|
||||
* letting in other writers, as that some just-caught-up backend might be
|
||||
@@ -465,7 +465,7 @@ SIInsertDataEntries(const SharedInvalidationMessage *data, int n)
|
||||
* If the buffer is full, we *must* acquire some space. Clean the
|
||||
* queue and reset anyone who is preventing space from being freed.
|
||||
* Otherwise, clean the queue only when it's exceeded the next
|
||||
* fullness threshold. We have to loop and recheck the buffer state
|
||||
* fullness threshold. We have to loop and recheck the buffer state
|
||||
* after any call of SICleanupQueue.
|
||||
*/
|
||||
for (;;)
|
||||
@@ -533,11 +533,11 @@ SIInsertDataEntries(const SharedInvalidationMessage *data, int n)
|
||||
* executing on behalf of other backends, since each instance will modify only
|
||||
* fields of its own backend's ProcState, and no instance will look at fields
|
||||
* of other backends' ProcStates. We express this by grabbing SInvalReadLock
|
||||
* in shared mode. Note that this is not exactly the normal (read-only)
|
||||
* in shared mode. Note that this is not exactly the normal (read-only)
|
||||
* interpretation of a shared lock! Look closely at the interactions before
|
||||
* allowing SInvalReadLock to be grabbed in shared mode for any other reason!
|
||||
*
|
||||
* NB: this can also run in parallel with SIInsertDataEntries. It is not
|
||||
* NB: this can also run in parallel with SIInsertDataEntries. It is not
|
||||
* guaranteed that we will return any messages added after the routine is
|
||||
* entered.
|
||||
*
|
||||
@@ -557,10 +557,10 @@ SIGetDataEntries(SharedInvalidationMessage *data, int datasize)
|
||||
|
||||
/*
|
||||
* Before starting to take locks, do a quick, unlocked test to see whether
|
||||
* there can possibly be anything to read. On a multiprocessor system,
|
||||
* there can possibly be anything to read. On a multiprocessor system,
|
||||
* it's possible that this load could migrate backwards and occur before
|
||||
* we actually enter this function, so we might miss a sinval message that
|
||||
* was just added by some other processor. But they can't migrate
|
||||
* was just added by some other processor. But they can't migrate
|
||||
* backwards over a preceding lock acquisition, so it should be OK. If we
|
||||
* haven't acquired a lock preventing against further relevant
|
||||
* invalidations, any such occurrence is not much different than if the
|
||||
@@ -651,7 +651,7 @@ SIGetDataEntries(SharedInvalidationMessage *data, int datasize)
|
||||
*
|
||||
* Caution: because we transiently release write lock when we have to signal
|
||||
* some other backend, it is NOT guaranteed that there are still minFree
|
||||
* free message slots at exit. Caller must recheck and perhaps retry.
|
||||
* free message slots at exit. Caller must recheck and perhaps retry.
|
||||
*/
|
||||
void
|
||||
SICleanupQueue(bool callerHasWriteLock, int minFree)
|
||||
@@ -672,7 +672,7 @@ SICleanupQueue(bool callerHasWriteLock, int minFree)
|
||||
/*
|
||||
* Recompute minMsgNum = minimum of all backends' nextMsgNum, identify the
|
||||
* furthest-back backend that needs signaling (if any), and reset any
|
||||
* backends that are too far back. Note that because we ignore sendOnly
|
||||
* backends that are too far back. Note that because we ignore sendOnly
|
||||
* backends here it is possible for them to keep sending messages without
|
||||
* a problem even when they are the only active backend.
|
||||
*/
|
||||
|
||||
@@ -130,7 +130,7 @@ GetStandbyLimitTime(void)
|
||||
|
||||
/*
|
||||
* The cutoff time is the last WAL data receipt time plus the appropriate
|
||||
* delay variable. Delay of -1 means wait forever.
|
||||
* delay variable. Delay of -1 means wait forever.
|
||||
*/
|
||||
GetXLogReceiptTime(&rtime, &fromStream);
|
||||
if (fromStream)
|
||||
@@ -475,7 +475,7 @@ SendRecoveryConflictWithBufferPin(ProcSignalReason reason)
|
||||
* determine whether an actual deadlock condition is present: the lock we
|
||||
* need to wait for might be unrelated to any held by the Startup process.
|
||||
* Sooner or later, this mechanism should get ripped out in favor of somehow
|
||||
* accounting for buffer locks in DeadLockCheck(). However, errors here
|
||||
* accounting for buffer locks in DeadLockCheck(). However, errors here
|
||||
* seem to be very low-probability in practice, so for now it's not worth
|
||||
* the trouble.
|
||||
*/
|
||||
@@ -867,7 +867,7 @@ standby_redo(XLogRecPtr lsn, XLogRecord *record)
|
||||
XLogRecPtr
|
||||
LogStandbySnapshot(void)
|
||||
{
|
||||
XLogRecPtr recptr;
|
||||
XLogRecPtr recptr;
|
||||
RunningTransactions running;
|
||||
xl_standby_lock *locks;
|
||||
int nlocks;
|
||||
@@ -889,8 +889,8 @@ LogStandbySnapshot(void)
|
||||
running = GetRunningTransactionData();
|
||||
|
||||
/*
|
||||
* GetRunningTransactionData() acquired ProcArrayLock, we must release
|
||||
* it. For Hot Standby this can be done before inserting the WAL record
|
||||
* GetRunningTransactionData() acquired ProcArrayLock, we must release it.
|
||||
* For Hot Standby this can be done before inserting the WAL record
|
||||
* because ProcArrayApplyRecoveryInfo() rechecks the commit status using
|
||||
* the clog. For logical decoding, though, the lock can't be released
|
||||
* early becuase the clog might be "in the future" from the POV of the
|
||||
@@ -977,9 +977,9 @@ LogCurrentRunningXacts(RunningTransactions CurrRunningXacts)
|
||||
/*
|
||||
* Ensure running_xacts information is synced to disk not too far in the
|
||||
* future. We don't want to stall anything though (i.e. use XLogFlush()),
|
||||
* so we let the wal writer do it during normal
|
||||
* operation. XLogSetAsyncXactLSN() conveniently will mark the LSN as
|
||||
* to-be-synced and nudge the WALWriter into action if sleeping. Check
|
||||
* so we let the wal writer do it during normal operation.
|
||||
* XLogSetAsyncXactLSN() conveniently will mark the LSN as to-be-synced
|
||||
* and nudge the WALWriter into action if sleeping. Check
|
||||
* XLogBackgroundFlush() for details why a record might not be flushed
|
||||
* without it.
|
||||
*/
|
||||
|
||||
@@ -266,10 +266,10 @@ inv_open(Oid lobjId, int flags, MemoryContext mcxt)
|
||||
errmsg("large object %u does not exist", lobjId)));
|
||||
|
||||
/*
|
||||
* We must register the snapshot in TopTransaction's resowner, because
|
||||
* it must stay alive until the LO is closed rather than until the
|
||||
* current portal shuts down. Do this after checking that the LO exists,
|
||||
* to avoid leaking the snapshot if an error is thrown.
|
||||
* We must register the snapshot in TopTransaction's resowner, because it
|
||||
* must stay alive until the LO is closed rather than until the current
|
||||
* portal shuts down. Do this after checking that the LO exists, to avoid
|
||||
* leaking the snapshot if an error is thrown.
|
||||
*/
|
||||
if (snapshot)
|
||||
snapshot = RegisterSnapshotOnOwner(snapshot,
|
||||
@@ -809,7 +809,7 @@ inv_truncate(LargeObjectDesc *obj_desc, int64 len)
|
||||
|
||||
/*
|
||||
* If we found the page of the truncation point we need to truncate the
|
||||
* data in it. Otherwise if we're in a hole, we need to create a page to
|
||||
* data in it. Otherwise if we're in a hole, we need to create a page to
|
||||
* mark the end of data.
|
||||
*/
|
||||
if (olddata != NULL && olddata->pageno == pageno)
|
||||
|
||||
@@ -51,7 +51,7 @@ typedef struct
|
||||
} WAIT_ORDER;
|
||||
|
||||
/*
|
||||
* Information saved about each edge in a detected deadlock cycle. This
|
||||
* Information saved about each edge in a detected deadlock cycle. This
|
||||
* is used to print a diagnostic message upon failure.
|
||||
*
|
||||
* Note: because we want to examine this info after releasing the lock
|
||||
@@ -119,7 +119,7 @@ static PGPROC *blocking_autovacuum_proc = NULL;
|
||||
* InitDeadLockChecking -- initialize deadlock checker during backend startup
|
||||
*
|
||||
* This does per-backend initialization of the deadlock checker; primarily,
|
||||
* allocation of working memory for DeadLockCheck. We do this per-backend
|
||||
* allocation of working memory for DeadLockCheck. We do this per-backend
|
||||
* since there's no percentage in making the kernel do copy-on-write
|
||||
* inheritance of workspace from the postmaster. We want to allocate the
|
||||
* space at startup because (a) the deadlock checker might be invoked when
|
||||
@@ -291,10 +291,10 @@ GetBlockingAutoVacuumPgproc(void)
|
||||
* DeadLockCheckRecurse -- recursively search for valid orderings
|
||||
*
|
||||
* curConstraints[] holds the current set of constraints being considered
|
||||
* by an outer level of recursion. Add to this each possible solution
|
||||
* by an outer level of recursion. Add to this each possible solution
|
||||
* constraint for any cycle detected at this level.
|
||||
*
|
||||
* Returns TRUE if no solution exists. Returns FALSE if a deadlock-free
|
||||
* Returns TRUE if no solution exists. Returns FALSE if a deadlock-free
|
||||
* state is attainable, in which case waitOrders[] shows the required
|
||||
* rearrangements of lock wait queues (if any).
|
||||
*/
|
||||
@@ -429,7 +429,7 @@ TestConfiguration(PGPROC *startProc)
|
||||
*
|
||||
* Since we need to be able to check hypothetical configurations that would
|
||||
* exist after wait queue rearrangement, the routine pays attention to the
|
||||
* table of hypothetical queue orders in waitOrders[]. These orders will
|
||||
* table of hypothetical queue orders in waitOrders[]. These orders will
|
||||
* be believed in preference to the actual ordering seen in the locktable.
|
||||
*/
|
||||
static bool
|
||||
@@ -506,7 +506,7 @@ FindLockCycleRecurse(PGPROC *checkProc,
|
||||
conflictMask = lockMethodTable->conflictTab[checkProc->waitLockMode];
|
||||
|
||||
/*
|
||||
* Scan for procs that already hold conflicting locks. These are "hard"
|
||||
* Scan for procs that already hold conflicting locks. These are "hard"
|
||||
* edges in the waits-for graph.
|
||||
*/
|
||||
procLocks = &(lock->procLocks);
|
||||
@@ -705,7 +705,7 @@ ExpandConstraints(EDGE *constraints,
|
||||
nWaitOrders = 0;
|
||||
|
||||
/*
|
||||
* Scan constraint list backwards. This is because the last-added
|
||||
* Scan constraint list backwards. This is because the last-added
|
||||
* constraint is the only one that could fail, and so we want to test it
|
||||
* for inconsistency first.
|
||||
*/
|
||||
@@ -759,7 +759,7 @@ ExpandConstraints(EDGE *constraints,
|
||||
* The initial queue ordering is taken directly from the lock's wait queue.
|
||||
* The output is an array of PGPROC pointers, of length equal to the lock's
|
||||
* wait queue length (the caller is responsible for providing this space).
|
||||
* The partial order is specified by an array of EDGE structs. Each EDGE
|
||||
* The partial order is specified by an array of EDGE structs. Each EDGE
|
||||
* is one that we need to reverse, therefore the "waiter" must appear before
|
||||
* the "blocker" in the output array. The EDGE array may well contain
|
||||
* edges associated with other locks; these should be ignored.
|
||||
@@ -829,7 +829,7 @@ TopoSort(LOCK *lock,
|
||||
afterConstraints[k] = i + 1;
|
||||
}
|
||||
/*--------------------
|
||||
* Now scan the topoProcs array backwards. At each step, output the
|
||||
* Now scan the topoProcs array backwards. At each step, output the
|
||||
* last proc that has no remaining before-constraints, and decrease
|
||||
* the beforeConstraints count of each of the procs it was constrained
|
||||
* against.
|
||||
|
||||
@@ -35,7 +35,7 @@ typedef struct XactLockTableWaitInfo
|
||||
{
|
||||
XLTW_Oper oper;
|
||||
Relation rel;
|
||||
ItemPointer ctid;
|
||||
ItemPointer ctid;
|
||||
} XactLockTableWaitInfo;
|
||||
|
||||
static void XactLockTableWaitErrorCb(void *arg);
|
||||
@@ -80,7 +80,7 @@ SetLocktagRelationOid(LOCKTAG *tag, Oid relid)
|
||||
/*
|
||||
* LockRelationOid
|
||||
*
|
||||
* Lock a relation given only its OID. This should generally be used
|
||||
* Lock a relation given only its OID. This should generally be used
|
||||
* before attempting to open the relation's relcache entry.
|
||||
*/
|
||||
void
|
||||
@@ -268,7 +268,7 @@ LockHasWaitersRelation(Relation relation, LOCKMODE lockmode)
|
||||
/*
|
||||
* LockRelationIdForSession
|
||||
*
|
||||
* This routine grabs a session-level lock on the target relation. The
|
||||
* This routine grabs a session-level lock on the target relation. The
|
||||
* session lock persists across transaction boundaries. It will be removed
|
||||
* when UnlockRelationIdForSession() is called, or if an ereport(ERROR) occurs,
|
||||
* or if the backend exits.
|
||||
@@ -471,7 +471,7 @@ XactLockTableInsert(TransactionId xid)
|
||||
*
|
||||
* Delete the lock showing that the given transaction ID is running.
|
||||
* (This is never used for main transaction IDs; those locks are only
|
||||
* released implicitly at transaction end. But we do use it for subtrans IDs.)
|
||||
* released implicitly at transaction end. But we do use it for subtrans IDs.)
|
||||
*/
|
||||
void
|
||||
XactLockTableDelete(TransactionId xid)
|
||||
@@ -494,7 +494,7 @@ XactLockTableDelete(TransactionId xid)
|
||||
* subtransaction, we will exit as soon as it aborts or its top parent commits.
|
||||
* It takes some extra work to ensure this, because to save on shared memory
|
||||
* the XID lock of a subtransaction is released when it ends, whether
|
||||
* successfully or unsuccessfully. So we have to check if it's "still running"
|
||||
* successfully or unsuccessfully. So we have to check if it's "still running"
|
||||
* and if so wait for its parent.
|
||||
*/
|
||||
void
|
||||
@@ -663,7 +663,7 @@ WaitForLockersMultiple(List *locktags, LOCKMODE lockmode)
|
||||
|
||||
/*
|
||||
* Note: GetLockConflicts() never reports our own xid, hence we need not
|
||||
* check for that. Also, prepared xacts are not reported, which is fine
|
||||
* check for that. Also, prepared xacts are not reported, which is fine
|
||||
* since they certainly aren't going to do anything anymore.
|
||||
*/
|
||||
|
||||
@@ -690,7 +690,7 @@ WaitForLockersMultiple(List *locktags, LOCKMODE lockmode)
|
||||
void
|
||||
WaitForLockers(LOCKTAG heaplocktag, LOCKMODE lockmode)
|
||||
{
|
||||
List *l;
|
||||
List *l;
|
||||
|
||||
l = list_make1(&heaplocktag);
|
||||
WaitForLockersMultiple(l, lockmode);
|
||||
|
||||
@@ -187,7 +187,7 @@ static int FastPathLocalUseCount = 0;
|
||||
|
||||
/*
|
||||
* The fast-path lock mechanism is concerned only with relation locks on
|
||||
* unshared relations by backends bound to a database. The fast-path
|
||||
* unshared relations by backends bound to a database. The fast-path
|
||||
* mechanism exists mostly to accelerate acquisition and release of locks
|
||||
* that rarely conflict. Because ShareUpdateExclusiveLock is
|
||||
* self-conflicting, it can't use the fast-path mechanism; but it also does
|
||||
@@ -914,7 +914,7 @@ LockAcquireExtended(const LOCKTAG *locktag,
|
||||
|
||||
/*
|
||||
* If lock requested conflicts with locks requested by waiters, must join
|
||||
* wait queue. Otherwise, check for conflict with already-held locks.
|
||||
* wait queue. Otherwise, check for conflict with already-held locks.
|
||||
* (That's last because most complex check.)
|
||||
*/
|
||||
if (lockMethodTable->conflictTab[lockmode] & lock->waitMask)
|
||||
@@ -995,7 +995,7 @@ LockAcquireExtended(const LOCKTAG *locktag,
|
||||
|
||||
/*
|
||||
* NOTE: do not do any material change of state between here and
|
||||
* return. All required changes in locktable state must have been
|
||||
* return. All required changes in locktable state must have been
|
||||
* done when the lock was granted to us --- see notes in WaitOnLock.
|
||||
*/
|
||||
|
||||
@@ -1032,7 +1032,7 @@ LockAcquireExtended(const LOCKTAG *locktag,
|
||||
{
|
||||
/*
|
||||
* Decode the locktag back to the original values, to avoid sending
|
||||
* lots of empty bytes with every message. See lock.h to check how a
|
||||
* lots of empty bytes with every message. See lock.h to check how a
|
||||
* locktag is defined for LOCKTAG_RELATION
|
||||
*/
|
||||
LogAccessExclusiveLock(locktag->locktag_field1,
|
||||
@@ -1289,7 +1289,7 @@ LockCheckConflicts(LockMethod lockMethodTable,
|
||||
}
|
||||
|
||||
/*
|
||||
* Rats. Something conflicts. But it could still be my own lock. We have
|
||||
* Rats. Something conflicts. But it could still be my own lock. We have
|
||||
* to construct a conflict mask that does not reflect our own locks, but
|
||||
* only lock types held by other processes.
|
||||
*/
|
||||
@@ -1381,7 +1381,7 @@ UnGrantLock(LOCK *lock, LOCKMODE lockmode,
|
||||
|
||||
/*
|
||||
* We need only run ProcLockWakeup if the released lock conflicts with at
|
||||
* least one of the lock types requested by waiter(s). Otherwise whatever
|
||||
* least one of the lock types requested by waiter(s). Otherwise whatever
|
||||
* conflict made them wait must still exist. NOTE: before MVCC, we could
|
||||
* skip wakeup if lock->granted[lockmode] was still positive. But that's
|
||||
* not true anymore, because the remaining granted locks might belong to
|
||||
@@ -1401,7 +1401,7 @@ UnGrantLock(LOCK *lock, LOCKMODE lockmode,
|
||||
}
|
||||
|
||||
/*
|
||||
* CleanUpLock -- clean up after releasing a lock. We garbage-collect the
|
||||
* CleanUpLock -- clean up after releasing a lock. We garbage-collect the
|
||||
* proclock and lock objects if possible, and call ProcLockWakeup if there
|
||||
* are remaining requests and the caller says it's OK. (Normally, this
|
||||
* should be called after UnGrantLock, and wakeupNeeded is the result from
|
||||
@@ -1823,7 +1823,7 @@ LockRelease(const LOCKTAG *locktag, LOCKMODE lockmode, bool sessionLock)
|
||||
}
|
||||
|
||||
/*
|
||||
* Decrease the total local count. If we're still holding the lock, we're
|
||||
* Decrease the total local count. If we're still holding the lock, we're
|
||||
* done.
|
||||
*/
|
||||
locallock->nLocks--;
|
||||
@@ -1955,7 +1955,7 @@ LockReleaseAll(LOCKMETHODID lockmethodid, bool allLocks)
|
||||
#endif
|
||||
|
||||
/*
|
||||
* Get rid of our fast-path VXID lock, if appropriate. Note that this is
|
||||
* Get rid of our fast-path VXID lock, if appropriate. Note that this is
|
||||
* the only way that the lock we hold on our own VXID can ever get
|
||||
* released: it is always and only released when a toplevel transaction
|
||||
* ends.
|
||||
@@ -2042,7 +2042,7 @@ LockReleaseAll(LOCKMETHODID lockmethodid, bool allLocks)
|
||||
* fast-path data structures, we must acquire it before attempting
|
||||
* to release the lock via the fast-path. We will continue to
|
||||
* hold the LWLock until we're done scanning the locallock table,
|
||||
* unless we hit a transferred fast-path lock. (XXX is this
|
||||
* unless we hit a transferred fast-path lock. (XXX is this
|
||||
* really such a good idea? There could be a lot of entries ...)
|
||||
*/
|
||||
if (!have_fast_path_lwlock)
|
||||
@@ -2061,7 +2061,7 @@ LockReleaseAll(LOCKMETHODID lockmethodid, bool allLocks)
|
||||
|
||||
/*
|
||||
* Our lock, originally taken via the fast path, has been
|
||||
* transferred to the main lock table. That's going to require
|
||||
* transferred to the main lock table. That's going to require
|
||||
* some extra work, so release our fast-path lock before starting.
|
||||
*/
|
||||
LWLockRelease(MyProc->backendLock);
|
||||
@@ -2070,7 +2070,7 @@ LockReleaseAll(LOCKMETHODID lockmethodid, bool allLocks)
|
||||
/*
|
||||
* Now dump the lock. We haven't got a pointer to the LOCK or
|
||||
* PROCLOCK in this case, so we have to handle this a bit
|
||||
* differently than a normal lock release. Unfortunately, this
|
||||
* differently than a normal lock release. Unfortunately, this
|
||||
* requires an extra LWLock acquire-and-release cycle on the
|
||||
* partitionLock, but hopefully it shouldn't happen often.
|
||||
*/
|
||||
@@ -2505,9 +2505,9 @@ FastPathTransferRelationLocks(LockMethod lockMethodTable, const LOCKTAG *locktag
|
||||
* acquiring proc->backendLock. In particular, it's certainly safe to
|
||||
* assume that if the target backend holds any fast-path locks, it
|
||||
* must have performed a memory-fencing operation (in particular, an
|
||||
* LWLock acquisition) since setting proc->databaseId. However, it's
|
||||
* LWLock acquisition) since setting proc->databaseId. However, it's
|
||||
* less clear that our backend is certain to have performed a memory
|
||||
* fencing operation since the other backend set proc->databaseId. So
|
||||
* fencing operation since the other backend set proc->databaseId. So
|
||||
* for now, we test it after acquiring the LWLock just to be safe.
|
||||
*/
|
||||
if (proc->databaseId != locktag->locktag_field1)
|
||||
@@ -3021,7 +3021,7 @@ AtPrepare_Locks(void)
|
||||
continue;
|
||||
|
||||
/*
|
||||
* If we have both session- and transaction-level locks, fail. This
|
||||
* If we have both session- and transaction-level locks, fail. This
|
||||
* should never happen with regular locks, since we only take those at
|
||||
* session level in some special operations like VACUUM. It's
|
||||
* possible to hit this with advisory locks, though.
|
||||
@@ -3030,7 +3030,7 @@ AtPrepare_Locks(void)
|
||||
* the transactional hold to the prepared xact. However, that would
|
||||
* require two PROCLOCK objects, and we cannot be sure that another
|
||||
* PROCLOCK will be available when it comes time for PostPrepare_Locks
|
||||
* to do the deed. So for now, we error out while we can still do so
|
||||
* to do the deed. So for now, we error out while we can still do so
|
||||
* safely.
|
||||
*/
|
||||
if (haveSessionLock)
|
||||
@@ -3219,7 +3219,7 @@ PostPrepare_Locks(TransactionId xid)
|
||||
/*
|
||||
* We cannot simply modify proclock->tag.myProc to reassign
|
||||
* ownership of the lock, because that's part of the hash key and
|
||||
* the proclock would then be in the wrong hash chain. Instead
|
||||
* the proclock would then be in the wrong hash chain. Instead
|
||||
* use hash_update_hash_key. (We used to create a new hash entry,
|
||||
* but that risks out-of-memory failure if other processes are
|
||||
* busy making proclocks too.) We must unlink the proclock from
|
||||
@@ -3319,7 +3319,7 @@ GetLockStatusData(void)
|
||||
|
||||
/*
|
||||
* First, we iterate through the per-backend fast-path arrays, locking
|
||||
* them one at a time. This might produce an inconsistent picture of the
|
||||
* them one at a time. This might produce an inconsistent picture of the
|
||||
* system state, but taking all of those LWLocks at the same time seems
|
||||
* impractical (in particular, note MAX_SIMUL_LWLOCKS). It shouldn't
|
||||
* matter too much, because none of these locks can be involved in lock
|
||||
@@ -3398,7 +3398,7 @@ GetLockStatusData(void)
|
||||
* will be self-consistent.
|
||||
*
|
||||
* Since this is a read-only operation, we take shared instead of
|
||||
* exclusive lock. There's not a whole lot of point to this, because all
|
||||
* exclusive lock. There's not a whole lot of point to this, because all
|
||||
* the normal operations require exclusive lock, but it doesn't hurt
|
||||
* anything either. It will at least allow two backends to do
|
||||
* GetLockStatusData in parallel.
|
||||
@@ -3917,7 +3917,7 @@ lock_twophase_postabort(TransactionId xid, uint16 info,
|
||||
* as MyProc->lxid, you might wonder if we really need both. The
|
||||
* difference is that MyProc->lxid is set and cleared unlocked, and
|
||||
* examined by procarray.c, while fpLocalTransactionId is protected by
|
||||
* backendLock and is used only by the locking subsystem. Doing it this
|
||||
* backendLock and is used only by the locking subsystem. Doing it this
|
||||
* way makes it easier to verify that there are no funny race conditions.
|
||||
*
|
||||
* We don't bother recording this lock in the local lock table, since it's
|
||||
|
||||
@@ -6,7 +6,7 @@
|
||||
* Lightweight locks are intended primarily to provide mutual exclusion of
|
||||
* access to shared-memory data structures. Therefore, they offer both
|
||||
* exclusive and shared lock modes (to support read/write and read-only
|
||||
* access to a shared object). There are few other frammishes. User-level
|
||||
* access to a shared object). There are few other frammishes. User-level
|
||||
* locking should be done with the full lock manager --- which depends on
|
||||
* LWLocks to protect its shared state.
|
||||
*
|
||||
@@ -54,7 +54,7 @@ extern slock_t *ShmemLock;
|
||||
* to the current backend.
|
||||
*/
|
||||
static LWLockTranche **LWLockTrancheArray = NULL;
|
||||
static int LWLockTranchesAllocated = 0;
|
||||
static int LWLockTranchesAllocated = 0;
|
||||
|
||||
#define T_NAME(lock) \
|
||||
(LWLockTrancheArray[(lock)->tranche]->name)
|
||||
@@ -91,18 +91,18 @@ static bool LWLockAcquireCommon(LWLock *l, LWLockMode mode, uint64 *valptr,
|
||||
#ifdef LWLOCK_STATS
|
||||
typedef struct lwlock_stats_key
|
||||
{
|
||||
int tranche;
|
||||
int instance;
|
||||
} lwlock_stats_key;
|
||||
int tranche;
|
||||
int instance;
|
||||
} lwlock_stats_key;
|
||||
|
||||
typedef struct lwlock_stats
|
||||
{
|
||||
lwlock_stats_key key;
|
||||
int sh_acquire_count;
|
||||
int ex_acquire_count;
|
||||
int block_count;
|
||||
int spin_delay_count;
|
||||
} lwlock_stats;
|
||||
lwlock_stats_key key;
|
||||
int sh_acquire_count;
|
||||
int ex_acquire_count;
|
||||
int block_count;
|
||||
int spin_delay_count;
|
||||
} lwlock_stats;
|
||||
|
||||
static int counts_for_pid = 0;
|
||||
static HTAB *lwlock_stats_htab;
|
||||
@@ -173,7 +173,7 @@ print_lwlock_stats(int code, Datum arg)
|
||||
while ((lwstats = (lwlock_stats *) hash_seq_search(&scan)) != NULL)
|
||||
{
|
||||
fprintf(stderr,
|
||||
"PID %d lwlock %s %d: shacq %u exacq %u blk %u spindelay %u\n",
|
||||
"PID %d lwlock %s %d: shacq %u exacq %u blk %u spindelay %u\n",
|
||||
MyProcPid, LWLockTrancheArray[lwstats->key.tranche]->name,
|
||||
lwstats->key.instance, lwstats->sh_acquire_count,
|
||||
lwstats->ex_acquire_count, lwstats->block_count,
|
||||
@@ -186,9 +186,9 @@ print_lwlock_stats(int code, Datum arg)
|
||||
static lwlock_stats *
|
||||
get_lwlock_stats_entry(LWLock *lock)
|
||||
{
|
||||
lwlock_stats_key key;
|
||||
lwlock_stats_key key;
|
||||
lwlock_stats *lwstats;
|
||||
bool found;
|
||||
bool found;
|
||||
|
||||
/* Set up local count state first time through in a given process */
|
||||
if (counts_for_pid != MyProcPid)
|
||||
@@ -270,7 +270,7 @@ NumLWLocks(void)
|
||||
* a loadable module.
|
||||
*
|
||||
* This is only useful if called from the _PG_init hook of a library that
|
||||
* is loaded into the postmaster via shared_preload_libraries. Once
|
||||
* is loaded into the postmaster via shared_preload_libraries. Once
|
||||
* shared memory has been allocated, calls will be ignored. (We could
|
||||
* raise an error, but it seems better to make it a no-op, so that
|
||||
* libraries containing such calls can be reloaded if needed.)
|
||||
@@ -339,12 +339,12 @@ CreateLWLocks(void)
|
||||
* before the first LWLock. LWLockCounter[0] is the allocation
|
||||
* counter for lwlocks, LWLockCounter[1] is the maximum number that
|
||||
* can be allocated from the main array, and LWLockCounter[2] is the
|
||||
* allocation counter for tranches.
|
||||
* allocation counter for tranches.
|
||||
*/
|
||||
LWLockCounter = (int *) ((char *) MainLWLockArray - 3 * sizeof(int));
|
||||
LWLockCounter[0] = NUM_FIXED_LWLOCKS;
|
||||
LWLockCounter[1] = numLocks;
|
||||
LWLockCounter[2] = 1; /* 0 is the main array */
|
||||
LWLockCounter[2] = 1; /* 0 is the main array */
|
||||
}
|
||||
|
||||
if (LWLockTrancheArray == NULL)
|
||||
@@ -352,7 +352,7 @@ CreateLWLocks(void)
|
||||
LWLockTranchesAllocated = 16;
|
||||
LWLockTrancheArray = (LWLockTranche **)
|
||||
MemoryContextAlloc(TopMemoryContext,
|
||||
LWLockTranchesAllocated * sizeof(LWLockTranche *));
|
||||
LWLockTranchesAllocated * sizeof(LWLockTranche *));
|
||||
}
|
||||
|
||||
MainLWLockTranche.name = "main";
|
||||
@@ -422,7 +422,7 @@ LWLockRegisterTranche(int tranche_id, LWLockTranche *tranche)
|
||||
|
||||
if (tranche_id >= LWLockTranchesAllocated)
|
||||
{
|
||||
int i = LWLockTranchesAllocated;
|
||||
int i = LWLockTranchesAllocated;
|
||||
|
||||
while (i <= tranche_id)
|
||||
i *= 2;
|
||||
@@ -534,7 +534,7 @@ LWLockAcquireCommon(LWLock *l, LWLockMode mode, uint64 *valptr, uint64 val)
|
||||
* in the presence of contention. The efficiency of being able to do that
|
||||
* outweighs the inefficiency of sometimes wasting a process dispatch
|
||||
* cycle because the lock is not free when a released waiter finally gets
|
||||
* to run. See pgsql-hackers archives for 29-Dec-01.
|
||||
* to run. See pgsql-hackers archives for 29-Dec-01.
|
||||
*/
|
||||
for (;;)
|
||||
{
|
||||
@@ -731,7 +731,7 @@ LWLockConditionalAcquire(LWLock *l, LWLockMode mode)
|
||||
/*
|
||||
* LWLockAcquireOrWait - Acquire lock, or wait until it's free
|
||||
*
|
||||
* The semantics of this function are a bit funky. If the lock is currently
|
||||
* The semantics of this function are a bit funky. If the lock is currently
|
||||
* free, it is acquired in the given mode, and the function returns true. If
|
||||
* the lock isn't immediately free, the function waits until it is released
|
||||
* and returns false, but does not acquire the lock.
|
||||
@@ -920,8 +920,8 @@ LWLockWaitForVar(LWLock *l, uint64 *valptr, uint64 oldval, uint64 *newval)
|
||||
return true;
|
||||
|
||||
/*
|
||||
* Lock out cancel/die interrupts while we sleep on the lock. There is
|
||||
* no cleanup mechanism to remove us from the wait queue if we got
|
||||
* Lock out cancel/die interrupts while we sleep on the lock. There is no
|
||||
* cleanup mechanism to remove us from the wait queue if we got
|
||||
* interrupted.
|
||||
*/
|
||||
HOLD_INTERRUPTS();
|
||||
|
||||
@@ -32,11 +32,11 @@
|
||||
* examining the MVCC data.)
|
||||
*
|
||||
* (1) Besides tuples actually read, they must cover ranges of tuples
|
||||
* which would have been read based on the predicate. This will
|
||||
* which would have been read based on the predicate. This will
|
||||
* require modelling the predicates through locks against database
|
||||
* objects such as pages, index ranges, or entire tables.
|
||||
*
|
||||
* (2) They must be kept in RAM for quick access. Because of this, it
|
||||
* (2) They must be kept in RAM for quick access. Because of this, it
|
||||
* isn't possible to always maintain tuple-level granularity -- when
|
||||
* the space allocated to store these approaches exhaustion, a
|
||||
* request for a lock may need to scan for situations where a single
|
||||
@@ -49,7 +49,7 @@
|
||||
*
|
||||
* (4) While they are associated with a transaction, they must survive
|
||||
* a successful COMMIT of that transaction, and remain until all
|
||||
* overlapping transactions complete. This even means that they
|
||||
* overlapping transactions complete. This even means that they
|
||||
* must survive termination of the transaction's process. If a
|
||||
* top level transaction is rolled back, however, it is immediately
|
||||
* flagged so that it can be ignored, and its SIREAD locks can be
|
||||
@@ -90,7 +90,7 @@
|
||||
* may yet matter because they overlap still-active transactions.
|
||||
*
|
||||
* SerializablePredicateLockListLock
|
||||
* - Protects the linked list of locks held by a transaction. Note
|
||||
* - Protects the linked list of locks held by a transaction. Note
|
||||
* that the locks themselves are also covered by the partition
|
||||
* locks of their respective lock targets; this lock only affects
|
||||
* the linked list connecting the locks related to a transaction.
|
||||
@@ -101,11 +101,11 @@
|
||||
* - It is relatively infrequent that another process needs to
|
||||
* modify the list for a transaction, but it does happen for such
|
||||
* things as index page splits for pages with predicate locks and
|
||||
* freeing of predicate locked pages by a vacuum process. When
|
||||
* freeing of predicate locked pages by a vacuum process. When
|
||||
* removing a lock in such cases, the lock itself contains the
|
||||
* pointers needed to remove it from the list. When adding a
|
||||
* lock in such cases, the lock can be added using the anchor in
|
||||
* the transaction structure. Neither requires walking the list.
|
||||
* the transaction structure. Neither requires walking the list.
|
||||
* - Cleaning up the list for a terminated transaction is sometimes
|
||||
* not done on a retail basis, in which case no lock is required.
|
||||
* - Due to the above, a process accessing its active transaction's
|
||||
@@ -355,7 +355,7 @@ int max_predicate_locks_per_xact; /* set by guc.c */
|
||||
|
||||
/*
|
||||
* This provides a list of objects in order to track transactions
|
||||
* participating in predicate locking. Entries in the list are fixed size,
|
||||
* participating in predicate locking. Entries in the list are fixed size,
|
||||
* and reside in shared memory. The memory address of an entry must remain
|
||||
* fixed during its lifetime. The list will be protected from concurrent
|
||||
* update externally; no provision is made in this code to manage that. The
|
||||
@@ -547,7 +547,7 @@ SerializationNeededForWrite(Relation relation)
|
||||
|
||||
/*
|
||||
* These functions are a simple implementation of a list for this specific
|
||||
* type of struct. If there is ever a generalized shared memory list, we
|
||||
* type of struct. If there is ever a generalized shared memory list, we
|
||||
* should probably switch to that.
|
||||
*/
|
||||
static SERIALIZABLEXACT *
|
||||
@@ -767,7 +767,7 @@ OldSerXidPagePrecedesLogically(int p, int q)
|
||||
int diff;
|
||||
|
||||
/*
|
||||
* We have to compare modulo (OLDSERXID_MAX_PAGE+1)/2. Both inputs should
|
||||
* We have to compare modulo (OLDSERXID_MAX_PAGE+1)/2. Both inputs should
|
||||
* be in the range 0..OLDSERXID_MAX_PAGE.
|
||||
*/
|
||||
Assert(p >= 0 && p <= OLDSERXID_MAX_PAGE);
|
||||
@@ -929,7 +929,7 @@ OldSerXidAdd(TransactionId xid, SerCommitSeqNo minConflictCommitSeqNo)
|
||||
}
|
||||
|
||||
/*
|
||||
* Get the minimum commitSeqNo for any conflict out for the given xid. For
|
||||
* Get the minimum commitSeqNo for any conflict out for the given xid. For
|
||||
* a transaction which exists but has no conflict out, InvalidSerCommitSeqNo
|
||||
* will be returned.
|
||||
*/
|
||||
@@ -982,7 +982,7 @@ OldSerXidSetActiveSerXmin(TransactionId xid)
|
||||
/*
|
||||
* When no sxacts are active, nothing overlaps, set the xid values to
|
||||
* invalid to show that there are no valid entries. Don't clear headPage,
|
||||
* though. A new xmin might still land on that page, and we don't want to
|
||||
* though. A new xmin might still land on that page, and we don't want to
|
||||
* repeatedly zero out the same page.
|
||||
*/
|
||||
if (!TransactionIdIsValid(xid))
|
||||
@@ -1467,7 +1467,7 @@ SummarizeOldestCommittedSxact(void)
|
||||
|
||||
/*
|
||||
* Grab the first sxact off the finished list -- this will be the earliest
|
||||
* commit. Remove it from the list.
|
||||
* commit. Remove it from the list.
|
||||
*/
|
||||
sxact = (SERIALIZABLEXACT *)
|
||||
SHMQueueNext(FinishedSerializableTransactions,
|
||||
@@ -1620,7 +1620,7 @@ SetSerializableTransactionSnapshot(Snapshot snapshot,
|
||||
/*
|
||||
* We do not allow SERIALIZABLE READ ONLY DEFERRABLE transactions to
|
||||
* import snapshots, since there's no way to wait for a safe snapshot when
|
||||
* we're using the snap we're told to. (XXX instead of throwing an error,
|
||||
* we're using the snap we're told to. (XXX instead of throwing an error,
|
||||
* we could just ignore the XactDeferrable flag?)
|
||||
*/
|
||||
if (XactReadOnly && XactDeferrable)
|
||||
@@ -1669,7 +1669,7 @@ GetSerializableTransactionSnapshotInt(Snapshot snapshot,
|
||||
* release SerializableXactHashLock to call SummarizeOldestCommittedSxact,
|
||||
* this means we have to create the sxact first, which is a bit annoying
|
||||
* (in particular, an elog(ERROR) in procarray.c would cause us to leak
|
||||
* the sxact). Consider refactoring to avoid this.
|
||||
* the sxact). Consider refactoring to avoid this.
|
||||
*/
|
||||
#ifdef TEST_OLDSERXID
|
||||
SummarizeOldestCommittedSxact();
|
||||
@@ -2051,7 +2051,7 @@ RemoveTargetIfNoLongerUsed(PREDICATELOCKTARGET *target, uint32 targettaghash)
|
||||
/*
|
||||
* Delete child target locks owned by this process.
|
||||
* This implementation is assuming that the usage of each target tag field
|
||||
* is uniform. No need to make this hard if we don't have to.
|
||||
* is uniform. No need to make this hard if we don't have to.
|
||||
*
|
||||
* We aren't acquiring lightweight locks for the predicate lock or lock
|
||||
* target structures associated with this transaction unless we're going
|
||||
@@ -2092,7 +2092,7 @@ DeleteChildTargetLocks(const PREDICATELOCKTARGETTAG *newtargettag)
|
||||
if (TargetTagIsCoveredBy(oldtargettag, *newtargettag))
|
||||
{
|
||||
uint32 oldtargettaghash;
|
||||
LWLock *partitionLock;
|
||||
LWLock *partitionLock;
|
||||
PREDICATELOCK *rmpredlock PG_USED_FOR_ASSERTS_ONLY;
|
||||
|
||||
oldtargettaghash = PredicateLockTargetTagHashCode(&oldtargettag);
|
||||
@@ -2497,7 +2497,7 @@ PredicateLockTuple(Relation relation, HeapTuple tuple, Snapshot snapshot)
|
||||
}
|
||||
|
||||
/*
|
||||
* Do quick-but-not-definitive test for a relation lock first. This will
|
||||
* Do quick-but-not-definitive test for a relation lock first. This will
|
||||
* never cause a return when the relation is *not* locked, but will
|
||||
* occasionally let the check continue when there really *is* a relation
|
||||
* level lock.
|
||||
@@ -2809,7 +2809,7 @@ exit:
|
||||
* transaction which is not serializable.
|
||||
*
|
||||
* NOTE: This is currently only called with transfer set to true, but that may
|
||||
* change. If we decide to clean up the locks from a table on commit of a
|
||||
* change. If we decide to clean up the locks from a table on commit of a
|
||||
* transaction which executed DROP TABLE, the false condition will be useful.
|
||||
*/
|
||||
static void
|
||||
@@ -2890,7 +2890,7 @@ DropAllPredicateLocksFromTable(Relation relation, bool transfer)
|
||||
continue; /* already the right lock */
|
||||
|
||||
/*
|
||||
* If we made it here, we have work to do. We make sure the heap
|
||||
* If we made it here, we have work to do. We make sure the heap
|
||||
* relation lock exists, then we walk the list of predicate locks for
|
||||
* the old target we found, moving all locks to the heap relation lock
|
||||
* -- unless they already hold that.
|
||||
@@ -3338,7 +3338,7 @@ ReleasePredicateLocks(bool isCommit)
|
||||
}
|
||||
|
||||
/*
|
||||
* Release all outConflicts to committed transactions. If we're rolling
|
||||
* Release all outConflicts to committed transactions. If we're rolling
|
||||
* back clear them all. Set SXACT_FLAG_CONFLICT_OUT if any point to
|
||||
* previously committed transactions.
|
||||
*/
|
||||
@@ -3657,7 +3657,7 @@ ClearOldPredicateLocks(void)
|
||||
* matter -- but keep the transaction entry itself and any outConflicts.
|
||||
*
|
||||
* When the summarize flag is set, we've run short of room for sxact data
|
||||
* and must summarize to the SLRU. Predicate locks are transferred to a
|
||||
* and must summarize to the SLRU. Predicate locks are transferred to a
|
||||
* dummy "old" transaction, with duplicate locks on a single target
|
||||
* collapsing to a single lock with the "latest" commitSeqNo from among
|
||||
* the conflicting locks..
|
||||
@@ -3850,7 +3850,7 @@ XidIsConcurrent(TransactionId xid)
|
||||
/*
|
||||
* CheckForSerializableConflictOut
|
||||
* We are reading a tuple which has been modified. If it is visible to
|
||||
* us but has been deleted, that indicates a rw-conflict out. If it's
|
||||
* us but has been deleted, that indicates a rw-conflict out. If it's
|
||||
* not visible and was created by a concurrent (overlapping)
|
||||
* serializable transaction, that is also a rw-conflict out,
|
||||
*
|
||||
@@ -3937,7 +3937,7 @@ CheckForSerializableConflictOut(bool visible, Relation relation,
|
||||
Assert(TransactionIdFollowsOrEquals(xid, TransactionXmin));
|
||||
|
||||
/*
|
||||
* Find top level xid. Bail out if xid is too early to be a conflict, or
|
||||
* Find top level xid. Bail out if xid is too early to be a conflict, or
|
||||
* if it's our own xid.
|
||||
*/
|
||||
if (TransactionIdEquals(xid, GetTopTransactionIdIfAny()))
|
||||
@@ -4002,7 +4002,7 @@ CheckForSerializableConflictOut(bool visible, Relation relation,
|
||||
|
||||
/*
|
||||
* We have a conflict out to a transaction which has a conflict out to a
|
||||
* summarized transaction. That summarized transaction must have
|
||||
* summarized transaction. That summarized transaction must have
|
||||
* committed first, and we can't tell when it committed in relation to our
|
||||
* snapshot acquisition, so something needs to be canceled.
|
||||
*/
|
||||
@@ -4036,7 +4036,7 @@ CheckForSerializableConflictOut(bool visible, Relation relation,
|
||||
&& (!SxactHasConflictOut(sxact)
|
||||
|| MySerializableXact->SeqNo.lastCommitBeforeSnapshot < sxact->SeqNo.earliestOutConflictCommit))
|
||||
{
|
||||
/* Read-only transaction will appear to run first. No conflict. */
|
||||
/* Read-only transaction will appear to run first. No conflict. */
|
||||
LWLockRelease(SerializableXactHashLock);
|
||||
return;
|
||||
}
|
||||
@@ -4282,8 +4282,8 @@ CheckForSerializableConflictIn(Relation relation, HeapTuple tuple,
|
||||
SET_PREDICATELOCKTARGETTAG_TUPLE(targettag,
|
||||
relation->rd_node.dbNode,
|
||||
relation->rd_id,
|
||||
ItemPointerGetBlockNumber(&(tuple->t_self)),
|
||||
ItemPointerGetOffsetNumber(&(tuple->t_self)));
|
||||
ItemPointerGetBlockNumber(&(tuple->t_self)),
|
||||
ItemPointerGetOffsetNumber(&(tuple->t_self)));
|
||||
CheckTargetForConflictsIn(&targettag);
|
||||
}
|
||||
|
||||
@@ -4627,7 +4627,7 @@ OnConflict_CheckForSerializationFailure(const SERIALIZABLEXACT *reader,
|
||||
*
|
||||
* If a dangerous structure is found, the pivot (the near conflict) is
|
||||
* marked for death, because rolling back another transaction might mean
|
||||
* that we flail without ever making progress. This transaction is
|
||||
* that we flail without ever making progress. This transaction is
|
||||
* committing writes, so letting it commit ensures progress. If we
|
||||
* canceled the far conflict, it might immediately fail again on retry.
|
||||
*/
|
||||
|
||||
@@ -229,10 +229,10 @@ InitProcGlobal(void)
|
||||
|
||||
/*
|
||||
* Newly created PGPROCs for normal backends, autovacuum and bgworkers
|
||||
* must be queued up on the appropriate free list. Because there can
|
||||
* must be queued up on the appropriate free list. Because there can
|
||||
* only ever be a small, fixed number of auxiliary processes, no free
|
||||
* list is used in that case; InitAuxiliaryProcess() instead uses a
|
||||
* linear search. PGPROCs for prepared transactions are added to a
|
||||
* linear search. PGPROCs for prepared transactions are added to a
|
||||
* free list by TwoPhaseShmemInit().
|
||||
*/
|
||||
if (i < MaxConnections)
|
||||
@@ -291,7 +291,7 @@ InitProcess(void)
|
||||
elog(ERROR, "you already exist");
|
||||
|
||||
/*
|
||||
* Initialize process-local latch support. This could fail if the kernel
|
||||
* Initialize process-local latch support. This could fail if the kernel
|
||||
* is low on resources, and if so we want to exit cleanly before acquiring
|
||||
* any shared-memory resources.
|
||||
*/
|
||||
@@ -400,7 +400,7 @@ InitProcess(void)
|
||||
|
||||
/*
|
||||
* We might be reusing a semaphore that belonged to a failed process. So
|
||||
* be careful and reinitialize its value here. (This is not strictly
|
||||
* be careful and reinitialize its value here. (This is not strictly
|
||||
* necessary anymore, but seems like a good idea for cleanliness.)
|
||||
*/
|
||||
PGSemaphoreReset(&MyProc->sem);
|
||||
@@ -450,7 +450,7 @@ InitProcessPhase2(void)
|
||||
*
|
||||
* Auxiliary processes are presently not expected to wait for real (lockmgr)
|
||||
* locks, so we need not set up the deadlock checker. They are never added
|
||||
* to the ProcArray or the sinval messaging mechanism, either. They also
|
||||
* to the ProcArray or the sinval messaging mechanism, either. They also
|
||||
* don't get a VXID assigned, since this is only useful when we actually
|
||||
* hold lockmgr locks.
|
||||
*
|
||||
@@ -476,7 +476,7 @@ InitAuxiliaryProcess(void)
|
||||
elog(ERROR, "you already exist");
|
||||
|
||||
/*
|
||||
* Initialize process-local latch support. This could fail if the kernel
|
||||
* Initialize process-local latch support. This could fail if the kernel
|
||||
* is low on resources, and if so we want to exit cleanly before acquiring
|
||||
* any shared-memory resources.
|
||||
*/
|
||||
@@ -557,7 +557,7 @@ InitAuxiliaryProcess(void)
|
||||
|
||||
/*
|
||||
* We might be reusing a semaphore that belonged to a failed process. So
|
||||
* be careful and reinitialize its value here. (This is not strictly
|
||||
* be careful and reinitialize its value here. (This is not strictly
|
||||
* necessary anymore, but seems like a good idea for cleanliness.)
|
||||
*/
|
||||
PGSemaphoreReset(&MyProc->sem);
|
||||
@@ -715,7 +715,7 @@ LockErrorCleanup(void)
|
||||
|
||||
/*
|
||||
* We used to do PGSemaphoreReset() here to ensure that our proc's wait
|
||||
* semaphore is reset to zero. This prevented a leftover wakeup signal
|
||||
* semaphore is reset to zero. This prevented a leftover wakeup signal
|
||||
* from remaining in the semaphore if someone else had granted us the lock
|
||||
* we wanted before we were able to remove ourselves from the wait-list.
|
||||
* However, now that ProcSleep loops until waitStatus changes, a leftover
|
||||
@@ -851,7 +851,7 @@ ProcKill(int code, Datum arg)
|
||||
|
||||
/*
|
||||
* AuxiliaryProcKill() -- Cut-down version of ProcKill for auxiliary
|
||||
* processes (bgwriter, etc). The PGPROC and sema are not released, only
|
||||
* processes (bgwriter, etc). The PGPROC and sema are not released, only
|
||||
* marked as not-in-use.
|
||||
*/
|
||||
static void
|
||||
@@ -977,7 +977,7 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
|
||||
*
|
||||
* Special case: if I find I should go in front of some waiter, check to
|
||||
* see if I conflict with already-held locks or the requests before that
|
||||
* waiter. If not, then just grant myself the requested lock immediately.
|
||||
* waiter. If not, then just grant myself the requested lock immediately.
|
||||
* This is the same as the test for immediate grant in LockAcquire, except
|
||||
* we are only considering the part of the wait queue before my insertion
|
||||
* point.
|
||||
@@ -996,7 +996,7 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
|
||||
if (lockMethodTable->conflictTab[lockmode] & proc->heldLocks)
|
||||
{
|
||||
/*
|
||||
* Yes, so we have a deadlock. Easiest way to clean up
|
||||
* Yes, so we have a deadlock. Easiest way to clean up
|
||||
* correctly is to call RemoveFromWaitQueue(), but we
|
||||
* can't do that until we are *on* the wait queue. So, set
|
||||
* a flag to check below, and break out of loop. Also,
|
||||
@@ -1117,8 +1117,8 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
|
||||
|
||||
/*
|
||||
* If someone wakes us between LWLockRelease and PGSemaphoreLock,
|
||||
* PGSemaphoreLock will not block. The wakeup is "saved" by the semaphore
|
||||
* implementation. While this is normally good, there are cases where a
|
||||
* PGSemaphoreLock will not block. The wakeup is "saved" by the semaphore
|
||||
* implementation. While this is normally good, there are cases where a
|
||||
* saved wakeup might be leftover from a previous operation (for example,
|
||||
* we aborted ProcWaitForSignal just before someone did ProcSendSignal).
|
||||
* So, loop to wait again if the waitStatus shows we haven't been granted
|
||||
@@ -1138,7 +1138,7 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
|
||||
|
||||
/*
|
||||
* waitStatus could change from STATUS_WAITING to something else
|
||||
* asynchronously. Read it just once per loop to prevent surprising
|
||||
* asynchronously. Read it just once per loop to prevent surprising
|
||||
* behavior (such as missing log messages).
|
||||
*/
|
||||
myWaitStatus = MyProc->waitStatus;
|
||||
@@ -1623,10 +1623,10 @@ check_done:
|
||||
* This can share the semaphore normally used for waiting for locks,
|
||||
* since a backend could never be waiting for a lock and a signal at
|
||||
* the same time. As with locks, it's OK if the signal arrives just
|
||||
* before we actually reach the waiting state. Also as with locks,
|
||||
* before we actually reach the waiting state. Also as with locks,
|
||||
* it's necessary that the caller be robust against bogus wakeups:
|
||||
* always check that the desired state has occurred, and wait again
|
||||
* if not. This copes with possible "leftover" wakeups.
|
||||
* if not. This copes with possible "leftover" wakeups.
|
||||
*/
|
||||
void
|
||||
ProcWaitForSignal(void)
|
||||
|
||||
@@ -79,7 +79,7 @@ s_lock(volatile slock_t *lock, const char *file, int line)
|
||||
*
|
||||
* We time out and declare error after NUM_DELAYS delays (thus, exactly
|
||||
* that many tries). With the given settings, this will usually take 2 or
|
||||
* so minutes. It seems better to fix the total number of tries (and thus
|
||||
* so minutes. It seems better to fix the total number of tries (and thus
|
||||
* the probability of unintended failure) than to fix the total time
|
||||
* spent.
|
||||
*/
|
||||
@@ -137,7 +137,7 @@ s_lock(volatile slock_t *lock, const char *file, int line)
|
||||
* Note: spins_per_delay is local within our current process. We want to
|
||||
* average these observations across multiple backends, since it's
|
||||
* relatively rare for this function to even get entered, and so a single
|
||||
* backend might not live long enough to converge on a good value. That
|
||||
* backend might not live long enough to converge on a good value. That
|
||||
* is handled by the two routines below.
|
||||
*/
|
||||
if (cur_delay == 0)
|
||||
@@ -177,7 +177,7 @@ update_spins_per_delay(int shared_spins_per_delay)
|
||||
/*
|
||||
* We use an exponential moving average with a relatively slow adaption
|
||||
* rate, so that noise in any one backend's result won't affect the shared
|
||||
* value too much. As long as both inputs are within the allowed range,
|
||||
* value too much. As long as both inputs are within the allowed range,
|
||||
* the result must be too, so we need not worry about clamping the result.
|
||||
*
|
||||
* We deliberately truncate rather than rounding; this is so that single
|
||||
|
||||
@@ -5,7 +5,7 @@
|
||||
*
|
||||
*
|
||||
* For machines that have test-and-set (TAS) instructions, s_lock.h/.c
|
||||
* define the spinlock implementation. This file contains only a stub
|
||||
* define the spinlock implementation. This file contains only a stub
|
||||
* implementation for spinlocks using PGSemaphores. Unless semaphores
|
||||
* are implemented in a way that doesn't involve a kernel call, this
|
||||
* is too slow to be very useful :-(
|
||||
@@ -74,7 +74,7 @@ SpinlockSemas(void)
|
||||
extern void
|
||||
SpinlockSemaInit(PGSemaphore spinsemas)
|
||||
{
|
||||
int i;
|
||||
int i;
|
||||
|
||||
for (i = 0; i < NUM_SPINLOCK_SEMAPHORES; ++i)
|
||||
PGSemaphoreCreate(&spinsemas[i]);
|
||||
@@ -88,7 +88,7 @@ SpinlockSemaInit(PGSemaphore spinsemas)
|
||||
void
|
||||
s_init_lock_sema(volatile slock_t *lock)
|
||||
{
|
||||
static int counter = 0;
|
||||
static int counter = 0;
|
||||
|
||||
*lock = (++counter) % NUM_SPINLOCK_SEMAPHORES;
|
||||
}
|
||||
|
||||
@@ -63,7 +63,7 @@ PageInit(Page page, Size pageSize, Size specialSize)
|
||||
* PageIsVerified
|
||||
* Check that the page header and checksum (if any) appear valid.
|
||||
*
|
||||
* This is called when a page has just been read in from disk. The idea is
|
||||
* This is called when a page has just been read in from disk. The idea is
|
||||
* to cheaply detect trashed pages before we go nuts following bogus item
|
||||
* pointers, testing invalid transaction identifiers, etc.
|
||||
*
|
||||
@@ -155,7 +155,7 @@ PageIsVerified(Page page, BlockNumber blkno)
|
||||
/*
|
||||
* PageAddItem
|
||||
*
|
||||
* Add an item to a page. Return value is offset at which it was
|
||||
* Add an item to a page. Return value is offset at which it was
|
||||
* inserted, or InvalidOffsetNumber if there's not room to insert.
|
||||
*
|
||||
* If overwrite is true, we just store the item at the specified
|
||||
@@ -769,7 +769,7 @@ PageIndexTupleDelete(Page page, OffsetNumber offnum)
|
||||
* PageIndexMultiDelete
|
||||
*
|
||||
* This routine handles the case of deleting multiple tuples from an
|
||||
* index page at once. It is considerably faster than a loop around
|
||||
* index page at once. It is considerably faster than a loop around
|
||||
* PageIndexTupleDelete ... however, the caller *must* supply the array
|
||||
* of item numbers to be deleted in item number order!
|
||||
*/
|
||||
@@ -780,7 +780,7 @@ PageIndexMultiDelete(Page page, OffsetNumber *itemnos, int nitems)
|
||||
Offset pd_lower = phdr->pd_lower;
|
||||
Offset pd_upper = phdr->pd_upper;
|
||||
Offset pd_special = phdr->pd_special;
|
||||
itemIdSortData itemidbase[MaxIndexTuplesPerPage];
|
||||
itemIdSortData itemidbase[MaxIndexTuplesPerPage];
|
||||
itemIdSort itemidptr;
|
||||
ItemId lp;
|
||||
int nline,
|
||||
@@ -903,7 +903,7 @@ PageIndexMultiDelete(Page page, OffsetNumber *itemnos, int nitems)
|
||||
* If checksums are disabled, or if the page is not initialized, just return
|
||||
* the input. Otherwise, we must make a copy of the page before calculating
|
||||
* the checksum, to prevent concurrent modifications (e.g. setting hint bits)
|
||||
* from making the final checksum invalid. It doesn't matter if we include or
|
||||
* from making the final checksum invalid. It doesn't matter if we include or
|
||||
* exclude hints during the copy, as long as we write a valid page and
|
||||
* associated checksum.
|
||||
*
|
||||
|
||||
@@ -86,7 +86,7 @@
|
||||
* not needed because of an mdtruncate() operation. The reason for leaving
|
||||
* them present at size zero, rather than unlinking them, is that other
|
||||
* backends and/or the checkpointer might be holding open file references to
|
||||
* such segments. If the relation expands again after mdtruncate(), such
|
||||
* such segments. If the relation expands again after mdtruncate(), such
|
||||
* that a deactivated segment becomes active again, it is important that
|
||||
* such file references still be valid --- else data might get written
|
||||
* out to an unlinked old copy of a segment file that will eventually
|
||||
@@ -123,7 +123,7 @@ static MemoryContext MdCxt; /* context for all md.c allocations */
|
||||
* we keep track of pending fsync operations: we need to remember all relation
|
||||
* segments that have been written since the last checkpoint, so that we can
|
||||
* fsync them down to disk before completing the next checkpoint. This hash
|
||||
* table remembers the pending operations. We use a hash table mostly as
|
||||
* table remembers the pending operations. We use a hash table mostly as
|
||||
* a convenient way of merging duplicate requests.
|
||||
*
|
||||
* We use a similar mechanism to remember no-longer-needed files that can
|
||||
@@ -291,7 +291,7 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
|
||||
* During bootstrap, there are cases where a system relation will be
|
||||
* accessed (by internal backend processes) before the bootstrap
|
||||
* script nominally creates it. Therefore, allow the file to exist
|
||||
* already, even if isRedo is not set. (See also mdopen)
|
||||
* already, even if isRedo is not set. (See also mdopen)
|
||||
*/
|
||||
if (isRedo || IsBootstrapProcessingMode())
|
||||
fd = PathNameOpenFile(path, O_RDWR | PG_BINARY, 0600);
|
||||
@@ -336,7 +336,7 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
|
||||
* if the contents of the file were repopulated by subsequent WAL entries.
|
||||
* But if we didn't WAL-log insertions, but instead relied on fsyncing the
|
||||
* file after populating it (as for instance CLUSTER and CREATE INDEX do),
|
||||
* the contents of the file would be lost forever. By leaving the empty file
|
||||
* the contents of the file would be lost forever. By leaving the empty file
|
||||
* until after the next checkpoint, we prevent reassignment of the relfilenode
|
||||
* number until it's safe, because relfilenode assignment skips over any
|
||||
* existing file.
|
||||
@@ -349,7 +349,7 @@ mdcreate(SMgrRelation reln, ForkNumber forkNum, bool isRedo)
|
||||
*
|
||||
* All the above applies only to the relation's main fork; other forks can
|
||||
* just be removed immediately, since they are not needed to prevent the
|
||||
* relfilenode number from being recycled. Also, we do not carefully
|
||||
* relfilenode number from being recycled. Also, we do not carefully
|
||||
* track whether other forks have been created or not, but just attempt to
|
||||
* unlink them unconditionally; so we should never complain about ENOENT.
|
||||
*
|
||||
@@ -366,7 +366,7 @@ mdunlink(RelFileNodeBackend rnode, ForkNumber forkNum, bool isRedo)
|
||||
{
|
||||
/*
|
||||
* We have to clean out any pending fsync requests for the doomed
|
||||
* relation, else the next mdsync() will fail. There can't be any such
|
||||
* relation, else the next mdsync() will fail. There can't be any such
|
||||
* requests for a temp relation, though. We can send just one request
|
||||
* even when deleting multiple forks, since the fsync queuing code accepts
|
||||
* the "InvalidForkNumber = all forks" convention.
|
||||
@@ -503,7 +503,7 @@ mdextend(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
|
||||
/*
|
||||
* Note: because caller usually obtained blocknum by calling mdnblocks,
|
||||
* which did a seek(SEEK_END), this seek is often redundant and will be
|
||||
* optimized away by fd.c. It's not redundant, however, if there is a
|
||||
* optimized away by fd.c. It's not redundant, however, if there is a
|
||||
* partial page at the end of the file. In that case we want to try to
|
||||
* overwrite the partial page with a full page. It's also not redundant
|
||||
* if bufmgr.c had to dump another buffer of the same file to make room
|
||||
@@ -803,9 +803,9 @@ mdnblocks(SMgrRelation reln, ForkNumber forknum)
|
||||
* exactly RELSEG_SIZE long, and it's useless to recheck that each time.
|
||||
*
|
||||
* NOTE: this assumption could only be wrong if another backend has
|
||||
* truncated the relation. We rely on higher code levels to handle that
|
||||
* truncated the relation. We rely on higher code levels to handle that
|
||||
* scenario by closing and re-opening the md fd, which is handled via
|
||||
* relcache flush. (Since the checkpointer doesn't participate in
|
||||
* relcache flush. (Since the checkpointer doesn't participate in
|
||||
* relcache flush, it could have segment chain entries for inactive
|
||||
* segments; that's OK because the checkpointer never needs to compute
|
||||
* relation size.)
|
||||
@@ -999,7 +999,7 @@ mdsync(void)
|
||||
|
||||
/*
|
||||
* If we are in the checkpointer, the sync had better include all fsync
|
||||
* requests that were queued by backends up to this point. The tightest
|
||||
* requests that were queued by backends up to this point. The tightest
|
||||
* race condition that could occur is that a buffer that must be written
|
||||
* and fsync'd for the checkpoint could have been dumped by a backend just
|
||||
* before it was visited by BufferSync(). We know the backend will have
|
||||
@@ -1115,7 +1115,7 @@ mdsync(void)
|
||||
* that have been deleted (unlinked) by the time we get to
|
||||
* them. Rather than just hoping an ENOENT (or EACCES on
|
||||
* Windows) error can be ignored, what we do on error is
|
||||
* absorb pending requests and then retry. Since mdunlink()
|
||||
* absorb pending requests and then retry. Since mdunlink()
|
||||
* queues a "cancel" message before actually unlinking, the
|
||||
* fsync request is guaranteed to be marked canceled after the
|
||||
* absorb if it really was this case. DROP DATABASE likewise
|
||||
@@ -1219,7 +1219,7 @@ mdsync(void)
|
||||
|
||||
/*
|
||||
* We've finished everything that was requested before we started to
|
||||
* scan the entry. If no new requests have been inserted meanwhile,
|
||||
* scan the entry. If no new requests have been inserted meanwhile,
|
||||
* remove the entry. Otherwise, update its cycle counter, as all the
|
||||
* requests now in it must have arrived during this cycle.
|
||||
*/
|
||||
@@ -1324,7 +1324,7 @@ mdpostckpt(void)
|
||||
|
||||
/*
|
||||
* As in mdsync, we don't want to stop absorbing fsync requests for a
|
||||
* long time when there are many deletions to be done. We can safely
|
||||
* long time when there are many deletions to be done. We can safely
|
||||
* call AbsorbFsyncRequests() at this point in the loop (note it might
|
||||
* try to delete list entries).
|
||||
*/
|
||||
@@ -1449,7 +1449,7 @@ RememberFsyncRequest(RelFileNode rnode, ForkNumber forknum, BlockNumber segno)
|
||||
/*
|
||||
* We can't just delete the entry since mdsync could have an
|
||||
* active hashtable scan. Instead we delete the bitmapsets; this
|
||||
* is safe because of the way mdsync is coded. We also set the
|
||||
* is safe because of the way mdsync is coded. We also set the
|
||||
* "canceled" flags so that mdsync can tell that a cancel arrived
|
||||
* for the fork(s).
|
||||
*/
|
||||
@@ -1551,7 +1551,7 @@ RememberFsyncRequest(RelFileNode rnode, ForkNumber forknum, BlockNumber segno)
|
||||
|
||||
/*
|
||||
* NB: it's intentional that we don't change cycle_ctr if the entry
|
||||
* already exists. The cycle_ctr must represent the oldest fsync
|
||||
* already exists. The cycle_ctr must represent the oldest fsync
|
||||
* request that could be in the entry.
|
||||
*/
|
||||
|
||||
@@ -1720,7 +1720,7 @@ _mdfd_getseg(SMgrRelation reln, ForkNumber forknum, BlockNumber blkno,
|
||||
{
|
||||
/*
|
||||
* Normally we will create new segments only if authorized by the
|
||||
* caller (i.e., we are doing mdextend()). But when doing WAL
|
||||
* caller (i.e., we are doing mdextend()). But when doing WAL
|
||||
* recovery, create segments anyway; this allows cases such as
|
||||
* replaying WAL data that has a write into a high-numbered
|
||||
* segment of a relation that was later deleted. We want to go
|
||||
|
||||
@@ -494,7 +494,7 @@ smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo)
|
||||
}
|
||||
|
||||
/*
|
||||
* Get rid of any remaining buffers for the relations. bufmgr will just
|
||||
* Get rid of any remaining buffers for the relations. bufmgr will just
|
||||
* drop them without bothering to write the contents.
|
||||
*/
|
||||
DropRelFileNodesAllBuffers(rnodes, nrels);
|
||||
@@ -679,7 +679,7 @@ smgrtruncate(SMgrRelation reln, ForkNumber forknum, BlockNumber nblocks)
|
||||
* Send a shared-inval message to force other backends to close any smgr
|
||||
* references they may have for this rel. This is useful because they
|
||||
* might have open file pointers to segments that got removed, and/or
|
||||
* smgr_targblock variables pointing past the new rel end. (The inval
|
||||
* smgr_targblock variables pointing past the new rel end. (The inval
|
||||
* message will come back to our backend, too, causing a
|
||||
* probably-unnecessary local smgr flush. But we don't expect that this
|
||||
* is a performance-critical path.) As in the unlink code, we want to be
|
||||
|
||||
Reference in New Issue
Block a user