postgres

mirror of https://github.com/postgres/postgres.git synced 2025-12-04 12:02:48 +03:00

Author	SHA1	Message	Date
Andres Freund	01368e5d9d	Split all OBJS style lines in makefiles into one-line-per-entry style. When maintaining or merging patches, one of the most common sources for conflicts are the list of objects in makefiles. Especially when the split across lines has been changed on both sides, which is somewhat common due to attempting to stay below 80 columns, those conflicts are unnecessarily laborious to resolve. By splitting, and alphabetically sorting, OBJS style lines into one object per line, conflicts should be less frequent, and easier to resolve when they still occur. Author: Andres Freund Discussion: https://postgr.es/m/20191029200901.vww4idgcxv74cwes@alap3.anarazel.de	2019-11-05 14:41:07 -08:00
Michael Paquier	6ca86bb7e9	Fix typos in the code Author: Vignesh C Reviewed-by: Dilip Kumar, Michael Paquier Discussion: https://postgr.es/m/CALDaNm0ni+GAOe4+fbXiOxNrVudajMYmhJFtXGX-zBPoN8ixhw@mail.gmail.com	2019-10-30 10:03:00 +09:00
Fujii Masao	6d05086c0a	Speedup truncations of relation forks. When a relation is truncated, shared_buffers needs to be scanned so that any buffers for the relation forks are invalidated in it. Previously, shared_buffers was scanned for each relation forks, i.e., MAIN, FSM and VM, when VACUUM truncated off any empty pages at the end of relation or TRUNCATE truncated the relation in place. Since shared_buffers needed to be scanned multiple times, it could take a long time to finish those commands especially when shared_buffers was large. This commit changes the logic so that shared_buffers is scanned only one time for those three relation forks. Author: Kirk Jamison Reviewed-by: Masahiko Sawada, Thomas Munro, Alvaro Herrera, Takayuki Tsunakawa and Fujii Masao Discussion: https://postgr.es/m/D09B13F772D2274BB348A310EE3027C64E2067@g01jpexmbkw24	2019-09-24 17:31:26 +09:00
Michael Paquier	0896ae561b	Fix inconsistencies and typos in the tree This is numbered take 7, and addresses a set of issues around: - Fixes for typos and incorrect reference names. - Removal of unneeded comments. - Removal of unreferenced functions and structures. - Fixes regarding variable name consistency. Author: Alexander Lakhin Discussion: https://postgr.es/m/10bfd4ac-3e7c-40ab-2b2e-355ed15495e8@gmail.com	2019-07-16 13:23:53 +09:00
Michael Paquier	6b8548964b	Fix inconsistencies in the code This addresses a couple of issues in the code: - Typos and inconsistencies in comments and function declarations. - Removal of unreferenced function declarations. - Removal of unnecessary compile flags. - A cleanup error in regressplans.sh. Author: Alexander Lakhin Discussion: https://postgr.es/m/0c991fdf-2670-1997-c027-772a420c4604@gmail.com	2019-07-08 13:15:09 +09:00
Michael Paquier	c74d49d41c	Fix many typos and inconsistencies Author: Alexander Lakhin Discussion: https://postgr.es/m/af27d1b3-a128-9d62-46e0-88f424397f44@gmail.com	2019-07-01 10:00:23 +09:00
Michael Paquier	1fb6f62a84	Fix typos in various places Author: Andrea Gelmini Reviewed-by: Michael Paquier, Justin Pryzby Discussion: https://postgr.es/m/20190528181718.GA39034@glet	2019-06-03 13:44:03 +09:00
Tom Lane	8255c7a5ee	Phase 2 pgindent run for v12. Switch to 2.1 version of pg_bsd_indent. This formats multiline function declarations "correctly", that is with additional lines of parameter declarations indented to match where the first line's left parenthesis is. Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com	2019-05-22 13:04:48 -04:00
Tom Lane	be76af171c	Initial pgindent run for v12. This is still using the 2.0 version of pg_bsd_indent. I thought it would be good to commit this separately, so as to document the differences between 2.0 and 2.1 behavior. Discussion: https://postgr.es/m/16296.1558103386@sss.pgh.pa.us	2019-05-22 12:55:34 -04:00
Andres Freund	7f44ede594	tableam: Don't assume that every AM uses md.c style storage. Previously various parts of the code routed size requests through RelationGetNumberOfBlocks[InFork]. That works if md.c is used by the AM, but not otherwise. Add a tableam callback to return the size of the table. As not every AM will use postgres' BLCKSZ, have it return bytes, and have RelationGetNumberOfBlocksInFork() round the byte size up into blocks. To allow code outside of the AM to determine the actual relation size map InvalidForkNumber the total size of a relation, as not every AM might just need the postgres defined forks. A few users of RelationGetNumberOfBlocks() ought to be converted away from that. One case, the use of it to determine whether a tid is valid, will be fixed in a follow up commit. Others will have to wait for v13. Author: Andres Freund Discussion: https://postgr.es/m/20190423225201.3bbv6tbqzkb5w7cw@alap3.anarazel.de	2019-05-17 18:56:47 -07:00
Thomas Munro	3eb77eba5a	Refactor the fsync queue for wider use. Previously, md.c and checkpointer.c were tightly integrated so that fsync calls could be handed off and processed in the background. Introduce a system of callbacks and file tags, so that other modules can hand off fsync work in the same way. For now only md.c uses the new interface, but other users are being proposed. Since there may be use cases that are not strictly SMGR implementations, use a new function table for sync handlers rather than extending the traditional SMGR one. Instead of using a bitmapset of segment numbers for each RelFileNode in the checkpointer's hash table, make the segment number part of the key. This requires sending explicit "forget" requests for every segment individually when relations are dropped, but suits the file layout schemes of proposed future users better (ie sparse or high segment numbers). Author: Shawn Debnath and Thomas Munro Reviewed-by: Thomas Munro, Andres Freund Discussion: https://postgr.es/m/CAEepm=2gTANm=e3ARnJT=n0h8hf88wqmaZxk0JYkxw+b21fNrw@mail.gmail.com	2019-04-04 23:38:38 +13:00
Bruce Momjian	97c39498e5	Update copyright for 2019 Backpatch-through: certain files through 9.4	2019-01-02 12:44:25 -05:00
Thomas Munro	d67dae036b	Don't count zero-filled buffers as 'read' in EXPLAIN. If you extend a relation, it should count as a block written, not read (we write a zero-filled block). If you ask for a zero-filled buffer, it shouldn't be counted as read or written. Later we might consider counting zero-filled buffers with a separate counter, if they become more common due to future work. Author: Thomas Munro Reviewed-by: Haribabu Kommi, Kyotaro Horiguchi, David Rowley Discussion: https://postgr.es/m/CAEepm%3D3JytB3KPpvSwXzkY%2Bdwc5zC8P8Lk7Nedkoci81_0E9rA%40mail.gmail.com	2018-11-28 11:58:10 +13:00
Tom Lane	3afd75eaac	Remove dubious micro-optimization in ckpt_buforder_comparator(). It seems incorrect to assume that the list of CkptSortItems can never contain duplicate page numbers: concurrent activity could result in some page getting dropped from a low-numbered buffer and later loaded into a high-numbered buffer while BufferSync is scanning the buffer pool. If that happened, the comparator would give self-inconsistent results, potentially confusing qsort(). Saving one comparison step is not worth possibly getting the sort wrong. So far as I can tell, nothing would actually go wrong given our current implementation of qsort(). It might get a bit slower than expected if there were a large number of duplicates of one value, but that's surely a probability-epsilon case. Still, the comment is wrong, and if we ever switched to another sort implementation it might be less forgiving. In passing, avoid casting away const-ness of the argument pointers; I've not seen any compiler complaints from that, but it seems likely that some compilers would not like it. Back-patch to 9.6 where this code came in, just in case I've underestimated the possible consequences. Discussion: https://postgr.es/m/18437.1515607610@sss.pgh.pa.us	2018-01-10 15:50:54 -05:00
Bruce Momjian	9d4649ca49	Update copyright for 2018 Backpatch-through: certain files through 9.3	2018-01-02 23:30:12 -05:00
Tom Lane	c5269472ea	Fix two violations of the ResourceOwnerEnlarge/Remember protocol. The point of having separate ResourceOwnerEnlargeFoo and ResourceOwnerRememberFoo functions is so that resource allocation can happen in between. Doing it in some other order is just wrong. OpenTemporaryFile() did open(), enlarge, remember, which would leak the open file if the enlarge step ran out of memory. Because fd.c has its own layer of resource-remembering, the consequences look like they'd be limited to an intratransaction FD leak, but it's still not good. IncrBufferRefCount() did enlarge, remember, incr-refcount, which would blow up if the incr-refcount step ever failed. It was safe enough when written, but since the introduction of PrivateRefCountHash, I think the assumption that no error could happen there is pretty shaky. The odds of real problems from either bug are probably small, but still, back-patch to supported branches. Thomas Munro and Tom Lane, per a comment from Andres Freund	2017-11-08 16:50:12 -05:00
Peter Eisentraut	2eb4a831e5	Change TRUE/FALSE to true/false The lower case spellings are C and C++ standard and are used in most parts of the PostgreSQL sources. The upper case spellings are only used in some files/modules. So standardize on the standard spellings. The APIs for ICU, Perl, and Windows define their own TRUE and FALSE, so those are left as is when using those APIs. In code comments, we use the lower-case spelling for the C concepts and keep the upper-case spelling for the SQL concepts. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2017-11-08 11:37:28 -05:00
Robert Haas	79ccd7cbd5	pg_prewarm: Add automatic prewarm feature. Periodically while the server is running, and at shutdown, write out a list of blocks in shared buffers. When the server reaches consistency -- unfortunatey, we can't do it before that point without breaking things -- reload those blocks into any still-unused shared buffers. Mithun Cy and Robert Haas, reviewed and tested by Beena Emerson, Amit Kapila, Jim Nasby, and Rafia Sabih. Discussion: http://postgr.es/m/CAD__OugubOs1Vy7kgF6xTjmEqTR4CrGAv8w+ZbaY_+MZeitukw@mail.gmail.com	2017-08-21 14:17:39 -04:00
Tom Lane	382ceffdf7	Phase 3 of pgindent updates. Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us	2017-06-21 15:35:54 -04:00
Tom Lane	c7b8998ebb	Phase 2 of pgindent updates. Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit `e3860ffa4d` wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us	2017-06-21 15:19:25 -04:00
Tom Lane	e3860ffa4d	Initial pgindent run with pg_bsd_indent version 2.0. The new indent version includes numerous fixes thanks to Piotr Stefaniak. The main changes visible in this commit are: * Nicer formatting of function-pointer declarations. * No longer unexpectedly removes spaces in expressions using casts, sizeof, or offsetof. * No longer wants to add a space in "struct structname varname", as well as some similar cases for const- or volatile-qualified pointers. Declarations using PG_USED_FOR_ASSERTS_ONLY are formatted more nicely. * Fixes bug where comments following declarations were sometimes placed with no space separating them from the code. * Fixes some odd decisions for comments following case labels. * Fixes some cases where comments following code were indented to less than the expected column 33. On the less good side, it now tends to put more whitespace around typedef names that are not listed in typedefs.list. This might encourage us to put more effort into typedef name collection; it's not really a bug in indent itself. There are more changes coming after this round, having to do with comment indentation and alignment of lines appearing within parentheses. I wanted to limit the size of the diffs to something that could be reviewed without one's eyes completely glazing over, so it seemed better to split up the changes as much as practical. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us	2017-06-21 14:39:04 -04:00
Teodor Sigaev	9cf6033281	Revert unintentional change in increasing usage count during pin of buffers, this makes buffer access strategy have no effect. Change was a part of commit `48354581a4` during 9.6 release cycle, so backpath to 9.6 Reported-by: Jim Nasby Author: Alexander Korotkov Reviewed-by: Jim Nasby, Andres Freund https://commitfest.postgresql.org/13/1029/	2017-03-20 18:48:46 +03:00
Robert Haas	88e66d193f	Rename "pg_clog" directory to "pg_xact". Names containing the letters "log" sometimes confuse users into believing that only non-critical data is present. It is hoped this renaming will discourage ill-considered removals of transaction status data. Michael Paquier Discussion: http://postgr.es/m/CA+Tgmoa9xFQyjRZupbdEFuwUerFTvC6HjZq1ud6GYragGDFFgA@mail.gmail.com	2017-03-17 09:48:38 -04:00
Robert Haas	87f9982034	Fix failure to mark init buffers as BM_PERMANENT. This could result in corruption of the init fork of an unlogged index if the ambuildempty routine for that index used shared buffers to create the init fork, which was true for brin, gin, gist, and hash indexes. Patch by me, based on an earlier patch by Michael Paquier, who also reviewed this one. This also incorporates an idea from Artur Zakirov. Discussion: http://postgr.es/m/CACYUyc8yccE4xfxhqxfh_Mh38j7dRFuxfaK1p6dSNAEUakxUyQ@mail.gmail.com	2017-03-14 11:51:11 -04:00
Tatsuo Ishii	73293ebae2	Fix comments in StrategyNotifyBgWriter(). The interface for the function was changed in `d72731a704` but the comments of the function was not updated. Patch by Yugo Nagata.	2017-01-24 09:39:11 +09:00
Bruce Momjian	1d25779284	Update copyright via script for 2017	2017-01-03 13:48:53 -05:00
Robert Haas	3761fe3c20	Simplify LWLock tranche machinery by removing array_base/array_stride. array_base and array_stride were added so that we could identify the offset of an LWLock within a tranche, but this facility is only very marginally used apart from the main tranche. So, give every lock in the main tranche its own tranche ID and get rid of array_base, array_stride, and all that's attached. For debugging facilities (Trace_lwlocks and LWLOCK_STATS) print the pointer address of the LWLock using %p instead of the offset. This is arguably more useful, and certainly a lot cheaper. Drop the offset-within-tranche from the information reported to dtrace and from one can't-happen message inside lwlock.c. The main user-visible impact of this change is that pg_stat_activity will now report all waits for LWLocks as "LWLock" rather than reporting some as "LWLockTranche" and others as "LWLockNamed". The main motivation for this change is that the need to specify an array_base and an array_stride is awkward for parallel query. There is only a very limited supply of tranche IDs so we can't just keep allocating new ones, and if we try to use the same tranche IDs every time then we run into trouble when multiple parallel contexts are use simultaneously. So if we didn't get rid of this mechanism we'd have to make it even more complicated. By simplifying it in this way, we instead reduce the size of the generated code for lwlock.c by about 5%. Discussion: http://postgr.es/m/CA+TgmoYsFn6NUW1x0AZtupJGUAs1UDY4dJtCN47_Q6D0sP80PA@mail.gmail.com	2016-12-16 11:29:23 -05:00
Robert Haas	f2e6a2ccf1	Add API to check if an existing exclusive lock allows cleanup. LockBufferForCleanup() acquires a cleanup lock unconditionally, and ConditionalLockBufferForCleanup() acquires a cleanup lock if it is possible to do so without waiting; this patch adds a new API, IsBufferCleanupOK(), which tests whether an exclusive lock already held happens to be a cleanup lock. This is possible because a cleanup lock simply means an exclusive lock plus the assurance any other pins on the buffer are newer than our own pin. Therefore, just as the existing functions decide that the exclusive lock that they've just taken is a cleanup lock if they observe the pin count to be 1, this new function allows us to observe that the pin count is 1 on a buffer we've already locked. This is useful in situations where a backend definitely wishes to modify the buffer and also wishes to perform cleanup operations if possible. The patch to eliminate heavyweight locking by hash indexes uses this, and it may have other applications as well. Amit Kapila, per a suggestion from me. Some comment adjustments by me as well.	2016-11-04 09:32:24 -04:00
Andres Freund	b0779abb3a	Fix fallback implementation of pg_atomic_write_u32(). I somehow had assumed that in the spinlock (in turn possibly using semaphores) based fallback atomics implementation 32 bit writes could be done without a lock. As far as the write goes that's correct, since postgres supports only platforms with single-copy atomicity for aligned 32bit writes. But writing without holding the spinlock breaks read-modify-write operations like pg_atomic_compare_exchange_u32(), since they'll potentially "miss" a concurrent write, which can't happen in actual hardware implementations. In 9.6+ when using the fallback atomics implementation this could lead to buffer header locks not being properly marked as released, and potentially some related state corruption. I don't see a related danger in 9.5 (earliest release with the API), because pg_atomic_write_u32() wasn't used in a concurrent manner there. The state variable of local buffers, before this change, were manipulated using pg_atomic_write_u32(), to avoid unnecessary synchronization overhead. As that'd not be the case anymore, introduce and use pg_atomic_unlocked_write_u32(), which does not correctly interact with RMW operations. This bug only caused issues when postgres is compiled on platforms without atomics support (i.e. no common new platform), or when compiled with --disable-atomics, which explains why this wasn't noticed in testing. Reported-By: Tom Lane Discussion: <14947.1475690465@sss.pgh.pa.us> Backpatch: 9.5-, where the atomic operations API was introduced.	2016-10-07 16:55:15 -07:00
Robert Haas	d2ce38e204	Rename WAIT_* constants to PG_WAIT_*. Windows apparently has a constant named WAIT_TIMEOUT, and some of these other names are pretty generic, too. Insert "PG_" at the front of each name in order to disambiguate. Michael Paquier	2016-10-05 08:04:52 -04:00
Robert Haas	6f3bd98ebf	Extend framework from commit `53be0b1ad` to report latch waits. WaitLatch, WaitLatchOrSocket, and WaitEventSetWait now taken an additional wait_event_info parameter; legal values are defined in pgstat.h. This makes it possible to uniquely identify every point in the core code where we are waiting for a latch; extensions can pass WAIT_EXTENSION. Because latches were the major wait primitive not previously covered by this patch, it is now possible to see information in pg_stat_activity on a large number of important wait events not previously addressed, such as ClientRead, ClientWrite, and SyncRep. Unfortunately, many of the wait events added by this patch will fail to appear in pg_stat_activity because they're only used in background processes which don't currently appear in pg_stat_activity. We should fix this either by creating a separate view for such information, or else by deciding to include them in pg_stat_activity after all. Michael Paquier and Robert Haas, reviewed by Alexander Korotkov and Thomas Munro.	2016-10-04 11:01:42 -04:00
Simon Riggs	016abf1fb8	Add debug check function LWLockHeldByMeInMode() Tests whether my process holds a lock in given mode. Add initial usage in MarkBufferDirty(). Thomas Munro	2016-09-05 10:38:08 +01:00
Tom Lane	ea268cdc9a	Add macros to make AllocSetContextCreate() calls simpler and safer. I found that half a dozen (nearly 5%) of our AllocSetContextCreate calls had typos in the context-sizing parameters. While none of these led to especially significant problems, they did create minor inefficiencies, and it's now clear that expecting people to copy-and-paste those calls accurately is not a great idea. Let's reduce the risk of future errors by introducing single macros that encapsulate the common use-cases. Three such macros are enough to cover all but two special-purpose contexts; those two calls can be left as-is, I think. While this patch doesn't in itself improve matters for third-party extensions, it doesn't break anything for them either, and they can gradually adopt the simplified notation over time. In passing, change TopMemoryContext to use the default allocation parameters. Formerly it could only be extended 8K at a time. That was probably reasonable when this code was written; but nowadays we create many more contexts than we did then, so that it's not unusual to have a couple hundred K in TopMemoryContext, even without considering various dubious code that sticks other things there. There seems no good reason not to let it use growing blocks like most other contexts. Back-patch to 9.6, mostly because that's still close enough to HEAD that it's easy to do so, and keeping the branches in sync can be expected to avoid some future back-patching pain. The bugs fixed by these changes don't seem to be significant enough to justify fixing them further back. Discussion: <21072.1472321324@sss.pgh.pa.us>	2016-08-27 17:50:38 -04:00
Andres Freund	48bfeb244f	Improve WritebackContextInit() comment and prototype argument names. Author: Masahiko Sawada Discussion: CAD21AoBD=Of1OzL90Xx4Q-3j=-2q7=S87cs75HfutE=eCday2w@mail.gmail.com	2016-07-01 14:29:03 -07:00
Alvaro Herrera	1ca4a1b5d2	Finish up XLOG_HINT renaming Commit `b8fd1a09f3` renamed XLOG_HINT to XLOG_FPI, but neglected two places. Backpatch to 9.3, like that commit.	2016-06-17 18:05:55 -04:00
Kevin Grittner	bf9a60ee33	Fix interaction between CREATE INDEX and "snapshot too old". Since indexes are created without valid LSNs, an index created while a snapshot older than old_snapshot_threshold existed could cause queries to return incorrect results when those old snapshots were used, if any relevant rows had been subject to early pruning before the index was built. Prevent usage of a newly created index until all such snapshots are released, for relations where this can happen. Questions about the interaction of "snapshot too old" with index creation were initially raised by Andres Freund. Reviewed by Robert Haas.	2016-06-10 09:25:31 -05:00
Tom Lane	cae1c788b9	Improve the situation for parallel query versus temp relations. Transmit the leader's temp-namespace state to workers. This is important because without it, the workers do not really have the same search path as the leader. For example, there is no good reason (and no extant code either) to prevent a worker from executing a temp function that the leader created previously; but as things stood it would fail to find the temp function, and then either fail or execute the wrong function entirely. We still prohibit a worker from creating a temp namespace on its own. In effect, a worker can only see the session's temp namespace if the leader had created it before starting the worker, which seems like the right semantics. Also, transmit the leader's BackendId to workers, and arrange for workers to use that when determining the physical file path of a temp relation belonging to their session. While the original intent was to prevent such accesses entirely, there were a number of holes in that, notably in places like dbsize.c which assume they can safely access temp rels of other sessions anyway. We might as well get this right, as a small down payment on someday allowing workers to access the leader's temp tables. (With this change, directly using "MyBackendId" as a relation or buffer backend ID is deprecated; you should use BackendIdForTempRelations() instead. I left a couple of such uses alone though, as they're not going to be reachable in parallel workers until we do something about localbuf.c.) Move the thou-shalt-not-access-thy-leader's-temp-tables prohibition down into localbuf.c, which is where it actually matters, instead of having it in relation_open(). This amounts to recognizing that access to temp tables' catalog entries is perfectly safe in a worker, it's only the data in local buffers that is problematic. Having done all that, we can get rid of the test in has_parallel_hazard() that says that use of a temp table's rowtype is unsafe in parallel workers. That test was unduly expensive, and if we really did need such a prohibition, that was not even close to being a bulletproof guard for it. (For example, any user-defined function executed in a parallel worker might have attempted such access.)	2016-06-09 20:16:11 -04:00
Robert Haas	4bc424b968	pgindent run for 9.6	2016-06-09 18:02:36 -04:00
Greg Stark	e1623c3959	Fix various common mispellings. Mostly these are just comments but there are a few in documentation and a handful in code and tests. Hopefully this doesn't cause too much unnecessary pain for backpatching. I relented from some of the most common like "thru" for that reason. The rest don't seem numerous enough to cause problems. Thanks to Kevin Lyda's tool https://pypi.python.org/pypi/misspellings	2016-06-03 16:08:45 +01:00
Alvaro Herrera	0c7cd45b6d	Fix range check for effective_io_concurrency Commit `1aba62ec` moved the range check of that option form guc.c into bufmgr.c, but introduced a bug by changing a >= 0.0 to > 0.0, which made the value 0 no longer accepted. Put it back. Reported by Jeff Janes, diagnosed by Tom Lane	2016-05-24 14:55:34 -04:00
Robert Haas	9f84280ae9	Fix assorted defects in `09adc9a8c0`. That commit increased all shared memory allocations to the next higher multiple of PG_CACHE_LINE_SIZE, but it didn't ensure that allocation started on a cache line boundary. It also failed to remove a couple other pieces of now-useless code. BUFFERALIGN() is perhaps obsolete at this point, and likely should be removed at some point, too, but that seems like it can be left to a future cleanup. Mistakes all pointed out by Andres Freund. The patch is mine, with a few extra assertions which I adopted from his version of this fix.	2016-04-21 13:27:41 -04:00
Kevin Grittner	11e178d0dc	Inline initial comparisons in TestForOldSnapshot() Even with old_snapshot_threshold = -1 (which disables the "snapshot too old" feature), performance regressions were seen at moderate to high concurrency. For example, a one-socket, four-core system running 200 connections at saturation could see up to a 2.3% regression, with larger regressions possible on NUMA machines. By inlining the early (smaller, faster) tests in the TestForOldSnapshot() function, the i7 case dropped to a 0.2% regression, which could easily just be noise, and is clearly an improvement. Further testing will show whether more is needed.	2016-04-21 08:40:08 -05:00
Kevin Grittner	a343e223a5	Revert no-op changes to BufferGetPage() The reverted changes were intended to force a choice of whether any newly-added BufferGetPage() calls needed to be accompanied by a test of the snapshot age, to support the "snapshot too old" feature. Such an accompanying test is needed in about 7% of the cases, where the page is being used as part of a scan rather than positioning for other purposes (such as DML or vacuuming). The additional effort required for back-patching, and the doubt whether the intended benefit would really be there, have indicated it is best just to rely on developers to do the right thing based on comments and existing usage, as we do with many other conventions. This change should have little or no effect on generated executable code. Motivated by the back-patching pain of Tom Lane and Robert Haas	2016-04-20 08:31:19 -05:00
Tom Lane	a0382e2d7e	Make partition-lock-release coding more transparent in BufferAlloc(). Coverity complained that oldPartitionLock was possibly dereferenced after having been set to NULL. That actually can't happen, because we'd only use it if (oldFlags & BM_TAG_VALID) is true. But nonetheless Coverity is justified in complaining, because at line 1275 we actually overwrite oldFlags, and then still expect its BM_TAG_VALID bit to be a safe guide to whether to release the oldPartitionLock. Thus, the code would be incorrect if someone else had changed the buffer's BM_TAG_VALID flag meanwhile. That should not happen, since we hold pin on the buffer throughout this sequence, but it's starting to look like a rather shaky chain of logic. And there's no need for such assumptions, because we can simply replace the (oldFlags & BM_TAG_VALID) tests with (oldPartitionLock != NULL), which has identical results and makes it plain to all comers that we don't dereference a null pointer. A small side benefit is that the range of liveness of oldFlags is greatly reduced, possibly allowing the compiler to save a register. This is just cleanup, not an actual bug fix, so there seems no need for a back-patch.	2016-04-18 18:05:56 -04:00
Tom Lane	6b85d4ba9b	Fix portability problem induced by commit `a6f6b7819`. pg_xlogdump includes bufmgr.h. With a compiler that emits code for static inline functions even when they're unreferenced, that leads to unresolved external references in the new static-inline version of BufferGetPage(). So hide it with #ifndef FRONTEND, as we've done for similar issues elsewhere. Per buildfarm member pademelon.	2016-04-15 10:44:28 -04:00
Andres Freund	4b74c6a40e	Make init_spin_delay() C89 compliant #2 . My previous attempt at doing so, in `80abbeba23`, was not sufficient. While that fixed the problem for bufmgr.c and lwlock.c , s_lock.c still has non-constant expressions in the struct initializer, because the file/line/function information comes from the caller of s_lock(). Give up on using a macro, and use a static inline instead. Discussion: 4369.1460435533@sss.pgh.pa.us	2016-04-14 19:26:13 -07:00
Andres Freund	80abbeba23	Make init_spin_delay() C89 compliant and change stuck spinlock reporting. The current definition of init_spin_delay (introduced recently in `48354581a`) wasn't C89 compliant. It's not legal to refer to refer to non-constant expressions, and the ptr argument was one. This, as reported by Tom, lead to a failure on buildfarm animal pademelon. The pointer, especially on system systems with ASLR, isn't super helpful anyway, though. So instead of making init_spin_delay into an inline function, make s_lock_stuck() report the function name in addition to file:line and change init_spin_delay() accordingly. While not a direct replacement, the function name is likely more useful anyway (line numbers are often hard to interpret in third party reports). This also fixes what file/line number is reported for waits via s_lock(). As PG_FUNCNAME_MACRO is now used outside of elog.h, move it to c.h. Reported-By: Tom Lane Discussion: 4369.1460435533@sss.pgh.pa.us	2016-04-13 17:00:53 -07:00
Andres Freund	6b93fcd149	Avoid atomic operation in MarkLocalBufferDirty(). The recent patch to make Pin/UnpinBuffer lockfree in the hot path (`48354581a`), accidentally used pg_atomic_fetch_or_u32() in MarkLocalBufferDirty(). Other code operating on local buffers was careful to only use pg_atomic_read/write_u32 which just read/write from memory; to avoid unnecessary overhead. On its own that'd just make MarkLocalBufferDirty() slightly less efficient, but in addition InitLocalBuffers() doesn't call pg_atomic_init_u32() - thus the spinlock fallback for the atomic operations isn't initialized. That in turn caused, as reported by Tom, buildfarm animal gaur to fail. As those errors are actually useful against this type of error, continue to omit - intentionally this time - initialization of the atomic variable. In addition, add an explicit note about only using pg_atomic_read/write on local buffers's state to BufferDesc's description. Reported-By: Tom Lane Discussion: 1881.1460431476@sss.pgh.pa.us	2016-04-13 15:28:29 -07:00
Kevin Grittner	a6f6b78196	Use static inline function for BufferGetPage() I was initially concerned that the some of the hundreds of references to BufferGetPage() where the literal BGP_NO_SNAPSHOT_TEST were passed might not optimize as well as a macro, leading to some hard-to-find performance regressions in corner cases. Inspection of disassembled code has shown identical code at all inspected locations, and the size difference doesn't amount to even one byte per such call. So make it readable. Per gripes from Álvaro Herrera and Tom Lane	2016-04-11 16:47:50 -05:00
Andres Freund	48354581a4	Allow Pin/UnpinBuffer to operate in a lockfree manner. Pinning/Unpinning a buffer is a very frequent operation; especially in read-mostly cache resident workloads. Benchmarking shows that in various scenarios the spinlock protecting a buffer header's state becomes a significant bottleneck. The problem can be reproduced with pgbench -S on larger machines, but can be considerably worse for queries which touch the same buffers over and over at a high frequency (e.g. nested loops over a small inner table). To allow atomic operations to be used, cram BufferDesc's flags, usage_count, buf_hdr_lock, refcount into a single 32bit atomic variable; that allows to manipulate them together using 32bit compare-and-swap operations. This requires reducing MAX_BACKENDS to 2^18-1 (which could be lifted by using a 64bit field, but it's not a realistic configuration atm). As not all operations can easily implemented in a lockfree manner, implement the previous buf_hdr_lock via a flag bit in the atomic variable. That way we can continue to lock the header in places where it's needed, but can get away without acquiring it in the more frequent hot-paths. There's some additional operations which can be done without the lock, but aren't in this patch; but the most important places are covered. As bufmgr.c now essentially re-implements spinlocks, abstract the delay logic from s_lock.c into something more generic. It now has already two users, and more are coming up; there's a follupw patch for lwlock.c at least. This patch is based on a proof-of-concept written by me, which Alexander Korotkov made into a fully working patch; the committed version is again revised by me. Benchmarking and testing has, amongst others, been provided by Dilip Kumar, Alexander Korotkov, Robert Haas. On a large x86 system improvements for readonly pgbench, with a high client count, of a factor of 8 have been observed. Author: Alexander Korotkov and Andres Freund Discussion: 2400449.GjM57CE0Yg@dinodell	2016-04-10 20:12:32 -07:00

1 2 3 4 5 ...

513 Commits