postgres

mirror of https://github.com/postgres/postgres.git synced 2025-07-24 14:22:24 +03:00

Author	SHA1	Message	Date
Tom Lane	c83aed34be	Avoid wholesale autovacuuming when autovacuum is nominally off. When autovacuum is nominally off, we will still launch autovac workers to vacuum tables that are at risk of XID wraparound. But after we'd done that, an autovac worker would proceed to autovacuum every table in the targeted database, if they meet the usual thresholds for autovacuuming. This is at best pretty unexpected; at worst it delays response to the wraparound threat. Fix it so that if autovacuum is nominally off, we only do forced vacuums and not any other work. Per gripe from Andrey Zhidenkov. This has been like this all along, so back-patch to all supported branches.	2014-07-30 14:41:53 -04:00
Heikki Linnakangas	1578d13dc7	Treat 2PC commit/abort the same as regular xacts in recovery. There were several oversights in recovery code where COMMIT/ABORT PREPARED records were ignored: * pg_last_xact_replay_timestamp() (wasn't updated for 2PC commits) * recovery_min_apply_delay (2PC commits were applied immediately) * recovery_target_xid (recovery would not stop if the XID used 2PC) The first of those was reported by Sergiy Zuban in bug #11032, analyzed by Tom Lane and Andres Freund. The bug was always there, but was masked before commit `d19bd29f07`, because COMMIT PREPARED always created an extra regular transaction that was WAL-logged. Backpatch to all supported versions (older versions didn't have all the features and therefore didn't have all of the above bugs).	2014-07-29 11:58:01 +03:00
Robert Haas	18470e5f2f	Avoid access to already-released lock in LockRefindAndRelease. Spotted by Tom Lane.	2014-07-24 08:39:46 -04:00
Tom Lane	810f0d2a2d	Check block number against the correct fork in get_raw_page(). get_raw_page tried to validate the supplied block number against RelationGetNumberOfBlocks(), which of course is only right when accessing the main fork. In most cases, the main fork is longer than the others, so that the check was too weak (allowing a lower-level error to be reported, but no real harm to be done). However, very small tables could have an FSM larger than their heap, in which case the mistake prevented access to some FSM pages. Per report from Torsten Foertsch. In passing, make the bad-block-number error into an ereport not elog (since it's certainly not an internal error); and fix sloppily maintained comment for RelationGetNumberOfBlocksInFork. This has been wrong since we invented relation forks, so back-patch to all supported branches.	2014-07-22 11:45:57 -04:00
Tom Lane	f54d97c5ef	Reject out-of-range numeric timezone specifications. In commit `631dc390f4`, we started to handle simple numeric timezone offsets via the zic library instead of the old CTimeZone/HasCTZSet kluge. However, we overlooked the fact that the zic code will reject UTC offsets exceeding a week (which seems a bit arbitrary, but not because it's too tight ...). This led to possibly setting session_timezone to NULL, which results in crashes in most timezone-related operations as of 9.4, and crashes in a small number of places even before that. So check for NULL return from pg_tzset_offset() and report an appropriate error message. Per bug #11014 from Duncan Gillis. Back-patch to all supported branches, like the previous patch. (Unfortunately, as of today that no longer includes 8.4.)	2014-07-21 22:41:30 -04:00
Peter Eisentraut	44c4013431	Translation updates	2014-07-21 01:02:07 -04:00
Tom Lane	a223b9e361	Fix two low-probability memory leaks in regular expression parsing. If pg_regcomp failed after having invoked markst/cleanst, it would leak any "struct subre" nodes it had created. (We've already detected all regex syntax errors at that point, so the only likely causes of later failure would be query cancel or out-of-memory.) To fix, make sure freesrnode knows the difference between the pre-cleanst and post-cleanst cleanup procedures. Add some documentation of this less-than-obvious point. Also, newlacon did the wrong thing with an out-of-memory failure from realloc(), so that the previously allocated array would be leaked. Both of these are pretty low-probability scenarios, but a bug is a bug, so patch all the way back. Per bug #10976 from Arthur O'Dwyer.	2014-07-18 13:00:48 -04:00
Alvaro Herrera	b42f09fc80	Fix REASSIGN OWNED for text search objects Trying to reassign objects owned by a user that had text search dictionaries or configurations used to fail with: ERROR: unexpected classid 3600 or ERROR: unexpected classid 3602 Fix by adding cases for those object types in a switch in pg_shdepend.c. Both REASSIGN OWNED and text search objects go back all the way to 8.1, so backpatch to all supported branches. In 9.3 the alter-owner code was made generic, so the required change in recent branches is pretty simple; however, for 9.2 and older ones we need some additional reshuffling to enable specifying objects by OID rather than name. Text search templates and parsers are not owned objects, so there's no change required for them. Per bug #9749 reported by Michal Novotný	2014-07-15 13:24:07 -04:00
Simon Riggs	2dde11a632	Reset master xmin when hot_standby_feedback disabled. If walsender has xmin of standby then ensure we reset the value to 0 when we change from hot_standby_feedback=on to hot_standby_feedback=off.	2014-07-15 14:40:23 +01:00
Tom Lane	261f954e7a	Fix bug with whole-row references to append subplans. ExecEvalWholeRowVar incorrectly supposed that it could "bless" the source TupleTableSlot just once per query. But if the input is coming from an Append (or, perhaps, other cases?) more than one slot might be returned over the query run. This led to "record type has not been registered" errors when a composite datum was extracted from a non-blessed slot. This bug has been there a long time; I guess it escaped notice because when dealing with subqueries the planner tends to expand whole-row Vars into RowExprs, which don't have the same problem. It is possible to trigger the problem in all active branches, though, as illustrated by the added regression test.	2014-07-11 19:12:45 -04:00
Tom Lane	189bd09cbd	Don't assume a subquery's output is unique if there's a SRF in its tlist. While the x output of "select x from t group by x" can be presumed unique, this does not hold for "select x, generate_series(1,10) from t group by x", because we may expand the set-returning function after the grouping step. (Perhaps that should be re-thought; but considering all the other oddities involved with SRFs in targetlists, it seems unlikely we'll change it.) Put a check in query_is_distinct_for() so it's not fooled by such cases. Back-patch to all supported branches. David Rowley	2014-07-08 14:03:23 -04:00
Tom Lane	981518ea19	Add some errdetail to checkRuleResultList(). This function wasn't originally thought to be really user-facing, because converting a table to a view isn't something we expect people to do manually. So not all that much effort was spent on the error messages; in particular, while the code will complain that you got the column types wrong it won't say exactly what they are. But since we repurposed the code to also check compatibility of rule RETURNING lists, it's definitely user-facing. It now seems worthwhile to add errdetail messages showing exactly what the conflict is when there's a mismatch of column names or types. This is prompted by bug #10836 from Matthias Raffelsieper, which might have been forestalled if the error message had reported the wrong column type as being "record". Per Alvaro's advice, back-patch to branches before 9.4, but resist the temptation to rephrase any existing strings there. Adding new strings is not really a translation degradation; anyway having the info presented in English is better than not having it at all.	2014-07-02 14:20:40 -04:00
Tom Lane	0cf16686bc	Back-patch "Fix EquivalenceClass processing for nested append relations". When we committed `a87c729153`, we somehow failed to notice that it didn't merely improve plan quality for expression indexes; there were very closely related cases that failed outright with "could not find pathkey item to sort". The failing cases seem to be those where the planner was already capable of selecting a MergeAppend plan, and there was inheritance involved: the lack of appropriate eclass child members would prevent prepare_sort_from_pathkeys() from succeeding on the MergeAppend's child plan nodes for inheritance child tables. Accordingly, back-patch into 9.1 through 9.3, along with an extra regression test case covering the problem. Per trouble report from Michael Glaesemann.	2014-06-26 10:42:03 -07:00
Heikki Linnakangas	1c9f9e888f	Don't allow foreign tables with OIDs. The syntax doesn't let you specify "WITH OIDS" for foreign tables, but it was still possible with default_with_oids=true. But the rest of the system, including pg_dump, isn't prepared to handle foreign tables with OIDs properly. Backpatch down to 9.1, where foreign tables were introduced. It's possible that there are databases out there that already have foreign tables with OIDs. There isn't much we can do about that, but at least we can prevent them from being created in the future. Patch by Etsuro Fujita, reviewed by Hadi Moshayedi.	2014-06-24 13:30:54 +03:00
Tom Lane	b568d38361	Avoid leaking memory while evaluating arguments for a table function. ExecMakeTableFunctionResult evaluated the arguments for a function-in-FROM in the query-lifespan memory context. This is insignificant in simple cases where the function relation is scanned only once; but if the function is in a sub-SELECT or is on the inside of a nested loop, any memory consumed during argument evaluation can add up quickly. (The potential for trouble here had been foreseen long ago, per existing comments; but we'd not previously seen a complaint from the field about it.) To fix, create an additional temporary context just for this purpose. Per an example from MauMau. Back-patch to all active branches.	2014-06-19 22:13:51 -04:00
Tom Lane	8023235357	Fix ancient encoding error in hungarian.stop. When we grabbed this file off the Snowball project's website, we mistakenly supposed that it was in LATIN1 encoding, but evidently it was actually in LATIN2. This resulted in ő (o-double-acute, U+0151, which is code 0xF5 in LATIN2) being misconverted into õ (o-tilde, U+00F5), as complained of in bug #10589 from Zoltán Sörös. We'd have messed up u-double-acute too, but there aren't any of those in the file. Other characters used in the file have the same codes in LATIN1 and LATIN2, which no doubt helped hide the problem for so long. The error is not only ours: the Snowball project also was confused about which encoding is required for Hungarian. But dealing with that will require source-code changes that I'm not at all sure we'll wish to back-patch. Fixing the stopword file seems reasonably safe to back-patch however.	2014-06-10 22:48:45 -04:00
Tom Lane	187ae17300	Fix planner bug with nested PlaceHolderVars in 9.2 (only). Commit `9e7e29c75a` fixed some problems with LATERAL references in PlaceHolderVars, one of which was that "createplan.c wasn't handling nested PlaceHolderVars properly". I failed to see that this problem might occur in older versions as well; but it can, as demonstrated in bug #10587 from Geoff Speicher. In this case the nesting occurs due to push-down of PlaceHolderVar expressions into a parameterized path. So, back-patch the relevant changes from `9e7e29c75a` into 9.2 where parameterized paths were introduced. (Perhaps I'm still being too myopic, but I'm hesitant to change older branches without some evidence that the case can occur there.)	2014-06-09 21:28:56 -04:00
Tom Lane	93328b2dfb	Fix infinite loop when splitting inner tuples in SPGiST text indexes. Previously, the code used a node label of zero both for strings that contain no bytes beyond the inner tuple's prefix, and for cases where an "allTheSame" inner tuple has to be split to allow a string with a different next byte to be inserted into it. Failing to distinguish these cases meant that if a string ending with the current prefix needed to be inserted into an allTheSame tuple, we got into an infinite loop, because after splitting the tuple we'd descend into the child allTheSame tuple and then find we need to split again. To fix, instead use -1 and -2 as the node labels for these two cases. This requires widening the node label type from "char" to int2, but fortunately SPGiST stores all pass-by-value node label types in their Datum representation, which means that this change is transparently upward compatible so far as the on-disk representation goes. We continue to recognize zero as a dummy node label for reading purposes, but will not attempt to push new index entries down into such a label, so that the loop won't occur even when dealing with an existing index. Per report from Teodor Sigaev. Back-patch to 9.2 where the faulty code was introduced.	2014-06-09 16:30:46 -04:00
Tom Lane	4fb6478275	Add defenses against running with a wrong selection of LOBLKSIZE. It's critical that the backend's idea of LOBLKSIZE match the way data has actually been divided up in pg_largeobject. While we don't provide any direct way to adjust that value, doing so is a one-line source code change and various people have expressed interest recently in changing it. So, just as with TOAST_MAX_CHUNK_SIZE, it seems prudent to record the value in pg_control and cross-check that the backend's compiled-in setting matches the on-disk data. Also tweak the code in inv_api.c so that fetches from pg_largeobject explicitly verify that the length of the data field is not more than LOBLKSIZE. Formerly we just had Asserts() for that, which is no protection at all in production builds. In some of the call sites an overlength data value would translate directly to a security-relevant stack clobber, so it seems worth one extra runtime comparison to be sure. In the back branches, we can't change the contents of pg_control; but we can still make the extra checks in inv_api.c, which will offer some amount of protection against running with the wrong value of LOBLKSIZE.	2014-06-05 11:31:12 -04:00
Andres Freund	315442c011	Fix longstanding bug in HeapTupleSatisfiesVacuum(). HeapTupleSatisfiesVacuum() didn't properly discern between DELETE_IN_PROGRESS and INSERT_IN_PROGRESS for rows that have been inserted in the current transaction and deleted in a aborted subtransaction of the current backend. At the very least that caused problems for CLUSTER and CREATE INDEX in transactions that had aborting subtransactions producing rows, leading to warnings like: WARNING: concurrent delete in progress within table "..." possibly in an endless, uninterruptible, loop. Instead of treating InProgress xmins the same as IsCurrent ones, treat them as being distinct like the other visibility routines. As implemented this separatation can cause a behaviour change for rows that have been inserted and deleted in another, still running, transaction. HTSV will now return INSERT_IN_PROGRESS instead of DELETE_IN_PROGRESS for those. That's both, more in line with the other visibility routines and arguably more correct. The latter because a INSERT_IN_PROGRESS will make callers look at/wait for xmin, instead of xmax. The only current caller where that's possibly worse than the old behaviour is heap_prune_chain() which now won't mark the page as prunable if a row has concurrently been inserted and deleted. That's harmless enough. As a cautionary measure also insert a interrupt check before the gotos in IndexBuildHeapScan() that lead to the uninterruptible loop. There are other possible causes, like a row that several sessions try to update and all fail, for repeated loops and the cost of doing so in the retry case is low. As this bug goes back all the way to the introduction of subtransactions in `573a71a5da` backpatch to all supported releases. Reported-By: Sandro Santilli	2014-06-04 23:25:52 +02:00
Andres Freund	f998e99400	Set the process latch when processing recovery conflict interrupts. Because RecoveryConflictInterrupt() didn't set the process latch anything using the latter to wait for events didn't get notified about recovery conflicts. Most latch users are never the target of recovery conflicts, which explains the lack of reports about this until now. Since 9.3 two possible affected users exist though: The sql callable pg_sleep() now uses latches to wait and background workers are expected to use latches in their main loop. Both would currently wait until the end of WaitLatch's timeout. Fix by adding a SetLatch() to RecoveryConflictInterrupt(). It'd also be possible to fix the issue by having each latch user set set_latch_on_sigusr1. That seems failure prone and though, as most of these callsites won't often receive recovery conflicts and thus will likely only be tested against normal query cancels et al. It'd also be unnecessarily verbose. Backpatch to 9.1 where latches were introduced. Arguably 9.3 would be sufficient, because that's where pg_sleep() was converted to waiting on the latch and background workers got introduced; but there could be user level code making use of the latch pre 9.3.	2014-06-03 14:18:04 +02:00
Tom Lane	952b036054	Revert "Fix bogus %name-prefix option syntax in all our Bison files." This reverts commit `867363cbcb`. It turns out that the %name-prefix syntax without "=" does not work at all in pre-2.4 Bison. We are not prepared to make such a large jump in minimum required Bison version just to suppress a warning message in a version hardly any developers are using yet. When 3.0 gets more popular, we'll figure out a way to deal with this. In the meantime, BISONFLAGS=-Wno-deprecated is recommendable for anyone using 3.0 who doesn't want to see the warning.	2014-05-28 19:29:05 -04:00
Tom Lane	867363cbcb	Fix bogus %name-prefix option syntax in all our Bison files. %name-prefix doesn't use an "=" sign according to the Bison docs, but it silently accepted one anyway, until Bison 3.0. This was originally a typo of mine in commit `012abebab1`, and we seem to have slavishly copied the error into all the other grammar files. Per report from Vik Fearing; analysis by Peter Eisentraut. Back-patch to all active branches, since somebody might try to build a back branch with up-to-date tools.	2014-05-28 15:41:58 -04:00
Magnus Hagander	dbcde0f4d6	Ensure cleanup in case of early errors in streaming base backups Move the code that sends the initial status information as well as the calculation of paths inside the ENSURE_ERROR_CLEANUP block. If this code failed, we would "leak" a counter of number of concurrent backups, thereby making the system always believe it was in backup mode. This could happen if the sending failed (which it probably never did given that the small amount of data to send would never cause a flush). It is very low risk, but all operations after do_pg_start_backup should be protected.	2014-05-28 13:03:21 +02:00
Tom Lane	31f579f09c	Prevent auto_explain from changing the output of a user's EXPLAIN. Commit `af7914c662`, which introduced the EXPLAIN (TIMING) option, for some reason coded explain.c to look at planstate->instrument->need_timer rather than es->timing to decide whether to print timing info. However, the former flag might get set as a result of contrib/auto_explain wanting timing information. We certainly don't want activation of auto_explain to change user-visible statement behavior, so fix that. Also fix an independent bug introduced in the same patch: in the code path for a never-executed node with a machine-friendly output format, if timing was selected, it would fail to print the Actual Rows and Actual Loops items. Per bug #10404 from Tomonari Katsumata. Back-patch to 9.2 where the faulty code was introduced.	2014-05-20 12:20:57 -04:00
Heikki Linnakangas	0128a7712d	Use 0-based numbering in comments about backup blocks. The macros and functions that work with backup blocks in the redo function use 0-based numbering, so let's use that consistently in the function that generates the records too. Makes it so much easier to compare the generation and replay functions. Backpatch to 9.0, where we switched from 1-based to 0-based numbering.	2014-05-19 13:27:21 +03:00
Heikki Linnakangas	0d4c75f4de	Initialize tsId and dbId fields in WAL record of COMMIT PREPARED. Commit `dd428c79` added dbId and tsId to the xl_xact_commit struct but missed that prepared transaction commits reuse that struct. Fix that. Because those fields were left unitialized, replaying a commit prepared WAL record in a hot standby node would fail to remove the relcache init file. That can lead to "could not open file" errors on the standby. Relcache init file only needs to be removed when a system table/index is rewritten in the transaction using two phase commit, so that should be rare in practice. In HEAD, the incorrect dbId/tsId values are also used for filtering in logical replication code, causing the transaction to always be filtered out. Analysis and fix by Andres Freund. Backpatch to 9.0 where hot standby was introduced.	2014-05-16 10:05:57 +03:00
Tom Lane	9601cb7bcc	Fix unportable setvbuf() usage in initdb. In yesterday's commit `2dc4f011fd`, I tried to force buffering of stdout/stderr in initdb to be what it is by default when the program is run interactively on Unix (since that's how most manual testing is done). This tripped over the fact that Windows doesn't support _IOLBF mode. We dealt with that a long time ago in syslogger.c by falling back to unbuffered mode on Windows. Export that solution in port.h and use it in initdb. Back-patch to 8.4, like the previous commit.	2014-05-15 15:58:01 -04:00
Heikki Linnakangas	479a36f234	Handle duplicate XIDs in txid_snapshot. The proc array can contain duplicate XIDs, when a transaction is just being prepared for two-phase commit. To cope, remove any duplicates in txid_current_snapshot(). Also ignore duplicates in the input functions, so that if e.g. you have an old pg_dump file that already contains duplicates, it will be accepted. Report and fix by Jan Wieck. Backpatch to all supported versions.	2014-05-15 18:31:00 +03:00
Heikki Linnakangas	d8dbeb0482	Fix race condition in preparing a transaction for two-phase commit. To lock a prepared transaction's shared memory entry, we used to mark it with the XID of the backend. When the XID was no longer active according to the proc array, the entry was implicitly considered as not locked anymore. However, when preparing a transaction, the backend's proc array entry was cleared before transfering the locks (and some other state) to the prepared transaction's dummy PGPROC entry, so there was a window where another backend could finish the transaction before it was in fact fully prepared. To fix, rewrite the locking mechanism of global transaction entries. Instead of an XID, just have simple locked-or-not flag in each entry (we store the locking backend's backend id rather than a simple boolean, but that's just for debugging purposes). The backend is responsible for explicitly unlocking the entry, and to make sure that that happens, install a callback to unlock it on abort or process exit. Backpatch to all supported versions.	2014-05-15 16:58:14 +03:00
Tom Lane	25c933c5c9	Get rid of bogus dependency on typcategory in to_json() and friends. These functions were relying on typcategory to identify arrays and composites, which is not reliable and not the normal way to do it. Using typcategory to identify boolean, numeric types, and json itself is also pretty questionable, though the code in those cases didn't seem to be at risk of anything worse than wrong output. Instead, use the standard lsyscache functions to identify arrays and composites, and rely on a direct check of the type OID for the other cases. In HEAD, also be sure to look through domains so that a domain is treated the same as its base type for conversions to JSON. However, this is a small behavioral change; given the lack of field complaints, we won't back-patch it. In passing, refactor so that there's only one copy of the code that decides which conversion strategy to apply, not multiple copies that could (and have) gotten out of sync.	2014-05-09 12:55:06 -04:00
Heikki Linnakangas	31633f9925	Protect against torn pages when deleting GIN list pages. To-be-deleted list pages contain no useful information, as they are being deleted, but we must still protect the writes from being torn by a crash after a partial write. To do that, re-initialize the pages on WAL replay. Jeff Janes caught this with a test program to test partial writes. Backpatch to all supported versions.	2014-05-08 14:43:39 +03:00
Tom Lane	022b5f2b22	Fix failure to set ActiveSnapshot while rewinding a cursor. ActiveSnapshot needs to be set when we call ExecutorRewind because some plan node types may execute user-defined functions during their ReScan calls (nodeLimit.c does so, at least). The wisdom of that is somewhat debatable, perhaps, but for now the simplest fix is to make sure the required context is valid. Failure to do this typically led to a null-pointer-dereference core dump, though it's possible that in more complex cases a function could be executed with the wrong snapshot leading to very subtle misbehavior. Per report from Leif Jensen. It's been broken for a long time, so back-patch to all active branches.	2014-05-07 14:25:17 -04:00
Bruce Momjian	0b44914c21	Remove tabs after spaces in C comments This was not changed in HEAD, but will be done later as part of a pgindent run. Future pgindent runs will also do this. Report by Tom Lane Backpatch through all supported branches, but not HEAD	2014-05-06 11:26:27 -04:00
Heikki Linnakangas	17b04a1580	Fix use of free in walsender error handling after a sysid mismatch. Found via valgrind. The bug exists since the introduction of the walsender, so backpatch to 9.0. Andres Freund	2014-05-06 15:17:00 +03:00
Tom Lane	c8fbeeb45e	Fix possible cache invalidation failure in ReceiveSharedInvalidMessages. Commit `fad153ec45` modified sinval.c to reduce the number of calls into sinvaladt.c (which require taking a shared lock) by keeping a local buffer of collected-but-not-yet-processed messages. However, if processing of the last message in a batch resulted in a recursive call to ReceiveSharedInvalidMessages, we could overwrite that message with a new one while the outer invalidation function was still working on it. This would be likely to lead to invalidation of the wrong cache entry, allowing subsequent processing to use stale cache data. The fix is just to make a local copy of each message while we're processing it. Spotted by Andres Freund. Back-patch to 8.4 where the bug was introduced.	2014-05-05 14:43:46 -04:00
Tom Lane	8c43980a18	Fix failure to detoast fields in composite elements of structured types. If we have an array of records stored on disk, the individual record fields cannot contain out-of-line TOAST pointers: the tuptoaster.c mechanisms are only prepared to deal with TOAST pointers appearing in top-level fields of a stored row. The same applies for ranges over composite types, nested composites, etc. However, the existing code only took care of expanding sub-field TOAST pointers for the case of nested composites, not for other structured types containing composites. For example, given a command such as UPDATE tab SET arraycol = ARRAY[(ROW(x,42)::mycompositetype] ... where x is a direct reference to a field of an on-disk tuple, if that field is long enough to be toasted out-of-line then the TOAST pointer would be inserted as-is into the array column. If the source record for x is later deleted, the array field value would become a dangling pointer, leading to errors along the line of "missing chunk number 0 for toast value ..." when the value is referenced. A reproducible test case for this was provided by Jan Pecek, but it seems likely that some of the "missing chunk number" reports we've heard in the past were caused by similar issues. Code-wise, the problem is that PG_DETOAST_DATUM() is not adequate to produce a self-contained Datum value if the Datum is of composite type. Seen in this light, the problem is not just confined to arrays and ranges, but could also affect some other places where detoasting is done in that way, for example form_index_tuple(). I tried teaching the array code to apply toast_flatten_tuple_attribute() along with PG_DETOAST_DATUM() when the array element type is composite, but this was messy and imposed extra cache lookup costs whether or not any TOAST pointers were present, indeed sometimes when the array element type isn't even composite (since sometimes it takes a typcache lookup to find that out). The idea of extending that approach to all the places that currently use PG_DETOAST_DATUM() wasn't attractive at all. This patch instead solves the problem by decreeing that composite Datum values must not contain any out-of-line TOAST pointers in the first place; that is, we expand out-of-line fields at the point of constructing a composite Datum, not at the point where we're about to insert it into a larger tuple. This rule is applied only to true composite Datums, not to tuples that are being passed around the system as tuples, so it's not as invasive as it might sound at first. With this approach, the amount of code that has to be touched for a full solution is greatly reduced, and added cache lookup costs are avoided except when there actually is a TOAST pointer that needs to be inlined. The main drawback of this approach is that we might sometimes dereference a TOAST pointer that will never actually be used by the query, imposing a rather large cost that wasn't there before. On the other side of the coin, if the field value is used multiple times then we'll come out ahead by avoiding repeat detoastings. Experimentation suggests that common SQL coding patterns are unaffected either way, though. Applications that are very negatively affected could be advised to modify their code to not fetch columns they won't be using. In future, we might consider reverting this solution in favor of detoasting only at the point where data is about to be stored to disk, using some method that can drill down into multiple levels of nested structured types. That will require defining new APIs for structured types, though, so it doesn't seem feasible as a back-patchable fix. Note that this patch changes HeapTupleGetDatum() from a macro to a function call; this means that any third-party code using that macro will not get protection against creating TOAST-pointer-containing Datums until it's recompiled. The same applies to any uses of PG_RETURN_HEAPTUPLEHEADER(). It seems likely that this is not a big problem in practice: most of the tuple-returning functions in core and contrib produce outputs that could not possibly be toasted anyway, and the same probably holds for third-party extensions. This bug has existed since TOAST was invented, so back-patch to all supported branches.	2014-05-01 15:19:14 -04:00
Tom Lane	920fbc1b43	Check for interrupts and stack overflow during rule/view dumps. Since ruleutils.c recurses, it could be driven to stack overflow by deeply nested constructs. Very large queries might also take long enough to deparse that a check for interrupts seems like a good idea. Stick appropriate tests into a couple of key places. Noted by Greg Stark. Back-patch to all supported branches.	2014-04-30 13:46:19 -04:00
Tom Lane	0901dbab33	Improve planner to drop constant-NULL inputs of AND/OR where it's legal. In general we can't discard constant-NULL inputs, since they could change the result of the AND/OR to be NULL. But at top level of WHERE, we do not need to distinguish a NULL result from a FALSE result, so it's okay to treat NULL as FALSE and then simplify AND/OR accordingly. This is a very ancient oversight, but in 9.2 and later it can lead to failure to optimize queries that previous releases did optimize, as a result of more aggressive parameter substitution rules making it possible to reduce more subexpressions to NULL constants. This is the root cause of bug #10171 from Arnold Scheffler. We could alternatively have fixed that by teaching orclauses.c to ignore constant-NULL OR arms, but it seems better to get rid of them globally. I resisted the temptation to back-patch this change into all active branches, but it seems appropriate to back-patch as far as 9.2 so that there will not be performance regressions of the kind shown in this bug.	2014-04-29 13:12:33 -04:00
Heikki Linnakangas	1e96eff43a	Fix two bugs in WAL-logging of GIN pending-list pages. In writeListPage, never take a full-page image of the page, because we have all the information required to re-initialize in the WAL record anyway. Before this fix, a full-page image was always generated, unless full_page_writes=off, because when the page is initialized its LSN is always 0. In stable-branches, keep the code to restore the backup blocks if they exist, in case that the WAL is generated with an older minor version, but in master Assert that there are no full-page images. In the redo routine, add missing "off++". Otherwise the tuples are added to the page in reverse order. That happens to be harmless because we always scan and remove all the tuples together, but it was clearly wrong. Also, it was masked by the first bug unless full_page_writes=off, because the page was always restored from a full-page image. Backpatch to all supported versions.	2014-04-28 17:31:07 +03:00
Tom Lane	ea9ac77419	Reset pg_stat_activity.xact_start during PREPARE TRANSACTION. Once we've completed a PREPARE, our session is not running a transaction, so its entry in pg_stat_activity should show xact_start as null, rather than leaving the value as the start time of the now-prepared transaction. I think possibly this oversight was triggered by faulty extrapolation from the adjacent comment that says PrepareTransaction should not call AtEOXact_PgStat, so tweak the wording of that comment. Noted by Andres Freund while considering bug #10123 from Maxim Boguk, although this error doesn't seem to explain that report. Back-patch to all active branches.	2014-04-24 13:30:00 -04:00
Heikki Linnakangas	6c5cba8e0b	Fix typos in comment.	2014-04-23 12:58:18 +03:00
Tom Lane	bac05d622d	Use AF_UNSPEC not PF_UNSPEC in getaddrinfo calls. According to the Single Unix Spec and assorted man pages, you're supposed to use the constants named AF_xxx when setting ai_family for a getaddrinfo call. In a few places we were using PF_xxx instead. Use of PF_xxx appears to be an ancient BSD convention that was not adopted by later standardization. On BSD and most later Unixen, it doesn't matter much because those constants have equivalent values anyway; but nonetheless this code is not per spec. In the same vein, replace PF_INET by AF_INET in one socket() call, which wasn't even consistent with the other socket() call in the same function let alone the remainder of our code. Per investigation of a Cygwin trouble report from Marco Atzeri. It's probably a long shot that this will fix his issue, but it's wrong in any case.	2014-04-16 13:21:31 -04:00
Bruce Momjian	966f015b60	check socket creation errors against PGINVALID_SOCKET Previously, in some places, socket creation errors were checked for negative values, which is not true for Windows because sockets are unsigned. This masked socket creation errors on Windows. Backpatch through 9.0. 8.4 doesn't have the infrastructure to fix this.	2014-04-16 10:45:48 -04:00
Heikki Linnakangas	a4c4e0bf60	Use correctly-sized buffer when zero-filling a WAL file. I mixed up BLCKSZ and XLOG_BLCKSZ when I changed the way the buffer is allocated a couple of weeks ago. With the default settings, they are both 8k, but they can be changed at compile-time.	2014-04-16 10:26:54 +03:00
Heikki Linnakangas	02b9fd73ee	Fix hot standby bug with GiST scans. Don't reset the rightlink of a page when replaying a page update record. This was a leftover from pre-hot standby days, when it was not possible to have scans concurrent with WAL replay. Resetting the right-link was not necessary back then either, but it was done for the sake of tidiness. But with hot standby, it's wrong, because a concurrent scan might still need it. Backpatch all versions with hot standby, 9.0 and above.	2014-04-08 14:51:56 +03:00
Robert Haas	74cf802841	Assert that strong-lock count is >0 everywhere it's decremented. The one existing assertion of this type has tripped a few times in the buildfarm lately, but it's not clear whether the problem is really originating there or whether it's leftovers from a trip through one of the other two paths that lack a matching assertion. So add one. Since the same bug(s) most likely exist(s) in the back-branches also, back-patch to 9.2, where the fast-path lock mechanism was added.	2014-04-07 11:15:13 -04:00
Tom Lane	53463e2479	Block signals earlier during postmaster startup. Formerly, we set up the postmaster's signal handling only when we were about to start launching subprocesses. This is a bad idea though, as it means that for example a SIGINT arriving before that will kill the postmaster instantly, perhaps leaving lockfiles, socket files, shared memory, etc laying about. We'd rather that such a signal caused orderly postmaster termination including releasing of those resources. A simple fix is to move the PostmasterMain stanza that initializes signal handling to an earlier point, before we've created any such resources. Then, an early-arriving signal will be blocked until we're ready to deal with it in the usual way. (The only part that really needs to be moved up is blocking of signals, but it seems best to keep the signal handler installation calls together with that; for one thing this ensures the kernel won't drop any signals we wished to get. The handlers won't get invoked in any case until we unblock signals in ServerLoop.) Per a report from MauMau. He proposed changing the way "pg_ctl stop" works to deal with this, but that'd just be masking one symptom not fixing the core issue. It's been like this since forever, so back-patch to all supported branches.	2014-04-05 18:16:14 -04:00
Tom Lane	bdc3e95c2a	Fix processing of PGC_BACKEND GUC parameters on Windows. EXEC_BACKEND builds (i.e., Windows) failed to absorb values of PGC_BACKEND parameters if they'd been changed post-startup via the config file. This for example prevented log_connections from working if it were turned on post-startup. The mechanism for handling this case has always been a bit of a kluge, and it wasn't revisited when we implemented EXEC_BACKEND. While in a normal forking environment new backends will inherit the postmaster's value of such settings, EXEC_BACKEND backends have to read the settings from the CONFIG_EXEC_PARAMS file, and they were mistakenly rejecting them. So this case has always been broken in the Windows port; so back-patch to all supported branches. Amit Kapila	2014-04-05 12:41:31 -04:00
Tom Lane	1a496a12b3	Fix tablespace creation WAL replay to work on Windows. The code segment that removes the old symlink (if present) wasn't clued into the fact that on Windows, symlinks are junction points which have to be removed with rmdir(). Backpatch to 9.0, where the failing code was introduced. MauMau, reviewed by Muhammad Asif Naeem and Amit Kapila	2014-04-04 23:09:41 -04:00

1 2 3 4 5 ...

13127 Commits