postgres

mirror of https://github.com/postgres/postgres.git synced 2025-12-06 00:02:13 +03:00

Author	SHA1	Message	Date
Andrew Dunstan	d7b5a31ce9	fix whitespace	2014-02-01 16:30:22 -05:00
Tom Lane	6f1a407731	Fix some more bugs in signal handlers and process shutdown logic. WalSndKill was doing things exactly backwards: it should first clear MyWalSnd (to stop signal handlers from touching MyWalSnd->latch), then disown the latch, and only then mark the WalSnd struct unused by clearing its pid field. Also, WalRcvSigUsr1Handler and worker_spi_sighup failed to preserve errno, which is surely a requirement for any signal handler. Per discussion of recent buildfarm failures. Back-patch as far as the relevant code exists.	2014-02-01 16:21:30 -05:00
Andrew Dunstan	27942baf49	Don't use deprecated dllwrap on Cygwin. The preferred method is to use "cc -shared", and this allows binaries to be rebased if required, unlike dllwrap. Backpatch to 9.0 where we have buildfarm coverage. There are still some issues with Cygwin, especially modern Cygwin, but this helps us get closer to good support. Marco Atzeri.	2014-02-01 16:13:32 -05:00
Andrew Dunstan	1e9876c3b6	Copy the libpq DLL to the bin directory on Mingw and Cygwin. This has long been done by the MSVC build system, and has caused confusion in the past when programs like psql have failed to start because they can't find the DLL. If it's in the same directory as it now will be they will find it. Backpatch to all live branches.	2014-02-01 15:16:06 -05:00
Robert Haas	5d807a74bf	Clear MyProc and MyProcSignalState before they become invalid. Evidence from buildfarm member crake suggests that the new test_shm_mq module is routinely crashing the server due to the arrival of a SIGUSR1 after the shared memory segment has been unmapped. Although processes using the new dynamic background worker facilities are more likely to receive a SIGUSR1 around this time, the problem is also possible on older branches, so I'm back-patching the parts of this change that apply to older branches as far as they apply. It's already generally the case that code checks whether these pointers are NULL before deferencing them, so the important thing is mostly to make sure that they do get set to NULL before they become invalid. But in master, there's one case in procsignal_sigusr1_handler that lacks a NULL guard, so add that. Patch by me; review by Tom Lane.	2014-01-31 21:34:44 -05:00
Tom Lane	a4aa854cad	Fix bogus handling of "postponed" lateral quals. When pulling a "postponed" qual from a LATERAL subquery up into the quals of an outer join, we must make sure that the postponed qual is included in those seen by make_outerjoininfo(). Otherwise we might compute a too-small min_lefthand or min_righthand for the outer join, leading to "JOIN qualification cannot refer to other relations" failures from distribute_qual_to_rels. Subtler errors in the created plan seem possible, too, if the extra qual would only affect join ordering constraints. Per bug #9041 from David Leverton. Back-patch to 9.3.	2014-01-30 14:51:19 -05:00
Tom Lane	bf8ee6f157	Fix unsafe references to errno within error messaging logic. Various places were supposing that errno could be expected to hold still within an ereport() nest or similar contexts. This isn't true necessarily, though in some cases it accidentally failed to fail depending on how the compiler chanced to order the subexpressions. This class of thinko explains recent reports of odd failures on clang-built versions, typically missing or inappropriate HINT fields in messages. Problem identified by Christian Kruse, who also submitted the patch this commit is based on. (I fixed a few issues in his patch and found a couple of additional places with the same disease.) Back-patch as appropriate to all supported branches.	2014-01-29 20:04:01 -05:00
Andrew Dunstan	56c08df55b	Enable building with Visual Studion 2013. Backpatch to 9.3. Brar Piening.	2014-01-26 09:45:43 -05:00
Stephen Frost	8cb90b21af	Avoid minor leak in parallel pg_dump During parallel pg_dump, a worker process closing the connection caused a minor memory leak (particularly minor as we are likely about to exit anyway). Instead, free the memory in this case prior to returning NULL to indicate connection closed. Spotting by the Coverity scanner. Back patch to 9.3 where this was introduced.	2014-01-24 15:12:54 -05:00
Fujii Masao	be5d499743	Fix bugs in PQhost(). In the platform that doesn't support Unix-domain socket, when neither host nor hostaddr are specified, the default host 'localhost' is used to connect to the server and PQhost() must return that, but it didn't. This patch fixes PQhost() so that it returns the default host in that case. Also this patch fixes PQhost() so that it doesn't return Unix-domain socket directory path in the platform that doesn't support Unix-domain socket. Back-patch to all supported versions.	2014-01-23 23:00:30 +09:00
Stephen Frost	d1e3070f04	Allow type_func_name_keywords in even more places A while back, `2c92edad48` allowed type_func_name_keywords to be used in more places, including role identifiers. Unfortunately, that commit missed out on cases where name_list was used for lists-of-roles, eg: for DROP ROLE. This resulted in the unfortunate situation that you could CREATE a role with a type_func_name_keywords-allowed identifier, but not DROP it (directly- ALTER could be used to rename it to something which could be DROP'd). This extends allowing type_func_name_keywords to places where role lists can be used. Back-patch to 9.0, as `2c92edad48` was.	2014-01-21 22:56:30 -05:00
Tom Lane	0950d67ee5	Tweak parse location assignment for CURRENT_DATE and related constructs. All these constructs generate parse trees consisting of a Const and a run-time type coercion (perhaps a FuncExpr or a CoerceViaIO). Modify the raw parse output so that we end up with the original token's location attached to the type coercion node while the Const has location -1; before, it was the other way around. This makes no difference in terms of what exprLocation() will say about the parse tree as a whole, so it should not have any user-visible impact. The point of changing it is that we do not want contrib/pg_stat_statements to treat these constructs as replaceable constants. It will do the right thing if the Const has location -1 rather than a valid location. This is a pretty ugly hack, but then this code is ugly already; we should someday replace this translation with special-purpose parse node(s) that would allow ruleutils.c to reconstruct the original query text. (See also commit `5d3fcc4c2e`, which also hacked location assignment rules for the benefit of pg_stat_statements.) Back-patch to 9.2 where pg_stat_statements grew the ability to recognize replaceable constants. Kyotaro Horiguchi	2014-01-21 16:34:31 -05:00
Robert Haas	3888b73f9a	Fix inadvertent semantics change in last patch to plug memory leaks. Commit `a5bca4ef03` accidentally changed the semantics when the "skipping missing configuration file" is emitted, because it forced OK to true instead of leaving the value untouched. Spotted by Tom Lane.	2014-01-21 11:49:13 -05:00
Robert Haas	efcdb625b3	Plug more memory leaks when reloading config file. Commit `138184adc5` plugged some but not all of the leaks from commit `2a0c81a12c`. This tightens things up some more. Amit Kapila, per an observation by Tom Lane	2014-01-21 09:48:14 -05:00
Stephen Frost	86e58ae023	Allow SET TABLESPACE to database default We've always allowed CREATE TABLE to create tables in the database's default tablespace without checking for CREATE permissions on that tablespace. Unfortunately, the original implementation of ALTER TABLE ... SET TABLESPACE didn't pick up on that exception. This changes ALTER TABLE ... SET TABLESPACE to allow the database's default tablespace without checking for CREATE rights on that tablespace, just as CREATE TABLE works today. Users could always do this through a series of commands (CREATE TABLE ... AS SELECT * FROM ...; DROP TABLE ...; etc), so let's fix the oversight in SET TABLESPACE's original implementation.	2014-01-18 18:49:08 -05:00
Peter Eisentraut	586bea6124	Fix client-only installation The psql Makefile was not creating $(datadir) before installing psqlrc.sample there. In most cases, the directory would be created in some other way, but for the documented from-source client-only installation procedure, it could fail. Reported-by: Mike Blackwell <mike.blackwell@rrd.com>	2014-01-17 23:11:02 -05:00
Heikki Linnakangas	e34acac62e	Fix Hot Standby feedback sending when streaming busily. Commit `6f60fdd701` accidentally removed a call to XLogWalRcvSendHSFeedback() after flushing received WAL to disk. The consequence is that when walsender is busy streaming WAL, it doesn't send HS feedback messages. One is sent if nothing is received from the master for 100ms, but if there's a steady stream of WAL, it never happens. Backpatch to 9.3. Andres Freund and Amit Kapila	2014-01-16 23:14:57 +02:00
Tom Lane	74c32f3455	Improve FILES section of psql reference page. Primarily, explain where to find the system-wide psqlrc file, per recent gripe from John Sutton. Do some general wordsmithing and improve the markup, too. Also adjust psqlrc.sample so its comments about file location are somewhat trustworthy. (Not sure why we bother with this file when it's empty, but whatever.) Back-patch to 9.2 where the startup file naming scheme was last changed.	2014-01-14 19:28:06 -05:00
Tom Lane	ebde6c4014	Fix multiple bugs in index page locking during hot-standby WAL replay. In ordinary operation, VACUUM must be careful to take a cleanup lock on each leaf page of a btree index; this ensures that no indexscans could still be "in flight" to heap tuples due to be deleted. (Because of possible index-tuple motion due to concurrent page splits, it's not enough to lock only the pages we're deleting index tuples from.) In Hot Standby, the WAL replay process must likewise lock every leaf page. There were several bugs in the code for that: * The replay scan might come across unused, all-zero pages in the index. While btree_xlog_vacuum itself did the right thing (ie, nothing) with such pages, xlogutils.c supposed that such pages must be corrupt and would throw an error. This accounts for various reports of replication failures with "PANIC: WAL contains references to invalid pages". To fix, add a ReadBufferMode value that instructs XLogReadBufferExtended not to complain when we're doing this. * btree_xlog_vacuum performed the extra locking if standbyState == STANDBY_SNAPSHOT_READY, but that's not the correct test: we won't open up for hot standby queries until the database has reached consistency, and we don't want to do the extra locking till then either, for fear of reading corrupted pages (which bufmgr.c would complain about). Fix by exporting a new function from xlog.c that will report whether we're actually in hot standby replay mode. * To ensure full coverage of the index in the replay scan, btvacuumscan would emit a dummy WAL record for the last page of the index, if no vacuuming work had been done on that page. However, if the last page of the index is all-zero, that would result in corruption of said page, since the functions called on it weren't prepared to handle that case. There's no need to lock any such pages, so change the logic to target the last normal leaf page instead. The first two of these bugs were diagnosed by Andres Freund, the other one by me. Fixes based on ideas from Heikki Linnakangas and myself. This has been wrong since Hot Standby was introduced, so back-patch to 9.0.	2014-01-14 17:34:51 -05:00
Bruce Momjian	efb41ba33a	Fix pg_dumpall on pre-8.1 servers rolname did not exist in pg_shadow. Backpatch to 9.3 Report by Andrew Gierth via IRC	2014-01-12 22:26:00 -05:00
Tom Lane	27ff4cfe76	Disallow LATERAL references to the target table of an UPDATE/DELETE. On second thought, commit `0c051c9008` was over-hasty: rather than allowing this case, we ought to reject it for now. That leaves the field clear for a future feature that allows the target table to be re-specified in the FROM (or USING) clause, which will enable left-joining the target table to something else. We can then also allow LATERAL references to such an explicitly re-specified target table. But allowing them right now will create ambiguities or worse for such a feature, and it isn't something we documented 9.3 as supporting. While at it, add a convenience subroutine to avoid having several copies of the ereport for disalllowed-LATERAL-reference cases.	2014-01-11 19:03:15 -05:00
Tom Lane	5bfcc9ec5e	Fix possible crashes due to using elog/ereport too early in startup. Per reports from Andres Freund and Luke Campbell, a server failure during set_pglocale_pgservice results in a segfault rather than a useful error message, because the infrastructure needed to use ereport hasn't been initialized; specifically, MemoryContextInit hasn't been called. One known cause of this is starting the server in a directory it doesn't have permission to read. We could try to prevent set_pglocale_pgservice from using anything that depends on palloc or elog, but that would be messy, and the odds of future breakage seem high. Moreover there are other things being called in main.c that look likely to use palloc or elog too --- perhaps those things shouldn't be there, but they are there today. The best solution seems to be to move the call of MemoryContextInit to very early in the backend's real main() function. I've verified that an elog or ereport occurring immediately after that is now capable of sending something useful to stderr. I also added code to elog.c to print something intelligible rather than just crashing if MemoryContextInit hasn't created the ErrorContext. This could happen if MemoryContextInit itself fails (due to malloc failure), and provides some future-proofing against someone trying to sneak in new code even earlier in server startup. Back-patch to all supported branches. Since we've only heard reports of this type of failure recently, it may be that some recent change has made it more likely to see a crash of this kind; but it sure looks like it's broken all the way back.	2014-01-11 16:35:30 -05:00
Tom Lane	36785a21ba	Fix compute_scalar_stats() for case that all values exceed WIDTH_THRESHOLD. The standard typanalyze functions skip over values whose detoasted size exceeds WIDTH_THRESHOLD (1024 bytes), so as to limit memory bloat during ANALYZE. However, we (I think I, actually :-() failed to consider the possibility that every non-null value in a column is too wide. While compute_minimal_stats() seems to behave reasonably anyway in such a case, compute_scalar_stats() just fell through and generated no pg_statistic entry at all. That's unnecessarily pessimistic: we can still produce valid stanullfrac and stawidth values in such cases, since we do include too-wide values in the average-width calculation. Furthermore, since the general assumption in this code is that too-wide values are probably all distinct from each other, it seems reasonable to set stadistinct to -1 ("all distinct"). Per complaint from Kadri Raudsepp. This has been like this since roughly neolithic times, so back-patch to all supported branches.	2014-01-11 13:41:51 -05:00
Alvaro Herrera	a25c2b7c4d	Accept pg_upgraded tuples during multixact freezing The new MultiXact freezing routines introduced by commit `8e9a16ab8f` neglected to consider tuples that came from a pg_upgrade'd database; a vacuum run that tried to freeze such tuples would die with an error such as ERROR: MultiXactId 11415437 does no longer exist -- apparent wraparound To fix, ensure that GetMultiXactIdMembers is allowed to return empty multis when the infomask bits are right, as is done in other callsites. Per trouble report from F-Secure. In passing, fix a copy&paste bug reported by Andrey Karpov from VIVA64 from their PVS-Studio static checked, that instead of setting relminmxid to Invalid, we were setting relfrozenxid twice. Not an important mistake because that code branch is about relations for which we don't use the frozenxid/minmxid values at all in the first place, but seems to warrants a fix nonetheless.	2014-01-10 18:03:18 -03:00
Michael Meskes	28fff0ef8b	Fix descriptor output in ECPG. While working on most platforms the old way sometimes created alignment problems. This should fix it. Also the regresion tests were updated to test for the reported case. Report and fix by MauMau <maumau307@gmail.com>	2014-01-09 15:41:51 +01:00
Tom Lane	47ac4473ac	Fix "cannot accept a set" error when only some arms of a CASE return a set. In commit `c1352052ef`, I implemented an optimization that assumed that a function's argument expressions would either always return a set (ie multiple rows), or always not. This is wrong however: we allow CASE expressions in which some arms return a set of some type and others just return a scalar of that type. There may be other examples as well. To fix, replace the run-time test of whether an argument returned a set with a static precheck (expression_returns_set). This adds a little bit of query startup overhead, but it seems barely measurable. Per bug #8228 from David Johnston. This has been broken since 8.0, so patch all supported branches.	2014-01-08 20:18:10 -05:00
Heikki Linnakangas	3aefff422a	Fix pause_at_recovery_target + recovery_target_inclusive combination. If pause_at_recovery_target is set, recovery pauses before applying the target record, even if recovery_target_inclusive is set. If you then continue with pg_xlog_replay_resume(), it will apply the target record before ending recovery. In other words, if you log in while it's paused and verify that the database looks OK, ending recovery changes its state again, possibly destroying data that you were tring to salvage with PITR. Backpatch to 9.1, this has been broken since pause_at_recovery_target was added.	2014-01-08 23:30:46 +02:00
Heikki Linnakangas	425bef6ee7	Fix bug in determining when recovery has reached consistency. When starting WAL replay from an online checkpoint, the last replayed WAL record variable was initialized using the checkpoint record's location, even though the records between the REDO location and the checkpoint record had not been replayed yet. That was noted as "slightly confusing" but harmless in the comment, but in some cases, it fooled CheckRecoveryConsistency to incorrectly conclude that we had already reached a consistent state immediately at the beginning of WAL replay. That caused the system to accept read-only connections in hot standby mode too early, and also PANICs with message "WAL contains references to invalid pages". Fix by initializing the variables to the REDO location instead. In 9.2 and above, change CheckRecoveryConsistency() to use lastReplayedEndRecPtr variable when checking if backup end location has been reached. It was inconsistently using EndRecPtr for that check, but lastReplayedEndRecPtr when checking min recovery point. It made no difference before this patch, because in all the places where CheckRecoveryConsistency was called the two variables were the same, but it was always an accident waiting to happen, and would have been wrong after this patch anyway. Report and analysis by Tomonari Katsumata, bug #8686. Backpatch to 9.0, where hot standby was introduced.	2014-01-08 14:32:22 +02:00
Tom Lane	0a43e5466c	Fix LATERAL references to target table of UPDATE/DELETE. I failed to think much about UPDATE/DELETE when implementing LATERAL :-(. The implemented behavior ended up being that subqueries in the FROM or USING clause (respectively) could access the update/delete target table as though it were a lateral reference; which seems fine if they said LATERAL, but certainly ought to draw an error if they didn't. Fix it so you get a suitable error when you omit LATERAL. Per report from Emre Hasegeli.	2014-01-07 15:25:19 -05:00
Magnus Hagander	91c2755fcb	Move permissions check from do_pg_start_backup to pg_start_backup And the same for do_pg_stop_backup. The code in do_pg_* is not allowed to access the catalogs. For manual base backups, the permissions check can be handled in the calling function, and for streaming base backups only users with the required permissions can get past the authentication step in the first place. Reported by Antonin Houska, diagnosed by Andres Freund	2014-01-07 17:51:02 +01:00
Magnus Hagander	0463b9419b	Avoid including tablespaces inside PGDATA twice in base backups If a tablespace was crated inside PGDATA it was backed up both as part of the PGDATA backup and as the backup of the tablespace. Avoid this by skipping any directory inside PGDATA that contains one of the active tablespaces. Dimitri Fontaine and Magnus Hagander	2014-01-07 17:11:51 +01:00
Heikki Linnakangas	8660f4b759	Remove bogus -K option from pg_dump. I added it to the getopt call by accident in commit `691e595dd9`. Amit Kapila	2014-01-06 12:33:05 +02:00
Tom Lane	341f0bc495	Fix translatability markings in psql, and add defenses against future bugs. Several previous commits have added columns to various \d queries without updating their translate_columns[] arrays, leading to potentially incorrect translations in NLS-enabled builds. Offenders include commit `893686762` (added prosecdef to \df+), `c9ac00e6e` (added description to \dc+) and `3b17efdfd` (added description to \dC+). Fix those cases back to 9.3 or 9.2 as appropriate. Since this is evidently more easily missed than one would like, in HEAD also add an Assert that the supplied array is long enough. This requires an API change for printQuery(), so it seems inappropriate for back branches, but presumably all future changes will be tested in HEAD anyway. In HEAD and 9.3, also clean up a whole lot of sloppiness in the emitted SQL for \dy (event triggers): lack of translatability due to failing to pass words-to-be-translated through gettext_noop(), inadequate schema qualification, and sloppy formatting resulting in unnecessarily ugly -E output. Peter Eisentraut and Tom Lane, per bug #8702 from Sergey Burladyan	2014-01-04 16:05:20 -05:00
Alvaro Herrera	948a3dfbb7	Handle 5-char filenames in SlruScanDirectory Original users of slru.c were all producing 4-digit filenames, so that was all that that code was prepared to handle. Changes to multixact.c in the course of commit `0ac5ad5134` made pg_multixact/members create 5-digit filenames once a certain threshold was reached, which SlruScanDirectory wasn't prepared to deal with; in particular, 5-digit-name files were not removed during truncation. Change that routine to make it aware of those files, and have it process them just like any others. Right now, some pg_multixact/members directories will contain a mixture of 4-char and 5-char filenames. A future commit is expected fix things so that each slru.c user declares the correct maximum width for the files it produces, to avoid such unsightly mixtures. Noticed while investigating bug #8673 reported by Serge Negodyuck.	2014-01-02 18:17:29 -03:00
Alvaro Herrera	03db794596	Wrap multixact/members correctly during extension In the 9.2 code for extending multixact/members, the logic was very simple because the number of entries in a members page was a proper divisor of 2^32, and thus at 2^32 wraparound the logic for page switch was identical than at any other page boundary. In commit `0ac5ad5134` I failed to realize this and introduced code that was not able to go over the 2^32 boundary. Fix that by ensuring that when we reach the last page of the last segment we correctly zero the initial page of the initial segment, using correct uint32-wraparound-safe arithmetic. Noticed while investigating bug #8673 reported by Serge Negodyuck, as diagnosed by Andres Freund.	2014-01-02 18:17:07 -03:00
Alvaro Herrera	f40e79173a	Handle wraparound during truncation in multixact/members In pg_multixact/members, relying on modulo-2^32 arithmetic for wraparound handling doesn't work all that well. Because we don't explicitely track wraparound of the allocation counter for members, it is possible that the "live" area exceeds 2^31 entries; trying to remove SLRU segments that are "old" according to the original logic might lead to removal of segments still in use. To fix, have the truncation routine use a tailored SlruScanDirectory callback that keeps track of the live area in actual use; that way, when the live range exceeds 2^31 entries, the oldest segments still live will not get removed untimely. This new SlruScanDir callback needs to take care not to remove segments that are "in the future": if new SLRU segments appear while the truncation is ongoing, make sure we don't remove them. This requires examination of shared memory state to recheck for false positives, but testing suggests that this doesn't cause a problem. The original coding didn't suffer from this pitfall because segments created when truncation is running are never considered to be removable. Per Andres Freund's investigation of bug #8673 reported by Serge Negodyuck.	2014-01-02 18:16:54 -03:00
Michael Meskes	8404037d89	Do not use an empty hostname. When trying to connect to a given database libecpg should not try using an empty hostname if no hostname was given.	2014-01-01 12:40:28 +01:00
Tom Lane	9a6e2b150f	Fix broken support for event triggers as extension members. CREATE EVENT TRIGGER forgot to mark the event trigger as a member of its extension, and pg_dump didn't pay any attention anyway when deciding whether to dump the event trigger. Per report from Moshe Jacobson. Given the obvious lack of testing here, it's rather astonishing that ALTER EXTENSION ADD/DROP EVENT TRIGGER work, but they seem to.	2013-12-30 14:00:05 -05:00
Kevin Grittner	30c36d8cc0	Don't attempt to limit target database for pg_restore. There was an apparent attempt to limit the target database for pg_restore to version 7.1.0 or later. Due to a leading zero this was interpreted as an octal number, which allowed targets with version numbers down to 2.87.36. The lowest actual release above that was 6.0.0, so that was effectively the limit. Since the success of the restore attempt will depend primarily on on what statements were generated by the dump run, we don't want pg_restore trying to guess whether a given target should be allowed based on version number. Allow a connection to any version. Since it is very unlikely that anyone would be using a recent version of pg_restore to restore to a pre-6.0 database, this has little to no practical impact, but it makes the code less confusing to read. Issue reported and initial patch suggestion from Joel Jacobson based on an article by Andrey Karpov reporting on issues found by PVS-Studio static code analyzer. Final patch based on analysis by Tom Lane. Back-patch to all supported branches.	2013-12-29 15:18:22 -06:00
Andrew Dunstan	7dfd9f6f54	Properly detect invalid JSON numbers when generating JSON. Instead of looking for characters that aren't valid in JSON numbers, we simply pass the output string through the JSON number parser, and if it fails the string is quoted. This means among other things that money and domains over money will be quoted correctly and generate valid JSON. Fixes bug #8676 reported by Anderson Cristian da Silva. Backpatched to 9.2 where JSON generation was introduced.	2013-12-27 17:21:04 -05:00
Kevin Grittner	28b60aa231	Fix misplaced right paren bugs in pgstatfuncs.c. The bug would only show up if the C sockaddr structure contained zero in the first byte for a valid address; otherwise it would fail to fail, which is probably why it went unnoticed for so long. Patch submitted by Joel Jacobson after seeing an article by Andrey Karpov in which he reports finding this through static code analysis using PVS-Studio. While I was at it I moved a definition of a local variable referenced in the buggy code to a more local context. Backpatch to all supported branches.	2013-12-27 15:40:51 -06:00
Tom Lane	663f8419b6	Fix ANALYZE failure on a column that's a domain over a range. Most other range operations seem to work all right on domains, but this one not so much, at least not since commit `918eee0c`. Per bug #8684 from Brett Neumeier.	2013-12-23 22:18:23 -05:00
Alvaro Herrera	fd995b3f4f	Avoid useless palloc during transaction commit We can allocate the initial relations-to-drop array when first needed, instead of at function entry; this avoids allocating it when the function is not going to do anything, which is most of the time. Backpatch to 9.3, where this behavior was introduced by commit `279628a0a7`. There's more that could be done here, such as possible reworking of the code to avoid having to palloc anything, but that doesn't sound as backpatchable as this relatively minor change. Per complaint from Noah Misch in 20131031145234.GA621493@tornado.leadboat.com	2013-12-20 12:37:30 -03:00
Alvaro Herrera	85d3b3c3ac	Optimize updating a row that's locked by same xid Updating or locking a row that was already locked by the same transaction under the same Xid caused a MultiXact to be created; but this is unnecessary, because there's no usefulness in being able to differentiate two locks by the same transaction. In particular, if a transaction executed SELECT FOR UPDATE followed by an UPDATE that didn't modify columns of the key, we would dutifully represent the resulting combination as a multixact -- even though a single key-update is sufficient. Optimize the case so that only the strongest of both locks/updates is represented in Xmax. This can save some Xmax's from becoming MultiXacts, which can be a significant optimization. This missed optimization opportunity was spotted by Andres Freund while investigating a bug reported by Oliver Seemann in message CANCipfpfzoYnOz5jj=UZ70_R=CwDHv36dqWSpwsi27vpm1z5sA@mail.gmail.com and also directly as a performance regression reported by Dong Ye in message d54b8387.000012d8.00000010@YED-DEVD1.vmware.com Reportedly, this patch fixes the performance regression. Since the missing optimization was reported as a significant performance regression from 9.2, backpatch to 9.3. Andres Freund, tweaked by Álvaro Herrera	2013-12-19 16:39:59 -03:00
Alvaro Herrera	db1014bc46	Don't ignore tuple locks propagated by our updates If a tuple was locked by transaction A, and transaction B updated it, the new version of the tuple created by B would be locked by A, yet visible only to B; due to an oversight in HeapTupleSatisfiesUpdate, the lock held by A wouldn't get checked if transaction B later deleted (or key-updated) the new version of the tuple. This might cause referential integrity checks to give false positives (that is, allow deletes that should have been rejected). This is an easy oversight to have made, because prior to improved tuple locks in commit `0ac5ad5134` it wasn't possible to have tuples created by our own transaction that were also locked by remote transactions, and so locks weren't even considered in that code path. It is recommended that foreign keys be rechecked manually in bulk after installing this update, in case some referenced rows are missing with some referencing row remaining. Per bug reported by Daniel Wood in CAPweHKe5QQ1747X2c0tA=5zf4YnS2xcvGf13Opd-1Mq24rF1cQ@mail.gmail.com	2013-12-18 13:31:27 -03:00
Alvaro Herrera	8e9a16ab8f	Rework tuple freezing protocol Tuple freezing was broken in connection to MultiXactIds; commit `8e53ae025d` tried to fix it, but didn't go far enough. As noted by Noah Misch, freezing a tuple whose Xmax is a multi containing an aborted update might cause locks in the multi to go ignored by later transactions. This is because the code depended on a multixact above their cutoff point not having any lock-only member older than the cutoff point for Xids, which is easily defeated in READ COMMITTED transactions. The fix for this involves creating a new MultiXactId when necessary. But this cannot be done during WAL replay, and moreover multixact examination requires using CLOG access routines which are not supposed to be used during WAL replay either; so tuple freezing cannot be done with the old freeze WAL record. Therefore, separate the freezing computation from its execution, and change the WAL record to carry all necessary information. At WAL replay time, it's easy to re-execute freezing because we don't need to re-compute the new infomask/Xmax values but just take them from the WAL record. While at it, restructure the coding to ensure all page changes occur in a single critical section without much room for failures. The previous coding wasn't using a critical section, without any explanation as to why this was acceptable. In replication scenarios using the 9.3 branch, standby servers must be upgraded before their master, so that they are prepared to deal with the new WAL record once the master is upgraded; failure to do so will cause WAL replay to die with a PANIC message. Later upgrade of the standby will allow the process to continue where it left off, so there's no disruption of the data in the standby in any case. Standbys know how to deal with the old WAL record, so it's okay to keep the master running the old code for a while. In master, the old freeze WAL record is gone, for cleanliness' sake; there's no compatibility concern there. Backpatch to 9.3, where the original bug was introduced and where the previous fix was backpatched. Álvaro Herrera and Andres Freund	2013-12-16 11:29:51 -03:00
Tatsuo Ishii	8122e6f85a	Add "SHIFT_JIS" as an accepted encoding name for locale checking. When locale is "ja_JP.SJIS", nl_langinfo(CODESET) returns "SHIFT_JIS" on some platforms, at least on RedHat Linux. So the encoding/locale match table (encoding_match_list) needs the entry. Otherwise client encoding is set to SQL_ASCII. Back patch to all supported branches.	2013-12-15 11:10:41 +09:00
Tom Lane	324577f39b	Fix inherited UPDATE/DELETE with UNION ALL subqueries. Fix an oversight in commit `b3aaf9081a`: we do indeed need to process the planner's append_rel_list when copying RTE subqueries, because if any of them were flattenable UNION ALL subqueries, the append_rel_list shows which subquery RTEs were pulled up out of which other ones. Without this, UNION ALL subqueries aren't correctly inserted into the update plans for inheritance child tables after the first one, typically resulting in no update happening for those child table(s). Per report from Victor Yegorov. Experimentation with this case also exposed a fault in commit `a7b965382c`: if an inherited UPDATE/DELETE was proven totally dummy by constraint exclusion, we might arrive at add_rtes_to_flat_rtable with root->simple_rel_array being NULL. This should be interpreted as not having any RelOptInfos. I chose to code the guard as a check against simple_rel_array_size, so as to also provide some protection against indexing off the end of the array. Back-patch to 9.2 where the faulty code was added.	2013-12-14 17:33:56 -05:00
Alvaro Herrera	eeb811c454	Fix typo	2013-12-13 17:26:58 -03:00
Alvaro Herrera	0bc00363b9	Rework MultiXactId cache code The original performs too poorly; in some scenarios it shows way too high while profiling. Try to make it a bit smarter to avoid excessive cosst. In particular, make it have a maximum size, and have entries be sorted in LRU order; once the max size is reached, evict the oldest entry to avoid it from growing too large. Per complaint from Andres Freund in connection with new tuple freezing code.	2013-12-13 17:16:25 -03:00

... 4 5 6 7 8 ...

24759 Commits