postgres

mirror of https://github.com/postgres/postgres.git synced 2025-11-25 12:03:53 +03:00

Author	SHA1	Message	Date
Bruce Momjian	e126958c2e	Update copyright notices for year 2012.	2012-01-01 18:01:58 -05:00
Bruce Momjian	6416a82a62	Remove unnecessary #include references, per pgrminclude script.	2011-09-01 10:04:27 -04:00
Alvaro Herrera	b93f5a5673	Move Trigger and TriggerDesc structs out of rel.h into a new reltrigger.h This lets us stop including rel.h into execnodes.h, which is a widely used header.	2011-07-04 14:35:58 -04:00
Heikki Linnakangas	cd70dd6bef	Move the PredicateLockRelation() call from nodeSeqscan.c to heapam.c. It's more consistent that way, since all the other PredicateLock* calls are made in various heapam.c and index AM functions. The call in nodeSeqscan.c was unnecessarily aggressive anyway, there's no need to try to lock the relation every time a tuple is fetched, it's enough to do it once. This has the user-visible effect that if a seq scan is initialized in the executor, but never executed, we now acquire the predicate lock on the heap relation anyway. We could avoid that by taking the lock on the first heap_getnext() call instead, but it doesn't seem worth the trouble given that it feels more natural to do it in heap_beginscan(). Also, remove the retail PredicateLockTuple() calls from heap_getnext(). In a seqscan, started with heap_begin(), we're holding a whole-relation predicate lock on the heap so there's no need to lock the tuples individually. Kevin Grittner and me	2011-06-29 21:57:43 +03:00
Heikki Linnakangas	0a0e2b52a5	Make non-MVCC snapshots exempt from predicate locking. Scans with non-MVCC snapshots, like in REINDEX, are basically non-transactional operations. The DDL operation itself might participate in SSI, but there's separate functions for that. Kevin Grittner and Dan Ports, with some changes by me.	2011-06-15 12:11:18 +03:00
Heikki Linnakangas	dafaa3efb7	Implement genuine serializable isolation level. Until now, our Serializable mode has in fact been what's called Snapshot Isolation, which allows some anomalies that could not occur in any serialized ordering of the transactions. This patch fixes that using a method called Serializable Snapshot Isolation, based on research papers by Michael J. Cahill (see README-SSI for full references). In Serializable Snapshot Isolation, transactions run like they do in Snapshot Isolation, but a predicate lock manager observes the reads and writes performed and aborts transactions if it detects that an anomaly might occur. This method produces some false positives, ie. it sometimes aborts transactions even though there is no anomaly. To track reads we implement predicate locking, see storage/lmgr/predicate.c. Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared memory is finite, so when a transaction takes many tuple-level locks on a page, the locks are promoted to a single page-level lock, and further to a single relation level lock if necessary. To lock key values with no matching tuple, a sequential scan always takes a relation-level lock, and an index scan acquires a page-level lock that covers the search key, whether or not there are any matching keys at the moment. A predicate lock doesn't conflict with any regular locks or with another predicate locks in the normal sense. They're only used by the predicate lock manager to detect the danger of anomalies. Only serializable transactions participate in predicate locking, so there should be no extra overhead for for other transactions. Predicate locks can't be released at commit, but must be remembered until all the transactions that overlapped with it have completed. That means that we need to remember an unbounded amount of predicate locks, so we apply a lossy but conservative method of tracking locks for committed transactions. If we run short of shared memory, we overflow to a new "pg_serial" SLRU pool. We don't currently allow Serializable transactions in Hot Standby mode. That would be hard, because even read-only transactions can cause anomalies that wouldn't otherwise occur. Serializable isolation mode now means the new fully serializable level. Repeatable Read gives you the old Snapshot Isolation level that we have always had. Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and Anssi Kääriäinen	2011-02-08 00:09:08 +02:00
Bruce Momjian	5d950e3b0c	Stamp copyrights for year 2011.	2011-01-01 13:18:15 -05:00
Magnus Hagander	9f2e211386	Remove cvs keywords from all files.	2010-09-20 22:08:53 +02:00
Tom Lane	53e757689c	Make NestLoop plan nodes pass outer-relation variables into their inner relation using the general PARAM_EXEC executor parameter mechanism, rather than the ad-hoc kluge of passing the outer tuple down through ExecReScan. The previous method was hard to understand and could never be extended to handle parameters coming from multiple join levels. This patch doesn't change the set of possible plans nor have any significant performance effect, but it's necessary infrastructure for future generalization of the concept of an inner indexscan plan. ExecReScan's second parameter is now unused, so it's removed.	2010-07-12 17:01:06 +00:00
Bruce Momjian	65e806cba1	pgindent run for 9.0	2010-02-26 02:01:40 +00:00
Bruce Momjian	0239800893	Update copyright for the year 2010.	2010-01-02 16:58:17 +00:00
Tom Lane	9f2ee8f287	Re-implement EvalPlanQual processing to improve its performance and eliminate a lot of strange behaviors that occurred in join cases. We now identify the "current" row for every joined relation in UPDATE, DELETE, and SELECT FOR UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the appropriate row into each scan node in the rechecking plan, forcing it to emit only that one row. The former behavior could rescan the whole of each joined relation for each recheck, which was terrible for performance, and what's much worse could result in duplicated output tuples. Also, the original implementation of EvalPlanQual could not re-use the recheck execution tree --- it had to go through a full executor init and shutdown for every row to be tested. To avoid this overhead, I've associated a special runtime Param with each LockRows or ModifyTable plan node, and arranged to make every scan node below such a node depend on that Param. Thus, by signaling a change in that Param, the EPQ machinery can just rescan the already-built test plan. This patch also adds a prohibition on set-returning functions in the targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the duplicate-output-tuple problem. It seems fairly reasonable since the other restrictions on SELECT FOR UPDATE are meant to ensure that there is a unique correspondence between source tuples and result tuples, which an output SRF destroys as much as anything else does.	2009-10-26 02:26:45 +00:00
Tom Lane	421d7d8edb	Remove no-longer-needed ExecCountSlots infrastructure.	2009-09-27 21:10:53 +00:00
Bruce Momjian	511db38ace	Update copyright for 2009.	2009-01-01 17:24:05 +00:00
Alvaro Herrera	a3540b0f65	Improve our #include situation by moving pointer types away from the corresponding struct definitions. This allows other headers to avoid including certain highly-loaded headers such as rel.h and relscan.h, instead using just relcache.h, heapam.h or genam.h, which are more lightweight and thus cause less unnecessary dependencies.	2008-06-19 00:46:06 +00:00
Bruce Momjian	9098ab9e32	Update copyrights in source tree to 2008.	2008-01-01 19:46:01 +00:00
Bruce Momjian	29dccf5fe0	Update CVS HEAD for 2007 copyright. Back branches are typically not back-stamped for this.	2007-01-05 22:20:05 +00:00
Tom Lane	68996463d4	Repair bug #2839 : the various ExecReScan functions need to reset ps_TupFromTlist in plan nodes that make use of it. This was being done correctly in join nodes and Result nodes but not in any relation-scan nodes. Bug would lead to bogus results if a set-returning function appeared in the targetlist of a subquery that could be rescanned after partial execution, for example a subquery within EXISTS(). Bug has been around forever :-( ... surprising it wasn't reported before.	2006-12-26 19:26:46 +00:00
Bruce Momjian	f99a569a2e	pgindent run for 8.2.	2006-10-04 00:30:14 +00:00
Bruce Momjian	e0522505bd	Remove 576 references of include files that were not needed.	2006-07-14 14:52:27 +00:00
Tom Lane	06e10abc0b	Fix problems with cached tuple descriptors disappearing while still in use by creating a reference-count mechanism, similar to what we did a long time ago for catcache entries. The back branches have an ugly solution involving lots of extra copies, but this way is more efficient. Reference counting is only applied to tupdescs that are actually in caches --- there seems no need to use it for tupdescs that are generated in the executor, since they'll go away during plan shutdown by virtue of being in the per-query memory context. Neil Conway and Tom Lane	2006-06-16 18:42:24 +00:00
Bruce Momjian	f2f5b05655	Update copyright for 2006. Update scripts.	2006-03-05 15:59:11 +00:00
Tom Lane	2c0ef9777c	Extend the ExecInitNode API so that plan nodes receive a set of flag bits indicating which optional capabilities can actually be exercised at runtime. This will allow Sort and Material nodes, and perhaps later other nodes, to avoid unnecessary overhead in common cases. This commit just adds the infrastructure and arranges to pass the correct flag values down to plan nodes; none of the actual optimizations are here yet. I'm committing this separately in case anyone wants to measure the added overhead. (It should be negligible.) Simon Riggs and Tom Lane	2006-02-28 04:10:28 +00:00
Tom Lane	d780f07ac1	Adjust scan plan nodes to avoid getting an extra AccessShareLock on a relation if it's already been locked by execMain.c as either a result relation or a FOR UPDATE/SHARE relation. This avoids an extra trip to the shared lock manager state. Per my suggestion yesterday.	2005-12-02 20:03:42 +00:00
Tom Lane	dab52ab13d	Improve ExecStoreTuple to be smarter about replacing the contents of a TupleTableSlot: instead of calling ExecClearTuple, inline the needed operations, so that we can avoid redundant steps. In particular, when the old and new tuples are both on the same disk page, avoid releasing and re-acquiring the buffer pin --- this saves work in both the bufmgr and ResourceOwner modules. To make this improvement actually useful, partially revert a change I made on 2004-04-21 that caused SeqNext et al to call ExecClearTuple before ExecStoreTuple. The motivation for that, to avoid grabbing the BufMgrLock separately for releasing the old buffer and grabbing the new one, no longer applies. My profiling says that this saves about 5% of the CPU time for an all-in-memory seqscan.	2005-11-25 04:24:48 +00:00
Bruce Momjian	1dc3498251	Standard pgindent run for 8.1.	2005-10-15 02:49:52 +00:00
Tom Lane	2ef172a2a4	Fix latent bug in ExecSeqRestrPos: it leaves the plan node's result slot in an inconsistent state. (This is only latent because in reality ExecSeqRestrPos is dead code at the moment ... but someday maybe it won't be.) Add some comments about what the API for plan node mark/restore actually is, because it's not immediately obvious.	2005-05-15 21:19:55 +00:00
Tom Lane	f97aebd162	Revise TupleTableSlot code to avoid unnecessary construction and disassembly of tuples when passing data up through multiple plan nodes. A slot can now hold either a normal "physical" HeapTuple, or a "virtual" tuple consisting of Datum/isnull arrays. Upper plan levels can usually just copy the Datum arrays, avoiding heap_formtuple() and possible subsequent nocachegetattr() calls to extract the data again. This work extends Atsushi Ogawa's earlier patch, which provided the key idea of adding Datum arrays to TupleTableSlots. (I believe however that something like this was foreseen way back in Berkeley days --- see the old comment on ExecProject.) A test case involving many levels of join of fairly wide tables (about 80 columns altogether) showed about 3x overall speedup, though simple queries will probably not be helped very much. I have also duplicated some code in heaptuple.c in order to provide versions of heap_formtuple and friends that use "bool" arrays to indicate null attributes, instead of the old convention of "char" arrays containing either 'n' or ' '. This provides a better match to the convention used by ExecEvalExpr. While I have not made a concerted effort to get rid of uses of the old routines, I think they should be deprecated and eventually removed.	2005-03-16 21:38:10 +00:00
PostgreSQL Daemon	2ff501590b	Tag appropriate files for rc3 Also performed an initial run through of upgrading our Copyright date to extend to 2005 ... first run here was very simple ... change everything where: grep 1996-2004 && the word 'Copyright' ... scanned through the generated list with 'less' first, and after, to make sure that I only picked up the right entries ...	2004-12-31 22:04:05 +00:00
Bruce Momjian	b6b71b85bc	Pgindent run for 8.0.	2004-08-29 05:07:03 +00:00
Bruce Momjian	da9a8649d8	Update copyright to 2004.	2004-08-29 04:13:13 +00:00
Tom Lane	37fa3b6c89	Tweak indexscan and seqscan code to arrange that steps from one page to the next are handled by ReleaseAndReadBuffer rather than separate ReleaseBuffer and ReadBuffer calls. This cuts the number of acquisitions of the BufMgrLock by a factor of 2 (possibly more, if an indexscan happens to pull successive rows from the same heap page). Unfortunately this doesn't seem enough to get us out of the recently discussed context-switch storm problem, but it's surely worth doing anyway.	2004-04-21 18:24:26 +00:00
PostgreSQL Daemon	969685ad44	$Header: -> $PostgreSQL Changes ...	2003-11-29 19:52:15 +00:00
Bruce Momjian	46785776c4	Another pgindent run with updated typedefs.	2003-08-08 21:42:59 +00:00
Bruce Momjian	f3c3deb7d0	Update copyrights to 2003.	2003-08-04 02:40:20 +00:00
Bruce Momjian	089003fb46	pgindent run.	2003-08-04 00:43:34 +00:00
Tom Lane	4cff59d8d5	Tweak planner and executor to avoid doing ExecProject() in table scan nodes where it's not really necessary. In many cases where the scan node is not the topmost plan node (eg, joins, aggregation), it's possible to just return the table tuple directly instead of generating an intermediate projection tuple. In preliminary testing, this reduced the CPU time needed for 'SELECT COUNT(*) FROM foo' by about 10%.	2003-02-03 15:07:08 +00:00
Tom Lane	d51260aa9d	Fix wrong/misleading comments, be more consistent about where to call ExecAssignResultTypeFromTL().	2003-01-12 22:01:38 +00:00
Tom Lane	5bab36e9f6	Revise executor APIs so that all per-query state structure is built in a per-query memory context created by CreateExecutorState --- and destroyed by FreeExecutorState. This provides a final solution to the longstanding problem of memory leaked by various ExecEndNode calls.	2002-12-15 16:17:59 +00:00
Tom Lane	3a4f7dde16	Phase 3 of read-only-plans project: ExecInitExpr now builds expression execution state trees, and ExecEvalExpr takes an expression state tree not an expression plan tree. The plan tree is now read-only as far as the executor is concerned. Next step is to begin actually exploiting this property.	2002-12-13 19:46:01 +00:00
Tom Lane	1fd0c59e25	Phase 1 of read-only-plans project: cause executor state nodes to point to plan nodes, not vice-versa. All executor state nodes now inherit from struct PlanState. Copying of plan trees has been simplified by not storing a list of SubPlans in Plan nodes (eliminating duplicate links). The executor still needs such a list, but it can build it during ExecutorStart since it has to scan the plan tree anyway. No initdb forced since no stored-on-disk structures changed, but you will need a full recompile because of node-numbering changes.	2002-12-05 15:50:39 +00:00
Tom Lane	935969415a	Be more realistic about plans involving Materialize nodes: take their cost into account while planning.	2002-11-30 05:21:03 +00:00
Bruce Momjian	e50f52a074	pgindent run.	2002-09-04 20:31:48 +00:00
Bruce Momjian	d84fe82230	Update copyright to 2002.	2002-06-20 20:29:54 +00:00
Tom Lane	44fbe20d62	Restructure indexscan API (index_beginscan, index_getnext) per yesterday's proposal to pghackers. Also remove unnecessary parameters to heap_beginscan, heap_rescan. I modified pg_proc.h to reflect the new numbers of parameters for the AM interface routines, but did not force an initdb because nothing actually looks at those fields.	2002-05-20 23:51:44 +00:00
Tom Lane	7863404417	A bunch of changes aimed at reducing backend startup time... Improve 'pg_internal.init' relcache entry preload mechanism so that it is safe to use for all system catalogs, and arrange to preload a realistic set of system-catalog entries instead of only the three nailed-in-cache indexes that were formerly loaded this way. Fix mechanism for deleting out-of-date pg_internal.init files: this must be synchronized with transaction commit, not just done at random times within transactions. Drive it off relcache invalidation mechanism so that no special-case tests are needed. Cache additional information in relcache entries for indexes (their pg_index tuples and index-operator OIDs) to eliminate repeated lookups. Also cache index opclass info at the per-opclass level to avoid repeated lookups during relcache load. Generalize 'systable scan' utilities originally developed by Hiroshi, move them into genam.c, use in a number of places where there was formerly ugly code for choosing either heap or index scan. In particular this allows simplification of the logic that prevents infinite recursion between syscache and relcache during startup: we can easily switch to heapscans in relcache.c when and where needed to avoid recursion, so IndexScanOK becomes simpler and does not need any expensive initialization. Eliminate useless opening of a heapscan data structure while doing an indexscan (this saves an mdnblocks call and thus at least one kernel call).	2002-02-19 20:11:20 +00:00
Bruce Momjian	6783b2372e	Another pgindent run. Fixes enum indenting, and improves #endif spacing. Also adds space for one-line comments.	2001-10-28 06:26:15 +00:00
Bruce Momjian	b81844b173	pgindent run on all C files. Java run to follow. initdb/regression tests pass.	2001-10-25 05:50:21 +00:00
Tom Lane	c8076f09d2	Restructure index AM interface for index building and index tuple deletion, per previous discussion on pghackers. Most of the duplicate code in different AMs' ambuild routines has been moved out to a common routine in index.c; this means that all index types now do the right things about inserting recently-dead tuples, etc. (I also removed support for EXTEND INDEX in the ambuild routines, since that's about to go away anyway, and it cluttered the code a lot.) The retail indextuple deletion routines have been replaced by a "bulk delete" routine in which the indexscan is inside the access method. I haven't pushed this change as far as it should go yet, but it should allow considerable simplification of the internal bookkeeping for deletions. Also, add flag columns to pg_am to eliminate various hardcoded tests on AM OIDs, and remove unused pg_am columns. Fix rtree and gist index types to not attempt to store NULLs; before this, gist usually crashed, while rtree managed not to crash but computed wacko bounding boxes for NULL entries (which might have had something to do with the performance problems we've heard about occasionally). Add AtEOXact routines to hash, rtree, and gist, all of which have static state that needs to be reset after an error. We discovered this need long ago for btree, but missed the other guys. Oh, one more thing: concurrent VACUUM is now the default.	2001-07-15 22:48:19 +00:00
Tom Lane	a3855c5761	Cause ExecCountSlots() accounting to bear some relationship to reality. Rather surprising we hadn't seen bug reports about this ...	2001-05-27 20:42:20 +00:00

1 2

79 Commits