postgres

mirror of https://github.com/postgres/postgres.git synced 2025-11-16 15:02:33 +03:00

Author	SHA1	Message	Date
Tom Lane	7aa772f03e	Now that we've rearranged relation open to get a lock before touching the rel, it's easy to get rid of the narrow race-condition window that used to exist in VACUUM and CLUSTER. Did some minor code-beautification work in the same area, too.	2006-08-18 16:09:13 +00:00
Tom Lane	e8ea9e9587	Implement archive_timeout feature to force xlog file switches to occur no more than N seconds apart. This allows a simple, if not very high performance, means of guaranteeing that a PITR archive is no more than N seconds behind real time. Also make pg_current_xlog_location return the WAL Write pointer, add pg_current_xlog_insert_location to return the Insert pointer, and fix pg_xlogfile_name_offset to return its results as a two-element record instead of a smashed-together string, as per recent discussion. Simon Riggs	2006-08-17 23:04:10 +00:00
Tom Lane	7a3e30e608	Add INSERT/UPDATE/DELETE RETURNING, with basic docs and regression tests. plpgsql support to come later. Along the way, convert execMain's SELECT INTO support into a DestReceiver, in order to eliminate some ugly special cases. Jonah Harris and Tom Lane	2006-08-12 02:52:06 +00:00
Tom Lane	e002836913	Make recovery from WAL be restartable, by executing a checkpoint-like operation every so often. This improves the usefulness of PITR log shipping for hot standby: formerly, if the standby server crashed, it was necessary to restart it from the last base backup and replay all the WAL since then. Now it will only need to reread about the same amount of WAL as the master server would. The behavior might also come in handy during a long PITR replay sequence. Simon Riggs, with some editorialization by Tom Lane.	2006-08-07 16:57:57 +00:00
Tom Lane	704ddaaa09	Add support for forcing a switch to a new xlog file; cause such a switch to happen automatically during pg_stop_backup(). Add some functions for interrogating the current xlog insertion point and for easily extracting WAL filenames from the hex WAL locations displayed by pg_stop_backup and friends. Simon Riggs with some editorialization by Tom Lane.	2006-08-06 03:53:44 +00:00
Tom Lane	bc8ac3ce40	Add missing pgstat_count_index_scan(), per Andreas Seltenreich.	2006-08-03 15:22:09 +00:00
Tom Lane	09d3670df3	Change the relation_open protocol so that we obtain lock on a relation (table or index) before trying to open its relcache entry. This fixes race conditions in which someone else commits a change to the relation's catalog entries while we are in process of doing relcache load. Problems of that ilk have been reported sporadically for years, but it was not really practical to fix until recently --- for instance, the recent addition of WAL-log support for in-place updates helped. Along the way, remove pg_am.amconcurrent: all AMs are now expected to support concurrent update.	2006-07-31 20:09:10 +00:00
Alvaro Herrera	92c2ecc130	Modify snapshot definition so that lazy vacuums are ignored by other vacuums. This allows a OLTP-like system with big tables to continue regular vacuuming on small-but-frequently-updated tables while the big tables are being vacuumed. Original patch from Hannu Krossing, rewritten by Tom Lane and updated by me.	2006-07-30 02:07:18 +00:00
Tom Lane	e6284649b9	Modify btree to delete known-dead index entries without an actual VACUUM. When we are about to split an index page to do an insertion, first look to see if any entries marked LP_DELETE exist on the page, and if so remove them to try to make enough space for the desired insert. This should reduce index bloat in heavily-updated tables, although of course you still need VACUUM eventually to clean up the heap. Junji Teramoto	2006-07-25 19:13:00 +00:00
Peter Eisentraut	e9b4969062	DTrace support, with a small initial set of probes by Robert Lor	2006-07-24 16:32:45 +00:00
Tom Lane	9dc842f083	Don't try to truncate multixact SLRU files in checkpoints done during xlog recovery. In the first place, it doesn't work because slru's latest_page_number isn't set up yet (this is why we've been hearing reports of strange "apparent wraparound" log messages during crash recovery, but only from people who'd managed to advance their next-mxact counters some considerable distance from 0). In the second place, it seems a bit unwise to be throwing away data during crash recovery anwyway. This latter consideration convinces me to just disable truncation during recovery, rather than computing latest_page_number and pushing ahead.	2006-07-20 00:46:42 +00:00
Tom Lane	c36418be40	Fix getDatumCopy(): don't use store_att_byval to copy into a Datum variable (this accounts for regression failures on PPC64, and in fact won't work on any big-endian machine). Get rid of hardwired knowledge about datum size rules; make it look just like datumCopy().	2006-07-16 00:54:22 +00:00
Tom Lane	e040ab44e4	Improve error message wording.	2006-07-16 00:52:05 +00:00
Tom Lane	98bac16e4d	Fix misguided removal of access/tuptoaster.h inclusion, per Kris Jurka. I'm going to insist on reversion of this entire patch unless pgrminclude is upgraded to a less broken state, but in the meantime let's get contrib passing regression again.	2006-07-14 19:05:52 +00:00
Bruce Momjian	e0522505bd	Remove 576 references of include files that were not needed.	2006-07-14 14:52:27 +00:00
Bruce Momjian	a22d76d96a	Allow include files to compile own their own. Strip unused include files out unused include files, and add needed includes to C files. The next step is to remove unused include files in C files.	2006-07-13 16:49:20 +00:00
Tom Lane	d29b66882a	Tweak fillfactor code as per my recent proposal. Fix nbtsort.c so that it can handle small fillfactors for ordinary-sized index entries without failing on large ones; fix nbtinsert.c to distinguish leaf and nonleaf pages; change the minimum fillfactor to 10% for all index types.	2006-07-11 21:05:57 +00:00
Teodor Sigaev	001d30ee6b	Add support to GIN for =(anyarray,anyarray) operation	2006-07-11 19:49:14 +00:00
Bruce Momjian	ac230e7431	Alphabetically order reference to include files, "S"-"Z".	2006-07-11 18:26:11 +00:00
Bruce Momjian	0ff3461bcc	Alphabetically order reference to include files, "N" - "S".	2006-07-11 17:26:59 +00:00
Bruce Momjian	3a534ade39	Alphabetically order reference to include files, "G" - "M".	2006-07-11 17:04:13 +00:00
Teodor Sigaev	234163649e	GIN improvements - Replace sorted array of entries in maintenance_work_mem to binary tree, this should improve create performance. - More precisely calculate allocated memory, eliminate leaks with user-defined extractValue() - Improve wordings in tsearch2	2006-07-11 16:55:34 +00:00
Alvaro Herrera	d4cef0aa2a	Improve vacuum code to track minimum Xids per table instead of per database. To this end, add a couple of columns to pg_class, relminxid and relvacuumxid, based on which we calculate the pg_database columns after each vacuum. We now force all databases to be vacuumed, even template ones. A backend noticing too old a database (meaning pg_database.datminxid is in danger of falling behind Xid wraparound) will signal the postmaster, which in turn will start an autovacuum iteration to process the offending database. In principle this is only there to cope with frozen (non-connectable) databases without forcing users to set them to connectable, but it could force regular user database to go through a database-wide vacuum at any time. Maybe we should warn users about this somehow. Of course the real solution will be to use autovacuum all the time ;-) There are some additional improvements we could have in this area: for example the vacuum code could be smarter about not updating pg_database for each table when called by autovacuum, and do it only once the whole autovacuum iteration is done. I updated the system catalogs documentation, but I didn't modify the maintenance section. Also having some regression tests for this would be nice but it's not really a very straightforward thing to do. Catalog version bumped due to system catalog changes.	2006-07-10 16:20:52 +00:00
Tom Lane	b7b78d24f7	Code review for FILLFACTOR patch. Change WITH grammar as per earlier discussion (including making def_arg allow reserved words), add missed opt_definition for UNIQUE case. Put the reloptions support code in a less random place (I chose to make a new file access/common/reloptions.c). Eliminate header inclusion creep. Make the index options functions safely user-callable (seems like client apps might like to be able to test validity of options before trying to make an index). Reduce overhead for normal case with no options by allowing rd_options to be NULL. Fix some unmaintainably klugy code, including getting rid of Natts_pg_class_fixed at long last. Some stylistic cleanup too, and pay attention to keeping comments in sync with code. Documentation still needs work, though I did fix the omissions in catalogs.sgml and indexam.sgml.	2006-07-03 22:45:41 +00:00
Bruce Momjian	277807bd9e	Add FILLFACTOR to CREATE INDEX. ITAGAKI Takahiro	2006-07-02 02:23:23 +00:00
Teodor Sigaev	783a73168b	Forget to add new file :((	2006-06-28 12:08:35 +00:00
Teodor Sigaev	1f7ef548ec	Changes * new split algorithm (as proposed in http://archives.postgresql.org/pgsql-hackers/2006-06/msg00254.php) * possible call pickSplit() for second and below columns * add spl_(l\|r)datum_exists to GIST_SPLITVEC - pickSplit should check its values to use already defined spl_(l\|r)datum for splitting. pickSplit should set spl_(l\|r)datum_exists to 'false' (if they was 'true') to signal to caller about using spl_(l\|r)datum. * support for old pickSplit(): not very optimal but correct split * remove 'bytes' field from GISTENTRY: in any case size of value is defined by it's type. * split GIST_SPLITVEC to two structures: one for using in picksplit and second - for internal use. * some code refactoring * support of subsplit to rtree opclasses TODO: add support of subsplit to contrib modules	2006-06-28 12:00:14 +00:00
Tom Lane	3c71244b74	Put #ifdef NOT_USED around posix_fadvise call. We may want to resurrect this someday, but right now it seems that posix_fadvise is immature to the point of being broken on many platforms ... and we don't have any benchmark evidence proving it's worth spending time on.	2006-06-27 18:59:17 +00:00
Tom Lane	cdd5178c69	Extend the MinimalTuple concept to tuplesort.c, thereby reducing the per-tuple space overhead for sorts in memory. I chose to replace the previous patch that tried to write out the bare minimum amount of data when sorting on disk; instead, just dump the MinimalTuples as-is. This wastes 3 to 10 bytes per tuple depending on architecture and null-bitmap length, but the simplification in the writetup/readtup routines seems worth it.	2006-06-27 16:53:02 +00:00
Tom Lane	3f50ba27cf	Create infrastructure for 'MinimalTuple' representation of in-memory tuples with less header overhead than a regular HeapTuple, per my recent proposal. Teach TupleTableSlot code how to deal with these. As proof of concept, change tuplestore.c to store MinimalTuples instead of HeapTuples. Future patches will expand the concept to other places where it is useful.	2006-06-27 02:51:40 +00:00
Tom Lane	3a04f53e7f	pg_stop_backup was calling XLogArchiveNotify() twice for the newly created backup history file. Bug introduced by the 8.1 change to make pg_stop_backup delete older history files. Per report from Masao Fujii.	2006-06-22 20:42:57 +00:00
Tom Lane	27c3e3de09	Remove redundant gettimeofday() calls to the extent practical without changing semantics too much. statement_timestamp is now set immediately upon receipt of a client command message, and the various places that used to do their own gettimeofday() calls to mark command startup are referenced to that instead. I have also made stats_command_string use that same value for pg_stat_activity.query_start for both the command itself and its eventual replacement by <IDLE> or <idle in transaction>. There was some debate about that, but no argument that seemed convincing enough to justify an extra gettimeofday() call.	2006-06-20 22:52:00 +00:00
Tom Lane	1e8ae13640	Don't try to call posix_fadvise() unless <fcntl.h> supplies a declaration for it. Hopefully will fix core dump evidenced by some buildfarm members since fadvise patch went in. The actual definition of the function is not ABI-compatible with compiler's default assumption in the absence of any declaration, so it's clearly unsafe to try to call it without seeing a declaration.	2006-06-18 18:30:21 +00:00
Tom Lane	06e10abc0b	Fix problems with cached tuple descriptors disappearing while still in use by creating a reference-count mechanism, similar to what we did a long time ago for catcache entries. The back branches have an ugly solution involving lots of extra copies, but this way is more efficient. Reference counting is only applied to tupdescs that are actually in caches --- there seems no need to use it for tupdescs that are generated in the executor, since they'll go away during plan shutdown by virtue of being in the per-query memory context. Neil Conway and Tom Lane	2006-06-16 18:42:24 +00:00
Bruce Momjian	40bc06fa16	Test for POSIX_FADV_DONTNEED to use posix_fadvise().	2006-06-16 04:11:48 +00:00
Bruce Momjian	94a5c4a01b	Use posix_fadvise() to avoid kernel caching of WAL contents on WAL file close. ITAGAKI Takahiro	2006-06-15 19:15:00 +00:00
Teodor Sigaev	b32000eda4	Som improve page split in multicolumn GiST index. If user picksplit on n-th column generate equals left and right unions then it calls picksplit on n+1-th column.	2006-05-29 12:50:06 +00:00
Teodor Sigaev	0a6fde5a26	Correct cheking in findParents(). i From Andreas Seltenreich <andreas+pg@gate450.dyndns.org>	2006-05-29 08:39:44 +00:00
Alvaro Herrera	3d58a1c168	Remove traces of otherwise unused RELKIND_SPECIAL symbol. Leave the psql bits in place though, so that it plays nicely with older servers. Per discussion.	2006-05-28 02:27:08 +00:00
Teodor Sigaev	5d1a066e64	Fix findParents() in case of multiple levels to find. By Andreas Seltenreich <andreas+pg@gate450.dyndns.org>	2006-05-26 08:01:17 +00:00
Teodor Sigaev	d2158b0281	* Add support NULL to GiST. * some refactoring and simplify code int gistutil.c and gist.c * now in some cases it can be called used-defined picksplit method for non-first column in index, but here is a place to do more. * small fix of docs related to support NULL.	2006-05-24 11:01:39 +00:00
Teodor Sigaev	09518fbdf4	Call MarkBufferDirty() before XLogInsert() during completion of insert	2006-05-19 17:15:41 +00:00
Teodor Sigaev	420cbff881	Simplify gistSplit() and some refactoring related code.	2006-05-19 16:15:17 +00:00
Teodor Sigaev	5890790b4a	Rework completion of incomplete inserts. Now it writes WAL log during inserts.	2006-05-19 11:10:25 +00:00
Teodor Sigaev	8876e37d07	Reduce size of critial section during vacuum full, critical sections now isn't nested. All user-defined functions now is called outside critsections. Small improvements in WAL protocol. TODO: improve XLOG replay	2006-05-17 16:34:59 +00:00
Tom Lane	3fdeb189e9	Clean up code associated with updating pg_class statistics columns (relpages/reltuples). To do this, create formal support in heapam.c for "overwrite" tuple updates (including xlog replay capability) and use that instead of the ad-hoc overwrites we'd been using in VACUUM and CREATE INDEX. Take the responsibility for updating stats during CREATE INDEX out of the individual index AMs, and do it where it belongs, in catalog/index.c. Aside from being more modular, this avoids having to update the same tuple twice in some paths through CREATE INDEX. It's probably not measurably faster, but for sure it's a lot cleaner than before.	2006-05-10 23:18:39 +00:00
Teodor Sigaev	10dd8df68e	Reduce size of critical section and remove call of user-defined functions in insertion and deletion, modify gistSplit() to do not use buffers. TODO: gistvacuumcleanup and XLOG	2006-05-10 09:19:54 +00:00
Tom Lane	5749f6ef0c	Rewrite btree vacuuming to fold the former bulkdelete and cleanup operations into a single mostly-physical-order scan of the index. This requires some ticklish interlocking considerations, but should create no material performance impact on normal index operations (at least given the already-committed changes to make scans work a page at a time). VACUUM itself should get significantly faster in any index that's degenerated to a very nonlinear page order. Also, we save one pass over the index entirely, except in the case where there were no deletions to do and so only one pass happened anyway. Original patch by Heikki Linnakangas, rework by Tom Lane.	2006-05-08 00:00:17 +00:00
Tom Lane	09cb5c0e7d	Rewrite btree index scans to work a page at a time in all cases (both btgettuple and btgetmulti). This eliminates the problem of "re-finding" the exact stopping point, since the stopping point is effectively always a page boundary, and index items are never moved across pre-existing page boundaries. A small penalty is that the keys_are_unique optimization is effectively disabled (and, therefore, is removed in this patch), causing us to apply _bt_checkkeys() to at least one more tuple than necessary when looking up a unique key. However, the advantages for non-unique cases seem great enough to accept this tradeoff. Aside from simplifying and (sometimes) speeding up the indexscan code, this will allow us to reimplement btbulkdelete as a largely sequential scan instead of index-order traversal, thereby significantly reducing the cost of VACUUM. Those changes will come in a separate patch. Original patch by Heikki Linnakangas, rework by Tom Lane.	2006-05-07 01:21:30 +00:00
Teodor Sigaev	2a58f3bff6	Fix typo noticed by Alvaro Herrera	2006-05-03 06:56:47 +00:00

... 69 70 71 72 73 ...

4679 Commits