Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
buffer is valid, as ReadBuffer() will elog on error. Most of the call
sites of ReadBuffer() got this right, but this patch fixes those call
sites that did not.
in all cases when keep_buf = true. This allows ANALYZE's inner loop to
use heap_release_fetch, which saves multiple buffer lookups for the same
page and avoids overestimation of cost by the vacuum cost mechanism.
now are supposed to take some kind of lock on an index whenever you
are going to access the index contents, rather than relying only on a
lock on the parent table.
In the past, we used a 'Lispy' linked list implementation: a "list" was
merely a pointer to the head node of the list. The problem with that
design is that it makes lappend() and length() linear time. This patch
fixes that problem (and others) by maintaining a count of the list
length and a pointer to the tail node along with each head node pointer.
A "list" is now a pointer to a structure containing some meta-data
about the list; the head and tail pointers in that structure refer
to ListCell structures that maintain the actual linked list of nodes.
The function names of the list API have also been changed to, I hope,
be more logically consistent. By default, the old function names are
still available; they will be disabled-by-default once the rest of
the tree has been updated to use the new API names.
ago. This should give significantly better results when the density of
live tuples is not uniform throughout a table. Manfred Koizar, with
minor kibitzing from Tom Lane.
costing us lots more to maintain than it was worth. On shared tables
it was of exactly zero benefit because we couldn't trust it to be
up to date. On temp tables it sometimes saved an lseek, but not often
enough to be worth getting excited about. And the real problem was that
we forced an lseek on every relcache flush in order to update the field.
So all in all it seems best to lose the complexity.
This commit teaches ANALYZE to store such stats in pg_statistic, but
nothing is done yet about teaching the planner to use 'em.
Also, repair longstanding oversight in separate ANALYZE command: it
updated the pg_class.relpages and reltuples counts for the table proper,
but not for indexes.
indexes, it seems like we ought to put another layer of indirection
between the compute_stats functions and the actual data storage. This
would allow us to compute the values on-the-fly, for example.
subroutine in src/port/pgsleep.c. Remove platform dependencies from
miscadmin.h and put them in port.h where they belong. Extend recent
vacuum cost-based-delay patch to apply to VACUUM FULL, ANALYZE, and
non-btree index vacuuming.
By the way, where is the documentation for the cost-based-delay patch?
- Update comment in IsReservedName() to the present day
- Improve some variable & function names in commands/vacuum.c. I
was planning to rewrite this to avoid lappend(), but since I
still intend to do the list rewrite, there's no need for that.
- Update some smgr comments which seemed to imply that we still
forced all dirty pages to disk at commit-time.
- Replace some #ifdef DIAGNOSTIC code with assertions.
- Make the distinction between OS-level file descriptors and
virtual file descriptors a little clearer in a few comments
- Other minor comment improvements in the smgr code
datatype by array_eq and array_cmp; use this to solve problems with memory
leaks in array indexing support. The parser's equality_oper and ordering_oper
routines also use the cache. Change the operator search algorithms to look
for appropriate btree or hash index opclasses, instead of assuming operators
named '<' or '=' have the right semantics. (ORDER BY ASC/DESC now also look
at opclasses, instead of assuming '<' and '>' are the right things.) Add
several more index opclasses so that there is no regression in functionality
for base datatypes. initdb forced due to catalog additions.
them as arrays of the internal datatype. This requires treating the
stavalues columns as 'anyarray' rather than 'text[]', which is not 100%
kosher but seems to work fine for the purposes we need for pg_statistic.
Perhaps in the future 'anyarray' will be allowed more generally.
operations: make sure we use operators that are compatible, as determined
by a mergejoin link in pg_operator. Also, add code to planner to ensure
we don't try to use hashed grouping when the grouping operators aren't
marked hashable.
that ANALYZE would not gather any stats for a CHAR(255) column. I still
think a width threshold is appropriate for the reasons mentioned in the
code, but we can loosen it at least.
array header, and to compute sizing and alignment of array elements
the same way normal tuple access operations do --- viz, using the
tupmacs.h macros att_addlength and att_align. This makes the world
safe for arrays of cstrings or intervals, and should make it much
easier to write array-type-polymorphic functions; as examples see
the cleanups of array_out and contrib/array_iterator. By Joe Conway
and Tom Lane.
value '-2' is used to indicate a variable-width type whose width is
computed as strlen(datum)+1. Everything that looks at typlen is updated
except for array support, which Joe Conway is working on; at the moment
it wouldn't work to try to create an array of cstring.
it takes could be held for quite awhile after the analyze step completes.
Rethink locking of pg_statistic in light of this fact. The original
scheme took an exclusive lock on pg_statistic, which was okay when the
lock could be expected to be released shortly, but that doesn't hold
anymore. Back off to a normal writer's lock (RowExclusiveLock). This
allows concurrent ANALYZE of nonoverlapping sets of tables, at the price
that concurrent ANALYZEs of the same table may fail with 'tuple
concurrently updated'.
hardwired lists of index names for each catalog, use the relcache's
mechanism for caching lists of OIDs of indexes of any table. This
reduces the common case of updating system catalog indexes to a single
line, makes it much easier to add a new system index (in fact, you
can now do so on-the-fly if you want to), and as a nice side benefit
improves performance a little. Per recent pghackers discussion.
code review by Tom Lane. Remaining issues: functions that take or
return tuple types are likely to break if one drops (or adds!)
a column in the table defining the type. Need to think about what
to do here.
Along the way: some code review for recent COPY changes; mark system
columns attnotnull = true where appropriate, per discussion a month ago.
attstattarget to indicate 'use the default'. The default is now a GUC
variable default_statistics_target, and so may be changed on the fly. Along
the way we gain the ability to have pg_dump dump the per-column statistics
target when it's not the default. Patch by Neil Conway, with some kibitzing
from Tom Lane.
transaction, so as to avoid returning them out of the index AM. Saves
repeated heap_fetch operations on frequently-updated rows. Also detect
queries on unique keys (equality to all columns of a unique index), and
don't bother continuing scan once we have found first match.
Killing is implemented in the btree and hash AMs, but not yet in rtree
or gist, because there isn't an equally convenient place to do it in
those AMs (the outer amgetnext routine can't do it without re-pinning
the index page).
Did some small cleanup on APIs of HeapTupleSatisfies, heap_fetch, and
index_insert to make this a little easier.
in snapshots, per my proposal of a few days ago. Also, tweak heapam.c
routines (heap_insert, heap_update, heap_delete, heap_mark4update) to
be passed the command ID to use, instead of doing GetCurrentCommandID.
For catalog updates they'll still get passed current command ID, but
for updates generated from the main executor they'll get passed the
command ID saved in the snapshot the query is using. This should fix
some corner cases associated with functions and triggers that advance
current command ID while an outer query is still in progress.
yesterday's proposal to pghackers. Also remove unnecessary parameters
to heap_beginscan, heap_rescan. I modified pg_proc.h to reflect the
new numbers of parameters for the AM interface routines, but did not
force an initdb because nothing actually looks at those fields.
qualified operator names directly, for example CREATE OPERATOR myschema.+
( ... ). To qualify an operator name in an expression you need to write
OPERATOR(myschema.+) (thanks to Peter for suggesting an escape hatch).
I also took advantage of having to reformat pg_operator to fix something
that'd been bugging me for a while: mergejoinable operators should have
explicit links to the associated cross-data-type comparison operators,
rather than hardwiring an assumption that they are named < and >.