a separate statement (though it can still be invoked as part of VACUUM, too).
pg_statistic redesigned to be more flexible about what statistics are
stored. ANALYZE now collects a list of several of the most common values,
not just one, plus a histogram (not just the min and max values). Random
sampling is used to make the process reasonably fast even on very large
tables. The number of values and histogram bins collected is now
user-settable via an ALTER TABLE command.
There is more still to do; the new stats are not being used everywhere
they could be in the planner. But the remaining changes for this project
should be localized, and the behavior is already better than before.
A not-very-related change is that sorting now makes use of btree comparison
routines if it can find one, rather than invoking '<' twice.
succeeds or not. Revise rtree page split algorithm to take care about
making a feasible split --- ie, will the incoming tuple actually fit?
Failure to make a feasible split, combined with failure to notice the
failure, account for Jim Stone's recent bug report. I suspect that
hash and gist indices may have the same type of bug, but at least now
we'll get error messages rather than silent failures if so. Also clean
up rtree code to use Datum rather than char* where appropriate.
allocated by plan nodes are not leaked at end of query. This doesn't
really matter for normal queries, but it sure does for queries invoked
repetitively inside SQL functions. Clean up some other grotty code
associated with tupdescs, and fix a few other memory leaks exposed by
tests with simple SQL functions.
1. Support of variable size keys - new algorithm of insertion to tree
(GLI - gist layrered insertion). Previous algorithm was implemented
as described in paper by Joseph M. Hellerstein et.al
"Generalized Search Trees for Database Systems". This (old)
algorithm was not suitable for variable size keys and could be
not effective ( walking up-down ) in case of multiple levels split
Bug fixed:
1. fixed bug in gistPageAddItem - key values were written to disk
uncompressed. This caused failure if decompression function
does real job.
2. NULLs handling - we keep NULLs in tree. Right way is to remove them,
but we don't know how to inform vacuum about index statistics. This is
just cosmetic warning message (like in case with R-Tree),
but I'm not sure how to recognize real problem if we remove NULLs
and suppress this warning as Tom suggested.
3. various memory leaks
This work was done by Teodor Sigaev (teodor@stack.net) and
Oleg Bartunov (oleg@sai.msu.su).
maintained for each cache entry. A cache entry will not be freed until
the matching ReleaseSysCache call has been executed. This eliminates
worries about cache entries getting dropped while still in use. See
my posting to pg-hackers of even date for more info.
(WAL logging for this is not done yet, however.) Clean up a number of really
crufty things that are no longer needed now that DROP behaves nicely. Make
temp table mapper do the right things when drop or rename affecting a temp
table is rolled back. Also, remove "relation modified while in use" error
check, in favor of locking tables at first reference and holding that lock
throughout the statement.
pass-by-ref data types --- eg, an index on lower(textfield) --- no longer
leak memory during index creation or update. Clean up a lot of redundant
code ... did you know that copy, vacuum, truncate, reindex, extend index,
and bootstrap each basically duplicated the main executor's logic for
extracting information about an index and preparing index entries?
Functional indexes should be a little faster now too, due to removal
of repeated function lookups.
CREATE INDEX 'opt_type' clause is deimplemented by these changes,
but I haven't removed it from the parser yet (need to merge with
Thomas' latest change set first).
memory contexts. Currently, only leaks in expressions executed as
quals or projections are handled. Clean up some old dead cruft in
executor while at it --- unused fields in state nodes, that sort of thing.
passing the index-is-unique flag to index build routines (duh! ...
why wasn't it done this way to begin with?). Aside from eliminating
an eyesore, this should save a few milliseconds in btree index creation
because a full scan of pg_index is not needed any more.
--- ie, they're only called for side-effects. Add a PG_RETURN_VOID()
macro and use it where appropriate. This probably doesn't change the
machine code by a single bit ... it's just for documentation.
running gcc and HP's cc with warnings cranked way up. Signed vs unsigned
comparisons, routines declared static and then defined not-static,
that kind of thing. Tedious, but perhaps useful...
from a constraint condition does not violate the constraint (cf. discussion
on pghackers 12/9/99). Implemented by adding a parameter to ExecQual,
specifying whether to return TRUE or FALSE when the qual result is
really NULL in three-valued boolean logic. Currently, ExecRelCheck is
the only caller that asks for TRUE, but if we find any other places that
have the wrong response to NULL, it'll be easy to fix them.
* Buffer refcount cleanup (per my "progress report" to pghackers, 9/22).
* Add links to backend PROC structs to sinval's array of per-backend info,
and use these links for routines that need to check the state of all
backends (rather than the slow, complicated search of the ShmemIndex
hashtable that was used before). Add databaseOID to PROC structs.
* Use this to implement an interlock that prevents DESTROY DATABASE of
a database containing running backends. (It's a little tricky to prevent
a concurrently-starting backend from getting in there, since the new
backend is not able to lock anything at the time it tries to look up
its database in pg_database. My solution is to recheck that the DB is
OK at the end of InitPostgres. It may not be a 100% solution, but it's
a lot better than no interlock at all...)
* In ALTER TABLE RENAME, flush buffers for the relation before doing the
rename of the physical files, to ensure we don't get failures later from
mdblindwrt().
* Update TRUNCATE patch so that it actually compiles against current
sources :-(.
You should do "make clean all" after pulling these changes.
additional argument specifying the kind of lock to acquire/release (or
'NoLock' to do no lock processing). Ensure that all relations are locked
with some appropriate lock level before being examined --- this ensures
that relevant shared-inval messages have been processed and should prevent
problems caused by concurrent VACUUM. Fix several bugs having to do with
mismatched increment/decrement of relation ref count and mismatched
heap_open/close (which amounts to the same thing). A bogus ref count on
a relation doesn't matter much *unless* a SI Inval message happens to
arrive at the wrong time, which is probably why we got away with this
sloppiness for so long. Repair missing grab of AccessExclusiveLock in
DROP TABLE, ALTER/RENAME TABLE, etc, as noted by Hiroshi.
Recommend 'make clean all' after pulling this update; I modified the
Relation struct layout slightly.
Will post further discussion to pghackers list shortly.
no longer returns buffer pointer, can be gotten from scan;
descriptor; bootstrap can create multi-key indexes;
pg_procname index now is multi-key index; oidint2, oidint4, oidname
are gone (must be removed from regression tests); use System Cache
rather than sequential scan in many places; heap_modifytuple no
longer takes buffer parameter; remove unused buffer parameter in
a few other functions; oid8 is not index-able; remove some use of
single-character variable names; cleanup Buffer variables usage
and scan descriptor looping; cleaned up allocation and freeing of
tuples; 18k lines of diff;
Patch by: wieck@sapserv.debis.de (Jan Wieck)
One of the design rules of PostgreSQL is extensibility. And
to follow this rule means (at least for me) that there should
not only be a builtin PL. Instead I would prefer a defined
interface for PL implemetations.