of an index can now be a computed expression instead of a simple variable.
Restrictions on expressions are the same as for predicates (only immutable
functions, no sub-selects). This fixes problems recently introduced with
inlining SQL functions, because the inlining transformation is applied to
both expression trees so the planner can still match them up. Along the
way, improve efficiency of handling index predicates (both predicates and
index expressions are now cached by the relcache) and fix 7.3 oversight
that didn't record dependencies of predicate expressions.
blanks, in hopes of reducing the surprise factor for newbies. Remove
redundant operators for VARCHAR (it depends wholly on TEXT operations now).
Clean up resolution of ambiguous operators/functions to avoid surprising
choices for domains: domains are treated as equivalent to their base types
and binary-coercibility is no longer considered a preference item when
choosing among multiple operators/functions. IsBinaryCoercible now correctly
reflects the notion that you need *only* relabel the type to get from type
A to type B: that is, a domain is binary-coercible to its base type, but
not vice versa. Various marginal cleanup, including merging the essentially
duplicate resolution code in parse_func.c and parse_oper.c. Improve opr_sanity
regression test to understand about binary compatibility (using pg_cast),
and fix a couple of small errors in the catalogs revealed thereby.
Restructure "special operator" handling to fetch operators via index opclasses
rather than hardwiring assumptions about names (cleans up the pattern_ops
stuff a little).
account for NULLs; in hindsight this is obvious since the code for
the MCV-lists case would reduce to this when there are zero entries
in both lists. Per example from Alec Mitchell.
them as arrays of the internal datatype. This requires treating the
stavalues columns as 'anyarray' rather than 'text[]', which is not 100%
kosher but seems to work fine for the purposes we need for pg_statistic.
Perhaps in the future 'anyarray' will be allowed more generally.
passed to join selectivity estimators. Make use of this in eqjoinsel
to derive non-bogus selectivity for IN clauses. Further tweaking of
cost estimation for IN.
initdb forced because of pg_proc.h changes.
Try to model the effect of rescanning input tuples in mergejoins;
account for JOIN_IN short-circuiting where appropriate. Also, recognize
that mergejoin and hashjoin clauses may now be more than single operator
calls, so we have to charge appropriate execution costs.
of known-equal expressions includes any constant expressions (including
Params from outer queries), we actively suppress any 'var = var'
clauses that are or could be deduced from the set, generating only the
deducible 'var = const' clauses instead. The idea here is to push down
the restrictions implied by the equality set to base relations whenever
possible. Once we have applied the 'var = const' clauses, the 'var = var'
clauses are redundant, and should be suppressed both to save work at
execution and to avoid double-counting restrictivity.
There are two implementation techniques: the executor understands a new
JOIN_IN jointype, which emits at most one matching row per left-hand row,
or the result of the IN's sub-select can be fed through a DISTINCT filter
and then joined as an ordinary relation.
Along the way, some minor code cleanup in the optimizer; notably, break
out most of the jointree-rearrangement preprocessing in planner.c and
put it in a new file prep/prepjointree.c.
containing a volatile function), rather than only on 'Var = Var' clauses
as before. This makes it practical to do flatten_join_alias_vars at the
start of planning, which in turn eliminates a bunch of klugery inside the
planner to deal with alias vars. As a free side effect, we now detect
implied equality of non-Var expressions; for example in
SELECT ... WHERE a.x = b.y and b.y = 42
we will deduce a.x = 42 and use that as a restriction qual on a. Also,
we can remove the restriction introduced 12/5/02 to prevent pullup of
subqueries whose targetlists contain sublinks.
Still TODO: make statistical estimation routines in selfuncs.c and costsize.c
smarter about expressions that are more complex than plain Vars. The need
for this is considerably greater now that we have to be able to estimate
the suitability of merge and hash join techniques on such expressions.
costs for expression evaluation, not only per-tuple cost as before.
This extension is needed in order to deal realistically with hashed or
materialized sub-selects.
so that all executable expression nodes inherit from a common supertype
Expr. This is somewhat of an exercise in code purity rather than any
real functional advance, but getting rid of the extra Oper or Func node
formerly used in each operator or function call should provide at least
a little space and speed improvement.
initdb forced by changes in stored-rules representation.
of groups produced by GROUP BY. This improves the accuracy of planning
estimates for grouped subselects, and is needed to check whether a
hashed aggregation plan risks memory overflow.
postgres.h or c.h includes a system header (such as stdio.h or
stdlib.h), there's no need to specifically include it in any of the .c
files in the backend.
Neil Conway
Ray Ontko 28-June-02. Also, fix prefix_selectivity for NAME lefthand
variables (it was bogusly assuming binary compatibility), and adjust
make_greater_string() to not call pg_mbcliplen() with invalid multibyte
data (this last per bug report that I can't find at the moment, but it
was in July '02).
> I see in your recent bytea-LIKE patch
>
> if (datatype != BYTEAOID && pg_database_encoding_max_length()
> 1)
> len = pg_mbcliplen((const unsigned char *) workstr, len,
len - 1);
> else
> len -= -1;
>
> Surely there's one too many minus signs in that last?
Joe Conway
> src/backend/optimizer/path/indxpath.c; see the "special indexable
> operators" stuff near the bottom of that file. (It's a bit of a crock
> that this code is hardwired there, and not somehow accessed through a
> system catalog, but it's what we've got at the moment.)
The attached patch re-enables a bytea right hand argument (as compared
to a text right hand argument), and enables index usage, for bytea LIKE
Joe Conway
with OPAQUE, as per recent pghackers discussion. I still want to do some
more work on the 'cstring' pseudo-type, but I'm going to commit the bulk
of the changes now before the tree starts shifting under me ...
per pghackers discussion. Add some more typsanity tests, and clean
up some problems exposed thereby (broken or missing array types for
some built-in types). Also, clean up loose ends from unknownin/out
patch.
Use "--enable-integer-datetimes" in configuration to use this rather
than the original float8 storage. I would recommend the integer-based
storage for any platform on which it is available. We perhaps should
make this the default for the production release.
Change timezone(timestamptz) results to return timestamp rather than
a character string. Formerly, we didn't have a way to represent
timestamps with an explicit time zone other than freezing the info into
a string. Now, we can reasonably omit the explicit time zone from the
result and return a timestamp with values appropriate for the specified
time zone. Much cleaner, and if you need the time zone in the result
you can put it into a character string pretty easily anyway.
Allow fractional seconds in date/time types even for dates prior to 1BC.
Limit timestamp data types to 6 decimal places of precision. Just right
for a micro-second storage of int8 date/time types, and reduces the
number of places ad-hoc rounding was occuring for the float8-based types.
Use lookup tables for precision/rounding calculations for timestamp and
interval types. Formerly used pow() to calculate the desired value but
with a more limited range there is no reason to not type in a lookup
table. Should be *much* better performance, though formerly there were
some optimizations to help minimize the number of times pow() was called.
Define a HAVE_INT64_TIMESTAMP variable. Based on the configure option
"--enable-integer-datetimes" and the existing internal INT64_IS_BUSTED.
Add explicit date/interval operators and functions for addition and
subtraction. Formerly relied on implicit type promotion from date to
timestamp with time zone.
Change timezone conversion functions for the timetz type from "timetz()"
to "timezone()". This is consistant with other time zone coersion
functions for other types.
Bump the catalog version to 200204201.
Fix up regression tests to reflect changes in fractional seconds
representation for date/times in BC eras.
All regression tests pass on my Linux box.
qualified operator names directly, for example CREATE OPERATOR myschema.+
( ... ). To qualify an operator name in an expression you need to write
OPERATOR(myschema.+) (thanks to Peter for suggesting an escape hatch).
I also took advantage of having to reformat pg_operator to fix something
that'd been bugging me for a while: mergejoinable operators should have
explicit links to the associated cross-data-type comparison operators,
rather than hardwiring an assumption that they are named < and >.
now just below FATAL in server_min_messages. Added more text to
highlight ordering difference between it and client_min_messages.
---------------------------------------------------------------------------
REALLYFATAL => PANIC
STOP => PANIC
New INFO level the prints to client by default
New LOG level the prints to server log by default
Cause VACUUM information to print only to the client
NOTICE => INFO where purely information messages are sent
DEBUG => LOG for purely server status messages
DEBUG removed, kept as backward compatible
DEBUG5, DEBUG4, DEBUG3, DEBUG2, DEBUG1 added
DebugLvl removed in favor of new DEBUG[1-5] symbols
New server_min_messages GUC parameter with values:
DEBUG[5-1], INFO, NOTICE, ERROR, LOG, FATAL, PANIC
New client_min_messages GUC parameter with values:
DEBUG[5-1], LOG, INFO, NOTICE, ERROR, FATAL, PANIC
Server startup now logged with LOG instead of DEBUG
Remove debug_level GUC parameter
elog() numbers now start at 10
Add test to print error message if older elog() values are passed to elog()
Bootstrap mode now has a -d that requires an argument, like postmaster
both input streams to the end. If one variable's range is much less
than the other, an indexscan-based merge can win by not scanning all
of the other table. Per example from Reinhard Max.
>
> 1. Now outputs '\\' instead of '\134' when using encode(bytea, 'escape')
> Note that I ended up leaving \0 as \000 so that there are no ambiguities
> when decoding something like, for example, \0123.
>
> 2. Fixed bug in byteain which allowed input values which were not valid
> octals (e.g. \789), to be parsed as if they were octals.
>
> Joe
>
Here's rev 2 of the bytea string support patch. Changes:
1. Added missing declaration for MatchBytea function
2. Added PQescapeBytea to fe-exec.c
3. Applies cleanly on cvs tip from this afternoon
I'm hoping that someone can review/approve/apply this before beta starts, so
I guess I'd vote (not that it counts for much) to delay beta a few days :-)
Joe Conway
Note: I didn't force an initdb, figuring that one today was enough.
However, there is a new function in pg_proc.h, and pg_dump won't be
able to dump partial indexes until you add that function.
IS TRUE, etc, with some degree of verisimilitude. Split out
selectivity support functions from builtins.h into a new header
file selfuncs.h, so as to reduce the number of header files builtins.h
must depend on. Fix a few missing inclusions exposed thereby.
From Joe Conway, with some kibitzing from Tom Lane.
of costsize.c routines to pass Query root, so that costsize can figure
more things out by itself and not be so dependent on its callers to tell
it everything it needs to know. Use selectivity of hash or merge clause
to estimate number of tuples processed internally in these joins
(this is more useful than it would've been before, since eqjoinsel is
somewhat more accurate than before).