The core of this patch is hash_array() and associated typcache
infrastructure, which works just about exactly like the existing support
for array comparison.
In addition I did some work to ensure that the planner won't think that an
array type is hashable unless its element type is hashable, and similarly
for sorting. This includes adding a datatype parameter to op_hashjoinable
and op_mergejoinable, and adding an explicit "hashable" flag to
SortGroupClause. The lack of a cross-check on the element type was a
pre-existing bug in mergejoin support --- but it didn't matter so much
before, because if you couldn't sort the element type there wasn't any good
alternative to failing anyhow. Now that we have the alternative of hashing
the array type, there are cases where we can avoid a failure by being picky
at the planner stage, so it's time to be picky.
The issue of exactly how to combine the per-element hash values to produce
an array hash is still open for discussion, but the rest of this is pretty
solid, so I'll commit it as-is.
better handling of NULL elements within the arrays. The third parameter
is a string that should be used to represent a NULL element, or should
be translated into a NULL element, respectively. If the third parameter
is NULL it behaves the same as the two-parameter form.
There are two incompatible changes in the behavior of the two-parameter form
of string_to_array. First, it will return an empty (zero-element) array
rather than NULL when the input string is of zero length. Second, if the
field separator is NULL, the function splits the string into individual
characters, rather than returning NULL as before. These two changes make
this form fully compatible with the behavior of the new three-parameter form.
Pavel Stehule, reviewed by Brendan Jurd
how this ought to behave for multi-dimensional arrays. Per discussion,
not having it at all seems better than having it with what might prove
to be the wrong behavior. We can always add it later when we have consensus
on the correct behavior.
alias for array_length(v,1). The efficiency gain here is doubtless
negligible --- what I'm interested in is making sure that if we have
second thoughts about the definition, we will not have to force a
post-beta initdb to change the implementation.
anyelement. This lacks the WITH ORDINALITY option, as well as the multiple
input arrays option added in the most recent SQL specs. But it's still a
pretty useful subset of the spec's functionality, and it is enough to
allow obsoleting contrib/intagg.
function as a special case.
This version still has the suspicious behavior of returning null for an
empty array (rather than zero), but this may need a wholesale revision of
empty array behavior, currently under discussion.
Jim Nasby, Robert Haas, Peter Eisentraut
an array of strings rather than an array of integers, and allow any simple
constant or identifier to be used in typmods; for example
create table foo (f1 widget(42,'23skidoo',point));
Of course the typmodin function has still got to pack this info into a
non-negative int32 for storage, but it's still a useful improvement in
flexibility, especially considering that you can do nearly anything if you
are willing to keep the info in a side table. We can get away with this
change since we have not yet released a version providing user-definable
typmods. Per discussion.
seen by code inspecting the expression. The best way to do this seems
to be to drop the original representation as a function invocation, and
instead make a special expression node type that represents applying
the element-type coercion function to each array element. In this way
the element function is exposed and will be checked for volatility.
Per report from Guillaume Smet.
Get rid of VARATT_SIZE and VARATT_DATA, which were simply redundant with
VARSIZE and VARDATA, and as a consequence almost no code was using the
longer names. Rename the length fields of struct varlena and various
derived structures to catch anyplace that was accessing them directly;
and clean up various places so caught. In itself this patch doesn't
change any behavior at all, but it is necessary infrastructure if we hope
to play any games with the representation of varlena headers.
Greg Stark and Tom Lane
more space is needed, instead of incrementing by a fixed amount; the old
method wastes lots of space and time when the ultimate size is large.
Per gripe from Tatsuo.
we probably should make them work reliably for all arrays. Fix code
to handle NULLs and multidimensional arrays, move it into arrayfuncs.c.
GIN is still restricted to indexing arrays with no null elements, however.
comment line where output as too long, and update typedefs for /lib
directory. Also fix case where identifiers were used as variable names
in the backend, but as typedefs in ecpg (favor the backend for
indenting).
Backpatch to 8.1.X.
functionality, but I still need to make another pass looking at places
that incidentally use arrays (such as ACL manipulation) to make sure they
are null-safe. Contrib needs work too.
I have not changed the behaviors that are still under discussion about
array comparison and what to do with lower bounds.
change saves a great deal of space in pg_proc and its primary index,
and it eliminates the former requirement that INDEX_MAX_KEYS and
FUNC_MAX_ARGS have the same value. INDEX_MAX_KEYS is still embedded
in the on-disk representation (because it affects index tuple header
size), but FUNC_MAX_ARGS is not. I believe it would now be possible
to increase FUNC_MAX_ARGS at little cost, but haven't experimented yet.
There are still a lot of vestigial references to FUNC_MAX_ARGS, which
I will clean up in a separate pass. However, getting rid of it
altogether would require changing the FunctionCallInfoData struct,
and I'm not sure I want to buy into that.
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
eliminating the former hard-wired convention about their names. Allow
pg_cast entries to represent both type coercion and length coercion in
a single step --- this is represented by a function that takes an
extra typmod argument, just like a length coercion function. This
nicely merges the type and length coercion mechanisms into something
at least a little cleaner than we had before. Make use of the single-
coercion-step behavior to fix integer-to-bit coercion so that coercing
to bit(n) yields the rightmost n bits of the integer instead of the
leftmost n bits. This should fix recurrent complaints about the odd
behavior of this coercion. Clean up the documentation of the bit string
functions, and try to put it where people might actually find it.
Also, get rid of the unreliable heuristics in ruleutils.c about whether
to display nested coercion steps; instead require parse_coerce.c to
label them properly in the first place.
of a composite type to get that type's OID as their second parameter,
in place of typelem which is useless. The actual changes are mostly
centralized in getTypeInputInfo and siblings, but I had to fix a few
places that were fetching pg_type.typelem for themselves instead of
using the lsyscache.c routines. Also, I renamed all the related variables
from 'typelem' to 'typioparam' to discourage people from assuming that
they necessarily contain array element types.
comparison functions), replacing the highly bogus bitwise array_eq. Create
a btree index opclass for ANYARRAY --- it is now possible to create indexes
on array columns.
Arrange to cache the results of catalog lookups across multiple array
operations, instead of repeating the lookups on every call.
Add string_to_array and array_to_string functions.
Remove singleton_array, array_accum, array_assign, and array_subscript
functions, since these were for proof-of-concept and not intended to become
supported functions.
Minor adjustments to behavior in some corner cases with empty or
zero-dimensional arrays.
Joe Conway (with some editorializing by Tom Lane).
expressions, ARRAY(sub-SELECT) expressions, some array functions.
Polymorphic functions using ANYARRAY/ANYELEMENT argument and return
types. Some regression tests in place, documentation is lacking.
Joe Conway, with some kibitzing from Tom Lane.
now)" item on the open items, and subsequent plpgsql function I sent in,
made me realize it was too hard to get the upper and lower bound of an
array. The attached creates two functions that I think will be very
useful when combined with the ability of plpgsql to return sets.
array_lower(array, dim_num)
- and -
array_upper(array, dim_num)
They return the value (as an int) of the upper and lower bound of the
requested dim in the provided array.
Joe Conway
to be flexible about assignment casts without introducing ambiguity in
operator/function resolution. Introduce a well-defined promotion hierarchy
for numeric datatypes (int2->int4->int8->numeric->float4->float8).
Change make_const to initially label numeric literals as int4, int8, or
numeric (never float8 anymore).
Explicitly mark Func and RelabelType nodes to indicate whether they came
from a function call, explicit cast, or implicit cast; use this to do
reverse-listing more accurately and without so many heuristics.
Explicit casts to char, varchar, bit, varbit will truncate or pad without
raising an error (the pre-7.2 behavior), while assigning to a column without
any explicit cast will still raise an error for wrong-length data like 7.3.
This more nearly follows the SQL spec than 7.2 behavior (we should be
reporting a 'completion condition' in the explicit-cast cases, but we have
no mechanism for that, so just do silent truncation).
Fix some problems with enforcement of typmod for array elements;
it didn't work at all in 'UPDATE ... SET array[n] = foo', for example.
Provide a generalized array_length_coerce() function to replace the
specialized per-array-type functions that used to be needed (and were
missing for NUMERIC as well as all the datetime types).
Add missing conversions int8<->float4, text<->numeric, oid<->int8.
initdb forced.
array header, and to compute sizing and alignment of array elements
the same way normal tuple access operations do --- viz, using the
tupmacs.h macros att_addlength and att_align. This makes the world
safe for arrays of cstrings or intervals, and should make it much
easier to write array-type-polymorphic functions; as examples see
the cleanups of array_out and contrib/array_iterator. By Joe Conway
and Tom Lane.