1
0
mirror of https://github.com/postgres/postgres.git synced 2025-06-23 14:01:44 +03:00

Make SQL arrays support null elements. This commit fixes the core array

functionality, but I still need to make another pass looking at places
that incidentally use arrays (such as ACL manipulation) to make sure they
are null-safe.  Contrib needs work too.
I have not changed the behaviors that are still under discussion about
array comparison and what to do with lower bounds.
This commit is contained in:
Tom Lane
2005-11-17 22:14:56 +00:00
parent c859308aba
commit cecb607559
35 changed files with 2149 additions and 950 deletions

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/array.sgml,v 1.46 2005/11/04 23:13:59 petere Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/array.sgml,v 1.47 2005/11/17 22:14:50 tgl Exp $ -->
<sect1 id="arrays">
<title>Arrays</title>
@ -110,6 +110,13 @@ CREATE TABLE tictactoe (
three subarrays of integers.
</para>
<para>
To set an element of an array constant to NULL, write <literal>NULL</>
for the element value. (Any upper- or lower-case variant of
<literal>NULL</> will do.) If you want an actual string value
<quote>NULL</>, you must put double quotes around it.
</para>
<para>
(These kinds of array constants are actually only a special case of
the generic type constants discussed in <xref
@ -121,17 +128,6 @@ CREATE TABLE tictactoe (
<para>
Now we can show some <command>INSERT</command> statements.
<programlisting>
INSERT INTO sal_emp
VALUES ('Bill',
'{10000, 10000, 10000, 10000}',
'{{"meeting", "lunch"}, {"meeting"}}');
ERROR: multidimensional arrays must have array expressions with matching dimensions
</programlisting>
Note that multidimensional arrays must have matching extents for each
dimension. A mismatch causes an error report.
<programlisting>
INSERT INTO sal_emp
VALUES ('Bill',
@ -145,15 +141,9 @@ INSERT INTO sal_emp
</programlisting>
</para>
<para>
A limitation of the present array implementation is that individual
elements of an array cannot be SQL null values. The entire array
can be set to null, but you can't have an array with some elements
null and some not. (This is likely to change in the future.)
</para>
<para>
The result of the previous two inserts looks like this:
<programlisting>
SELECT * FROM sal_emp;
name | pay_by_quarter | schedule
@ -183,6 +173,19 @@ INSERT INTO sal_emp
constructor syntax is discussed in more detail in
<xref linkend="sql-syntax-array-constructors">.
</para>
<para>
Multidimensional arrays must have matching extents for each
dimension. A mismatch causes an error report, for example:
<programlisting>
INSERT INTO sal_emp
VALUES ('Bill',
'{10000, 10000, 10000, 10000}',
'{{"meeting", "lunch"}, {"meeting"}}');
ERROR: multidimensional arrays must have array expressions with matching dimensions
</programlisting>
</para>
</sect2>
<sect2>
@ -262,14 +265,22 @@ SELECT schedule[1:2][2] FROM sal_emp WHERE name = 'Bill';
</para>
<para>
Fetching from outside the current bounds of an array yields a
SQL null value, not an error. For example, if <literal>schedule</>
An array subscript expression will return null if either the array itself or
any of the subscript expressions are null. Also, null is returned if a
subscript is outside the array bounds (this case does not raise an error).
For example, if <literal>schedule</>
currently has the dimensions <literal>[1:3][1:2]</> then referencing
<literal>schedule[3][3]</> yields NULL. Similarly, an array reference
with the wrong number of subscripts yields a null rather than an error.
Fetching an array slice that
is completely outside the current bounds likewise yields a null array;
but if the requested slice partially overlaps the array bounds, then it
</para>
<para>
An array slice expression likewise yields null if the array itself or
any of the subscript expressions are null. However, in other corner
cases such as selecting an array slice that
is completely outside the current array bounds, a slice expression
yields an empty (zero-dimensional) array instead of null.
If the requested slice partially overlaps the array bounds, then it
is silently reduced to just the overlapping region.
</para>
@ -349,7 +360,7 @@ UPDATE sal_emp SET pay_by_quarter[1:2] = '{27000,27000}'
</para>
<para>
Array slice assignment allows creation of arrays that do not use one-based
Subscripted assignment allows creation of arrays that do not use one-based
subscripts. For example one might assign to <literal>myarray[-2:7]</> to
create an array with subscript values running from -2 to 7.
</para>
@ -442,7 +453,7 @@ SELECT array_dims(ARRAY[1,2] || ARRAY[[3,4],[5,6]]);
arrays, but <function>array_cat</function> supports multidimensional arrays.
Note that the concatenation operator discussed above is preferred over
direct use of these functions. In fact, the functions are primarily for use
direct use of these functions. In fact, the functions exist primarily for use
in implementing the concatenation operator. However, they may be directly
useful in the creation of user-defined aggregates. Some examples:
@ -544,8 +555,9 @@ SELECT * FROM sal_emp WHERE 10000 = ALL (pay_by_quarter);
<para>
The array output routine will put double quotes around element values
if they are empty strings or contain curly braces, delimiter characters,
double quotes, backslashes, or white space. Double quotes and backslashes
if they are empty strings, contain curly braces, delimiter characters,
double quotes, backslashes, or white space, or match the word
<literal>NULL</>. Double quotes and backslashes
embedded in element values will be backslash-escaped. For numeric
data types it is safe to assume that double quotes will never appear, but
for textual data types one should be prepared to cope with either presence
@ -555,35 +567,15 @@ SELECT * FROM sal_emp WHERE 10000 = ALL (pay_by_quarter);
<para>
By default, the lower bound index value of an array's dimensions is
set to one. If any of an array's dimensions has a lower bound index not
equal to one, an additional decoration that indicates the actual
array dimensions will precede the array structure decoration.
set to one. To represent arrays with other lower bounds, the array
subscript ranges can be specified explicitly before writing the
array contents.
This decoration consists of square brackets (<literal>[]</>)
around each array dimension's lower and upper bounds, with
a colon (<literal>:</>) delimiter character in between. The
array dimension decoration is followed by an equal sign (<literal>=</>).
For example:
<programlisting>
SELECT 1 || ARRAY[2,3] AS array;
array
---------------
[0:2]={1,2,3}
(1 row)
SELECT ARRAY[1,2] || ARRAY[[3,4]] AS array;
array
--------------------------
[0:1][1:2]={{1,2},{3,4}}
(1 row)
</programlisting>
</para>
<para>
This syntax can also be used to specify non-default array subscripts
in an array literal. For example:
<programlisting>
SELECT f1[1][-2][3] AS e1, f1[1][-1][5] AS e2
FROM (SELECT '[1:1][-2:-1][3:5]={{{1,2,3},{4,5,6}}}'::int[] AS f1) AS ss;
@ -592,6 +584,18 @@ SELECT f1[1][-2][3] AS e1, f1[1][-1][5] AS e2
1 | 6
(1 row)
</programlisting>
The array output routine will include explicit dimensions in its result
only when there are one or more lower bounds different from one.
</para>
<para>
If the value written for an element is <literal>NULL</> (in any case
variant), the element is taken to be NULL. The presence of any quotes
or backslashes disables this and allows the literal string value
<quote>NULL</> to be entered. Also, for backwards compatibility with
pre-8.2 versions of <productname>PostgreSQL</>, the <xref
linkend="guc-array-nulls"> configuration parameter may be turned
<literal>off</> to suppress recognition of <literal>NULL</> as a NULL.
</para>
<para>
@ -600,7 +604,9 @@ SELECT f1[1][-2][3] AS e1, f1[1][-1][5] AS e2
if the element value would otherwise confuse the array-value parser.
For example, elements containing curly braces, commas (or whatever the
delimiter character is), double quotes, backslashes, or leading or trailing
whitespace must be double-quoted. To put a double quote or backslash in a
whitespace must be double-quoted. Empty strings and strings matching the
word <literal>NULL</> must be quoted, too. To put a double quote or
backslash in a
quoted array element value, precede it with a backslash. Alternatively, you
can use backslash-escaping to protect all data characters that would
otherwise be taken as array syntax.

View File

@ -1,5 +1,5 @@
<!--
$PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.36 2005/11/04 23:53:18 tgl Exp $
$PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.37 2005/11/17 22:14:50 tgl Exp $
-->
<chapter Id="runtime-config">
<title>Server Configuration</title>
@ -3614,6 +3614,7 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir'
<sect2 id="runtime-config-compatible-version">
<title>Previous PostgreSQL Versions</title>
<variablelist>
<varlistentry id="guc-add-missing-from" xreflabel="add_missing_from">
@ -3647,40 +3648,27 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir'
</listitem>
</varlistentry>
<varlistentry id="guc-regex-flavor" xreflabel="regex_flavor">
<term><varname>regex_flavor</varname> (<type>string</type>)</term>
<indexterm><primary>regular expressions</></>
<varlistentry id="guc-array-nulls" xreflabel="array_nulls">
<term><varname>array_nulls</varname> (<type>boolean</type>)</term>
<indexterm>
<primary><varname>regex_flavor</> configuration parameter</primary>
<primary><varname>array_nulls</> configuration parameter</primary>
</indexterm>
<listitem>
<para>
The regular expression <quote>flavor</> can be set to
<literal>advanced</>, <literal>extended</>, or <literal>basic</>.
The default is <literal>advanced</>. The <literal>extended</>
setting may be useful for exact backwards compatibility with
pre-7.4 releases of <productname>PostgreSQL</>. See
<xref linkend="posix-syntax-details"> for details.
This controls whether the array input parser recognizes
unquoted <literal>NULL</> as specifying a NULL array element.
By default, this is <literal>on</>, allowing array values containing
NULLs to be entered. However, <productname>PostgreSQL</> versions
before 8.2 did not support NULLs in arrays, and therefore would
treat <literal>NULL</> as specifying a normal array element with
the string value <quote>NULL</>. For backwards compatibility with
applications that require the old behavior, this variable can be
turned <literal>off</>.
</para>
</listitem>
</varlistentry>
<varlistentry id="guc-sql-inheritance" xreflabel="sql_inheritance">
<term><varname>sql_inheritance</varname> (<type>boolean</type>)</term>
<indexterm>
<primary><varname>sql_inheritance</> configuration parameter</primary>
</indexterm>
<indexterm><primary>inheritance</></>
<listitem>
<para>
This controls the inheritance semantics, in particular whether
subtables are included by various commands by default. They were
not included in versions prior to 7.1. If you need the old
behavior you can set this variable to <literal>off</>, but in
the long run you are encouraged to change your applications to
use the <literal>ONLY</literal> key word to exclude subtables.
See <xref linkend="ddl-inherit"> for more information about
inheritance.
Note that it is possible to create array values containing NULLs
even when this variable is <literal>off</>.
</para>
</listitem>
</varlistentry>
@ -3736,8 +3724,47 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir'
</listitem>
</varlistentry>
<varlistentry id="guc-regex-flavor" xreflabel="regex_flavor">
<term><varname>regex_flavor</varname> (<type>string</type>)</term>
<indexterm><primary>regular expressions</></>
<indexterm>
<primary><varname>regex_flavor</> configuration parameter</primary>
</indexterm>
<listitem>
<para>
The regular expression <quote>flavor</> can be set to
<literal>advanced</>, <literal>extended</>, or <literal>basic</>.
The default is <literal>advanced</>. The <literal>extended</>
setting may be useful for exact backwards compatibility with
pre-7.4 releases of <productname>PostgreSQL</>. See
<xref linkend="posix-syntax-details"> for details.
</para>
</listitem>
</varlistentry>
<varlistentry id="guc-sql-inheritance" xreflabel="sql_inheritance">
<term><varname>sql_inheritance</varname> (<type>boolean</type>)</term>
<indexterm>
<primary><varname>sql_inheritance</> configuration parameter</primary>
</indexterm>
<indexterm><primary>inheritance</></>
<listitem>
<para>
This controls the inheritance semantics, in particular whether
subtables are included by various commands by default. They were
not included in versions prior to 7.1. If you need the old
behavior you can set this variable to <literal>off</>, but in
the long run you are encouraged to change your applications to
use the <literal>ONLY</literal> key word to exclude subtables.
See <xref linkend="ddl-inherit"> for more information about
inheritance.
</para>
</listitem>
</varlistentry>
</variablelist>
</sect2>
<sect2 id="runtime-config-compatible-clients">
<title>Platform and Client Compatibility</title>
<variablelist>

View File

@ -1,5 +1,5 @@
<!--
$PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.292 2005/11/16 03:56:16 momjian Exp $
$PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.293 2005/11/17 22:14:50 tgl Exp $
PostgreSQL documentation
-->
@ -8323,6 +8323,18 @@ AND
case where the array has zero elements).
</para>
<para>
If the array expression yields a null array, the result of
<token>ANY</token> will be null. If the left-hand expression yields null,
the result of <token>ANY</token> is ordinarily null (though a non-strict
comparison operator could possibly yield a different result).
Also, if the right-hand array contains any null elements and no true
comparison result is obtained, the result of <token>ANY</token>
will be null, not false (again, assuming a strict comparison operator).
This is in accordance with SQL's normal rules for Boolean combinations
of null values.
</para>
<para>
<token>SOME</token> is a synonym for <token>ANY</token>.
</para>
@ -8346,6 +8358,18 @@ AND
(including the special case where the array has zero elements).
The result is <quote>false</> if any false result is found.
</para>
<para>
If the array expression yields a null array, the result of
<token>ALL</token> will be null. If the left-hand expression yields null,
the result of <token>ALL</token> is ordinarily null (though a non-strict
comparison operator could possibly yield a different result).
Also, if the right-hand array contains any null elements and no false
comparison result is obtained, the result of <token>ALL</token>
will be null, not true (again, assuming a strict comparison operator).
This is in accordance with SQL's normal rules for Boolean combinations
of null values.
</para>
</sect2>
<sect2>

View File

@ -1,5 +1,5 @@
<!--
$PostgreSQL: pgsql/doc/src/sgml/ref/insert.sgml,v 1.29 2005/01/09 05:57:45 tgl Exp $
$PostgreSQL: pgsql/doc/src/sgml/ref/insert.sgml,v 1.30 2005/11/17 22:14:51 tgl Exp $
PostgreSQL documentation
-->
@ -206,11 +206,11 @@ INSERT INTO films SELECT * FROM tmp_films WHERE date_prod &lt; '2004-05-07';
<programlisting>
-- Create an empty 3x3 gameboard for noughts-and-crosses
-- (these commands create the same board)
INSERT INTO tictactoe (game, board[1:3][1:3])
VALUES (1,'{{"","",""},{"","",""},{"","",""}}');
VALUES (1, '{{" "," "," "},{" "," "," "},{" "," "," "}}');
-- The subscripts in the above example aren't really needed
INSERT INTO tictactoe (game, board)
VALUES (2,'{{,,},{,,},{,,}}');
VALUES (2, '{{X," "," "},{" ",O," "},{" ",X," "}}');
</programlisting>
</para>
</refsect1>

View File

@ -1,5 +1,5 @@
<!--
$PostgreSQL: pgsql/doc/src/sgml/xfunc.sgml,v 1.107 2005/10/15 20:12:33 neilc Exp $
$PostgreSQL: pgsql/doc/src/sgml/xfunc.sgml,v 1.108 2005/11/17 22:14:50 tgl Exp $
-->
<sect1 id="xfunc">
@ -2790,6 +2790,7 @@ make_array(PG_FUNCTION_ARGS)
ArrayType *result;
Oid element_type = get_fn_expr_argtype(fcinfo-&gt;flinfo, 0);
Datum element;
bool isnull;
int16 typlen;
bool typbyval;
char typalign;
@ -2800,8 +2801,12 @@ make_array(PG_FUNCTION_ARGS)
if (!OidIsValid(element_type))
elog(ERROR, "could not determine data type of input");
/* get the provided element */
element = PG_GETARG_DATUM(0);
/* get the provided element, being careful in case it's NULL */
isnull = PG_ARGISNULL(0);
if (isnull)
element = (Datum) 0;
else
element = PG_GETARG_DATUM(0);
/* we have one dimension */
ndims = 1;
@ -2814,7 +2819,7 @@ make_array(PG_FUNCTION_ARGS)
get_typlenbyvalalign(element_type, &amp;typlen, &amp;typbyval, &amp;typalign);
/* now build the array */
result = construct_md_array(&amp;element, ndims, dims, lbs,
result = construct_md_array(&amp;element, &amp;isnull, ndims, dims, lbs,
element_type, typlen, typbyval, typalign);
PG_RETURN_ARRAYTYPE_P(result);
@ -2829,11 +2834,8 @@ make_array(PG_FUNCTION_ARGS)
<programlisting>
CREATE FUNCTION make_array(anyelement) RETURNS anyarray
AS '<replaceable>DIRECTORY</replaceable>/funcs', 'make_array'
LANGUAGE C STRICT;
LANGUAGE C IMMUTABLE;
</programlisting>
Note the use of <literal>STRICT</literal>; this is essential
since the code is not bothering to test for a null input.
</para>
</sect2>
</sect1>