1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-28 23:42:10 +03:00

Downgrade implicit casts to text to be assignment-only, except for the ones

from the other string-category types; this eliminates a lot of surprising
interpretations that the parser could formerly make when there was no directly
applicable operator.

Create a general mechanism that supports casts to and from the standard string
types (text,varchar,bpchar) for *every* datatype, by invoking the datatype's
I/O functions.  These new casts are assignment-only in the to-string direction,
explicit-only in the other, and therefore should create no surprising behavior.
Remove a bunch of thereby-obsoleted datatype-specific casting functions.

The "general mechanism" is a new expression node type CoerceViaIO that can
actually convert between *any* two datatypes if their external text
representations are compatible.  This is more general than needed for the
immediate feature, but might be useful in plpgsql or other places in future.

This commit does nothing about the issue that applying the concatenation
operator || to non-text types will now fail, often with strange error messages
due to misinterpreting the operator as array concatenation.  Since it often
(not always) worked before, we should either make it succeed or at least give
a more user-friendly error; but details are still under debate.

Peter Eisentraut and Tom Lane
This commit is contained in:
Tom Lane
2007-06-05 21:31:09 +00:00
parent 1120b99445
commit 31edbadf4a
60 changed files with 850 additions and 1612 deletions

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/catalogs.sgml,v 2.152 2007/05/15 19:13:54 neilc Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/catalogs.sgml,v 2.153 2007/06/05 21:31:03 tgl Exp $ -->
<!--
Documentation of the system catalogs, directed toward PostgreSQL developers
-->
@ -1358,11 +1358,22 @@
</indexterm>
<para>
The catalog <structname>pg_cast</structname> stores data type conversion paths,
both built-in paths and those defined with
The catalog <structname>pg_cast</structname> stores data type conversion
paths, both built-in paths and those defined with
<xref linkend="sql-createcast" endterm="sql-createcast-title">.
</para>
<para>
It should be noted that <structname>pg_cast</structname> does not represent
every type conversion that the system knows how to perform; only those that
cannot be deduced from some generic rule. For example, casting between a
domain and its base type is not explicitly represented in
<structname>pg_cast</structname>. Another important exception is that
<quote>I/O conversion casts</>, those performed using a data type's own
I/O functions to convert to or from <type>text</> or other string types,
are not explicitly represented in <structname>pg_cast</structname>.
</para>
<table>
<title><structfield>pg_cast</> Columns</title>

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/ref/create_cast.sgml,v 1.25 2007/02/01 00:28:18 momjian Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/ref/create_cast.sgml,v 1.26 2007/06/05 21:31:04 tgl Exp $ -->
<refentry id="SQL-CREATECAST">
<refmeta>
@ -35,11 +35,11 @@ CREATE CAST (<replaceable>sourcetype</replaceable> AS <replaceable>targettype</r
specifies how to perform a conversion between
two data types. For example:
<programlisting>
SELECT CAST(42 AS text);
SELECT CAST(42 AS float8);
</programlisting>
converts the integer constant 42 to type <type>text</type> by
converts the integer constant 42 to type <type>float8</type> by
invoking a previously specified function, in this case
<literal>text(int4)</>. (If no suitable cast has been defined, the
<literal>float8(int4)</>. (If no suitable cast has been defined, the
conversion fails.)
</para>
@ -69,8 +69,7 @@ SELECT CAST(42 AS text);
INSERT INTO foo (f1) VALUES (42);
</programlisting>
will be allowed if the cast from type <type>integer</type> to type
<type>text</type> is marked <literal>AS ASSIGNMENT</>, otherwise
not.
<type>text</type> is marked <literal>AS ASSIGNMENT</>, otherwise not.
(We generally use the term <firstterm>assignment
cast</firstterm> to describe this kind of cast.)
</para>
@ -78,19 +77,37 @@ INSERT INTO foo (f1) VALUES (42);
<para>
If the cast is marked <literal>AS IMPLICIT</> then it can be invoked
implicitly in any context, whether assignment or internally in an
expression. For example, since <literal>||</> takes <type>text</>
operands,
<programlisting>
SELECT 'The time is ' || now();
</programlisting>
will be allowed only if the cast from type <type>timestamp</> to
<type>text</type> is marked <literal>AS IMPLICIT</>. Otherwise it
will be necessary to write the cast explicitly, for example:
<programlisting>
SELECT 'The time is ' || CAST(now() AS text);
</programlisting>
(We generally use the term <firstterm>implicit
expression. (We generally use the term <firstterm>implicit
cast</firstterm> to describe this kind of cast.)
For example, consider this query:
<programlisting>
SELECT 2 + 4.0;
</programlisting>
The parser initially marks the constants as being of type <type>integer</>
and <type>numeric</> respectively. There is no <type>integer</>
<literal>+</> <type>numeric</> operator in the system catalogs,
but there is a <type>numeric</> <literal>+</> <type>numeric</> operator.
The query will therefore succeed if a cast from <type>integer</> to
<type>numeric</> is available and is marked <literal>AS IMPLICIT</> &mdash;
which in fact it is. The parser will apply the implicit cast and resolve
the query as if it had been written
<programlisting>
SELECT CAST ( 2 AS numeric ) + 4.0;
</programlisting>
</para>
<para>
Now, the catalogs also provide a cast from <type>numeric</> to
<type>integer</>. If that cast were marked <literal>AS IMPLICIT</> &mdash;
which it is not &mdash; then the parser would be faced with choosing
between the above interpretation and the alternative of casting the
<type>numeric</> constant to <type>integer</> and applying the
<type>integer</> <literal>+</> <type>integer</> operator. Lacking any
knowledge of which choice to prefer, it would give up and declare the
query ambiguous. The fact that only one of the two casts is
implicit is the way in which we teach the parser to prefer resolution
of a mixed <type>numeric</>-and-<type>integer</> expression as
<type>numeric</>; there is no built-in knowledge about that.
</para>
<para>
@ -208,9 +225,7 @@ SELECT 'The time is ' || CAST(now() AS text);
argument. This is used to represent type-specific length coercion
functions in the system catalogs. The named function is used to
coerce a value of the type to the type modifier value given by its
second argument. (Since the grammar presently permits only certain
built-in data types to have type modifiers, this feature is of no
use for user-defined target types, but we mention it for completeness.)
second argument.
</para>
<para>
@ -237,6 +252,32 @@ SELECT 'The time is ' || CAST(now() AS text);
need to declare casts both ways explicitly.
</para>
<indexterm zone="sql-createcast">
<primary>cast</primary>
<secondary>I/O conversion</secondary>
</indexterm>
<para>
It is normally not necessary to create casts between user-defined types
and the standard string types (<type>text</>, <type>varchar</>, and
<type>char(<replaceable>n</>)</type>). <productname>PostgreSQL</> will
automatically handle a cast to a string type by invoking the other
type's output function, or conversely handle a cast from a string type
by invoking the other type's input function. These
automatically-provided casts are known as <firstterm>I/O conversion
casts</>. I/O conversion casts to string types are treated as
assignment casts, while I/O conversion casts from string types are
explicit-only. You can override this behavior by declaring your own
cast to replace an I/O conversion cast, but usually the only reason to
do so is if you want the conversion to be more easily invokable than the
standard assignment-only or explicit-only setting. Another possible
reason is that you want the conversion to behave differently from the
type's I/O function; but that is sufficiently surprising that you
should think twice about whether it's a good idea. (A small number of
the built-in types do indeed have different behaviors for conversions,
mostly because of requirements of the SQL standard.)
</para>
<para>
Prior to <productname>PostgreSQL</> 7.3, every function that had
the same name as a data type, returned that data type, and took one
@ -265,16 +306,20 @@ SELECT 'The time is ' || CAST(now() AS text);
<note>
<para>
There is one small lie in the preceding paragraph: there is still one
case in which <structname>pg_cast</> will be used to resolve the
meaning of an apparent function call. If a
function call <replaceable>name</>(<replaceable>x</>) matches no
actual function, but <replaceable>name</> is the name of a data type
and <structname>pg_cast</> shows a binary-compatible cast to this
type from the type of <replaceable>x</>, then the call will be construed
as an explicit cast. This exception is made so that binary-compatible
casts can be invoked using functional syntax, even though they lack
any function.
Actually the preceding paragraph is an oversimplification: there are
two cases in which a function-call construct will be treated as a cast
request without having matched it to an actual function.
If a function call <replaceable>name</>(<replaceable>x</>) does not
exactly match any existing function, but <replaceable>name</> is the name
of a data type and <structname>pg_cast</> provides a binary-compatible cast
to this type from the type of <replaceable>x</>, then the call will be
construed as a binary-compatible cast. This exception is made so that
binary-compatible casts can be invoked using functional syntax, even
though they lack any function. Likewise, if there is no
<structname>pg_cast</> entry but the cast would be to or from a string
type, the call will be construed as an I/O conversion cast. This
exception allows I/O conversion casts to be invoked using functional
syntax.
</para>
</note>
</refsect1>
@ -284,10 +329,10 @@ SELECT 'The time is ' || CAST(now() AS text);
<title>Examples</title>
<para>
To create a cast from type <type>text</type> to type
<type>int4</type> using the function <literal>int4(text)</literal>:
To create a cast from type <type>bigint</type> to type
<type>int4</type> using the function <literal>int4(bigint)</literal>:
<programlisting>
CREATE CAST (text AS int4) WITH FUNCTION int4(text);
CREATE CAST (bigint AS int4) WITH FUNCTION int4(bigint);
</programlisting>
(This cast is already predefined in the system.)
</para>

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/syntax.sgml,v 1.118 2007/05/11 17:57:11 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/syntax.sgml,v 1.119 2007/06/05 21:31:04 tgl Exp $ -->
<chapter id="sql-syntax">
<title>SQL Syntax</title>
@ -561,18 +561,18 @@ CAST ( '<replaceable>string</replaceable>' AS <replaceable>type</replaceable> )
The <literal>::</literal>, <literal>CAST()</literal>, and
function-call syntaxes can also be used to specify run-time type
conversions of arbitrary expressions, as discussed in <xref
linkend="sql-syntax-type-casts">. But the form
<literal><replaceable>type</replaceable> '<replaceable>string</replaceable>'</literal>
can only be used to specify the type of a literal constant.
Another restriction on
<literal><replaceable>type</replaceable> '<replaceable>string</replaceable>'</literal>
is that it does not work for array types; use <literal>::</literal>
linkend="sql-syntax-type-casts">. To avoid syntactic ambiguity, the
<literal><replaceable>type</> '<replaceable>string</>'</literal>
syntax can only be used to specify the type of a simple literal constant.
Another restriction on the
<literal><replaceable>type</> '<replaceable>string</>'</literal>
syntax is that it does not work for array types; use <literal>::</literal>
or <literal>CAST()</literal> to specify the type of an array constant.
</para>
<para>
The <literal>CAST()</> syntax conforms to SQL. The
<literal><replaceable>type</replaceable> '<replaceable>string</replaceable>'</literal>
<literal><replaceable>type</> '<replaceable>string</>'</literal>
syntax is a generalization of the standard: SQL specifies this syntax only
for a few data types, but <productname>PostgreSQL</productname> allows it
for all types. The syntax with
@ -1431,16 +1431,21 @@ CAST ( <replaceable>expression</replaceable> AS <replaceable>type</replaceable>
double-quoted, because of syntactic conflicts. Therefore, the use of
the function-like cast syntax leads to inconsistencies and should
probably be avoided in new applications.
(The function-like syntax is in fact just a function call. When
one of the two standard cast syntaxes is used to do a run-time
conversion, it will internally invoke a registered function to
perform the conversion. By convention, these conversion functions
have the same name as their output type, and thus the <quote>function-like
syntax</> is nothing more than a direct invocation of the underlying
conversion function. Obviously, this is not something that a portable
application should rely on.)
</para>
<note>
<para>
The function-like syntax is in fact just a function call. When
one of the two standard cast syntaxes is used to do a run-time
conversion, it will internally invoke a registered function to
perform the conversion. By convention, these conversion functions
have the same name as their output type, and thus the <quote>function-like
syntax</> is nothing more than a direct invocation of the underlying
conversion function. Obviously, this is not something that a portable
application should rely on. For further details see
<xref linkend="sql-createcast" endterm="sql-createcast-title">.
</para>
</note>
</sect2>
<sect2 id="sql-syntax-scalar-subqueries">

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/typeconv.sgml,v 1.51 2007/02/01 19:10:24 momjian Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/typeconv.sgml,v 1.52 2007/06/05 21:31:04 tgl Exp $ -->
<chapter Id="typeconv">
<title>Type Conversion</title>
@ -139,7 +139,8 @@ and for the <function>GREATEST</> and <function>LEAST</> functions.
The system catalogs store information about which conversions, called
<firstterm>casts</firstterm>, between data types are valid, and how to
perform those conversions. Additional casts can be added by the user
with the <command>CREATE CAST</command> command. (This is usually
with the <xref linkend="sql-createcast" endterm="sql-createcast-title">
command. (This is usually
done in conjunction with defining new data types. The set of casts
between the built-in types has been carefully crafted and is best not
altered.)
@ -336,28 +337,28 @@ Some examples follow.
</para>
<example>
<title>Exponentiation Operator Type Resolution</title>
<title>Factorial Operator Type Resolution</title>
<para>
There is only one exponentiation
operator defined in the catalog, and it takes arguments of type
<type>double precision</type>.
The scanner assigns an initial type of <type>integer</type> to both arguments
of this query expression:
There is only one factorial operator (postfix <literal>!</>)
defined in the standard catalog, and it takes an argument of type
<type>bigint</type>.
The scanner assigns an initial type of <type>integer</type> to the argument
in this query expression:
<screen>
SELECT 2 ^ 3 AS "exp";
SELECT 40 ! AS "40 factorial";
exp
-----
8
40 factorial
--------------------------------------------------
815915283247897734345611269596115894272000000000
(1 row)
</screen>
So the parser does a type conversion on both operands and the query
So the parser does a type conversion on the operand and the query
is equivalent to
<screen>
SELECT CAST(2 AS double precision) ^ CAST(3 AS double precision) AS "exp";
SELECT CAST(40 AS bigint) ! AS "40 factorial";
</screen>
</para>
</example>
@ -421,7 +422,7 @@ entries for the prefix operator <literal>@</>, all of which implement
absolute-value operations for various numeric data types. One of these
entries is for type <type>float8</type>, which is the preferred type in
the numeric category. Therefore, <productname>PostgreSQL</productname>
will use that entry when faced with a non-numeric input:
will use that entry when faced with an <type>unknown</> input:
<screen>
SELECT @ '-4.5' AS "abs";
abs
@ -429,9 +430,9 @@ SELECT @ '-4.5' AS "abs";
4.5
(1 row)
</screen>
Here the system has performed an implicit conversion from <type>text</type> to <type>float8</type>
before applying the chosen operator. We can verify that <type>float8</type> and
not some other type was used:
Here the system has implicitly resolved the unknown-type literal as type
<type>float8</type> before applying the chosen operator. We can verify that
<type>float8</type> and not some other type was used:
<screen>
SELECT @ '-4.5e500' AS "abs";
@ -447,8 +448,8 @@ try a similar case with <literal>~</>, we get:
SELECT ~ '20' AS "negation";
ERROR: operator is not unique: ~ "unknown"
HINT: Could not choose a best candidate operator. You might need to add explicit
type casts.
HINT: Could not choose a best candidate operator. You might need to add
explicit type casts.
</screen>
This happens because the system cannot decide which of the several
possible <literal>~</> operators should be preferred. We can help
@ -518,12 +519,24 @@ this step.)
<step performance="required">
<para>
If no exact match is found, see whether the function call appears
to be a trivial type conversion request. This happens if the function call
to be a special type conversion request. This happens if the function call
has just one argument and the function name is the same as the (internal)
name of some data type. Furthermore, the function argument must be either
an unknown-type literal or a type that is binary-compatible with the named
data type. When these conditions are met, the function argument is converted
to the named data type without any actual function call.
an unknown-type literal, or a type that is binary-compatible with the named
data type, or a type that could be converted to the named data type by
applying that type's I/O functions (that is, the conversion is either to or
from one of the standard string types). When these conditions are met,
the function call is treated as a form of <literal>CAST</> specification.
<footnote>
<para>
The reason for this step is to support function-style cast specifications
in cases where there is not an actual cast function. If there is a cast
function, it is conventionally named after its output type, and so there
is no need to have a special case. See
<xref linkend="sql-createcast" endterm="sql-createcast-title">
for additional commentary.
</para>
</footnote>
</para>
</step>
<step performance="required">
@ -670,30 +683,31 @@ The parser learns from the <structname>pg_cast</> catalog that
<type>text</type> and <type>varchar</type>
are binary-compatible, meaning that one can be passed to a function that
accepts the other without doing any physical conversion. Therefore, no
explicit type conversion call is really inserted in this case.
type conversion call is really inserted in this case.
</para>
</note>
</para>
<para>
And, if the function is called with an argument of type <type>integer</type>, the parser will
try to convert that to <type>text</type>:
And, if the function is called with an argument of type <type>integer</type>,
the parser will try to convert that to <type>text</type>:
<screen>
SELECT substr(1234, 3);
ERROR: function substr(integer, integer) does not exist
HINT: No function matches the given name and argument types. You might need
to add explicit type casts.
</screen>
This does not work because <type>integer</> does not have an implicit cast
to <type>text</>. An explicit cast will work, however:
<screen>
SELECT substr(CAST (1234 AS text), 3);
substr
--------
34
(1 row)
</screen>
This actually executes as
<screen>
SELECT substr(CAST (1234 AS text), 3);
</screen>
This automatic transformation can succeed because there is an
implicitly invocable cast from <type>integer</type> to
<type>text</type>.
</para>
</example>