mirror of
https://github.com/postgres/postgres.git
synced 2025-07-27 12:41:57 +03:00
Prevent to_number() from losing data when template doesn't match exactly.
Non-data template patterns would consume characters whether or not those characters were what the pattern expected, for example SELECT TO_NUMBER('1234', '9,999'); produced 134 because the '2' got eaten by the comma pattern. This seems undesirable, not least because it doesn't happen in Oracle. For the ',' and 'G' template patterns, we can fix this by consuming characters only if they match what the pattern would output. For non-data patterns such as 'L' and 'TH', it seems impractical to tighten things up to the point of consuming only exact matches to what the pattern would output; but we can improve matters quite a lot by redefining the behavior as "consume only characters that aren't digits, signs, decimal point, or comma". Also, fix it so that the behavior is to consume the number of *characters* the pattern would output, not the number of *bytes*. The old coding would do surprising things with non-ASCII currency symbols, for example. (It would be good to apply that rule for literal text as well, but this commit only fixes it for non-data patterns.) Oliver Ford, reviewed by Thomas Munro and Nathan Wagner, and whacked around a bit more by me Discussion: https://postgr.es/m/CAGMVOdvpbMqPf9XWNzOwBpzJfErkydr_fEGhmuDGa015z97mwg@mail.gmail.com
This commit is contained in:
@ -5850,7 +5850,10 @@ SELECT regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
|
||||
data based on the given value. Any text that is not a template pattern is
|
||||
simply copied verbatim. Similarly, in an input template string (for the
|
||||
other functions), template patterns identify the values to be supplied by
|
||||
the input data string.
|
||||
the input data string. If there are characters in the template string
|
||||
that are not template patterns, the corresponding characters in the input
|
||||
data string are simply skipped over (whether or not they are equal to the
|
||||
template string characters).
|
||||
</para>
|
||||
|
||||
<para>
|
||||
@ -6176,13 +6179,15 @@ SELECT regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
|
||||
Ordinary text is allowed in <function>to_char</function>
|
||||
templates and will be output literally. You can put a substring
|
||||
in double quotes to force it to be interpreted as literal text
|
||||
even if it contains pattern key words. For example, in
|
||||
even if it contains template patterns. For example, in
|
||||
<literal>'"Hello Year "YYYY'</literal>, the <literal>YYYY</literal>
|
||||
will be replaced by the year data, but the single <literal>Y</literal> in <literal>Year</literal>
|
||||
will not be. In <function>to_date</function>, <function>to_number</function>,
|
||||
and <function>to_timestamp</function>, double-quoted strings skip the number of
|
||||
input characters contained in the string, e.g. <literal>"XX"</literal>
|
||||
skips two input characters.
|
||||
will not be.
|
||||
In <function>to_date</function>, <function>to_number</function>,
|
||||
and <function>to_timestamp</function>, literal text and double-quoted
|
||||
strings result in skipping the number of characters contained in the
|
||||
string; for example <literal>"XX"</literal> skips two input characters
|
||||
(whether or not they are <literal>XX</literal>).
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
@ -6483,6 +6488,17 @@ SELECT regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
In <function>to_number</function>, if non-data template patterns such
|
||||
as <literal>L</literal> or <literal>TH</literal> are used, the
|
||||
corresponding number of input characters are skipped, whether or not
|
||||
they match the template pattern, unless they are data characters
|
||||
(that is, digits, sign, decimal point, or comma). For
|
||||
example, <literal>TH</literal> would skip two non-data characters.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
<literal>V</literal> with <function>to_char</function>
|
||||
|
Reference in New Issue
Block a user