1
0
mirror of https://github.com/postgres/postgres.git synced 2025-08-08 06:02:22 +03:00

Handle Unicode surrogate pairs correctly when processing JSON.

In 9.2, Unicode escape sequences are not analysed at all other than
to make sure that they are in the form \uXXXX. But in 9.3 many of the
new operators and functions try to turn JSON text values into text in
the server encoding, and this includes de-escaping Unicode escape
sequences. This processing had not taken into account the possibility
that this might contain a surrogate pair to designate a character
outside the BMP. That is now handled correctly.

This also enforces correct use of surrogate pairs, something that is not
done by the type's input routines. This fact is noted in the docs.
This commit is contained in:
Andrew Dunstan
2013-06-08 09:12:48 -04:00
parent c99d5d1bcc
commit 94e3311b97
4 changed files with 92 additions and 0 deletions

View File

@@ -10150,6 +10150,15 @@ table2-mapping
</tgroup>
</table>
<note>
<para>
The <type>json</type> functions and operators can impose stricter validity requirements
than the type's input functions. In particular, they check much more closely that any use
of Unicode surrogate pairs to designate characters outside the Unicode Basic Multilingual
Plane is correct.
</para>
</note>
<note>
<para>
The <xref linkend="hstore"> extension has a cast from <type>hstore</type> to