mirror of
https://github.com/postgres/postgres.git
synced 2025-07-03 20:02:46 +03:00
Fix jsonb Unicode escape processing, and in consequence disallow \u0000.
We've been trying to support \u0000 in JSON values since commit78ed8e03c6
, and have introduced increasingly worse hacks to try to make it work, such as commit0ad1a81632
. However, it fundamentally can't work in the way envisioned, because the stored representation looks the same as for \\u0000 which is not the same thing at all. It's also entirely bogus to output \u0000 when de-escaped output is called for. The right way to do this would be to store an actual 0x00 byte, and then throw error only if asked to produce de-escaped textual output. However, getting to that point seems likely to take considerable work and may well never be practical in the 9.4.x series. To preserve our options for better behavior while getting rid of the nasty side-effects of0ad1a81632
, revert that commit in toto and instead throw error if \u0000 is used in a context where it needs to be de-escaped. (These are the same contexts where non-ASCII Unicode escapes throw error if the database encoding isn't UTF8, so this behavior is by no means without precedent.) In passing, make both the \u0000 case and the non-ASCII Unicode case report ERRCODE_UNTRANSLATABLE_CHARACTER / "unsupported Unicode escape sequence" rather than claiming there's something wrong with the input syntax. Back-patch to 9.4, where we have to do something because0ad1a81632
broke things for many cases having nothing to do with \u0000. 9.3 also has bogus behavior, but only for that specific escape value, so given the lack of field complaints it seems better to leave 9.3 alone.
This commit is contained in:
@ -111,14 +111,6 @@ SET LOCAL TIME ZONE -8;
|
||||
select to_json(timestamptz '2014-05-28 12:22:35.614298-04');
|
||||
COMMIT;
|
||||
|
||||
-- unicode escape - backslash is not escaped
|
||||
|
||||
select to_json(text '\uabcd');
|
||||
|
||||
-- any other backslash is escaped
|
||||
|
||||
select to_json(text '\abcd');
|
||||
|
||||
--json_agg
|
||||
|
||||
SELECT json_agg(q)
|
||||
@ -401,9 +393,17 @@ select json '{ "a": "\ude04X" }' -> 'a'; -- orphan low surrogate
|
||||
|
||||
--handling of simple unicode escapes
|
||||
|
||||
select json '{ "a": "the Copyright \u00a9 sign" }' as correct_in_utf8;
|
||||
select json '{ "a": "dollar \u0024 character" }' as correct_everywhere;
|
||||
select json '{ "a": "dollar \\u0024 character" }' as not_an_escape;
|
||||
select json '{ "a": "null \u0000 escape" }' as not_unescaped;
|
||||
select json '{ "a": "null \\u0000 escape" }' as not_an_escape;
|
||||
|
||||
select json '{ "a": "the Copyright \u00a9 sign" }' ->> 'a' as correct_in_utf8;
|
||||
select json '{ "a": "dollar \u0024 character" }' ->> 'a' as correct_everywhere;
|
||||
select json '{ "a": "null \u0000 escape" }' ->> 'a' as not_unescaped;
|
||||
select json '{ "a": "dollar \\u0024 character" }' ->> 'a' as not_an_escape;
|
||||
select json '{ "a": "null \u0000 escape" }' ->> 'a' as fails;
|
||||
select json '{ "a": "null \\u0000 escape" }' ->> 'a' as not_an_escape;
|
||||
|
||||
--json_typeof() function
|
||||
select value, json_typeof(value)
|
||||
|
Reference in New Issue
Block a user