1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-28 23:42:10 +03:00

Make collation-aware system catalog columns use "C" collation.

Up to now we allowed text columns in system catalogs to use collation
"default", but that isn't really safe because it might mean something
different in template0 than it means in a database cloned from template0.
In particular, this could mean that cloned pg_statistic entries for such
columns weren't entirely valid, possibly leading to bogus planner
estimates, though (probably) not any outright failures.

In the wake of commit 5e0928005, a better solution is available: if we
label such columns with "C" collation, then their pg_statistic entries
will also use that collation and hence will be valid independently of
the database collation.

This also provides a cleaner solution for indexes on such columns than
the hack added by commit 0b28ea79c: the indexes will naturally inherit
"C" collation and don't have to be forced to use text_pattern_ops.

Also, with the planned improvement of type "name" to be collation-aware,
this policy will apply cleanly to both text and name columns.

Because of the pg_statistic angle, we should also apply this policy
to the tables in information_schema.  This patch does that by adjusting
information_schema's textual domain types to specify "C" collation.
That has the user-visible effect that order-sensitive comparisons to
textual information_schema view columns will now use "C" collation
by default.  The SQL standard says that the collation of those view
columns is implementation-defined, so I think this is legal per spec.
At some point this might allow for translation of such comparisons
into indexable conditions on the underlying "name" columns, although
additional work will be needed before that can happen.

Discussion: https://postgr.es/m/19346.1544895309@sss.pgh.pa.us
This commit is contained in:
Tom Lane
2018-12-18 12:48:15 -05:00
parent b2d9e17768
commit 6b0faf7236
9 changed files with 60 additions and 32 deletions

View File

@ -2060,18 +2060,25 @@ ORDER BY 1;
-- a representational error in pg_index, but simply wrong catalog design.
-- It's bad because we expect to be able to clone template0 and assign the
-- copy a different database collation. It would especially not work for
-- shared catalogs. Note that although text columns will show a collation
-- in indcollation, they're still okay to index with text_pattern_ops,
-- so allow that case.
-- shared catalogs.
SELECT relname, attname, attcollation
FROM pg_class c, pg_attribute a
WHERE c.oid = attrelid AND c.oid < 16384 AND
c.relkind != 'v' AND -- we don't care about columns in views
attcollation != 0 AND
attcollation != (SELECT oid FROM pg_collation WHERE collname = 'C');
relname | attname | attcollation
---------+---------+--------------
(0 rows)
-- Double-check that collation-sensitive indexes have "C" collation, too.
SELECT indexrelid::regclass, indrelid::regclass, iclass, icoll
FROM (SELECT indexrelid, indrelid,
unnest(indclass) as iclass, unnest(indcollation) as icoll
FROM pg_index
WHERE indrelid < 16384) ss
WHERE icoll != 0 AND iclass !=
(SELECT oid FROM pg_opclass
WHERE opcname = 'text_pattern_ops' AND opcmethod =
(SELECT oid FROM pg_am WHERE amname = 'btree'));
WHERE icoll != 0 AND
icoll != (SELECT oid FROM pg_collation WHERE collname = 'C');
indexrelid | indrelid | iclass | icoll
------------+----------+--------+-------
(0 rows)

View File

@ -1333,16 +1333,21 @@ ORDER BY 1;
-- a representational error in pg_index, but simply wrong catalog design.
-- It's bad because we expect to be able to clone template0 and assign the
-- copy a different database collation. It would especially not work for
-- shared catalogs. Note that although text columns will show a collation
-- in indcollation, they're still okay to index with text_pattern_ops,
-- so allow that case.
-- shared catalogs.
SELECT relname, attname, attcollation
FROM pg_class c, pg_attribute a
WHERE c.oid = attrelid AND c.oid < 16384 AND
c.relkind != 'v' AND -- we don't care about columns in views
attcollation != 0 AND
attcollation != (SELECT oid FROM pg_collation WHERE collname = 'C');
-- Double-check that collation-sensitive indexes have "C" collation, too.
SELECT indexrelid::regclass, indrelid::regclass, iclass, icoll
FROM (SELECT indexrelid, indrelid,
unnest(indclass) as iclass, unnest(indcollation) as icoll
FROM pg_index
WHERE indrelid < 16384) ss
WHERE icoll != 0 AND iclass !=
(SELECT oid FROM pg_opclass
WHERE opcname = 'text_pattern_ops' AND opcmethod =
(SELECT oid FROM pg_am WHERE amname = 'btree'));
WHERE icoll != 0 AND
icoll != (SELECT oid FROM pg_collation WHERE collname = 'C');