1
0
mirror of https://github.com/postgres/postgres.git synced 2025-12-19 17:02:53 +03:00

Make LC_COLLATE and LC_CTYPE database-level settings. Collation and

ctype are now more like encoding, stored in new datcollate and datctype
columns in pg_database.

This is a stripped-down version of Radek Strnad's patch, with further
changes by me.
This commit is contained in:
Heikki Linnakangas
2008-09-23 09:20:39 +00:00
parent c52aab5525
commit 61d9674988
30 changed files with 440 additions and 248 deletions

View File

@@ -1,5 +1,5 @@
<!--
$PostgreSQL: pgsql/doc/src/sgml/ref/create_database.sgml,v 1.48 2007/09/28 22:25:49 tgl Exp $
$PostgreSQL: pgsql/doc/src/sgml/ref/create_database.sgml,v 1.49 2008/09/23 09:20:34 heikki Exp $
PostgreSQL documentation
-->
@@ -24,6 +24,8 @@ CREATE DATABASE <replaceable class="PARAMETER">name</replaceable>
[ [ WITH ] [ OWNER [=] <replaceable class="parameter">dbowner</replaceable> ]
[ TEMPLATE [=] <replaceable class="parameter">template</replaceable> ]
[ ENCODING [=] <replaceable class="parameter">encoding</replaceable> ]
[ COLLATE [=] <replaceable class="parameter">collate</replaceable> ]
[ CTYPE [=] <replaceable class="parameter">ctype</replaceable> ]
[ TABLESPACE [=] <replaceable class="parameter">tablespace</replaceable> ]
[ CONNECTION LIMIT [=] <replaceable class="parameter">connlimit</replaceable> ] ]
</synopsis>
@@ -112,6 +114,29 @@ CREATE DATABASE <replaceable class="PARAMETER">name</replaceable>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><replaceable class="parameter">collate</replaceable></term>
<listitem>
<para>
Collation order (<literal>LC_COLLATE</>) to use in the new database.
This affects the sort order applied to strings, e.g in queries with
ORDER BY, as well as the order used in indexes on text columns.
The default is to use the collation order of the template database.
See below for additional restrictions.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><replaceable class="parameter">ctype</replaceable></term>
<listitem>
<para>
Character classification (<literal>LC_CTYPE</>) to use in the new
database. This affects the categorization of characters, e.g. lower,
upper and digit. The default is to use the character classification of
the template database. See below for additional restrictions.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><replaceable class="parameter">tablespace</replaceable></term>
<listitem>
@@ -180,13 +205,11 @@ CREATE DATABASE <replaceable class="PARAMETER">name</replaceable>
</para>
<para>
Any character set encoding specified for the new database must be
compatible with the server's <envar>LC_CTYPE</> locale setting.
The character set encoding specified for the new database must be
compatible with the chosen COLLATE and CTYPE settings.
If <envar>LC_CTYPE</> is <literal>C</> (or equivalently
<literal>POSIX</>), then all encodings are allowed, but for other
locale settings there is only one encoding that will work properly,
and so the apparent freedom to specify an encoding is illusory if
you didn't initialize the database cluster in <literal>C</> locale.
locale settings there is only one encoding that will work properly.
<command>CREATE DATABASE</> will allow superusers to specify
<literal>SQL_ASCII</> encoding regardless of the locale setting,
but this choice is deprecated and may result in misbehavior of
@@ -194,6 +217,16 @@ CREATE DATABASE <replaceable class="PARAMETER">name</replaceable>
with the locale is stored in the database.
</para>
<para>
The <literal>COLLATE</> and <literal>CTYPE</> settings must match
those of the template database, except when template0 is used as
template. This is because <literal>COLLATE</> and <literal>CTYPE</>
affects the ordering in indexes, so that any indexes copied from the
template database would be invalid in the new database with different
settings. <literal>template0</literal>, however, is known to not
contain any indexes that would be affected.
</para>
<para>
The <literal>CONNECTION LIMIT</> option is only enforced approximately;
if two new sessions start at about the same time when just one

View File

@@ -1,5 +1,5 @@
<!--
$PostgreSQL: pgsql/doc/src/sgml/ref/initdb.sgml,v 1.43 2007/03/26 17:23:36 tgl Exp $
$PostgreSQL: pgsql/doc/src/sgml/ref/initdb.sgml,v 1.44 2008/09/23 09:20:34 heikki Exp $
PostgreSQL documentation
-->
@@ -76,25 +76,34 @@ PostgreSQL documentation
<para>
<command>initdb</command> initializes the database cluster's default
locale and character set encoding. The collation order
(<literal>LC_COLLATE</>) and character set classes
(<literal>LC_CTYPE</>, e.g. upper, lower, digit) are fixed for all
databases and cannot be changed. Collation orders other than
<literal>C</> or <literal>POSIX</> also have a performance penalty.
For these reasons it is important to choose the right locale when
running <command>initdb</command>. The remaining locale categories
can be changed later when the server is started. All server locale
values (<literal>lc_*</>) can be displayed via <command>SHOW ALL</>.
locale and character set encoding. The character set encoding,
collation order (<literal>LC_COLLATE</>) and character set classes
(<literal>LC_CTYPE</>, e.g. upper, lower, digit) can be set separately
for a database when it is created. <command>initdb</command> determines
those settings for the <literal>template1</literal> database, which will
serve as the default for all other databases.
</para>
<para>
To alter the default collation order or character set classes, use the
<option>--lc-collate</option> and <option>--lc-ctype</option> options.
Collation orders other than <literal>C</> or <literal>POSIX</> also have
a performance penalty. For these reasons it is important to choose the
right locale when running <command>initdb</command>.
</para>
<para>
The remaining locale categories can be changed later when the server
is started. You can also use <option>--locale</option> to set the
default for all locale categories, including collation order and
character set classes. All server locale values (<literal>lc_*</>) can
be displayed via <command>SHOW ALL</>.
More details can be found in <xref linkend="locale">.
</para>
<para>
The character set encoding can be set separately for a database when
it is created. <command>initdb</command> determines the encoding for
the <literal>template1</literal> database, which will serve as the
default for all other databases. To alter the default encoding use
the <option>--encoding</option> option. More details can be found in
<xref linkend="multibyte">.
To alter the default encoding, use the <option>--encoding</option>.
More details can be found in <xref linkend="multibyte">.
</para>
</refsect1>

View File

@@ -1,5 +1,5 @@
<!--
$PostgreSQL: pgsql/doc/src/sgml/ref/pg_controldata.sgml,v 1.10 2007/02/20 18:10:58 momjian Exp $
$PostgreSQL: pgsql/doc/src/sgml/ref/pg_controldata.sgml,v 1.11 2008/09/23 09:20:35 heikki Exp $
PostgreSQL documentation
-->
@@ -30,7 +30,7 @@ PostgreSQL documentation
<title>Description</title>
<para>
<command>pg_controldata</command> prints information initialized during
<command>initdb</>, such as the catalog version and server locale.
<command>initdb</>, such as the catalog version.
It also shows information about write-ahead logging and checkpoint
processing. This information is cluster-wide, and not specific to any one
database.

View File

@@ -1,5 +1,5 @@
<!--
$PostgreSQL: pgsql/doc/src/sgml/ref/pg_resetxlog.sgml,v 1.20 2007/01/31 23:26:04 momjian Exp $
$PostgreSQL: pgsql/doc/src/sgml/ref/pg_resetxlog.sgml,v 1.21 2008/09/23 09:20:35 heikki Exp $
PostgreSQL documentation
-->
@@ -62,14 +62,10 @@ PostgreSQL documentation
by specifying the <literal>-f</> (force) switch. In this case plausible
values will be substituted for the missing data. Most of the fields can be
expected to match, but manual assistance might be needed for the next OID,
next transaction ID and epoch, next multitransaction ID and offset,
WAL starting address, and database locale fields.
The first six of these can be set using the switches discussed below.
<command>pg_resetxlog</command>'s own environment is the source for its
guess at the locale fields; take care that <envar>LANG</> and so forth
match the environment that <command>initdb</> was run in.
If you are not able to determine correct values for all these fields,
<literal>-f</> can still be used, but
next transaction ID and epoch, next multitransaction ID and offset, and
WAL starting address fields. These fields can be set using the switches
discussed below. If you are not able to determine correct values for all
these fields, <literal>-f</> can still be used, but
the recovered database must be treated with even more suspicion than
usual: an immediate dump and reload is imperative. <emphasis>Do not</>
execute any data-modifying operations in the database before you dump,

View File

@@ -1,5 +1,5 @@
<!--
$PostgreSQL: pgsql/doc/src/sgml/ref/select.sgml,v 1.103 2008/02/15 22:17:06 tgl Exp $
$PostgreSQL: pgsql/doc/src/sgml/ref/select.sgml,v 1.104 2008/09/23 09:20:35 heikki Exp $
PostgreSQL documentation
-->
@@ -747,8 +747,7 @@ SELECT name FROM distributors ORDER BY code;
<para>
Character-string data is sorted according to the locale-specific
collation order that was established when the database cluster
was initialized.
collation order that was established when the database was created.
</para>
</refsect2>

View File

@@ -1,5 +1,5 @@
<!--
$PostgreSQL: pgsql/doc/src/sgml/ref/show.sgml,v 1.45 2008/01/03 21:23:15 tgl Exp $
$PostgreSQL: pgsql/doc/src/sgml/ref/show.sgml,v 1.46 2008/09/23 09:20:35 heikki Exp $
PostgreSQL documentation
-->
@@ -82,8 +82,8 @@ SHOW ALL
<para>
Shows the database's locale setting for collation (text
ordering). At present, this parameter can be shown but not
set, because the setting is determined at
<command>initdb</> time.
set, because the setting is determined at database creation
time.
</para>
</listitem>
</varlistentry>
@@ -94,8 +94,8 @@ SHOW ALL
<para>
Shows the database's locale setting for character
classification. At present, this parameter can be shown but
not set, because the setting is determined at
<command>initdb</> time.
not set, because the setting is determined at database creation
time.
</para>
</listitem>
</varlistentry>