mirror of
https://github.com/postgres/postgres.git
synced 2025-12-19 17:02:53 +03:00
Make LC_COLLATE and LC_CTYPE database-level settings. Collation and
ctype are now more like encoding, stored in new datcollate and datctype columns in pg_database. This is a stripped-down version of Radek Strnad's patch, with further changes by me.
This commit is contained in:
@@ -1,5 +1,5 @@
|
||||
<!--
|
||||
$PostgreSQL: pgsql/doc/src/sgml/ref/create_database.sgml,v 1.48 2007/09/28 22:25:49 tgl Exp $
|
||||
$PostgreSQL: pgsql/doc/src/sgml/ref/create_database.sgml,v 1.49 2008/09/23 09:20:34 heikki Exp $
|
||||
PostgreSQL documentation
|
||||
-->
|
||||
|
||||
@@ -24,6 +24,8 @@ CREATE DATABASE <replaceable class="PARAMETER">name</replaceable>
|
||||
[ [ WITH ] [ OWNER [=] <replaceable class="parameter">dbowner</replaceable> ]
|
||||
[ TEMPLATE [=] <replaceable class="parameter">template</replaceable> ]
|
||||
[ ENCODING [=] <replaceable class="parameter">encoding</replaceable> ]
|
||||
[ COLLATE [=] <replaceable class="parameter">collate</replaceable> ]
|
||||
[ CTYPE [=] <replaceable class="parameter">ctype</replaceable> ]
|
||||
[ TABLESPACE [=] <replaceable class="parameter">tablespace</replaceable> ]
|
||||
[ CONNECTION LIMIT [=] <replaceable class="parameter">connlimit</replaceable> ] ]
|
||||
</synopsis>
|
||||
@@ -112,6 +114,29 @@ CREATE DATABASE <replaceable class="PARAMETER">name</replaceable>
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
<term><replaceable class="parameter">collate</replaceable></term>
|
||||
<listitem>
|
||||
<para>
|
||||
Collation order (<literal>LC_COLLATE</>) to use in the new database.
|
||||
This affects the sort order applied to strings, e.g in queries with
|
||||
ORDER BY, as well as the order used in indexes on text columns.
|
||||
The default is to use the collation order of the template database.
|
||||
See below for additional restrictions.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
<term><replaceable class="parameter">ctype</replaceable></term>
|
||||
<listitem>
|
||||
<para>
|
||||
Character classification (<literal>LC_CTYPE</>) to use in the new
|
||||
database. This affects the categorization of characters, e.g. lower,
|
||||
upper and digit. The default is to use the character classification of
|
||||
the template database. See below for additional restrictions.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
<term><replaceable class="parameter">tablespace</replaceable></term>
|
||||
<listitem>
|
||||
@@ -180,13 +205,11 @@ CREATE DATABASE <replaceable class="PARAMETER">name</replaceable>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Any character set encoding specified for the new database must be
|
||||
compatible with the server's <envar>LC_CTYPE</> locale setting.
|
||||
The character set encoding specified for the new database must be
|
||||
compatible with the chosen COLLATE and CTYPE settings.
|
||||
If <envar>LC_CTYPE</> is <literal>C</> (or equivalently
|
||||
<literal>POSIX</>), then all encodings are allowed, but for other
|
||||
locale settings there is only one encoding that will work properly,
|
||||
and so the apparent freedom to specify an encoding is illusory if
|
||||
you didn't initialize the database cluster in <literal>C</> locale.
|
||||
locale settings there is only one encoding that will work properly.
|
||||
<command>CREATE DATABASE</> will allow superusers to specify
|
||||
<literal>SQL_ASCII</> encoding regardless of the locale setting,
|
||||
but this choice is deprecated and may result in misbehavior of
|
||||
@@ -194,6 +217,16 @@ CREATE DATABASE <replaceable class="PARAMETER">name</replaceable>
|
||||
with the locale is stored in the database.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The <literal>COLLATE</> and <literal>CTYPE</> settings must match
|
||||
those of the template database, except when template0 is used as
|
||||
template. This is because <literal>COLLATE</> and <literal>CTYPE</>
|
||||
affects the ordering in indexes, so that any indexes copied from the
|
||||
template database would be invalid in the new database with different
|
||||
settings. <literal>template0</literal>, however, is known to not
|
||||
contain any indexes that would be affected.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The <literal>CONNECTION LIMIT</> option is only enforced approximately;
|
||||
if two new sessions start at about the same time when just one
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
<!--
|
||||
$PostgreSQL: pgsql/doc/src/sgml/ref/initdb.sgml,v 1.43 2007/03/26 17:23:36 tgl Exp $
|
||||
$PostgreSQL: pgsql/doc/src/sgml/ref/initdb.sgml,v 1.44 2008/09/23 09:20:34 heikki Exp $
|
||||
PostgreSQL documentation
|
||||
-->
|
||||
|
||||
@@ -76,25 +76,34 @@ PostgreSQL documentation
|
||||
|
||||
<para>
|
||||
<command>initdb</command> initializes the database cluster's default
|
||||
locale and character set encoding. The collation order
|
||||
(<literal>LC_COLLATE</>) and character set classes
|
||||
(<literal>LC_CTYPE</>, e.g. upper, lower, digit) are fixed for all
|
||||
databases and cannot be changed. Collation orders other than
|
||||
<literal>C</> or <literal>POSIX</> also have a performance penalty.
|
||||
For these reasons it is important to choose the right locale when
|
||||
running <command>initdb</command>. The remaining locale categories
|
||||
can be changed later when the server is started. All server locale
|
||||
values (<literal>lc_*</>) can be displayed via <command>SHOW ALL</>.
|
||||
locale and character set encoding. The character set encoding,
|
||||
collation order (<literal>LC_COLLATE</>) and character set classes
|
||||
(<literal>LC_CTYPE</>, e.g. upper, lower, digit) can be set separately
|
||||
for a database when it is created. <command>initdb</command> determines
|
||||
those settings for the <literal>template1</literal> database, which will
|
||||
serve as the default for all other databases.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
To alter the default collation order or character set classes, use the
|
||||
<option>--lc-collate</option> and <option>--lc-ctype</option> options.
|
||||
Collation orders other than <literal>C</> or <literal>POSIX</> also have
|
||||
a performance penalty. For these reasons it is important to choose the
|
||||
right locale when running <command>initdb</command>.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The remaining locale categories can be changed later when the server
|
||||
is started. You can also use <option>--locale</option> to set the
|
||||
default for all locale categories, including collation order and
|
||||
character set classes. All server locale values (<literal>lc_*</>) can
|
||||
be displayed via <command>SHOW ALL</>.
|
||||
More details can be found in <xref linkend="locale">.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The character set encoding can be set separately for a database when
|
||||
it is created. <command>initdb</command> determines the encoding for
|
||||
the <literal>template1</literal> database, which will serve as the
|
||||
default for all other databases. To alter the default encoding use
|
||||
the <option>--encoding</option> option. More details can be found in
|
||||
<xref linkend="multibyte">.
|
||||
To alter the default encoding, use the <option>--encoding</option>.
|
||||
More details can be found in <xref linkend="multibyte">.
|
||||
</para>
|
||||
|
||||
</refsect1>
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
<!--
|
||||
$PostgreSQL: pgsql/doc/src/sgml/ref/pg_controldata.sgml,v 1.10 2007/02/20 18:10:58 momjian Exp $
|
||||
$PostgreSQL: pgsql/doc/src/sgml/ref/pg_controldata.sgml,v 1.11 2008/09/23 09:20:35 heikki Exp $
|
||||
PostgreSQL documentation
|
||||
-->
|
||||
|
||||
@@ -30,7 +30,7 @@ PostgreSQL documentation
|
||||
<title>Description</title>
|
||||
<para>
|
||||
<command>pg_controldata</command> prints information initialized during
|
||||
<command>initdb</>, such as the catalog version and server locale.
|
||||
<command>initdb</>, such as the catalog version.
|
||||
It also shows information about write-ahead logging and checkpoint
|
||||
processing. This information is cluster-wide, and not specific to any one
|
||||
database.
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
<!--
|
||||
$PostgreSQL: pgsql/doc/src/sgml/ref/pg_resetxlog.sgml,v 1.20 2007/01/31 23:26:04 momjian Exp $
|
||||
$PostgreSQL: pgsql/doc/src/sgml/ref/pg_resetxlog.sgml,v 1.21 2008/09/23 09:20:35 heikki Exp $
|
||||
PostgreSQL documentation
|
||||
-->
|
||||
|
||||
@@ -62,14 +62,10 @@ PostgreSQL documentation
|
||||
by specifying the <literal>-f</> (force) switch. In this case plausible
|
||||
values will be substituted for the missing data. Most of the fields can be
|
||||
expected to match, but manual assistance might be needed for the next OID,
|
||||
next transaction ID and epoch, next multitransaction ID and offset,
|
||||
WAL starting address, and database locale fields.
|
||||
The first six of these can be set using the switches discussed below.
|
||||
<command>pg_resetxlog</command>'s own environment is the source for its
|
||||
guess at the locale fields; take care that <envar>LANG</> and so forth
|
||||
match the environment that <command>initdb</> was run in.
|
||||
If you are not able to determine correct values for all these fields,
|
||||
<literal>-f</> can still be used, but
|
||||
next transaction ID and epoch, next multitransaction ID and offset, and
|
||||
WAL starting address fields. These fields can be set using the switches
|
||||
discussed below. If you are not able to determine correct values for all
|
||||
these fields, <literal>-f</> can still be used, but
|
||||
the recovered database must be treated with even more suspicion than
|
||||
usual: an immediate dump and reload is imperative. <emphasis>Do not</>
|
||||
execute any data-modifying operations in the database before you dump,
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
<!--
|
||||
$PostgreSQL: pgsql/doc/src/sgml/ref/select.sgml,v 1.103 2008/02/15 22:17:06 tgl Exp $
|
||||
$PostgreSQL: pgsql/doc/src/sgml/ref/select.sgml,v 1.104 2008/09/23 09:20:35 heikki Exp $
|
||||
PostgreSQL documentation
|
||||
-->
|
||||
|
||||
@@ -747,8 +747,7 @@ SELECT name FROM distributors ORDER BY code;
|
||||
|
||||
<para>
|
||||
Character-string data is sorted according to the locale-specific
|
||||
collation order that was established when the database cluster
|
||||
was initialized.
|
||||
collation order that was established when the database was created.
|
||||
</para>
|
||||
</refsect2>
|
||||
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
<!--
|
||||
$PostgreSQL: pgsql/doc/src/sgml/ref/show.sgml,v 1.45 2008/01/03 21:23:15 tgl Exp $
|
||||
$PostgreSQL: pgsql/doc/src/sgml/ref/show.sgml,v 1.46 2008/09/23 09:20:35 heikki Exp $
|
||||
PostgreSQL documentation
|
||||
-->
|
||||
|
||||
@@ -82,8 +82,8 @@ SHOW ALL
|
||||
<para>
|
||||
Shows the database's locale setting for collation (text
|
||||
ordering). At present, this parameter can be shown but not
|
||||
set, because the setting is determined at
|
||||
<command>initdb</> time.
|
||||
set, because the setting is determined at database creation
|
||||
time.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
@@ -94,8 +94,8 @@ SHOW ALL
|
||||
<para>
|
||||
Shows the database's locale setting for character
|
||||
classification. At present, this parameter can be shown but
|
||||
not set, because the setting is determined at
|
||||
<command>initdb</> time.
|
||||
not set, because the setting is determined at database creation
|
||||
time.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
Reference in New Issue
Block a user