mirror of
https://github.com/postgres/postgres.git
synced 2025-07-28 23:42:10 +03:00
Introduce "builtin" collation provider.
New provider for collations, like "libc" or "icu", but without any external dependency. Initially, the only locale supported by the builtin provider is "C", which is identical to the libc provider's "C" locale. The libc provider's "C" locale has always been treated as a special case that uses an internal implementation, without using libc at all -- so the new builtin provider uses the same implementation. The builtin provider's locale is independent of the server environment variables LC_COLLATE and LC_CTYPE. Using the builtin provider, the database collation locale can be "C" while LC_COLLATE and LC_CTYPE are set to "en_US", which is impossible with the libc provider. By offering a new builtin provider, it clarifies that the semantics of a collation using this provider will never depend on libc, and makes it easier to document the behavior. Discussion: https://postgr.es/m/ab925f69-5f9d-f85e-b87c-bd2a44798659@joeconway.com Discussion: https://postgr.es/m/dd9261f4-7a98-4565-93ec-336c1c110d90@manitou-mail.org Discussion: https://postgr.es/m/ff4c2f2f9c8fc7ca27c1c24ae37ecaeaeaff6b53.camel%40j-davis.com Reviewed-by: Daniel Vérité, Peter Eisentraut, Jeremy Schneider
This commit is contained in:
@ -342,22 +342,14 @@ initdb --locale=sv_SE
|
||||
<title>Locale Providers</title>
|
||||
|
||||
<para>
|
||||
<productname>PostgreSQL</productname> supports multiple <firstterm>locale
|
||||
providers</firstterm>. This specifies which library supplies the locale
|
||||
data. One standard provider name is <literal>libc</literal>, which uses
|
||||
the locales provided by the operating system C library. These are the
|
||||
locales used by most tools provided by the operating system. Another
|
||||
provider is <literal>icu</literal>, which uses the external
|
||||
ICU<indexterm><primary>ICU</primary></indexterm> library. ICU locales can
|
||||
only be used if support for ICU was configured when PostgreSQL was built.
|
||||
A locale provider specifies which library defines the locale behavior for
|
||||
collations and character classifications.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The commands and tools that select the locale settings, as described
|
||||
above, each have an option to select the locale provider. The examples
|
||||
shown earlier all use the <literal>libc</literal> provider, which is the
|
||||
default. Here is an example to initialize a database cluster using the
|
||||
ICU provider:
|
||||
above, each have an option to select the locale provider. Here is an
|
||||
example to initialize a database cluster using the ICU provider:
|
||||
<programlisting>
|
||||
initdb --locale-provider=icu --icu-locale=en
|
||||
</programlisting>
|
||||
@ -370,12 +362,76 @@ initdb --locale-provider=icu --icu-locale=en
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Which locale provider to use depends on individual requirements. For most
|
||||
basic uses, either provider will give adequate results. For the libc
|
||||
provider, it depends on what the operating system offers; some operating
|
||||
systems are better than others. For advanced uses, ICU offers more locale
|
||||
variants and customization options.
|
||||
Regardless of the locale provider, the operating system is still used to
|
||||
provide some locale-aware behavior, such as messages (see <xref
|
||||
linkend="guc-lc-messages"/>).
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The available locale providers are listed below:
|
||||
</para>
|
||||
|
||||
<variablelist>
|
||||
<varlistentry>
|
||||
<term><literal>builtin</literal></term>
|
||||
<listitem>
|
||||
<para>
|
||||
The <literal>builtin</literal> provider uses built-in operations. Only
|
||||
the <literal>C</literal> locale is supported for this provider.
|
||||
</para>
|
||||
<para>
|
||||
The <literal>C</literal> locale behavior is identical to the
|
||||
<literal>C</literal> locale in the libc provider. When using this
|
||||
locale, the behavior may depend on the database encoding.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term><literal>icu</literal></term>
|
||||
<listitem>
|
||||
<para>
|
||||
The <literal>icu</literal> provider uses the external
|
||||
ICU<indexterm><primary>ICU</primary></indexterm>
|
||||
library. <productname>PostgreSQL</productname> must have been
|
||||
configured with support.
|
||||
</para>
|
||||
<para>
|
||||
ICU provides collation and character classification behavior that is
|
||||
independent of the operating system and database encoding, which is
|
||||
preferable if you expect to transition to other platforms without any
|
||||
change in results. <literal>LC_COLLATE</literal> and
|
||||
<literal>LC_CTYPE</literal> can be set independently of the ICU
|
||||
locale.
|
||||
</para>
|
||||
<note>
|
||||
<para>
|
||||
For the ICU provider, results may depend on the version of the ICU
|
||||
library used, as it is updated to reflect changes in natural language
|
||||
over time.
|
||||
</para>
|
||||
</note>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term><literal>libc</literal></term>
|
||||
<listitem>
|
||||
<para>
|
||||
The <literal>libc</literal> provider uses the operating system's C
|
||||
library. The collation and character classification behavior is
|
||||
controlled by the settings <literal>LC_COLLATE</literal> and
|
||||
<literal>LC_CTYPE</literal>, so they cannot be set independently.
|
||||
</para>
|
||||
<note>
|
||||
<para>
|
||||
The same locale name may have different behavior on different
|
||||
platforms when using the libc provider.
|
||||
</para>
|
||||
</note>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
</variablelist>
|
||||
</sect2>
|
||||
|
||||
<sect2 id="icu-locales">
|
||||
|
@ -96,6 +96,11 @@ CREATE COLLATION [ IF NOT EXISTS ] <replaceable>name</replaceable> FROM <replace
|
||||
<replaceable>locale</replaceable>, you cannot specify either of those
|
||||
parameters.
|
||||
</para>
|
||||
<para>
|
||||
If <replaceable>provider</replaceable> is <literal>builtin</literal>,
|
||||
then <replaceable>locale</replaceable> must be specified and set to
|
||||
<literal>C</literal>.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
@ -129,9 +134,9 @@ CREATE COLLATION [ IF NOT EXISTS ] <replaceable>name</replaceable> FROM <replace
|
||||
<listitem>
|
||||
<para>
|
||||
Specifies the provider to use for locale services associated with this
|
||||
collation. Possible values are
|
||||
<literal>icu</literal><indexterm><primary>ICU</primary></indexterm>
|
||||
(if the server was built with ICU support) or <literal>libc</literal>.
|
||||
collation. Possible values are <literal>builtin</literal>,
|
||||
<literal>icu</literal><indexterm><primary>ICU</primary></indexterm> (if
|
||||
the server was built with ICU support) or <literal>libc</literal>.
|
||||
<literal>libc</literal> is the default. See <xref
|
||||
linkend="locale-providers"/> for details.
|
||||
</para>
|
||||
|
@ -162,6 +162,11 @@ CREATE DATABASE <replaceable class="parameter">name</replaceable>
|
||||
linkend="create-database-lc-ctype"/>, or <xref
|
||||
linkend="create-database-icu-locale"/> individually.
|
||||
</para>
|
||||
<para>
|
||||
If <xref linkend="create-database-locale-provider"/> is
|
||||
<literal>builtin</literal>, then <replaceable>locale</replaceable>
|
||||
must be specified and set to <literal>C</literal>.
|
||||
</para>
|
||||
<tip>
|
||||
<para>
|
||||
The other locale settings <xref linkend="guc-lc-messages"/>, <xref
|
||||
@ -243,7 +248,7 @@ CREATE DATABASE <replaceable class="parameter">name</replaceable>
|
||||
<listitem>
|
||||
<para>
|
||||
Specifies the provider to use for the default collation in this
|
||||
database. Possible values are
|
||||
database. Possible values are <literal>builtin</literal>,
|
||||
<literal>icu</literal><indexterm><primary>ICU</primary></indexterm>
|
||||
(if the server was built with ICU support) or <literal>libc</literal>.
|
||||
By default, the provider is the same as that of the <xref
|
||||
|
@ -171,7 +171,7 @@ PostgreSQL documentation
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term><option>--locale-provider={<literal>libc</literal>|<literal>icu</literal>}</option></term>
|
||||
<term><option>--locale-provider={<literal>builtin</literal>|<literal>libc</literal>|<literal>icu</literal>}</option></term>
|
||||
<listitem>
|
||||
<para>
|
||||
Specifies the locale provider for the database's default collation.
|
||||
|
@ -286,6 +286,11 @@ PostgreSQL documentation
|
||||
environment that <command>initdb</command> runs in. Locale
|
||||
support is described in <xref linkend="locale"/>.
|
||||
</para>
|
||||
<para>
|
||||
If <option>--locale-provider</option> is <literal>builtin</literal>,
|
||||
<option>--locale</option> must be specified and set to
|
||||
<literal>C</literal>.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
@ -314,8 +319,18 @@ PostgreSQL documentation
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry id="app-initdb-builtin-locale">
|
||||
<term><option>--builtin-locale=<replaceable>locale</replaceable></option></term>
|
||||
<listitem>
|
||||
<para>
|
||||
Specifies the locale name when the builtin provider is used. Locale support
|
||||
is described in <xref linkend="locale"/>.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry id="app-initdb-option-locale-provider">
|
||||
<term><option>--locale-provider={<literal>libc</literal>|<literal>icu</literal>}</option></term>
|
||||
<term><option>--locale-provider={<literal>builtin</literal>|<literal>libc</literal>|<literal>icu</literal>}</option></term>
|
||||
<listitem>
|
||||
<para>
|
||||
This option sets the locale provider for databases created in the new
|
||||
|
Reference in New Issue
Block a user