mirror of
https://github.com/postgres/postgres.git
synced 2025-09-03 15:22:11 +03:00
Proofreading improvements for the Administration documentation book.
This commit is contained in:
@@ -1,4 +1,4 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/charset.sgml,v 2.95 2009/05/18 08:59:28 petere Exp $ -->
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/charset.sgml,v 2.96 2010/02/03 17:25:05 momjian Exp $ -->
|
||||
|
||||
<chapter id="charset">
|
||||
<title>Localization</>
|
||||
@@ -6,8 +6,8 @@
|
||||
<para>
|
||||
This chapter describes the available localization features from the
|
||||
point of view of the administrator.
|
||||
<productname>PostgreSQL</productname> supports localization with
|
||||
two approaches:
|
||||
<productname>PostgreSQL</productname> supports two localization
|
||||
facilities:
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
@@ -67,10 +67,10 @@ initdb --locale=sv_SE
|
||||
(<literal>sv</>) as spoken
|
||||
in Sweden (<literal>SE</>). Other possibilities might be
|
||||
<literal>en_US</> (U.S. English) and <literal>fr_CA</> (French
|
||||
Canadian). If more than one character set can be useful for a
|
||||
Canadian). If more than one character set can be used for a
|
||||
locale then the specifications look like this:
|
||||
<literal>cs_CZ.ISO8859-2</>. What locales are available under what
|
||||
names on your system depends on what was provided by the operating
|
||||
<literal>cs_CZ.ISO8859-2</>. What locales are available on your
|
||||
system under what names depends on what was provided by the operating
|
||||
system vendor and what was installed. On most Unix systems, the command
|
||||
<literal>locale -a</> will provide a list of available locales.
|
||||
Windows uses more verbose locale names, such as <literal>German_Germany</>
|
||||
@@ -80,8 +80,8 @@ initdb --locale=sv_SE
|
||||
<para>
|
||||
Occasionally it is useful to mix rules from several locales, e.g.,
|
||||
use English collation rules but Spanish messages. To support that, a
|
||||
set of locale subcategories exist that control only a certain
|
||||
aspect of the localization rules:
|
||||
set of locale subcategories exist that control only certain
|
||||
aspects of the localization rules:
|
||||
|
||||
<informaltable>
|
||||
<tgroup cols="2">
|
||||
@@ -127,13 +127,13 @@ initdb --locale=sv_SE
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The nature of some locale categories is that their value has to be
|
||||
Some locale categories must have their values
|
||||
fixed when the database is created. You can use different settings
|
||||
for different databases, but once a database is created, you cannot
|
||||
change them for that database anymore. <literal>LC_COLLATE</literal>
|
||||
and <literal>LC_CTYPE</literal> are these categories. They affect
|
||||
and <literal>LC_CTYPE</literal> are these type of categories. They affect
|
||||
the sort order of indexes, so they must be kept fixed, or indexes on
|
||||
text columns will become corrupt. The default values for these
|
||||
text columns would become corrupt. The default values for these
|
||||
categories are determined when <command>initdb</command> is run, and
|
||||
those values are used when new databases are created, unless
|
||||
specified otherwise in the <command>CREATE DATABASE</command> command.
|
||||
@@ -146,7 +146,7 @@ initdb --locale=sv_SE
|
||||
linkend="runtime-config-client-format"> for details). The values
|
||||
that are chosen by <command>initdb</command> are actually only written
|
||||
into the configuration file <filename>postgresql.conf</filename> to
|
||||
serve as defaults when the server is started. If you delete these
|
||||
serve as defaults when the server is started. If you disable these
|
||||
assignments from <filename>postgresql.conf</filename> then the
|
||||
server will inherit the settings from its execution environment.
|
||||
</para>
|
||||
@@ -178,7 +178,7 @@ initdb --locale=sv_SE
|
||||
settings for the purpose of setting the language of messages. If
|
||||
in doubt, please refer to the documentation of your operating
|
||||
system, in particular the documentation about
|
||||
<application>gettext</>, for more information.
|
||||
<application>gettext</>.
|
||||
</para>
|
||||
</note>
|
||||
|
||||
@@ -320,8 +320,9 @@ initdb --locale=sv_SE
|
||||
|
||||
<para>
|
||||
An important restriction, however, is that each database's character set
|
||||
must be compatible with the database's <envar>LC_CTYPE</> and
|
||||
<envar>LC_COLLATE</> locale settings. For <literal>C</> or
|
||||
must be compatible with the database's <envar>LC_CTYPE</> (character
|
||||
classification) and <envar>LC_COLLATE</> (string sort order) locale
|
||||
settings. For <literal>C</> or
|
||||
<literal>POSIX</> locale, any character set is allowed, but for other
|
||||
locales there is only one character set that will work correctly.
|
||||
(On Windows, however, UTF-8 encoding can be used with any locale.)
|
||||
@@ -543,7 +544,7 @@ initdb --locale=sv_SE
|
||||
<entry>LATIN1 with Euro and accents</entry>
|
||||
<entry>Yes</entry>
|
||||
<entry>1</entry>
|
||||
<entry>ISO885915</entry>
|
||||
<entry><literal>ISO885915</></entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><literal>LATIN10</literal></entry>
|
||||
@@ -694,7 +695,7 @@ initdb --locale=sv_SE
|
||||
</table>
|
||||
|
||||
<para>
|
||||
Not all <acronym>API</>s support all the listed character sets. For example, the
|
||||
Not all client <acronym>API</>s support all the listed character sets. For example, the
|
||||
<productname>PostgreSQL</>
|
||||
JDBC driver does not support <literal>MULE_INTERNAL</>, <literal>LATIN6</>,
|
||||
<literal>LATIN8</>, and <literal>LATIN10</>.
|
||||
@@ -710,7 +711,7 @@ initdb --locale=sv_SE
|
||||
much a declaration that a specific encoding is in use, as a declaration
|
||||
of ignorance about the encoding. In most cases, if you are
|
||||
working with any non-ASCII data, it is unwise to use the
|
||||
<literal>SQL_ASCII</> setting, because
|
||||
<literal>SQL_ASCII</> setting because
|
||||
<productname>PostgreSQL</productname> will be unable to help you by
|
||||
converting or validating non-ASCII characters.
|
||||
</para>
|
||||
@@ -720,17 +721,17 @@ initdb --locale=sv_SE
|
||||
<title>Setting the Character Set</title>
|
||||
|
||||
<para>
|
||||
<command>initdb</> defines the default character set
|
||||
<command>initdb</> defines the default character set (encoding)
|
||||
for a <productname>PostgreSQL</productname> cluster. For example,
|
||||
|
||||
<screen>
|
||||
initdb -E EUC_JP
|
||||
</screen>
|
||||
|
||||
sets the default character set (encoding) to
|
||||
sets the default character set to
|
||||
<literal>EUC_JP</literal> (Extended Unix Code for Japanese). You
|
||||
can use <option>--encoding</option> instead of
|
||||
<option>-E</option> if you prefer to type longer option strings.
|
||||
<option>-E</option> if you prefer longer option strings.
|
||||
If no <option>-E</> or <option>--encoding</option> option is
|
||||
given, <command>initdb</> attempts to determine the appropriate
|
||||
encoding to use based on the specified or default locale.
|
||||
@@ -762,8 +763,8 @@ CREATE DATABASE korean WITH ENCODING 'EUC_KR' LC_COLLATE='ko_KR.euckr' LC_CTYPE=
|
||||
<para>
|
||||
The encoding for a database is stored in the system catalog
|
||||
<literal>pg_database</literal>. You can see it by using the
|
||||
<option>-l</option> option or the <command>\l</command> command
|
||||
of <command>psql</command>.
|
||||
<command>psql</command> <option>-l</option> option or the
|
||||
<command>\l</command> command.
|
||||
|
||||
<screen>
|
||||
$ <userinput>psql -l</userinput>
|
||||
@@ -784,11 +785,11 @@ $ <userinput>psql -l</userinput>
|
||||
<important>
|
||||
<para>
|
||||
On most modern operating systems, <productname>PostgreSQL</productname>
|
||||
can determine which character set is implied by an <envar>LC_CTYPE</>
|
||||
can determine which character set is implied by the <envar>LC_CTYPE</>
|
||||
setting, and it will enforce that only the matching database encoding is
|
||||
used. On older systems it is your responsibility to ensure that you use
|
||||
the encoding expected by the locale you have selected. A mistake in
|
||||
this area is likely to lead to strange misbehavior of locale-dependent
|
||||
this area is likely to lead to strange behavior of locale-dependent
|
||||
operations such as sorting.
|
||||
</para>
|
||||
|
||||
@@ -1190,9 +1191,9 @@ RESET client_encoding;
|
||||
<para>
|
||||
If the conversion of a particular character is not possible
|
||||
— suppose you chose <literal>EUC_JP</literal> for the
|
||||
server and <literal>LATIN1</literal> for the client, then some
|
||||
Japanese characters do not have a representation in
|
||||
<literal>LATIN1</literal> — then an error is reported.
|
||||
server and <literal>LATIN1</literal> for the client, and some
|
||||
Japanese characters are returned that do not have a representation in
|
||||
<literal>LATIN1</literal> — an error is reported.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
@@ -1249,7 +1250,8 @@ RESET client_encoding;
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
<acronym>UTF</acronym>-8 is defined here.
|
||||
<acronym>UTF</acronym>-8 (8-bit UCS/Unicode Transformation
|
||||
Format) is defined here.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
Reference in New Issue
Block a user