1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-27 12:41:57 +03:00

I really hope that I haven't missed anything in this one...

From: t-ishii@sra.co.jp

Attached are patches to enhance the multi-byte support.  (patches are
against 7/18 snapshot)

* determine encoding at initdb/createdb rather than compile time

Now initdb/createdb has an option to specify the encoding. Also, I
modified the syntax of CREATE DATABASE to accept encoding option. See
README.mb for more details.

For this purpose I have added new column "encoding" to pg_database.
Also pg_attribute and pg_class are changed to catch up the
modification to pg_database.  Actually I haved added pg_database_mb.h,
pg_attribute_mb.h and pg_class_mb.h. These are used only when MB is
enabled. The reason having separate files is I couldn't find a way to
use ifdef or whatever in those files. I have to admit it looks
ugly. No way.

* support for PGCLIENTENCODING when issuing COPY command

commands/copy.c modified.

* support for SQL92 syntax "SET NAMES"

See gram.y.

* support for LATIN2-5
* add UNICODE regression test case
* new test suite for MB

New directory test/mb added.

* clean up source files

Basic idea is to have MB's own subdirectory for easier maintenance.
These are include/mb and backend/utils/mb.
This commit is contained in:
Marc G. Fournier
1998-07-24 03:32:46 +00:00
parent 6e66468f3a
commit bf00bbb0c4
82 changed files with 2161 additions and 759 deletions

View File

@ -1,4 +1,4 @@
postgresql 6.4 multi-byte (MB) support README Jun 5 1998
postgresql 6.4 multi-byte (MB) support README Jul 22 1998
Tatsuo Ishii
t-ishii@sra.co.jp
@ -10,7 +10,10 @@ The MB support is intended for allowing PostgreSQL to handle
multi-byte character sets such as EUC(Extended Unix Code), Unicode and
Mule internal code. With the MB enabled you can use multi-byte
character sets in regexp ,LIKE and some functions. The encoding system
chosen is determined at the compile time.
chosen is determined when initializing your PostgreSQL installation
using initdb(1). Note that this can be overrided when creating a
database using createdb(1) or create database SQL command. So you
could have multiple databases with different encoding system.
MB also fixes some problems concerning with 8-bit single byte
character sets including ISO8859. (I would not say all of problems
@ -36,7 +39,11 @@ where encoding_system is one of:
EUC_TW Taiwan EUC
UNICODE Unicode(UTF-8)
MULE_INTERNAL Mule internal
LATIN1 ISO 8859-1 English and some European laguages
LATIN1 ISO 8859-1 English and some European languages
LATIN2 ISO 8859-2 English and some European languages
LATIN3 ISO 8859-3 English and some European languages
LATIN4 ISO 8859-4 English and some European languages
LATIN5 ISO 8859-5 English and some European languages
Example:
@ -50,7 +57,28 @@ Example:
If MB is disabled, nothing is changed except better supporting for
8-bit single byte character sets.
2. PGCLIENTENCODING
2. How to set encoding
initdb command defines the default encoding for a PostgreSQL
installation. For example:
% initdb -e EUC_JP
sets the default encoding to EUC_JP(Extended Unix Code for Japanese).
Note that you can use "-pgencoding" instead of "-e" if you like longer
option string:-) If no -e or -pgencoding option is given, the encoding
specified at the compile time is used.
You can create a database with a different encoding.
% createdb -E EUC_KR korean
will create a database named "korean" with EUC_KR encoding. The
another way to accomplish this is to use a SQL command:
CREATE DATABASE korean WITH ENCODING = 'EUC_KR';
3. PGCLIENTENCODING
If an environment variable PGCLIENTENCODING is defined on the
frontend, automatic encoding translation is done by the backend. For
@ -68,7 +96,11 @@ Supported encodings for PGCLIENTENCODING are:
EUC_KR Korean EUC
EUC_TW Taiwan EUC
MULE_INTERNAL Mule internal
LATIN1 ISO 8859-1 English and some European laguages
LATIN1 ISO 8859-1 English and some European languages
LATIN2 ISO 8859-2 English and some European languages
LATIN3 ISO 8859-3 English and some European languages
LATIN4 ISO 8859-4 English and some European languages
LATIN5 ISO 8859-5 English and some European languages
Note that UNICODE is not supported(yet). Also note that the
translation is not always possible. Suppose you choose EUC_JP for the
@ -86,7 +118,12 @@ new command:
SET CLIENT_ENCODING TO 'encoding';
where encoding is one of the encodings those can be set to
PGCLIENTENCODING. To query the current the frontend encoding:
PGCLIENTENCODING. Also you can use SQL92 syntax "SET NAMES" for this
purpose:
SET NAMES 'encoding';
To query the current the frontend encoding:
SHOW CLIENT_ENCODING;
@ -114,7 +151,16 @@ Unicode: http://www.unicode.org/
5. History
Jun 5, 1988
Jul 22, 1998
* determine encoding at initdb/createdb rather than compile time
* support for PGCLIENTENCODING when issuing COPY command
* support for SQL92 syntax "SET NAMES"
* support for LATIN2-5
* add UNICODE regression test case
* new test suite for MB
* clean up source files
Jun 5, 1998
* add support for the encoding translation between the backend
and the frontend
* new command SET CLIENT_ENCODING etc. added

View File

@ -1,4 +1,4 @@
postgresql 6.3.2 multi-byte (MB) support README 1998/5/25 $B:n@.(B
postgresql 6.4 multi-byte (MB) support README 1998/7/22 $B:n@.(B
$B@P0fC#IW(B
t-ishii@sra.co.jp
@ -9,7 +9,7 @@ postgresql 6.3.2 multi-byte (MB) support README 1998/5/25 $B:n@.(B
PostgreSQL $B$K$*$1$k%^%k%A%P%$%H%5%]!<%H$O0J2<$N$h$&$JFCD'$r;}$C$F$$$^$9!#(B
1.$B%^%k%A%P%$%HJ8;z$H$7$F!"F|K\8l!"Cf9q8l$J$I$N3F9q$N(B EUC$B!"(BUnicode$B!"(B
mule internal code, ISO-8859-1 $B$,%3%s%Q%$%k;~$KA*Br2DG=!#(B
mule internal code, ISO-8859-1 $B$,%G!<%?%Y!<%9:n@.;~$KA*Br2DG=!#(B
$B%G!<%?%Y!<%9$K$O$3$N%3!<%I$N$^$^3JG<$5$l$^$9!#(B
2.$B%F!<%V%kL>$K%^%k%A%P%$%HJ8;z$,;HMQ2DG=(B($B$?$@$7!"(BOS $B$,%^%k%A%P%$%H(B
$B$N%U%!%$%kL>$r5v$7$F$$$k$3$H$,I,MW(B)
@ -23,6 +23,7 @@ postgresql 6.3.2 multi-byte (MB) support README 1998/5/25 $B:n@.(B
$B$,%P%C%/%(%s%IB&$H0[$k>l9g$K!"<+F0E*$K%3!<%IJQ49$r9T$J$$$^$9!#(B
$B%$%s%9%H!<%k!'(B
$B%G%U%)%k%H$G$O(B PostgreSQL $B$O%^%k%A%P%$%H$r%5%]!<%H$7$F$$$^$;$s!#(B
$B%^%k%A%P%$%H%5%]!<%H$rM-8z$K$9$kJ}K!$r@bL@$7$^$9!#(B
@ -34,9 +35,11 @@ postgresql 6.3.2 multi-byte (MB) support README 1998/5/25 $B:n@.(B
% configure --with-mb=EUC_JP
$BJ8;z%3!<%I$H$7$F$O(B EUC_JP $B$r4^$a!"0J2<$N%3!<%I$,;XDj$G$-$^$9!#(B
($B8=:_$N<BAu$G$O!"J8;z%3!<%I$O%3%s%Q%$%k;~$K7hDj$5$l!"<B9T;~$K(B
$BF0E*$KJQ99$9$k$3$H$O$G$-$^$;$s(B)
$BJ8;z%3!<%I$H$7$F$O(B EUC_JP $B$r4^$a!"0J2<$N%3!<%I$,(B initdb $B$K$h$k(B
$B%G!<%?%Y!<%9=i4|2=;~$*$h$S%G!<%?%Y!<%9:n@.;~(B
(Unix $B%3%^%s%I$N(B createdb $B$b$7$/$O(B SQL $B$N(B create database)
$B$K;XDj$G$-$^$9!#(BMakefile.custom $B$"$k$$$O(B configure $B$G;XDj$7$?J8;z%3!<(B
$B%I$O(B initdb $B$N>JN,;~$NJ8;z%3!<%I$K$J$j$^$9!#(B
EUC_JP $BF|K\8l(B EUC
EUC_CN GB $B$r%Y!<%9$K$7$?CfJ8(BEUC$B!#(Bcode set 2 $B$O(B
@ -48,9 +51,9 @@ postgresql 6.3.2 multi-byte (MB) support README 1998/5/25 $B:n@.(B
$B$9$J$o$A(B 0xffff $B$^$G$G$9!#(B
MULE_INTERNAL mule $B$NFbIt%3!<%I!#$?$@$7!"(BType N $B$NITDjD9J8;z$O(B
$B%5%]!<%H$7$F$$$^$;$s!#(B
LATIN1 ISO8859 Latin 1$B!#%7%s%0%k%P%$%H$J$s$G$9$1$I!"(B
$B;n$7$H$$$&$3$H$G(B:-)$B$A$J$_$K!"(BLATIN2 etc. $B$O(B
$BL$%5%]!<%H!#(B
LATIN* ISO8859 Latin $B%7%j!<%:!#(B* $B$O(B 1 $B$+$i(B 5 $B$^$G;XDj(B
$B$G$-$^$9!#%7%s%0%k%P%$%H$J$s$G$9$1$I!"(B
$B;n$7$H$$$&$3$H$G(B:-)
$BA*Br$NL\0B$H$7$F$O!"1Q8l$HF|K\8l$7$+;H$o$J$$>l9g$O(B EUC_JP($BF1MM$K!"Cf(B
$B9q8l$7$+;H$o$J$$>l9g$O(B EUC_CN... $B$J$I$H$J$j$^$9(B)$B!"$=$NB>$N8@8l$b;H$$$?(B
@ -69,13 +72,42 @@ postgresql 6.3.2 multi-byte (MB) support README 1998/5/25 $B:n@.(B
http://www.sra.co.jp/people/t-ishii/PostgreSQL/ $B$G$b4JC1$J%$%s%9%H!<(B
$B%kJ}K!$r>R2p$7$F$$$^$9!#(B
initdb/createdb/create database $B$K$*$1$kJ8;z%3!<%I$N;XDj$K$D$$$F(B
initdb $B$G$O0J2<$N%*%W%7%g%s$GJ8;z%3!<%I$,;XDj$G$-$^$9!#(B
-e $BJ8;z%3!<%I(B
-pgencoding $BJ8;z%3!<%I(B
$B$3$3$G;XDj$7$?J8;z%3!<%I$O!"0J8e(B createdb/create database $B$GJ8;z%3!<%I$r(B
$B>JN,$7$?>l9g$K@_Dj$5$l$kJ8;z%3!<%I$K$J$j$^$9!#(B-e $B$^$?$O(B -pgencoding
$B%*%W%7%g%s$r>JN,$7$?>l9g$O!"(BMakefile.custom $B$"$k$$$O(B configure $B$G;X(B
$BDj$7$?J8;z%3!<%I$,:NMQ$5$l$^$9!#(B
createdb $B$G$O0J2<$N%*%W%7%g%s$GJ8;z%3!<%I$,;XDj$G$-$^$9!#(B
-E $BJ8;z%3!<%I(B
create database $B$G$O0J2<$N%*%W%7%g%s$GJ8;z%3!<%I$,;XDj$G$-$^$9!#(B
CREATE DATABASE dbanme WITH ENCODING = '$BJ8;z%3!<%I(B';
LOCATION $B$rF1;~$K;XDj$9$k>l9g$O0J2<$N$h$&$K$J$j$^$9!#(B
CREATE DATABASE dbanme WITH LOCATION = 'path' ENCODING = '$BJ8;z%3!<%I(B';
createdb/create database $B$O!"J8;z%3!<%I;XDj$r>JN,$7$?>l9g$O!"(Binitdb
$B$G;XDj$7$?J8;z%3!<%I$,:NMQ$5$l$^$9!#(B
$B4D6-JQ?t(B PGCLIENTENCODING $B$K$D$$$F!'(B
$B%G%U%)%k%H$G$O!"%3%s%Q%$%k;~$K;XDj$7$?%5!<%PB&$NJ8;z%3!<%I$H!"(Bpsql
$B$J$I$N%/%i%$%"%s%HB&$NJ8;z%3!<%I$,0lCW$7$F$$$k$b$N$H8+Jo$5$l$^$9!#%5!<(B
$B%PB&$H0[$kJ8;z%3!<%I$r;H$$$?$$>l9g$O!"4D6-JQ?t(B PGCLIENTENCODING $B$r@_(B
$BDj$7$^$9!#@_Dj2DG=$JJ8;z%3!<%I$O!">e5-$K2C$(!"(BSJIS ($B%7%U%H(BJIS)
$B$,;XDj$G$-$^$9!#(B
$B4D6-JQ?t(B PGCLIENTENCODING $B$,@_Dj$5$l$F$$$J$$>l9g!"(Blibpq $B$O%;%C%7%g%s(B
$B3+;O;~$K%5!<%PB&$KJ8;z%3!<%I$rLd$$9g$o$;!"$=$NCM$r4D6-JQ?t(B
PGCLIENTENCODING $B$K@_Dj$7$^$9!#(B
$B4D6-JQ?t(B PGCLIENTENCODING $B$,@_Dj$5$l$F$$$k>l9g$O$=$NCM$,M%@h$5$l!"%5!<(B
$B%PB&$H0[$J$kJ8;z%3!<%I$,;HMQ$G$-$^$9!#@_Dj2DG=$JJ8;z%3!<%I$O!">e5-$K(B
$B2C$(!"(BSJIS ($B%7%U%H(BJIS)$B$,;XDj$G$-$^$9!#(B
$B$A$J$_$K!"(BSJIS $B$O(B JISX0201 $B$N(B 1$B%P%$%H%+%J!"$$$o$f$k!VH>3Q%+%?(B
$B%+%J!W$b%5%]!<%H$7$F$$$^$9(B($B7h$7$F!VH>3Q%+%?%+%J!W$N;HMQ$r$*4+(B
@ -150,6 +182,18 @@ postgresql 6.3.2 multi-byte (MB) support README 1998/5/25 $B:n@.(B
$B2~DjMzNr!'(B
1998/7/22 6.4 $B&A8~$1$K%Q%C%A$r%j%j!<%9!#(B
* initdb/createdb/create database $B$G%5!<%PB&$NJ8;z%3!<%I$r@_Dj(B
$B$G$-$k5!G=<BAu!#$3$N$?$a!"%7%9%F%`%+%?%m%0$N(B pg_database $B$K(B
$B?7$7$$%+%i%`(B encoding $B$rDI2C(B(MB$B$,M-8z$J;~$@$1(B)
* copy $B$,(B PGCLIENTENCODING $B$KBP1~(B
* SQL92 $B$N(B "SET NAMES" $B$r%5%]!<%H(B(MB$B$,M-8z$J;~$@$1(B)
* LATIN2-5 $B$r%5%]!<%H(B
* regression test $B$K(B unicode $B$N%F%9%H%1!<%9$rDI2C(B
* MB $B@lMQ$N(B regression $B%F%9%H%G%#%l%/%H%j(B test/mb $B$rDI2C(B
* $B%=!<%9%U%!%$%k$NCV$->l=j$rBgI}8+D>$7!#(BMB $B4X78$O(B
include/mb, backend/utils/mb $B$KCV$/$h$&$K$7$?(B
1998/5/25 $B%P%0=$@5(B(mb_b3.patch $B$H$7$F(B pgsql-jp ML $B$K%j%j!<%9!"(B
$BK\2H$G$O(B 6.4 snapshot $B$K<h$j9~$^$l$kM=Dj(B)