the script is not executable as UCS_to_most.pl is in CVS. It also won't
pick up any custom setting of the perl version/location to use. This
patch calls perl scripts like $(PERL) $(srcdir)/script.pl.
Kris Jurka
up a bunch of the support utilities.
In src/backend/utils/mb/Unicode remove nearly duplicate copies of the
UCS_to_XXX perl script and replace with one version to handle all generic
files. Update the Makefile so that it knows about all the map files.
This produces a slight difference in some of the map files, using a
uniform naming convention and not mapping the null character.
In src/backend/utils/mb/conversion_procs create a master utf8<->win
codepage function like the ISO 8859 versions instead of having a separate
handler for each conversion.
There is an externally visible change in the name of the win1258 to utf8
conversion. According to the documentation notes, it was named
incorrectly and this changes it to a standard name.
Running the Unicode mapping perl scripts has shown some additional mapping
changes in koi8r and iso8859-7.
> > > > It was made to cope with encoding such as an Asian bloc in 7.2Beta2.
> > > >
> > > > Added ServerEncoding
> > > > Korean (JOHAB), Thai (WIN874),
> > > > Vietnamese (TCVN), Arabic (WIN1256)
> > > >
> > > > Added ClientEncoding
> > > > Simplified Chinese (GBK), Korean (UHC)
> > > >
> > > >
> > > >
> http://www.sankyo-unyu.co.jp/Pool/postgresql-7.2b2.newencoding.diff.tar.gz
> > > > (608K)
> > >
> > > Looks good. I need some people to review this for me.
> >
> > For me they look good too. The only missing part is a
> > documentation. I will ask him to write it up. If he couldn't, I will
> > do it for him.
> > > The diff is 3mb
> > > but appears to address only additions to multibyte. I have attached a
> > > list of files it modifies. Also, look at the sizes of the mb/
> > > directory. It is getting large:
> > >
> > > 4 ./CVS
> > > 6 ./Unicode/CVS
> > > 3433 ./Unicode
> > > 6197 .
> >
> > Yes. We definitely need the on-the-fly encoding addition capability:
> > i.e. CREATE CHRACTER SET in the future...
> > --
> > Tatsuo Ishii
> >
> >
Address chainge.
http://www.sankyo-unyu.co.jp/Pool/postgresql-7.2.newencoding.diff.gz
Add PsqlODBC and document ...etc patch.
Eiji Tokuya
-------------------------------------------------------------------
Subject: Re: [PATCHES] encoding names
From: Karel Zak <zakkr@zf.jcu.cz>
To: Peter Eisentraut <peter_e@gmx.net>
Cc: pgsql-patches <pgsql-patches@postgresql.org>
Date: Fri, 31 Aug 2001 17:24:38 +0200
On Thu, Aug 30, 2001 at 01:30:40AM +0200, Peter Eisentraut wrote:
> > - convert encoding 'name' to 'id'
>
> I thought we decided not to add functions returning "new" names until we
> know exactly what the new names should be, and pending schema
Ok, the patch not to add functions.
> better
>
> ...(): encoding name too long
Fixed.
I found new bug in command/variable.c in parse_client_encoding(), nobody
probably never see this error:
if (pg_set_client_encoding(encoding))
{
elog(ERROR, "Conversion between %s and %s is not supported",
value, GetDatabaseEncodingName());
}
because pg_set_client_encoding() returns -1 for error and 0 as true.
It's fixed too.
IMHO it can be apply.
Karel
PS:
* following files are renamed:
src/utils/mb/Unicode/KOI8_to_utf8.map -->
src/utils/mb/Unicode/koi8r_to_utf8.map
src/utils/mb/Unicode/WIN_to_utf8.map -->
src/utils/mb/Unicode/win1251_to_utf8.map
src/utils/mb/Unicode/utf8_to_KOI8.map -->
src/utils/mb/Unicode/utf8_to_koi8r.map
src/utils/mb/Unicode/utf8_to_WIN.map -->
src/utils/mb/Unicode/utf8_to_win1251.map
* new file:
src/utils/mb/encname.c
* removed file:
src/utils/mb/common.c
--
Karel Zak <zakkr@zf.jcu.cz>
http://home.zf.jcu.cz/~zakkr/
C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz
--------------------------------------------------
Subject: Bug in unicode conversion ...
From: Jan Varga <varga@utcru.sk>
To: t-ishii@sra.co.jp
Date: Sat, 18 Nov 2000 17:41:20 +0100 (CET)
Hi,
I tried this new feature in PostgreSQL. I found one bug.
Script UCS_to_8859.pl skips input lines which
1. code <0x80 or
2. ucs <0x100
I think second one is not good idea because some codes in ISO8859-2
have ucs <0x100 (e.g. 0xE9 - 0x00E9)
--------------------------------------------------