[HanoiLUG] vietnamese language and Unicode
Phan Thái Trung
phanthaitrung at gmail.com
Thu Jul 5 13:29:19 ICT 2007
2007/7/5, Jean Christophe André <jean-christophe.andre at auf.org>:
>
> Phan Thái Trung a écrit :
> > and is not supported in modern OS systems like WinXp or Linux.
> I don't know much about Windows XP, but with Linux, yes it is!!
>
> Linux has been supporting TCVN3 for a few years already, in console
> (text) mode and Xwindow (graphic) mode as well as in the recoding
> functionalities from the GNU libc. And when I arrived in Vietnam I had
> not so much difficulty to find command line tools for vietnamese
> recoding in Linux.
I don't think Linux has built-in TCVN3 installed, so one must search,
download/copy & install TCVN3 fonts for use.
>
> But in fact the real huge problem with these encodings is that they have
> been used in a way that invalidated them for interroperability.
>
> I mean, since vietnamese people had no choice but to use proprietary
> software at the time they created these encodings, they had to use very
> bad trick to be able to use them with these softwares.
More precisely, since it was not possible to add the official TCVN3
> encoding to Windows (and still it's not, thanks to proprietary
> software), they had to use the default ISO-8859-1 encoding instead to
> store real data. And to be able to display them correctly, they had to
> create vietnamese fonts with TCVN3 encoding positions but declared using
> the ISO-8859-1 encoding too.
I think you are talking about storing data in the web - like environment, or
database which only supports ANSI characters. In those environments, apps
need encode non-ANSI characters like TCVN3 into encoding character sets such
as ISO-8859-1 or UTF-x.
But in desktop apps, not web apps, TCVN3 strings usually do not need to be
encoded, because TCVN3 is 1-byte characters and they can easily stored in
the normal string in the developing environment without losing. This is one
of reasons why TCVN3 is commonly used in VNmese apps til now.
The 2nd reason is about how to develop/apply true Unicode apps, in Windows
(currently widely used in VN). It is really a difficult problem even at this
moment. I'm currently an I.T instructor in an I.T department, where my
students are developing VNmese-softwares yearly. They, students, have
difficulty to develop Unicode supported apps in common developing IDE. Their
products mostly still using TCVN3-like fonts (badly!) and rarely very little
support Unicode. The same problem in many I.T VN software product for now.
The best way is using modern frameworks such as MS .NET or Java, but they
are not Native machine language apps.
> Even worse, it has consequences for usage with OpenOffice.org too, even
> using it in Windows! For exemple the [ư] (u with horn) is encoded as the
> dash sign [-] in ISO-8859-1, which is used for hyphenation (cutting word
> when it is too long) so that OpenOffice.org treat it specificaly and do
> not allow it's recoding using the Unikey toolkit to go from TCVN3 to
> Unicode...
This is only one reason for ppl who are not technical at that time. They
don't mind about Unicode, ISO or Thu tuong Chinh phu blah blah... but they
only want his/her web browser must display Vnmese content correctly, that's
all. There are some patch of Dang Minh Tuan (author of Vietkey) to fix dash
sign [-] problem for IE, but it is not a good choice. The good choice is
switch to Unicode, mostly in the web environment at that time.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.hanoilug.org/pipermail/hanoilug/attachments/20070705/d6a76ae1/attachment.htm
More information about the HanoiLUG
mailing list