[Koha-translate] Re: to begin with...

dbkliv dbkliv at netscape.net
Thu Jan 9 16:13:57 CET 2003


news-misc at ada.dhs.org wrote:
> In article <3E1BDE62.2030003 at koha-fr.org>,
> nicolas morin <nicolas at koha-fr.org> wrote:
> 
> 
>>dbkliv wrote:
>>
>>
>>>ISO-10646 (Unicode) is probably the strongest bet here.
>>
>>I'd be happy to hear more about that...
> 
> 
> Using ISO-10646 directly is not necessary. This is because
> the HTML 4.0 standard defines ISO-10646 as the underlying
> character set of all HTML documents, no matter how you encode
> the document. (That is the reason why entities (the &...;
> sequences) can work at all -- HTML entities are long-hand
> aliases to ISO-10646 characters.)
> 

I wasn't aware of this - though it makes perfect sense. Thanks
for the clarification.

I'm curious what the problems are with representing French with
Unicode (of whatever form). The special French characters (&eacute;
etc.) all fall into the first 256 characters of ASCII, and similarly
of Unicode - at that low range, the two are nearly identical, aren't
they?

At any rate, French translation is not my forte; I tend to deal
with different languages. So I'm honestly just passingly curious,
not challenging you here. :)

Your further comments about UCS-2 vs UTF-8 are well worth noting
for translators.






More information about the Koha-translate mailing list