[Koha-win32] Re: [Koha] Unicode

Carol Ku carolcool01 at yahoo.com
Tue Dec 20 07:53:34 CET 2005


When I imported a MARC record with chinese characters using MARCedit to convert the file into UTF8, several of the characters did not display properly (see attached file) I already set the web browser encoding to match with MARC Edit.  Still come up with the same problem... it looks like during the import process, some characters were not read correctly by Koha?

Has anyone encountered similar problem?

Thomas D <koha at alinto.com> wrote: Carol,

I believe that Firefox is up to version 1.5 but 1.5 may be only a beta
release currently.  I suspect the configuration of Firefox is more likely to
be the problem.  Try changing the font used within Firefox to the same one
used for MARCEdit.  Find edit preferences in the drop down menus for Firefox
or the MS Windows version equivalent.  Go to Edit Preferences : General :
Choose Fonts and Colors.

I suspect the reason the problem is manifest even while the font that you
have currently selected in Firefox may display the same character correctly
when you type it is because two almost identical glyphs are used in Unicode
for the same basic character sometimes.  Whatever font you have currently
configured for Firefox  may be unable to display the glyph that MARCEdit
assigned in the conversion of the record to UTF-8.  The optimal font should
display the different glyphs correctly for the same basic character.  The
actual keystrokes required to generate the different glyphs will be
different even if the basic character is the same.


MAILING LISTS

The links on koha.org seem to have been fixed now.  Try
http://www.koha.org/community/mailing-lists.html .


Thomas D


Quoting Carol Ku  :
> ---------------- Beginning of the original message ------------------
> 
> Hi Thomas:
>   I am using Mozilla Firefox, i thought there is only one
> version?  1.0?  Koha
> 2.2.4.  Yes I used the bulkmarcimport... Yes, the problem
> remains in Chinese...
> only for one or two characters though.... If I were to enter
> the data myself, it
> wouldn't cause a problem...
>    
>   Oh, the window mailing list... i sent emails to savannah
> before, but all were
> bounced back, so I resubscribe it again through koha home
> page, and i was given
> the old mailing list.... Yes you are right, i have not been
> getting any response
> from windows mailing list.
>   
> 
> Thomas D  wrote:
>   MULTITUDE OF CHARACTER SETS AND ENCODINGS
> 
> There are variant encodings of Unicode. UTF-8 is but one of
> them. There
> are also other encodings such as UTF-16, UTF-32, and USC2.
> Conversion
> applications can convert between different encodings.
> 
> MS Windows can use UTF-16 directly for keyboard output but not
> UTF-8. To my
> knowledge there are no keyboard generation applications that
> work around
> this problem directly.
> 
> Unix can use UTF-8 directly for keyboard output so encoding
> conversion
> issues are less problematic.
> 
> MARC records have more usually used other older character sets
> to represent
> similar sets of characters to Unicode. These library character
> set
> standards were developed before Unicode existed. One such
> standard that is
> prevalent in MARC-21 records is the MARC-8 character set.
> MARC-8 should not
> be confused with UTF-8. They are not compatible but character
> set
> conversion applications can convert between them.
> 
> 
> CHARCTER SETS AND ENCODINGS IN KOHA
> 
> Koha 3.0 should convert between MARC-8 and UTF-8 for at least
> major Western
> European languages. Chinese may have to wait for Koha 3.0.X.
> especially as
> I do not know how to identify which Chinese glyphs are which.
> At least with
> Western European languages, I know how to read the alphabets
> even when I do
> not know how to read the language.
> 
> Previously you have changed the Koha SQL columns from ISO 8859
> to UTF-8 if
> necessary and the charset headers for the web pages that Koha
> sends to the
> webserver from ISO 8859 to UTF-8. The web browser then would
> seem to have
> done a certain degree of conversion work automatically that I
> had not
> expected would happen as well for characters that you typed as
> opposed to
> characters that were merely displayed within the web browser.
> However, this
> seems to have worked for you so far on MS Windows. I would
> presume that the
> web browser itself would then be converting between UTF-16
> from MS Windows
> and UTF-8 inside the web browser before posting back to Koha.
> 
> If the issue that you have now is only for one or two
> characters after a
> conversion, that seems like the converting application had
> partial failure.
> I would suggest that the conversion inside MARCedit was
> successful but that
> the conversion inside your web browser for Koha was less
> succesful. What
> web browser and version are you using with Koha?
> 
> I am assuming that the most all of the characters in your
> problematic
> records are in Chinese. I am also assuming that you have used
> bulkmarcimport.pl to import these records. Please let me know
> if either is
> not the case.
> 
> 
> KOHA WINDOWS LIST CHANGE
> 
> Do you have any responses from the Savannah list any longer?
> The address
> for the Koha Windows list is now koha-win32 at nongnu.org . This
> change is
> part of a move of the Koha project from Sourceforge to
> Savannah. The
> Sourceforge site had become much to unresponsive with the
> volume of users
> relative to the provision of servers. You should have a better
> response
> about MS Windows issues on the MS Windows list. Unfortunate
> mailing lists
> on Savannah do seem to suffer from delays in the mail queue.
> 
> 
> Thomas D
> 
> 
> Quoting Carol Ku :
> > ---------------- Beginning of the original message
> ------------------
> > 
> > I imported two book records with chinese characters.
> However,
> > there are about
> > one or two characters that show up wacky. I used MARCEdit to
> > convert the text
> > file into MARC UTF file. When I open the file using
> MARCedit,
> > all the
> > characters look fine.
> > 
> > I was told that MARCEdit uses Arial Unicode MS, is it the
> > same code as UTF8? 
> > If not, how can I oversome this problem?
> > 
> > __________________________________________________
> > Do You Yahoo!?
> > Tired of spam? Yahoo! Mail has the best spam protection
> > around 
> > http://mail.yahoo.com
> > 
>     
> ---------------------------------
>   _______________________________________________
> > Koha mailing list
> > Koha at lists.katipo.co.nz
> > http://lists.katipo.co.nz/mailman/listinfo/koha
> > 
> > ------------------- End of the original message
> ---------------------
> 
> 
> 
> 
> ---------------------------------------------
> Protect your mails from viruses thanks to Alinto Premium
> services
> http://www.alinto.com
>   
> 
> 
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection
> around 
> http://mail.yahoo.com 
> ------------------- End of the original message ---------------------




---------------------------------------------
Protect your mails from viruses thanks to Alinto Premium services http://www.alinto.com



__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: </pipermail/koha-win32/attachments/20051219/81096211/attachment-0001.htm>


More information about the Koha-win32 mailing list