[Koha-devel] My Zebra Indexing with ICU option is not working

Jared Camins-Esakov jcamins at cpbibliography.com
Thu Jan 10 14:41:00 CET 2013


Waqar,

After enabling the ICU option manually, it's not working for my huge data
> until I deleted the most of the biblio records.
>
> Then only difference was, I modified <icu_chain locale="en"> to blank <icu_chain
> locale=""> instead of <icu_chain locale="en_IN.UTF-8">. As I am not sure
> what to do with locale for other languages
>

You cannot not have a locale, so far as I know. Leave it as locale="en" if
you don't know what locale you want.

So,
>
> 1 - I am getting these warining? Although I have already renamed my
>     default.idx located at /home/koha/koha-dev/etc/zebradb/etc (i hope
> this is the right location)
>

I'm not sure why you're talking about renaming default.idx, but you should
not be. default.idx is required whether you are using ICU or not. You just
have to adjust whether it is using an icuchain or charmap file.

====================
> REINDEXING zebra
> ====================
> 12:22:45-10/01 zebraidx(2057) [warn] Unknown register type: 0
> 12:22:45-10/01 zebraidx(2057) [warn] Unknown register type: n
> 12:22:45-10/01 zebraidx(2057) [warn] Unknown register type: y
> 12:22:45-10/01 zebraidx(2057) [warn] Unknown register type: d
>
> 2 - Right after that I have this warning
>
>     [warn] previous transaction didn't reach commit
>

That will be because either A) there was a very corrupted record or (more
likely) B) the lack of a default.idx file is giving Zebra conniptions.


> 3 - I deleted the records upto just 9000. Now zebra started to index other
> languages. Why it is working only on this small set of data? Do i need to
> define any escape sequence somewhere for some special characters.
>

I had problems using ICU indexing for datasets larger than ~400k. I believe
the problem was a corrupted record, but with 450k records I couldn't be
bothered to track down which record it was.

Regards,
Jared

-- 
Jared Camins-Esakov
Bibliographer, C & P Bibliography Services, LLC
(phone) +1 (917) 727-3445
(e-mail) jcamins at cpbibliography.com
(web) http://www.cpbibliography.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.koha-community.org/pipermail/koha-devel/attachments/20130110/6d8db62a/attachment.html>


More information about the Koha-devel mailing list