[Koha-devel] [Koha] need help in zebra indexing for Arabic words

Karam Qubsi karamqubsi at gmail.com
Fri Oct 26 00:58:26 CEST 2012


Hi all
I solved this in zebra by customizing the transliterate rule  in
words-icu.xml file

I will share a complete file solve this in Arabic soon !

the solution is by adding the following : (for example ) : I will not use
here the Arabic characters  to make it more simple :

if we have language X and in this language we write in connected letter but
some letter is not important in the search process , so we have this word
" *the*word " in the search the searcher is not interested in finding *the*
but he is absolutely search for "word "

so I solve this by following this guide :
http://userguide.icu-project.org/transforms/general/rules#TOC-Context

and make zebra convert thew to w
and we may have to make this for every letter thea to a _ theb to b >>>>
thez to z

like in the following :
  <transliterate rule="{ thea > a "/>
  <transliterate rule="{ thew > w "/>
...
...
..
  <transliterate rule="{ thez > z "/>
so if some one search for theword the zebra will convert thew to w so
searching for word = theword :D

and for Arabic :
  <transliterate rule="{ الا > ا "/>
  <transliterate rule="{ الب > ب "/>
.....
...
...
..

  <transliterate rule="{ الي > ي "/>
so searching for  " بحث"
will find  "البحث"

and this will solve the whole problem :)
I wish this will help you Mohamed

Thank you Frédéric , Paul

Karam


On Thu, Oct 25, 2012 at 9:23 AM, Karam Qubsi <karamqubsi at gmail.com> wrote:

> Yes it's not a koha problem
> but I think there is some people who fix this in zebra ( or maybe it's
> just some more options to add in zebra files )
>
> Massoud Alshareef  from KnowledgeWare Technologies mention that they have
> do that and solve the problem
> in : http://koha-community.org/category/koha-news/support-company-press/
>
> I wish if he can help us in this (cc to him )
>
> I heard about solr that it's very good but I didn't search about arabic
> support if better than zebra but I see this now :
> http://wiki.apache.org/solr/LanguageAnalysis#Arabic
>
> anyway thanks a lot and I will search more about that if I find some
> solution I will share it with you
>
>
> best regards
> Karam .
>
>
>
> On Thu, Oct 25, 2012 at 8:02 AM, Paul Poulain <paul.poulain at biblibre.com>wrote:
>
>> Le 25/10/2012 13:53, Frédéric Demians a écrit :
>> > No, you don't need help, you need to contract a developer to do the job.
>> What Frederic is explaining here is that you can't achieve this with the
>> current Koha. And I suspect it's not a koha problem, but a zebra/icu one.
>>
>> Side comment = we're working on integration of a new search engine layer
>> (solr). Maybe solr will fix this problem ?
>>
>> Anyway, we're looking for some funding for continuing the work on search
>> layer (see:
>>
>> http://wiki.koha-community.org/w/index.php?title=C_%26_P_Search_Rewrite_RFC
>> )
>>
>>
>> --
>> Paul POULAIN
>> http://www.biblibre.com
>> Expert en Logiciels Libres pour l'info-doc
>> Tel : (33) 4 91 81 35 08
>> _______________________________________________
>> Koha-devel mailing list
>> Koha-devel at lists.koha-community.org
>> http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
>> website : http://www.koha-community.org/
>> git : http://git.koha-community.org/
>> bugs : http://bugs.koha-community.org/
>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: </pipermail/koha-devel/attachments/20121025/e67d9d27/attachment.htm>


More information about the Koha-devel mailing list