[Koha-devel] New Zebra Version 2.0.62 (Relevance plus st-numeric && CHR documentation fix)

David Cook dcook at prosentient.com.au
Mon Feb 1 01:44:46 CET 2016


Hi Barton and all,

 

Just noticed that Zebra 2.0.62 came out on the weekend:  <http://lists.indexdata.dk/pipermail/zebralist/2016-January/002642.html> http://lists.indexdata.dk/pipermail/zebralist/2016-January/002642.html

 

You’ll see “Allow @attr 2=102 for numeric index” near the bottom of the announcement. That should allow the use of “rk=()” around a query that contains a st-numeric qualifier. I haven’t tested it yet, but just a head’s up. Barton, you may find that fixes the problem you were experiencing in November.

 

They’ve also updated the documentation for the charmap files for “equivalent” ( <https://github.com/indexdata/idzebra/commit/0e8272b3d5a18a0695f8c7a58617caa0ad938059> https://github.com/indexdata/idzebra/commit/0e8272b3d5a18a0695f8c7a58617caa0ad938059) ( <http://www.indexdata.com/zebra/doc/character-map-files.html> http://www.indexdata.com/zebra/doc/character-map-files.html). It mentions that “equivalent” is actually for searching and not for sorting. It also mentions that the equivalent directive takes place before the map directive. 

 

I think we might want to update some of our CHR files in light of this clarification. At the moment, in Koha, I think if you search for “carers” you’ll also get hits for “careers” because of the following directives:

 

equivalent ëē(ee)

map ë                                   e

map ē                                   e

 

I think it’s because the term “careers” would be equivalent to “carērs” and “carërs”, according to the equivalent directive. And “carērs” and “carërs” would both map to “carers” with the map directive, which means that “carers” would get hits for a “careers” value. 

 

I’ve seen the same thing with “Siemon” and “Simon”:

 

equivalent ï(ie)

map ï                                     i

 

equivalents: Siemon = Sïmon

map: Sïmon = Simon

 

So a search for “Siemon” would equal “Simon”. 

 

Conclusion: I think we could actually go without the “equivalent” directive and just use “map”, since “map” is for indexing, searching, and sorting. We get everything we need from “map”.

 

I don’t think the equivalent is even 100% correct. For Scandinavian languages, this line would be incorrect:

 

equivalent aáàãåâăąȧǎȁȃ

 

å should actually be equivalent to “aa”. If you look at the system Zebra file scan.chr, you’ll see “equivalent å(aa)”. 

 

Anyway, just a FYI to everyone…

 

David Cook

Systems Librarian

Prosentient Systems

72/330 Wattle St, Ultimo, NSW 2007

 

From: Barton Chittenden [mailto:barton at bywatersolutions.com] 
Sent: Monday, 9 November 2015 11:14 PM
To: David Cook <dcook at prosentient.com.au>
Subject: Re: Searching numeric ranges

 

 

 

On Mon, Nov 9, 2015 at 1:33 AM, David Cook <dcook at prosentient.com.au <mailto:dcook at prosentient.com.au> > wrote:

Hi Barton:

 

I’d have to explore some more and I’ve already overstayed my day by 30 minutes.

 

Quite understandable :-) 

 

I’d suggest adding the “SetEnv DEBUG 1” to your Apache configuration. That should output the final Search.pm query to your logs, and then you can explore from there.

 

Yeah, I just hard coded DEBUG=1 in search.pm <http://search.pm>  on my development instance. All the search debugging spam, with none of the other spam.

 

We determined that the ranking operator/zebra construct rk( ... ) is incompatible with st_numeric :-P ... or at least that was Jesse Weaver's assessment... I was log-blind (you see too much snow, you go snowblind, you see to many logs, you go log-blind). I'll dig up more details in a bit.

 

Thanks,

 

--Barton

 

I might have some time to look tomorrow though :)

 

David Cook

Systems Librarian

Prosentient Systems

72/330 Wattle St, Ultimo, NSW 2007

 

From: Barton Chittenden [mailto:barton at bywatersolutions.com <mailto:barton at bywatersolutions.com> ] 
Sent: Friday, 6 November 2015 8:04 AM
To: David Cook <dcook at prosentient.com.au <mailto:dcook at prosentient.com.au> >
Subject: Fwd: Searching numeric ranges

 

Hey, I thought this might be in your area of expertise.

 

The CCL part is working but once it gets inside C4/Search.pm ... god knows....

 

Cheers,

 

--Barton

---------- Forwarded message ----------
From: Barton Chittenden <barton at bywatersolutions.com <mailto:barton at bywatersolutions.com> >
Date: Thu, Nov 5, 2015 at 11:47 AM
Subject: Searching numeric ranges
To: Koha-devel <koha-devel at lists.koha-community.org <mailto:koha-devel at lists.koha-community.org> >

I am working on searching lexile number ranges.

 

ccl.properties shows

 

    lex 1=9903 r=r

 

The 'r=r' bit means that I should be able to search using a numeric range separated by a dash, e.g.

 

   500-600

 

Should return any numeric results from 500 to 600.

 

The following query works:

 

   cgi-bin/koha/catalogue/search.pl?q=ccl%3Dlex%2Cst-numeric%3D500-600 <http://search.pl?q=ccl%3Dlex%2Cst-numeric%3D500-600> 

 

However, when I try adding that as an item in the search menu, as follows:

 

    $(document).ready(function(){

 

        //add lexile to search pull downs

        $("select[name='idx']").append("<option value='lex,st-numeric'>Lexile (e.g. 600 or 550-650 )</option>");

 

    }); 

 

That gets munged... the url reads

 

    cgi-bin/koha/catalogue/search.pl?idx=lex%2Cst-numeric <http://search.pl?idx=lex%2Cst-numeric&q=500-600> &q=500-600

 

 

And I get the following message:

 

    No results found

    No results match your search for 'lex,st-numeric: 500-600'. 

 

--Barton

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.koha-community.org/pipermail/koha-devel/attachments/20160201/aae664e9/attachment-0001.html>


More information about the Koha-devel mailing list