[Koha-zebra] Zebra and non-filing characters

Sebastian Hammer quinn at indexdata.com
Thu Dec 22 21:19:35 CET 2005


Joshua Ferraro wrote:

>Hello everyone,
>
>This is just generic question regarding Zebra's handling of
>MARC non-filing characters. I know there is a 'stopwords'-like
>function available using the 'map' directive:
>
>map (^The\s) @
>
>but I'm wondering whether Zebra is also capable of examining the
>non-filing character specs within each MARC field to decide
>whether to index or not to index ...
>  
>
You mean using an indicator in the field to determine how many 
characters to skip? To the best of my knowledge, this is not supported 
at present, sorry.

What I don't like about that approach anyway is that it leaves it 
ambiguous what happens when the user put a leading article into a search 
term... I think yu'd be better off just configuring the system to ignore 
the most common leading articles as described above.

It is true that this would require separate configuration for different 
languages, but you probably wouldn't get around that anyway, since many 
non-English-speaking countries use other record formats than MARC21, and 
the use of indicators to control indexing is not universal.. the Danish 
MARC (cleverly named DANMARC) format, for instance, use a special 
character inside of the subfields to mark the part which should not be 
indexed.

--Sebastian

>Cheers,
>
>  
>

-- 
Sebastian Hammer, Index Data
quinn at indexdata.com   www.indexdata.com
Ph: (603) 209-6853







More information about the Koha-zebra mailing list