[Koha-devel] marc_word and searching
Joshua Ferraro
jferraro at athenscounty.lib.oh.us
Mon May 24 10:44:02 CEST 2004
Paul et al,
I've been trying to figure out how best to solve our ' and , problem
with the marc searching and I've got a few comments to make about the
way that the searches are currently done (using marc_word) and the
problems with how marc_word stores data.
So here's a classic example of an author that fails currently:
o'brian, patrick
right now the search seperates the 'o' and the 'brian' and the 'patrick'
and the resulting query looks like this:
select distinct m1.bibid from biblio,biblioitems,marc_biblio,marc_word as m1,marc_word as m2,marc_word as m3,marc_word as m4 where biblio.biblionumber=marc_biblio.biblionumber and biblio.biblionumber=biblioitems.biblionumber and m1.bibid=marc_biblio.bibid and (m1.bibid=m2.bibid and m1.bibid=m3.bibid and m1.bibid=m4.bibid) and ((m1.word like 'o%' and m1.tag+m1.subfieldid in ('100a','110a', '700a', '710a'))and (m2.word like '\'%' and m2.tag+m2.subfieldid in('100a','110a', '700a', '710a'))and (m3.word like 'brian%' and m3.tag+m3.subfieldid in('100a','110a', '700a', '710a'))and (m4.word like 'patrick%' and m4.tag+m4.subfieldid in('100a','110a', '700a', '710a'))) order by biblio.title
So there is at least one major problem with this query which does not return
any results): marc_word does not store values as small as ' or o. So of course
there are no results ...
Even if I strip out the ' and , from the query and search on something like
(I add the following after line 117 in SearchMarc.pm):
@$value[$i] =~ s/'/ /g;
@$value[$i] =~ s/,/ /g;
which turns out like:
'o brian patrick'
it fails ('o' is too small for marc_word); and of course
@$value[$i] =~ s/'//g;
@$value[$i] =~ s/,//g;
resulting in:
'obrian patrick'
fails too--the data simply isn't stored right for this kind of search.
So I see two ways to fix this problem: 1) stop using marc_word for these
kinds of searches and use marc_subfield_table (which has the whole
'o'brian, patrick' in subfield_value) or 2) fix the way that marc_word
stores small values (it should store everything including , ' and single
letters like 'a', 'o', etc.
Any comments? Further suggestions?
Joshua
More information about the Koha-devel
mailing list