[Koha-devel] new features to test
paul POULAIN
paul.poulain at free.fr
Mon Mar 22 06:40:05 CET 2004
Stephen Hedges wrote:
>Paul -
>
>I've been testing the changes you made to search.marc/search.pl and
>SearchMarc.pm and thought you would like to see some results.
>
yep, thanks.
>First some questions. The "query" that gets displayed as you are building
>the search is confusing. It always seems to start with "biblioitems.illus
>contains ''." Is that actually correct? And does the query start by
>using the koha fields? Is that where it's finding the tags to search on?
>(I should have taken a closer look at your code.)
>
a bug, probably. that's why it's in HEAD branch :-D
>Also three general problems. The marclist seems to be broken, since it is
>not available in the pull-down menu (neither in the default template nor
>the npl template). And MARCdetail.pl never seems to display any results.
>And all of the queries are followed in the log with this error:
>
>DBD::mysql::st execute failed: Unknown column 'seealso' in 'field list' at
>/usr/local/koha/intranet/modules//C4/Biblio.pm line 235.
>[sounds like we need to updatedatabase]
>
yes ;-)
>So now for some search results --
>
>* Search on keyword "dog" (worked OK):
>explain select distinct m1.bibid from
>biblio,biblioitems,marc_biblio,marc_word as m1 where
>biblio.biblionumber=marc_biblio.biblionumber and
>biblio.biblionumber=biblioitems.biblionumber and
>m1.bibid=marc_biblio.bibid and (m1.word like 'dog%' and
>m1.tag+m1.subfieldid in ('')) order by biblio.title;
>+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
>| table | type | possible_keys | key | key_len | ref
> | rows | Extra |
>+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
>| m1 | range | bibid,word | word | 255 | NULL
> | 2200 | Using where; Using temporary; Using filesort |
>| marc_biblio | eq_ref | PRIMARY,biblionumber | PRIMARY | 8 | m1.bibid
> | 1 | Using where; Distinct |
>| biblio | eq_ref | PRIMARY,blbnoidx | PRIMARY | 4 |
>marc_biblio.biblionumber | 1 | Distinct
> |
>| biblioitems | ref | bibnoidx | bibnoidx | 4 |
>biblio.biblionumber | 12 | Using index; Distinct
> |
>+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
>
Nice result set : 2200 x 1 x 1 x12 rows to parse. Not too many.
>* Search on author "Henry" (worked OK):
>
<snip>
>* Search on author "Sue Henry" order by author (worked _very_ well!)
>
<snip>
>* Search on author "Sue Henry" order by title (worked _very_ well!)
>
<snip>
>* BUT NOTE -- Search on author "Henry, Sue" did NOT work:
>select distinct m1.bibid from biblio,biblioitems,marc_biblio,marc_word as
>m1,marc_word as m2 where biblio.biblionumber=marc_biblio.biblionumber and
>biblio.biblionumber=biblioitems.biblionumber and
>m1.bibid=marc_biblio.bibid and (m1.bibid=m2.bibid) and ((m1.word like
>'henry,%' and m1.tag+m1.subfieldid in ('100a'))and (m2.word like 'sue%'
>and m2.tag+m2.subfieldid in('100a'))) order by biblio.author
>[no results returned]
>* Search on title "I'll be home for Christmas" (failed):
>select distinct m1.bibid from biblio,biblioitems,marc_biblio,marc_word as
>m1,marc_word as m2,marc_word as m3 where
>biblio.biblionumber=marc_biblio.biblionumber and
>biblio.biblionumber=biblioitems.biblionumber and
>m1.bibid=marc_biblio.bibid and (m1.bibid=m2.bibid and m1.bibid=m3.bibid)
>and ((m1.word like 'I'll%' and m1.tag+m1.subfieldid in ('245a'))and
>(m2.word like 'home%' and m2.tag+m2.subfieldid in('245a'))and (m3.word
>like 'Christmas%' and m3.tag+m3.subfieldid in('245a'))) order by
>biblio.title
>DBD::mysql::st execute failed: You have an error in your SQL syntax.
>Check the manual that corresponds to your MySQL server version for the
>right syntax to use near 'll%' and m1.tag+m1.subfieldid in ('245a'))and
>(m2.word like 'ho
>[we need to escape apostrophes]
>
we need to remove ' and , and other useless characters. enter a
bugs.koha.org pls
>* Search on barcode (worked very well):
>+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
>| table | type | possible_keys | key | key_len | ref
> | rows | Extra |
>+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
>| m1 | range | bibid,word | word | 255 | NULL
> | 1 | Using where; Using temporary; Using filesort |
>| marc_biblio | eq_ref | PRIMARY,biblionumber | PRIMARY | 8 | m1.bibid
> | 1 | Using where; Distinct |
>| biblio | eq_ref | PRIMARY,blbnoidx | PRIMARY | 4 |
>marc_biblio.biblionumber | 1 | Distinct
> |
>| biblioitems | ref | bibnoidx | bibnoidx | 4 |
>biblio.biblionumber | 12 | Using index; Distinct
> |
>+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
>
wonderful :1 x 1 x 1 x 12 rows to parse in the DB !!! For a 1Gb DB if
your test DB is a copy of your real one ?
>* Search ("Q2") on keyword "England" and itemtype "AB" (failed--returned
>no results);
>explain select distinct m1.bibid from
>biblio,biblioitems,marc_biblio,marc_word as m1,marc_word as m2 where
>biblio.biblionumber=marc_biblio.biblionumber and
>biblio.biblionumber=biblioitems.biblionumber and
>m1.bibid=marc_biblio.bibid and (m1.bibid=m2.bibid) and ((m1.word like
>'england%' and m1.tag+m1.subfieldid in (''))and (m2.word like 'AB%' and
>m2.tag+m2.subfieldid in('item'))) order by biblio.title;
>+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
>| table | type | possible_keys | key | key_len | ref
> | rows | Extra |
>+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
>| m1 | range | bibid,word | word | 255 | NULL
> | 5377 | Using where; Using temporary; Using filesort |
>| marc_biblio | eq_ref | PRIMARY,biblionumber | PRIMARY | 8 | m1.bibid
> | 1 | Using where; Distinct |
>| biblio | eq_ref | PRIMARY,blbnoidx | PRIMARY | 4 |
>marc_biblio.biblionumber | 1 | Distinct
> |
>| biblioitems | ref | bibnoidx | bibnoidx | 4 |
>biblio.biblionumber | 12 | Using index; Distinct
> |
>| m2 | ref | bibid,word | bibid | 8 | m1.bibid
> | 20 | Using where; Distinct |
>+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
>
mmm... this one is important : the marc_word does NOT store words <= 2
letters.
sub MARCaddword {
# split a subfield string and adds it into the word table.
# removes stopwords
my ($dbh,$bibid,$tag,$tagorder,$subfieldid,$subfieldorder,$sentence)
=@_;
$sentence =~ s/(\.|\?|\:|\!|\'|,|\-|\"|\(|\)|\[|\]|\{|\})/ /g;
my @words = split / /,$sentence;
my $stopwords= C4::Context->stopwords;
my $sth=$dbh->prepare("insert into marc_word (bibid, tag, tagorder,
subfieldid, subfieldorder, word, sndx_word)
values (?,?,?,?,?,?,soundex(?))");
foreach my $word (@words) {
# we record only words longer than 2 car and not in stopwords hash
if (length($word)>2 and !($stopwords->{uc($word)})) {
$sth->execute($bibid,$tag,$tagorder,$subfieldid,$subfieldorder,$word,$word);
if ($sth->err()) {
warn "ERROR ==> insert into marc_word (bibid, tag, tagorder,
subfieldid, subfieldorder, word, sndx_word) values
($bibid,$tag,$tagorder,$subfieldid,$subfieldorder,$word,soundex($word))\n";
}
}
}
}
What should we decide here ?
* also store <=2 letters word.
* say that itemtype & other fields must be >2 chars ?
>Hope this helps --- Stephen
>
yes, thanks a lot.
--
Paul POULAIN
Consultant indépendant en logiciels libres
responsable francophone de koha (SIGB libre http://www.koha-fr.org)
More information about the Koha-devel
mailing list