[Koha-devel] new features to test

paul POULAIN paul.poulain at free.fr
Mon Mar 22 06:40:05 CET 2004


Stephen Hedges wrote:

>Paul -
>
>I've been testing the changes you made to search.marc/search.pl and
>SearchMarc.pm and thought you would like to see some results.
>
yep, thanks.

>First some questions.  The "query" that gets displayed as you are building
>the search is confusing.  It always seems to start with "biblioitems.illus
>contains ''."  Is that actually correct?  And does the query start by
>using the koha fields?  Is that where it's finding the tags to search on? 
>(I should have taken a closer look at your code.)
>
a bug, probably. that's why it's in HEAD branch :-D

>Also three general problems.  The marclist seems to be broken, since it is
>not available in the pull-down menu (neither in the default template nor
>the npl template).  And MARCdetail.pl never seems to display any results. 
>And all of the queries are followed in the log with this error:
>
>DBD::mysql::st execute failed: Unknown column 'seealso' in 'field list' at
>/usr/local/koha/intranet/modules//C4/Biblio.pm line 235.
>[sounds like we need to updatedatabase]
>
yes ;-)

>So now for some search results --
>
>* Search on keyword "dog" (worked OK):
>explain select distinct m1.bibid from
>biblio,biblioitems,marc_biblio,marc_word as m1 where
>biblio.biblionumber=marc_biblio.biblionumber and
>biblio.biblionumber=biblioitems.biblionumber and
>m1.bibid=marc_biblio.bibid and (m1.word  like 'dog%' and
>m1.tag+m1.subfieldid in ('')) order by biblio.title;
>+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
>| table       | type   | possible_keys        | key      | key_len | ref
>              | rows | Extra                                        |
>+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
>| m1          | range  | bibid,word           | word     |     255 | NULL
>              | 2200 | Using where; Using temporary; Using filesort |
>| marc_biblio | eq_ref | PRIMARY,biblionumber | PRIMARY  |       8 | m1.bibid
>              |    1 | Using where; Distinct                        |
>| biblio      | eq_ref | PRIMARY,blbnoidx     | PRIMARY  |       4 |
>marc_biblio.biblionumber |    1 | Distinct                                
>    |
>| biblioitems | ref    | bibnoidx             | bibnoidx |       4 |
>biblio.biblionumber      |   12 | Using index; Distinct                   
>    |
>+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
>
Nice result set : 2200 x 1 x 1 x12 rows to parse. Not too many.

>* Search on author "Henry" (worked OK):
>
<snip>

>* Search on author "Sue Henry" order by author (worked _very_ well!)
>
<snip>

>* Search on author "Sue Henry" order by title (worked _very_ well!)
>
<snip>

>* BUT NOTE -- Search on author "Henry, Sue" did NOT work:
>select distinct m1.bibid from biblio,biblioitems,marc_biblio,marc_word as
>m1,marc_word as m2 where biblio.biblionumber=marc_biblio.biblionumber and
>biblio.biblionumber=biblioitems.biblionumber and
>m1.bibid=marc_biblio.bibid and (m1.bibid=m2.bibid) and ((m1.word  like
>'henry,%' and m1.tag+m1.subfieldid in ('100a'))and (m2.word like 'sue%'
>and m2.tag+m2.subfieldid in('100a'))) order by biblio.author
>[no results returned]
>* Search on title "I'll be home for Christmas" (failed):
>select distinct m1.bibid from biblio,biblioitems,marc_biblio,marc_word as
>m1,marc_word as m2,marc_word as m3 where
>biblio.biblionumber=marc_biblio.biblionumber and
>biblio.biblionumber=biblioitems.biblionumber and
>m1.bibid=marc_biblio.bibid and (m1.bibid=m2.bibid and m1.bibid=m3.bibid)
>and ((m1.word  like 'I'll%' and m1.tag+m1.subfieldid in ('245a'))and
>(m2.word like 'home%' and m2.tag+m2.subfieldid in('245a'))and (m3.word
>like 'Christmas%' and m3.tag+m3.subfieldid in('245a'))) order by
>biblio.title
>DBD::mysql::st execute failed: You have an error in your SQL syntax. 
>Check the manual that corresponds to your MySQL server version for the
>right syntax to use near 'll%' and m1.tag+m1.subfieldid in ('245a'))and
>(m2.word like 'ho
>[we need to escape apostrophes]
>
we need to remove ' and , and other useless characters. enter a 
bugs.koha.org pls

>* Search on barcode (worked very well):
>+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
>| table       | type   | possible_keys        | key      | key_len | ref
>              | rows | Extra                                        |
>+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
>| m1          | range  | bibid,word           | word     |     255 | NULL
>              |    1 | Using where; Using temporary; Using filesort |
>| marc_biblio | eq_ref | PRIMARY,biblionumber | PRIMARY  |       8 | m1.bibid
>              |    1 | Using where; Distinct                        |
>| biblio      | eq_ref | PRIMARY,blbnoidx     | PRIMARY  |       4 |
>marc_biblio.biblionumber |    1 | Distinct                                
>    |
>| biblioitems | ref    | bibnoidx             | bibnoidx |       4 |
>biblio.biblionumber      |   12 | Using index; Distinct                   
>    |
>+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
>
wonderful :1 x 1 x 1 x 12 rows to parse in the DB !!! For a 1Gb DB if 
your test DB is a copy of your real one ?

>* Search ("Q2") on keyword "England" and itemtype "AB" (failed--returned
>no results);
>explain select distinct m1.bibid from
>biblio,biblioitems,marc_biblio,marc_word as m1,marc_word as m2 where
>biblio.biblionumber=marc_biblio.biblionumber and
>biblio.biblionumber=biblioitems.biblionumber and
>m1.bibid=marc_biblio.bibid and (m1.bibid=m2.bibid) and ((m1.word  like
>'england%' and m1.tag+m1.subfieldid in (''))and (m2.word like 'AB%' and
>m2.tag+m2.subfieldid in('item'))) order by biblio.title;
>+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
>| table       | type   | possible_keys        | key      | key_len | ref
>              | rows | Extra                                        |
>+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
>| m1          | range  | bibid,word           | word     |     255 | NULL
>              | 5377 | Using where; Using temporary; Using filesort |
>| marc_biblio | eq_ref | PRIMARY,biblionumber | PRIMARY  |       8 | m1.bibid
>              |    1 | Using where; Distinct                        |
>| biblio      | eq_ref | PRIMARY,blbnoidx     | PRIMARY  |       4 |
>marc_biblio.biblionumber |    1 | Distinct                                
>    |
>| biblioitems | ref    | bibnoidx             | bibnoidx |       4 |
>biblio.biblionumber      |   12 | Using index; Distinct                   
>    |
>| m2          | ref    | bibid,word           | bibid    |       8 | m1.bibid
>              |   20 | Using where; Distinct                        |
>+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
>
mmm... this one is important : the marc_word does NOT store words <= 2 
letters.

sub MARCaddword {
# split a subfield string and adds it into the word table.
# removes stopwords
    my ($dbh,$bibid,$tag,$tagorder,$subfieldid,$subfieldorder,$sentence) 
=@_;
    $sentence =~ s/(\.|\?|\:|\!|\'|,|\-|\"|\(|\)|\[|\]|\{|\})/ /g;
    my @words = split / /,$sentence;
    my $stopwords= C4::Context->stopwords;
    my $sth=$dbh->prepare("insert into marc_word (bibid, tag, tagorder, 
subfieldid, subfieldorder, word, sndx_word)
            values (?,?,?,?,?,?,soundex(?))");
    foreach my $word (@words) {
# we record only words longer than 2 car and not in stopwords hash
    if (length($word)>2 and !($stopwords->{uc($word)})) {
        
$sth->execute($bibid,$tag,$tagorder,$subfieldid,$subfieldorder,$word,$word);
        if ($sth->err()) {
        warn "ERROR ==> insert into marc_word (bibid, tag, tagorder, 
subfieldid, subfieldorder, word, sndx_word) values 
($bibid,$tag,$tagorder,$subfieldid,$subfieldorder,$word,soundex($word))\n";
        }
    }
    }
}

What should we decide here ?
* also store <=2 letters word.
* say that itemtype & other fields must be >2 chars ?

>Hope this helps --- Stephen
>
yes, thanks a lot.

-- 
Paul POULAIN
Consultant indépendant en logiciels libres
responsable francophone de koha (SIGB libre http://www.koha-fr.org)






More information about the Koha-devel mailing list