[Koha-devel] new features to test

Joshua Ferraro jferraro at athenscounty.lib.oh.us
Fri Mar 26 08:44:13 CET 2004


Paul,

I updated to latest CVS and ran updatedatabase on our test machine.  Then I
tested the 'quick search' MARC searching and I have some results and comments
listed below.


*Title Search (Single Term) 'cryptonomicon'
(8 seconds to return result to browser)

mysql> explain select distinct m1.bibid from biblio,biblioitems,marc_biblio,marc_word as m1 where biblio.biblionumber=marc_biblio.biblionumber and biblio.biblionumber=biblioitems.biblionumber and m1.bibid=marc_biblio.bibid and (m1.word  like 'cryptonomicon%' and m1.tag+m1.subfieldid in ('245a')) order by biblio.title; +-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
| table       | type   | possible_keys        | key      | key_len | ref                      | rows | Extra                                        |
+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
| m1          | range  | bibid,word           | word     |     255 | NULL                     |    1 | Using where; Using temporary; Using filesort |
| marc_biblio | eq_ref | PRIMARY,biblionumber | PRIMARY  |       8 | m1.bibid                 |    1 | Using where; Distinct                        |
| biblio      | eq_ref | PRIMARY,blbnoidx     | PRIMARY  |       4 | marc_biblio.biblionumber |    1 | Distinct                                     |
| biblioitems | ref    | bibnoidx             | bibnoidx |       4 | biblio.biblionumber      |   12 | Using index; Distinct                        |
+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
4 rows in set (0.00 sec)


*Title Search (Multi-term) 'the great escape'
(8 sec to return result to browser)

mysql> explain select distinct m1.bibid from biblio,biblioitems,marc_biblio,marc_word as m1,marc_word as m2 where biblio.biblionumber=marc_biblio.biblionumber and biblio.biblionumber=biblioitems.biblionumber and m1.bibid=marc_biblio.bibid and (m1.bibid=m2.bibid) and ((m1.word  like 'Great%' and m1.tag+m1.subfieldid in ('245a'))and (m2.word like 'Escape%' and m2.tag+m2.subfieldid in('245a'))) order by biblio.title;
+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
| table       | type   | possible_keys        | key      | key_len | ref                      | rows | Extra                                        |
+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
| m2          | range  | bibid,word           | word     |     255 | NULL                     |  431 | Using where; Using temporary; Using filesort |
| m1          | ref    | bibid,word           | bibid    |       8 | m2.bibid                 |  131 | Using where                                  |
| marc_biblio | eq_ref | PRIMARY,biblionumber | PRIMARY  |       8 | m1.bibid                 |    1 | Using where; Distinct                        |
| biblio      | eq_ref | PRIMARY,blbnoidx     | PRIMARY  |       4 | marc_biblio.biblionumber |    1 | Distinct                                     |
| biblioitems | ref    | bibnoidx             | bibnoidx |       4 | biblio.biblionumber      |   12 | Using index; Distinct                        |
+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
5 rows in set (0.00 sec)

*Author Search (Single Term) 'stephenson'
(5 seconds to return result to browser)

mysql> explain select distinct m1.bibid from biblio,biblioitems,marc_biblio,marc_word as m1 where biblio.biblionumber=marc_biblio.biblionumber and biblio.biblionumber=biblioitems.biblionumber and m1.bibid=marc_biblio.bibid and (m1.word  like 'Stephenson%' and m1.tag+m1.subfieldid in ('100a')) order by biblio.title;
+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
| table       | type   | possible_keys        | key      | key_len | ref                      | rows | Extra                                        |
+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
| m1          | range  | bibid,word           | word     |     255 | NULL                     |   41 | Using where; Using temporary; Using filesort |
| marc_biblio | eq_ref | PRIMARY,biblionumber | PRIMARY  |       8 | m1.bibid                 |    1 | Using where; Distinct                        |
| biblio      | eq_ref | PRIMARY,blbnoidx     | PRIMARY  |       4 | marc_biblio.biblionumber |    1 | Distinct                                     |
| biblioitems | ref    | bibnoidx             | bibnoidx |       4 | biblio.biblionumber      |   12 | Using index; Distinct                        |
+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
4 rows in set (0.01 sec)

* Author Search (Multi-string) 'John Ghrisham'
(6 seconds to return results to browser)

mysql> explain select distinct m1.bibid from biblio,biblioitems,marc_biblio,marc_word as m1,marc_word as m2 where biblio.biblionumber=marc_biblio.biblionumber and biblio.biblionumber=biblioitems.biblionumber and m1.bibid=marc_biblio.bibid and (m1.bibid=m2.bibid) and ((m1.word  like 'John%' and m1.tag+m1.subfieldid in ('100a'))and (m2.word like 'Grisham%' and m2.tag+m2.subfieldid in('100a'))) order by biblio.title;
+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
| table       | type   | possible_keys        | key      | key_len | ref                      | rows | Extra                                        |
+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
| m2          | range  | bibid,word           | word     |     255 | NULL                     |  169 | Using where; Using temporary; Using filesort |
| m1          | ref    | bibid,word           | bibid    |       8 | m2.bibid                 |  131 | Using where                                  |
| marc_biblio | eq_ref | PRIMARY,biblionumber | PRIMARY  |       8 | m1.bibid                 |    1 | Using where; Distinct                        |
| biblio      | eq_ref | PRIMARY,blbnoidx     | PRIMARY  |       4 | marc_biblio.biblionumber |    1 | Distinct                                     |
| biblioitems | ref    | bibnoidx             | bibnoidx |       4 | biblio.biblionumber      |   12 | Using index; Distinct                        |
+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
5 rows in set (0.00 sec)


Recommendations:

Still can't do an 'Author' search on 'Grisham, John'.  Also there is not
a 'Subject' search available.

To improve the accuracy of the 'quick search' it should automatically
search the following MARC fields:
        Title-245a, 245b, 246a, 246b, 440a, 740a
        Author-100a, 110a, 700a, 710a,
        Subject-_all_ of the 600 fields


I still consider these searches to be slow even though the mysql is fast.
What's the bottleneck?

Also, NPL folks were wondering whether it's time to have another IRC
meeting to discuss MARC searching, retrieving results from the MARC
tables, and the general direction that development should take in the
coming months.

Thanks,

Joshua Ferraro




More information about the Koha-devel mailing list