[Koha-bugs] [Bug 12478] Elasticsearch support for Koha

bugzilla-daemon at bugs.koha-community.org bugzilla-daemon at bugs.koha-community.org
Thu Aug 27 16:01:41 CEST 2015


http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=12478

--- Comment #79 from Jonathan Druart <jonathan.druart at bugs.koha-community.org> ---
Me again :)

So, I have tried to do some tests locally using your branch (OPAC biblio search
only).
The first problem I got was to find a MARC21 DB (since the UNIMARC mappings are
not defined, I cannot test with an UNIMARC DB).
I have used the one created for the sandboxes
(http://git.koha-community.org/gitweb/?p=contrib/global.git;a=blob;f=sandbox/sql/sandbox1.sql.gz;h=19268bccb43b2a33d5644b7d86cbb1abb323016b;hb=HEAD).
But there are only 436 biblios, it's not enough to test some stuffs (facets for
instance).
Or maybe you can share your DB?

Here some notes:

1/ Add deps to C4/Installer/PerlDependencies.pm

2/ The number of tests provided is very low.

3/ catalyst/elastic_search is 1004 commits behind origin/master, please rebase

4/ The message "No 'elasticsearch' block is defined in koha-conf.xml" should be
raised before starting the indexation process, and not on commiting the first
batch.

5/ You really need to tune the default value for the commit :)
commit 100:  perl misc/search_tools/rebuild_elastic_search.pl -b  77.57s user
0.86s system 91% cpu 1:25.62 total
commit 1000: perl misc/search_tools/rebuild_elastic_search.pl -b  24.68s user
0.52s system 79% cpu 31.595 total
For Solr, we used 5000.
Yes I know, it's configurable.

6/ Verbose does not work as expected, it could be fixed with
-    print $msg if ($verbose <= $level);
+    print $msg if ($verbose >= $level);

7/ perl -e "use
Pod::Checker;podchecker('misc/search_tools/rebuild_elastic_search.pl')";
*** WARNING: empty section in previous paragraph at line 36 in file
misc/search_tools/rebuild_elastic_search.pl
*** ERROR: =over on line 38 without closing =back at line EOF in file
misc/search_tools/rebuild_elastic_search.pl

8/ 2 occurrences of "Solr" reintroduced in installer/data/mysql/sysprefs.sql
and koha-tmpl/intranet-tmpl/prog/en/modules/admin/preferences/admin.pref

9/ Test!
I have launched some searches, with the same DB (the one from the sandbox).
On a local using your remote branch and another one using master (sandbox7
provided by BibLibre).

a. Search for 'd' (screentshot opac_search_for_d_sort_by_relevance.png ES on
the left, Zebra on the right).
Main differences:
- 183 vs 182 results (?) 
- the order is not the same (make sense)
- Locations and Places facets are missing
- 6 entries are displayed in the facets for ES (current behavior is 5). 

b. Search for 'd', sort by title AZ (screenshot
opac_search_for_d_sort_by_title.png)
- Zebra displayes only 1 facet
- The order is still completely different

c. Search for 'harry', sort by title AZ (screenshot
opac_search_for_harry_sort_by_title.png)
- 'Show more' links is displayed even if only 2 entries for a facet are
available
- The order is still different ("The discovery of heaven" should be sorted
either before Dollhouse (if the is a stopword) either after "Hareios*"
- The availability is wrong for ES (The item for Dollhouse is not for loan)

d. Search for Books (limit by item type in the adv search), sort by pubdate
(screenshot limit_by_book_sort_by_pubdate.png)
- "Return to the last advanced search" link is not displayed
- The item types facet contains several entries, which does not make sense
- The number of results highly differ (395 vs 364)
- The order is still completely different. I had a look in the index and found:
"Pictura murală*" has "pubdate":"||||" (/_search?q=_id:39&pretty)
The Korean Go Association's learn to play go  "pubdate":"uuuu"
(/_search?q=_id:155&pretty)
Where do come from these values? Shouldn't be a date, or at least an integer?

It's not easy to know what is indexed where. Did you have a look at the indexes
configuration page the Solr stuff had?
It provided an interface to configure the different mappings, it was very
useful.

-- 
You are receiving this mail because:
You are watching all bug changes.


More information about the Koha-bugs mailing list