[Koha-devel] 16.05, zebra and jessie

Jonathan Druart jonathan.druart at bugs.koha-community.org
Mon Sep 5 10:38:51 CEST 2016


Philippe,

If you need help for testing something, please ask on the ML or #koha,
same if you need documentation to setup something. One of us might
have the answer.

Going to a same direction at a same time is certainly one of the
biggest problem we have in the Koha community.
To let an overview of what we (the devs) are doing currently, my
advice is to read the "what's on" email I have been emailing for a few
months. The goal is to let know people who wants to help where they
can help. In the medium term, I hope it will help to concentrate
efforts on a same problematic and to help us focus all together on the
same things.

Cheers,
Jonathan

2016-08-31 13:17 GMT+01:00 Philippe Blouin <philippe.blouin at inlibro.com>:
> We had time and a resources available for a while.  We got Bouzid to learn
> Elastic Search, learn about its integration in Koha, then wanted him to
> contribute.  But been told twice to ... wait.
>
> We certainly have no time anymore.  We'll be back, but probably not until
> 2017 unless sponsored.
>
> Now, may I suggest something?  Documentation.  Not user friendly, but at
> least dev-enlightening.
>
> "You" guys put time in there, as with plack and memcache and whatnot, but if
> you're on the outside, you have no way of getting a head start beside
> reading some 1200 bugzilla comments in chronological order to know what's
> the status, how to get it to work on a git, etc...  (my ref to plack is not
> accidental).
>
> You can put tens of hours alone on the next bug and then wonder why nobody
> comes in when you call for testers, or invest a few hours on documenting a
> quick way for dummasses like me to catch up.
>
> Now, everyone is constrained by time, and I apologize if the comment above
> is frustrating to read.  I don't want to be rude, and I appreciate all the
> hours everyone put to improve the product while I don't or can't.  Please
> see that just as my opinion on why different big devs don't get as many
> helpers as deserved.
>
>
>
> Philippe Blouin,
> Responsable du développement informatique
>
> Tél.  : (888) 604-2627
> philippe.blouin at inLibro.com
>
> inLibro | pour esprit libre | www.inLibro.com
> On 08/31/2016 03:55 AM, Jonathan Druart wrote:
>
> 2016-08-31 0:19 GMT+01:00 David Cook <dcook at prosentient.com.au>:
>
> Is ElasticSearch usable with Koha at this point? I heard a lot in 2015, but
> after Robin left I haven’t heard a word other than rumours that the patches
> had been pushed?
>
> It's pushed and testable.
> It does not work for new installs (bug 16838) and is broken if
> OpacSuppression is set (bug 16660). Nobody seems to take care of these
> bugs (I guess I should).
> Improvements are provided but don't get lot of attention from signoffers
> Bug 14567 - Browse interface for OPAC
> Bug 14899 - Mapping configuration page for Elastic search
>
> Cheers,
> Jonathan
>
> David Cook
>
> Systems Librarian
>
> Prosentient Systems
>
> 72/330 Wattle St
>
> Ultimo, NSW 2007
>
> Australia
>
>
>
> Office: 02 9212 0899
>
> Direct: 02 8005 0595
>
>
>
> From: koha-devel-bounces at lists.koha-community.org
> [mailto:koha-devel-bounces at lists.koha-community.org] On Behalf Of Tomas
> Cohen Arazi
> Sent: Tuesday, 23 August 2016 11:42 PM
> To: Barton Chittenden <barton at bywatersolutions.com>; Jonathan Druart
> <jonathan.druart at bugs.koha-community.org>
> Cc: koha-devel at lists.koha-community.org
> Subject: Re: [Koha-devel] 16.05, zebra and jessie
>
>
>
> I have seen use_zebra_facets=1 cause no facets rendered when GRS-1
> configuration files are kept during upgrades up to where GRS-1 got
> deprecated (3.20?). Is it the case? What does the About > System information
> page says about your config?
>
>
>
> The slowliness is not in zebra per se, but in the way we retrieve the facets
> from it (so Koha/Perl side). We retrieve each facet at a time instead of
> fetching them all in one call. And they come in XML format, so need to be
> parsed. So, if anyone is willing to improve it, just need to optimize this
> function (read the TODO):
>
>
>
> sub _get_facet_from_result_set {
>
>
>
>     my $facet_idx = shift;
>
>     my $rs        = shift;
>
>     my $sep       = shift;
>
>
>
>     my $internal_sep  = '<*>';
>
>     my $facetMaxCount = C4::Context->preference('FacetMaxCount') // 20;
>
>
>
>     return if ( ! defined $facet_idx || ! defined $rs );
>
>     # zebra's facet element, untokenized index
>
>     my $facet_element = 'zebra::facet::' . $facet_idx . ':0:' .
> $facetMaxCount;
>
>     # configure zebra results for retrieving the desired facet
>
>     $rs->option( elementSetName => $facet_element );
>
>     # get the facet record from result set
>
>     my $facet = $rs->record( 0 )->raw;
>
>     # if the facet has no restuls...
>
>     return if !defined $facet;
>
>     # TODO: benchmark DOM vs. SAX performance
>
>     my $facet_dom = XML::LibXML->load_xml(
>
>       string => ($facet)
>
>     );
>
>     my @terms = $facet_dom->getElementsByTagName('term');
>
>     return if ! @terms;
>
>
>
>     my $facets = {};
>
>     foreach my $term ( @terms ) {
>
>         my $facet_value = $term->textContent;
>
>         $facet_value =~ s/\Q$internal_sep\E/$sep/ if defined $sep;
>
>         $facets->{ $facet_value } = $term->getAttribute( 'occur' );
>
>     }
>
>
>
>     return $facets;
>
> }
>
>
>
> Another option would be to make _get_facets_from_zebra build the element set
> containing all facets so they are read in one call (comma-separate all
> elements). The problem is that Zebra returns zero if one of the elements is
> empty. Jared proposed to create a ghost record with all facet fields. I
> didn't manage to make it work. Another option is to patch Zebra. I started
> that, but abandoned once the ES code got pushed.
>
>
>
> So, if use_zebra_facets=0 is good enough, maybe it should be recommended.
> Problem is it is not a real facet, but the sole extraction of the fields
> from the first x records.
>
> As I said, it could be good enough anyway.
>
>
>
> Regards
>
>
>
>
>
> El mar., 23 ago. 2016 a las 10:21, Barton Chittenden
> (<barton at bywatersolutions.com>) escribió:
>
> Zebra tends to be I/O bound -- we've seen it write enormous .zrs files to
> disk (~16G/query on large libraries). Bug 13665 mentions that searches could
> be taking upwards of 40 seconds to complete -- I think that we've seen
> searches time out and return no results at about 1 minute.
>
>
>
> Is it possible to tune Zebra's space/time optimizations in any way so that
> it doesn't write such large files to disk?
>
>
>
> On Tue, Aug 23, 2016 at 5:38 AM, Jonathan Druart
> <jonathan.druart at bugs.koha-community.org> wrote:
>
> See bug 13665 - Retrieve facets from zebra is slow
> To understand why and when use_zebra_facet=1 is slow
>
>
> 2016-08-22 21:31 GMT+01:00 Barton Chittenden <barton at bywatersolutions.com>:
>
> I haven't run into the issue with the dashes in idzebra-2.0 2.0.59, but I
> have run into this, when using ICU-Chains:
>
> Bug 16581 : ICU tokenization bug in idzebra-2.0 2.0.59-1
> URL       :
> https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=16581
> Priority  : P5 - low
> Urgency   : enhancement
> Status    : NEW
>
> I also know that when use_zebra_facets was first introduced, it was *very*
> slow -- I can't find any bugs about that though. It's possible that it got
> so slow under idzebra-2.0 2.61 that the searches are timing out.
>
> It should be possible to set the logging for zebra so that you can see the
> PQF queries:
>
> See
>
> Bug 15714 : Remove zebra.log from debian scripts and add optional log
> levels
> URL       :
> https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=15714
> Priority  : P5 - low
> Urgency   : enhancement
> Status    : RESOLVED
>
> For setting the log levels
>
> And
> http://koha.1045719.n5.nabble.com/Improving-Zebra-logging-td5861827.html
>
> For a general discussion of how to use them.
>
> ... This should give you some idea of what's failing, both in terms of the
> dashes in 2.0.59 and the non-functional zebra facets under 2.0.61.
>
> My general feeling is that 2.0.59 is irredeemably broken by bug 16581, and
> we need at least 2.0.60, but I don't have any experience with zebra
> facets.
>
> --Barton
>
>
>
> On Mon, Aug 22, 2016 at 2:49 PM, Mark Tompsett <mtompset at hotmail.com>
> wrote:
>
> Greetings,
>
> Similar problem. I hope someone has a better solution than setting it to
> 0.
>
> GPML,
> Mark Tompsett
>
> -----Original Message-----
> From: Philippe Blouin
> Sent: Monday, August 22, 2016 2:40 PM
> To: koha-devel at lists.koha-community.org
> Subject: [Koha-devel] 16.05, zebra and jessie
>
> Hello!
>
> We're trying to find the correction combination.  We're new on Jessie,
> so we still have some tweaking needed...
>
> - By default, we get zebra 2.00.59 installed on Jessie through the
> packages.
> - On 16.05, we get some very bad results in the search when the itemtype
> contains an hyphen (-), like 'A-DOC'.
> - So we installed zebra 2.00.62.  This fixes the search...
> - But now we do not have facets.
> - So we set <use_zebra_facets>0</use_zebra_facets>
> - And now we have facets.  But this feels... wrong?
>
> My dummy question: what is the supposedly correct version of Zebra on
> Jessie ?
> And we're we correct in setting the config to 0 ?
>
> Thanks
> Blou
> _______________________________________________
> Koha-devel mailing list
> Koha-devel at lists.koha-community.org
> http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
> website : http://www.koha-community.org/
> git : http://git.koha-community.org/
> bugs : http://bugs.koha-community.org/
>
> _______________________________________________
> Koha-devel mailing list
> Koha-devel at lists.koha-community.org
> http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
> website : http://www.koha-community.org/
> git : http://git.koha-community.org/
> bugs : http://bugs.koha-community.org/
>
>
> _______________________________________________
> Koha-devel mailing list
> Koha-devel at lists.koha-community.org
> http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
> website : http://www.koha-community.org/
> git : http://git.koha-community.org/
> bugs : http://bugs.koha-community.org/
>
>
> _______________________________________________
> Koha-devel mailing list
> Koha-devel at lists.koha-community.org
> http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
> website : http://www.koha-community.org/
> git : http://git.koha-community.org/
> bugs : http://bugs.koha-community.org/
>
> --
>
> Tomás Cohen Arazi
>
> Theke Solutions (https://theke.io)
> ✆ +54 9351 3513384
> GPG: B2F3C15F
>
> _______________________________________________
> Koha-devel mailing list
> Koha-devel at lists.koha-community.org
> http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
> website : http://www.koha-community.org/
> git : http://git.koha-community.org/
> bugs : http://bugs.koha-community.org/
>
>
>
> _______________________________________________
> Koha-devel mailing list
> Koha-devel at lists.koha-community.org
> http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
> website : http://www.koha-community.org/
> git : http://git.koha-community.org/
> bugs : http://bugs.koha-community.org/


More information about the Koha-devel mailing list