[Koha-devel] 16.05, zebra and jessie
David Cook
dcook at prosentient.com.au
Wed Aug 31 01:42:52 CEST 2016
I don't doubt that I've missed a ton of words! Between baby and competing projects, I haven't had as much time to keep up with the bleeding edge.
I want to be one of those people using and fixing it, although - as you say - everyone has their own priorities.
It's the search engine that Koha deserves, but I reckon many of us are busy right now :/.
Thanks for the update though, Chris.
David Cook
Systems Librarian
Prosentient Systems
72/330 Wattle St
Ultimo, NSW 2007
Australia
Office: 02 9212 0899
Direct: 02 8005 0595
> -----Original Message-----
> From: Chris Cormack [mailto:chrisc at catalyst.net.nz]
> Sent: Wednesday, 31 August 2016 9:28 AM
> To: David Cook <dcook at prosentient.com.au>
> Cc: 'Tomas Cohen Arazi' <tomascohen at gmail.com>; 'Barton Chittenden'
> <barton at bywatersolutions.com>; 'Jonathan Druart'
> <jonathan.druart at bugs.koha-community.org>; koha-devel at lists.koha-
> community.org
> Subject: Re: [Koha-devel] 16.05, zebra and jessie
>
> * David Cook (dcook at prosentient.com.au) wrote:
> >
> >
> >
> > I suppose Adam at IndexData has been busy with the FOLIO project, so I
> > doubt he has time to work on Zebra these days, even if we did have a
> patch available.
> >
> >
> >
> > Is ElasticSearch usable with Koha at this point? I heard a lot in
> > 2015, but after Robin left I haven’t heard a word other than rumours
> > that the patches had been pushed?
>
> You missed tons of words then :)
>
> Yes, it is in 16.05, marked expiremental, it works, mostly. But it will only get
> better with more people using and fixing it.
> The next task is to update the version it works with to Elastic 2. That isn't a
> huge amount of work, but everyone has their own priorities and a lot of us
> have to work on what users ask for (not what users need ;))
>
> Chris
>
> >
> > Subject: Re: [Koha-devel] 16.05, zebra and jessie
> >
> >
> >
> > I have seen use_zebra_facets=1 cause no facets rendered when GRS-1
> > configuration files are kept during upgrades up to where GRS-1 got
> > deprecated (3.20?). Is it the case? What does the About > System
> > information page says about your config?
> >
> >
> >
> > The slowliness is not in zebra per se, but in the way we retrieve the
> > facets from it (so Koha/Perl side). We retrieve each facet at a time
> > instead of fetching them all in one call. And they come in XML format,
> > so need to be parsed. So, if anyone is willing to improve it, just
> > need to optimize this function (read the TODO):
> >
> >
> >
> > sub _get_facet_from_result_set {
> >
> >
> >
> > my $facet_idx = shift;
> >
> > my $rs = shift;
> >
> > my $sep = shift;
> >
> >
> >
> > my $internal_sep = '<*>';
> >
> > my $facetMaxCount = C4::Context->preference('FacetMaxCount') //
> > 20;
> >
> >
> >
> > return if ( ! defined $facet_idx || ! defined $rs );
> >
> > # zebra's facet element, untokenized index
> >
> > my $facet_element = 'zebra::facet::' . $facet_idx . ':0:' .
> > $facetMaxCount;
> >
> > # configure zebra results for retrieving the desired facet
> >
> > $rs->option( elementSetName => $facet_element );
> >
> > # get the facet record from result set
> >
> > my $facet = $rs->record( 0 )->raw;
> >
> > # if the facet has no restuls...
> >
> > return if !defined $facet;
> >
> > # TODO: benchmark DOM vs. SAX performance
> >
> > my $facet_dom = XML::LibXML->load_xml(
> >
> > string => ($facet)
> >
> > );
> >
> > my @terms = $facet_dom->getElementsByTagName('term');
> >
> > return if ! @terms;
> >
> >
> >
> > my $facets = {};
> >
> > foreach my $term ( @terms ) {
> >
> > my $facet_value = $term->textContent;
> >
> > $facet_value =~ s/\Q$internal_sep\E/$sep/ if defined $sep;
> >
> > $facets->{ $facet_value } = $term->getAttribute( 'occur' );
> >
> > }
> >
> >
> >
> > return $facets;
> >
> > }
> >
> >
> >
> > Another option would be to make _get_facets_from_zebra build the
> > element set containing all facets so they are read in one call
> > (comma-separate all elements). The problem is that Zebra returns zero
> > if one of the elements is empty. Jared proposed to create a ghost
> > record with all facet fields. I didn't manage to make it work. Another
> > option is to patch Zebra. I started that, but abandoned once the ES code
> got pushed.
> >
> >
> >
> > So, if use_zebra_facets=0 is good enough, maybe it should be
> recommended.
> > Problem is it is not a real facet, but the sole extraction of the
> > fields from the first x records.
> >
> > As I said, it could be good enough anyway.
> >
> >
> >
> > Regards
> >
> >
> >
> >
> >
> > El mar., 23 ago. 2016 a las 10:21, Barton Chittenden (<
> > barton at bywatersolutions.com>) escribió:
> >
> > Zebra tends to be I/O bound -- we've seen it write enormous .zrs files to
> > disk (~16G/query on large libraries). Bug 13665 mentions that searches
> > could be taking upwards of 40 seconds to complete -- I think that we've
> > seen searches time out and return no results at about 1 minute.
> >
> >
> >
> > Is it possible to tune Zebra's space/time optimizations in any way so that
> > it doesn't write such large files to disk?
> >
> >
> >
> > On Tue, Aug 23, 2016 at 5:38 AM, Jonathan Druart <
> > jonathan.druart at bugs.koha-community.org> wrote:
> >
> > See bug 13665 - Retrieve facets from zebra is slow
> > To understand why and when use_zebra_facet=1 is slow
> >
> >
> > 2016-08-22 21:31 GMT+01:00 Barton Chittenden <
> > barton at bywatersolutions.com>:
> > > I haven't run into the issue with the dashes in idzebra-2.0 2.0.59,
> > but I
> > > have run into this, when using ICU-Chains:
> > >
> > > Bug 16581 : ICU tokenization bug in idzebra-2.0 2.0.59-1
> > > URL : https://bugs.koha-
> community.org/bugzilla3/show_bug.cgi?id
> > =16581
> > > Priority : P5 - low
> > > Urgency : enhancement
> > > Status : NEW
> > >
> > > I also know that when use_zebra_facets was first introduced, it was
> > *very*
> > > slow -- I can't find any bugs about that though. It's possible that
> > it got
> > > so slow under idzebra-2.0 2.61 that the searches are timing out.
> > >
> > > It should be possible to set the logging for zebra so that you can
> > see the
> > > PQF queries:
> > >
> > > See
> > >
> > > Bug 15714 : Remove zebra.log from debian scripts and add optional
> log
> > levels
> > > URL : https://bugs.koha-
> community.org/bugzilla3/show_bug.cgi?id
> > =15714
> > > Priority : P5 - low
> > > Urgency : enhancement
> > > Status : RESOLVED
> > >
> > > For setting the log levels
> > >
> > > And http://koha.1045719.n5.nabble.com/
> > Improving-Zebra-logging-td5861827.html
> > >
> > > For a general discussion of how to use them.
> > >
> > > ... This should give you some idea of what's failing, both in terms
> > of the
> > > dashes in 2.0.59 and the non-functional zebra facets under 2.0.61.
> > >
> > > My general feeling is that 2.0.59 is irredeemably broken by bug
> > 16581, and
> > > we need at least 2.0.60, but I don't have any experience with zebra
> > facets.
> > >
> > > --Barton
> > >
> > >
> > >
> > > On Mon, Aug 22, 2016 at 2:49 PM, Mark Tompsett
> <mtompset at hotmail.com>
> > wrote:
> > >>
> > >> Greetings,
> > >>
> > >> Similar problem. I hope someone has a better solution than setting
> > it to
> > >> 0.
> > >>
> > >> GPML,
> > >> Mark Tompsett
> > >>
> > >> -----Original Message-----
> > >> From: Philippe Blouin
> > >> Sent: Monday, August 22, 2016 2:40 PM
> > >> To: koha-devel at lists.koha-community.org
> > >> Subject: [Koha-devel] 16.05, zebra and jessie
> > >>
> > >> Hello!
> > >>
> > >> We're trying to find the correction combination. We're new on
> > Jessie,
> > >> so we still have some tweaking needed...
> > >>
> > >> - By default, we get zebra 2.00.59 installed on Jessie through the
> > >> packages.
> > >> - On 16.05, we get some very bad results in the search when the
> > itemtype
> > >> contains an hyphen (-), like 'A-DOC'.
> > >> - So we installed zebra 2.00.62. This fixes the search...
> > >> - But now we do not have facets.
> > >> - So we set <use_zebra_facets>0</use_zebra_facets>
> > >> - And now we have facets. But this feels... wrong?
> > >>
> > >> My dummy question: what is the supposedly correct version of
> Zebra
> > on
> > >> Jessie ?
> > >> And we're we correct in setting the config to 0 ?
> > >>
> > >> Thanks
> > >> Blou
> > >> _______________________________________________
> > >> Koha-devel mailing list
> > >> Koha-devel at lists.koha-community.org
> > >> http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-
> devel
> > >> website : http://www.koha-community.org/
> > >> git : http://git.koha-community.org/
> > >> bugs : http://bugs.koha-community.org/
> > >>
> > >> _______________________________________________
> > >> Koha-devel mailing list
> > >> Koha-devel at lists.koha-community.org
> > >> http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-
> devel
> > >> website : http://www.koha-community.org/
> > >> git : http://git.koha-community.org/
> > >> bugs : http://bugs.koha-community.org/
> > >
> > >
> > >
> > > _______________________________________________
> > > Koha-devel mailing list
> > > Koha-devel at lists.koha-community.org
> > > http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
> > > website : http://www.koha-community.org/
> > > git : http://git.koha-community.org/
> > > bugs : http://bugs.koha-community.org/
> >
> >
> >
> > _______________________________________________
> > Koha-devel mailing list
> > Koha-devel at lists.koha-community.org
> > http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
> > website : http://www.koha-community.org/
> > git : http://git.koha-community.org/
> > bugs : http://bugs.koha-community.org/
> >
> > --
> >
> > Tomás Cohen Arazi
> >
> > Theke Solutions (https://theke.io)
> > ✆ +54 9351 3513384
> > GPG: B2F3C15F
> >
>
> > _______________________________________________
> > Koha-devel mailing list
> > Koha-devel at lists.koha-community.org
> > http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
> > website : http://www.koha-community.org/ git :
> > http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
>
>
> --
> Chris Cormack
> Catalyst IT Ltd.
> +64 4 803 2238
> PO Box 11-053, Manners St, Wellington 6142, New Zealand
More information about the Koha-devel
mailing list