[Koha-devel] biblioitems.marcxml & biblioitems.marc / HUGE performance issue !

Nicolas Legrand nicolas.legrand at bulac.fr
Wed Aug 31 09:46:34 CEST 2016


also, as a more than 1M biblios library, hurray for Jonathan :)

2016-08-31 0:58 GMT+02:00 David Cook <dcook at prosentient.com.au>:

> Excitement! I've been looking forward to patches for those bugs for years.
> I don't have the time to test them now, but hoping that someone does soon!
>
> Of course, moving biblioitems.marcxml will break a lot of SQL reports I'm
> sure, but I think it's worth it. The broken reports should generate noisy
> errors, so it's not like they would silently fail. And there comes a time
> where you just need to move on to something new.
>
>  I'm curious about our "table","deletedtable" methodology. I suppose by
> moving data to "deletedtable" for historical purposes, we're theoretically
> improving performance for "table" and making queries simple for active
> records. I think we lose data integrity sometimes (more so with issues and
> reserves than records) by moving data to historical tables. I suppose we're
> entrenched in that design now though, so discussion would be a bit moot.
>
> Anyway, hurray for Jonathan adding patches!
>
> David Cook
> Systems Librarian
> Prosentient Systems
> 72/330 Wattle St
> Ultimo, NSW 2007
> Australia
>
> Office: 02 9212 0899
> Direct: 02 8005 0595
>
>
> > -----Original Message-----
> > From: koha-devel-bounces at lists.koha-community.org [mailto:koha-devel-
> > bounces at lists.koha-community.org] On Behalf Of Jonathan Druart
> > Sent: Friday, 26 August 2016 7:39 PM
> > To: koha-devel at lists.koha-community.org
> > Subject: Re: [Koha-devel] biblioitems.marcxml & biblioitems.marc / HUGE
> > performance issue !
> >
> > For the record:
> >   Bug 10455 - remove redundant 'biblioitems.marc' field and
> >   Bug 17196 - Move marcxml out of the biblioitems table have now patches!
> >
> > 2016-07-12 17:43 GMT+01:00 Paul Poulain <paul.poulain at biblibre.com>:
> > > Hi all,
> > >
> > > Those days, we're working on a pretty large DB ( >1M biblio), for a
> > > customer that want to do many statistics on some fields.
> > > We discovered that something "simple" like:
> > > SELECT publicationyear, count(publicationyear) FROM biblioitems GROUP
> > > BY publicationyear;
> > >
> > > was giving no result in 10mn.
> > > This is a test DB, not optimized, but we were surprised by the results.
> > > After investigating we had the idea to create a biblioitems2 table
> > > with the same structure EXCEPT MARCXML and MARC fields
> > >
> > > launch the same SQL query : result in 3seconds !
> > > This could be reproduced on any query (on fields without index).
> > >
> > > I think it's because the innoDB is storing each line in one "object",
> > > so, even if you need only one column, you have to read everything.
> > > In our case, that was 12GB+ of data to read.
> > > biblioitems2 is just a few dozen MB.
> > > (all caching values are minimum and there's no index, so not involved
> > > in the
> > > results)
> > >
> > > MY CONCLUSIONS:
> > >  * the biblioitems.marc field must be removed quickly: it's useless
> > > since years, and is only resulting in slowing things
> > >  * the bilbioitems.marcxml field should be moved outside from this
> table.
> > > Something like biblio_blob, with biblionumber, biblioitemnumber and
> > marcxml.
> > > When we need it, just join the tables.
> > >
> > > I'm almost sure it would have an important impact on Koha, as
> > > biblioitems table is called and used "everywhere".
> > >
> > > any opinion ?
> > >
> > > --
> > > Paul Poulain, Associé-gérant / co-owner BibLibre, Services en
> > > logiciels libres pour les bibliothèques BibLibre, Open Source software
> > > and services for libraries
> > >
> > >
> > > _______________________________________________
> > > Koha-devel mailing list
> > > Koha-devel at lists.koha-community.org
> > > http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
> > > website : http://www.koha-community.org/ git :
> > > http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
> > _______________________________________________
> > Koha-devel mailing list
> > Koha-devel at lists.koha-community.org
> > http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
> > website : http://www.koha-community.org/ git : http://git.koha-
> > community.org/ bugs : http://bugs.koha-community.org/
>
>
> _______________________________________________
> Koha-devel mailing list
> Koha-devel at lists.koha-community.org
> http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
> website : http://www.koha-community.org/
> git : http://git.koha-community.org/
> bugs : http://bugs.koha-community.org/
>



-- 

*Nicolas Legrand*
Administration technique et développements du système de gestion de la
bibliothèque

[image: Logo BULAC]

Bibliothèque universitaire
des langues et civilisations

65 rue des Grands Moulins
F-75013 PARIS
T +33 1 81 69 *18 22*
www.bulac.fr
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.koha-community.org/pipermail/koha-devel/attachments/20160831/511b6de4/attachment.html>


More information about the Koha-devel mailing list