<html>

  <head>


    <meta http-equiv="content-type" content="text/html; charset=utf-8">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    Hi all,<br>

    <br>

    Those days, we're working on a pretty large DB ( >1M biblio), for

    a customer that want to do many statistics on some fields.<br>

    We discovered that something "simple" like:<br>

    SELECT publicationyear, count(publicationyear) FROM biblioitems

    GROUP BY publicationyear;<br>

    <br>

    was giving <b>no result in 10mn</b>.<br>

    This is a test DB, not optimized, but we were surprised by the

    results.<br>

    After investigating we had the idea to create a biblioitems2 table

    with the same structure <b>EXCEPT MARCXML and MARC fields<br>

      <br>

    </b>launch the same SQL query : <b>result in 3seconds</b> !<b><br>

    </b>This could be reproduced on any query (on fields without index).<br>

    <br>

    I think it's because the innoDB is storing each line in one

    "object", so, even if you need only one column, you have to read

    everything.<br>

    In our case, that was 12GB+ of data to read.<br>

    biblioitems2 is just a few dozen MB.<br>

    (all caching values are minimum and there's no index, so not

    involved in the results)<br>

    <br>

    MY CONCLUSIONS:<br>

     * the biblioitems.marc field must be removed quickly: it's useless

    since years, and is only resulting in slowing things<br>

     * the bilbioitems.marcxml field should be moved outside from this

    table. Something like biblio_blob, with biblionumber,

    biblioitemnumber and marcxml. When we need it, just join the tables.<br>

    <br>

    I'm almost sure it would have an important impact on Koha, as

    biblioitems table is called and used "everywhere".<br>

    <br>

    any opinion ?<br>

    <b></b>

    <pre class="moz-signature" cols="72">-- 

Paul Poulain, Associé-gérant / co-owner

BibLibre, Services en logiciels libres pour les bibliothèques

BibLibre, Open Source software and services for libraries</pre>

  </body>

</html>