[Koha-devel] Restructuring C4

Paul Poulain paul.poulain at biblibre.com
Wed Sep 14 22:54:51 CEST 2011


(this mail has been sent 10 hours ago, seems it was too large. I send it
again without attached docs)

Le 30/03/2011 16:47, Ian Walls a écrit :
> Fellow Developers,
Hello everybody,
> Last night, I stayed up late running circ/circulation.pl
> <http://circulation.pl> through NYTProf, to get an idea where we may
> be able to optimize circulation for speed.  After much frustration
> (darn session IDs...), I was able to get a report.  The results
> were... not exactly what I expected.

For those who have used or developed on Koha 2.x, you probably have
noticed that Koha 2.x was much more slower than Koha 3.x
So, yesterday and tonight i've taken my laptop and made a lot of testings.

First of all, i'll thank forever ... The New York Times ... Why will you
tell me ? Because of
http://search.cpan.org/~adamk/Aspect-Library-NYTProf-1.00/lib/Aspect/Library/NYTProf.pm

This tool is really amazing. You run
perl -d:NYTProf mainpage.pl ; nytprofhtml --open
And you get a highly detailled HTML report with the time spent in each
line/sub/module to run mainpage.pl.
Here is the result :  http://depot.biblibre.com/ppoulain/nytprof.ini/
(the .ini is to keep track of the perf before anything optimized)

Note : tests done on my laptop, that is not a server.
1st test with master:
Profile of mainpage.pl for 1.73s (of 2.18s), executing 155108 statements
and 41041 subroutine calls in 249 source files and 64 string evals.

I have investigated to find the longest timings & if we can do something.
(note All the following improvement have been done incrementally)

First of all, C4::Context and C4::Koha are loaded on everypage, so it's
worth investigating

== C4::Context ==
* spent 185ms just reading the config file. Ouch ! that's almost 10% of
the time !!!
(http://depot.biblibre.com/ppoulain/nytprof.ini/C4-Context-pm-13-line.html#230)
I tried to hardcode the hashref instead of XMLin-ing the config file,
the results are astonishing :
after = Profile of mainpage.pl for 1.59s (of 1.93s), executing 117399
statements and 26939 subroutine calls in 241 source files and 63 string
evals.
=> PROPOSAL = add a YAML version of the XML config file that we could
use in //
=> PROPOSAL 2 = memcache the config file
(hint : the config file in koha 2 was not xml !)

* C4::Context, sub db_scheme2dbi, the i flag in /mysql/i cost 84ms !!!
with /i, the duration is 84.4ms, without, it's 10µs !!!
after = Profile of mainpage.pl for 1.55s (of 1.91s), executing 117398
statements and 26938 subroutine calls in 241 source files and 63 string
evals.
=> PROPOSAL = just 'return "mysql"' at the beginning of this sub, as
everything else is not working anyway !

== C4::Languages ==
* C4::Languages::getAllLanguages 43ms => heavily under optimized mysql
query = returns all languages from language_subtag_registry and
language_descriptions even if only a few localisations are installed.
* C4::Languages::_build_languages_arrayref 15ms => same as for
getAllLanguages
I know there is a patch (from hdl) pending to improve this sub, i
haven't tested it yet
(http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=6000)
(hint : this sub appeared in 3.0, was not in 2.2 !)

== Other packages ==
* C4::Koha.pm 15ms for use Business::ISBN, that is used only in 1 sub
used only in a few pages (and none if no cover images are enabled unless
i'm wrong)
* C4::Letters use MIME::Lite spend 26ms, that are useless 90% of the
time (why is C4::Letters loaded ! answer = because C4::Reserves is
loaded by Items.pm that is loaded by Biblio.pm that is loaded by
AuthoritiesMarc.pm that is loaded (uselessly) by mainpage.pl
* CGI/Session/Driver/mysql.pm needs 79ms i've tested by switching to
driver::File::Serializer (dropped to 21ms). this seems to work, but
probably requires more testing than what I did
* Date::Manip needs requires 46ms and is useless most of the time.
after = Profile of mainpage.pl for 1.31s (of 1.60s), executing 103643
statements and 25380 subroutine calls in 215 source files and 55 string
evals.
=> PROPOSAL = WHERE it's interesting we could replace the use XXXX by a
require XXXX just before the lines that need the package. The require is
loaded at run tim, so, when unneeded, the package won't be loaded

== mainpage.pl ==
Mainpage.pl load some authorities, we do nothing with it. Commenting the
use C4::Authorities and all related code results in:
after = Profile of mainpage.pl for 1.27s (of 1.57s), executing 103000
statements and 25194 subroutine calls in 212 source files and 55 string
evals.

== Packages nested ==
* I also noticed that Auth.pm loads Members.pm (to retrieve patron
informations and display them on the top right). Members.pm loads
Reserves that loads Biblio.pm, Circulation.pm and Items.pm All those
packages are the biggest to load.
I've written a script, that display, for each file in C4 which package
is loaded and which sub in the package loaded are used. It shows, for
example, that there is a "use Koha.pm;" that is useless in
AuthoritiesMarc.pm It also shows which sub are used, and how many times.
For example, Circulation.pm calls "use C4::Biblio" and :
        In Circulation.pm the Biblio.pm, sub GetBiblioItemData is used 1
times
        In Circulation.pm the Biblio.pm, sub GetBiblioFromItemNumber is
used 4 times
That will be handy to find what can be cleaned & removed.

I've attached the script and the result. If you want to run it yourself,
just export PERL5LIB, KOHA_CONF, put the script in $KOHA directory, and
run it !

== Other ==
* packages that uses the biggest amount of CPU are not C4 packages :
  * utf8_heavy-pl is the most cpu consuming thing : 132ms, I don't know
if we can do something
  * Template::Parser needs 123ms, not sure we can do something
  * all XML stuff is highly consuming = loading MARC::File::XML requires
47ms in Biblio.pm, XML::Sax::Base require 40ms to be loaded
=> definetly, the best way to reduce CPU time is by pre-loading things,
we already knew that

== Next step ==
* i'll file a bug for the various improvements that can be made easily
* i'll continue investigate & report my findings
I'm aware it's a 1st step, and the long-term road is to reach data&code
persistency. But cleaning what we already have may ease the next big
step. And any speed improvement is a good thing, isn't it !

HTH

-- 
Paul POULAIN
http://www.biblibre.com
Expert en Logiciels Libres pour l'info-doc
Tel : (33) 4 91 81 35 08
-------------- section suivante --------------
Une pi�ce jointe autre que texte a �t� nettoy�e...
Nom: check_subs.pl
Type: application/x-perl-module
Taille: 3072 octets
Desc: non disponible
URL: </pipermail/koha-devel/attachments/20110914/d0a24a31/attachment-0001.bin>


More information about the Koha-devel mailing list