[Koha-zebra] Re: Import Speed

Joshua Ferraro jmf at liblime.com
Thu Mar 2 22:29:54 CET 2006


On Thu, Mar 02, 2006 at 04:40:16PM +0000, Mike Taylor wrote:
> > Date: Thu, 2 Mar 2006 07:44:22 -0800
> > From: Joshua Ferraro <jmf at liblime.com>
> > 
> There's your culprit, then.  You're spending 39751 of your 40604
> seconds doing needless searches, and 853 seconds (14 minutes) doing
> the actual updates.  Rip out the searches and you should get a 47-fold
> speed increase.
> 
> Why are you doing the search?  So far I can see, it's just a probe to
> see whether the connection is still alive.  But you don't need to do
> that: just go ahead and submit the update request, you'll find out
> soon enough if the connection's dead and you can re-forge it then if
> necessary.
Here's what the connection manager looks like now:

        if (defined($context->{"Zconn"})) {
                $Zconn = $context->{"Zconn"};
                return $context->{"Zconn"};
        } else {
                $context->{"Zconn"} = &new_Zconn();
                return $context->{"Zconn"};
                }
So ... no search ... if one is defined it just returns it and if
it's not alive I assume the app will just crash (no fault tolerance
built into the script).

And here's the new benchmark for those 5000 records:

5000 MARC records imported in 7727.84231996536 seconds

dprofpp tmon.out                           Exporter::export_ok_tags has -1 unstacked calls in outer
AutoLoader::AUTOLOAD has -1 unstacked calls in outer
Exporter::Heavy::heavy_export has 12 unstacked calls in outer
bytes::AUTOLOAD has -1 unstacked calls in outer
Exporter::Heavy::heavy_export_ok_tags has 1 unstacked calls in outer
POSIX::__ANON__ has 1 unstacked calls in outer
POSIX::load_imports has 1 unstacked calls in outer
Exporter::export has -12 unstacked calls in outer
utf8::AUTOLOAD has -1 unstacked calls in outer
utf8::SWASHNEW has 1 unstacked calls in outer
Storable::thaw has 1 unstacked calls in outer
bytes::length has 1 unstacked calls in outer
POSIX::AUTOLOAD has -2 unstacked calls in outer
Total Elapsed Time = 6617.861 Seconds
  User+System Time = 706.1013 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 21.4   151.3 817.46 103492   0.0001 0.0008  MARC::Charset::marc8_to_utf8
 18.0   127.3 416.36 126313   0.0000 0.0000  MARC::Charset::Table::get_code
 17.1   121.0 121.08 126295   0.0000 0.0000  Storable::mretrieve
 10.9   77.27  0.000 126295   0.0000 0.0000  Storable::thaw
 10.1   71.52 71.521 126313   0.0000 0.0000  SDBM_File::FETCH
 8.42   59.48 117.80 252590   0.0000 0.0000  Class::Accessor::__ANON__
 8.26   58.31 58.317 252590   0.0000 0.0000  Class::Accessor::get
 7.21   50.88 467.25 126313   0.0000 0.0000  MARC::Charset::Table::lookup_by_ma
                   1                         rc8
 6.15   43.39 97.718 126295   0.0000 0.0000  MARC::Charset::Code::char_value
 4.87   34.35 34.354 126295   0.0000 0.0000  MARC::Charset::_process_escape
 2.71   19.10 19.101 126313   0.0000 0.0000  MARC::Charset::Table::db
 2.26   15.98 30.245 728288   0.0000 0.0000  MARC::Record::field
 2.10   14.79 14.794 802346   0.0000 0.0000  MARC::Field::tag
 1.94   13.69 857.27  25241   0.0005 0.0340  MARC::File::XML::record
 1.44   10.15 11.456 714137   0.0000 0.0000  MARC::Field::subfields

So it's definitely better without the search, but there is still
the question of XML ... being able to import raw marc (which would 
only take a few seconds) would be really nice ...

> (Mind you, 14 minutes still seems very slow for 5000 poxy records.  I
> think there are bulk-update cache issues going on here as well.)

-- 
Joshua Ferraro               VENDOR SERVICES FOR OPEN-SOURCE SOFTWARE
President, Technology       migration, training, maintenance, support
LibLime                                Featuring Koha Open-Source ILS
jmf at liblime.com |Full Demos at http://liblime.com/koha |1(888)KohaILS





More information about the Koha-zebra mailing list