[Koha-zebra] Re: Import Speed

Mike Taylor mike at miketaylor.org.uk
Thu Mar 2 17:29:32 CET 2006


> Date: Thu, 02 Mar 2006 11:05:44 -0500
> From: Sebastian Hammer <quinn at indexdata.com>
> 
> Importing records one at a time when first building a database, or
> when doing a batch update that is a substantial percentage of the
> size of the database is not a good idea. The software has no way to
> optimize the layout of the index files, so for each record update,
> things get shuffled around, resulting on very sluggish update
> performance and a less-than-ideal layout inside the index files.

Sure, but ...

> It would be highly advisable to do at least the initial import from
> the command-line. I think it would make a lot of sense if this could
> be done well from the protocol, but AFAIK, the extended service
> interface at the moment only allows you to insert one record at a
> time.

But -- ??  What magic does the command-line import have access to that
ZOOM update doesn't?  Clearly it's using some kind of in-memory
caching to hugely reduce the frequency of disk-writes, but why
shouldn't that also be used the doing a ZOOM update?  Isn't that (part
of) the purpose of delaying the "commit" call?  If not, then we need
to add $conn->option("updateCacheSize" => 100*1024*1024);

 _/|_	 ___________________________________________________________________
/o ) \/  Mike Taylor  <mike at miketaylor.org.uk>  http://www.miketaylor.org.uk
)_v__/\  "If I write in C++ I probably don't use even 10% of the language,
	 and in fact the other 90% I don't think I understand" -- Brian
	 W. Kernighan.






More information about the Koha-zebra mailing list