[Koha-zebra] A few Zebra Questions

Thu Dec 29 21:55:10 CET 2005

Joshua Ferraro wrote:

>Hi guys,
>
>Chris and I are brainstorming today over a few cups of coffee and
>a giant dry-erase board and we've come up with a few Zebra questions
>that will help out as we plan our next moves for the migration. So 
>here goes:
>
>Does Zebra allow any kind of database replication, master/slave or
>master/master relationships? (or alternatively, are there methods to
>communicate between two or more Zebra servers?)
>  
>
Oooh.. serious dream project. I have fantasized about this for years. 
But no, there is nothing built into Zebra today.

One somewhat obvious way to approach this would be using OAI-PMH.. the 
LoC is presently contemplating awarding us a little money to support an 
OAI server function in Zebra. It wouldn't require much, really, 
primarily a mechanism to select records based on an update time stamp, 
and the ability to preserve and optionally recall deleted records. 
Anything else could be written in a simple Perl or PHP frontend..

With that in hand, we would have a pretty simple basis for replicating 
databases in a master/slave relationship. Another option would involve 
using Z39.50 (or SRW) Update to actively update remote databases when 
new stuff comes in. That could be done instantaneously, and could even 
be built into the update/commit cycle so you really enforced a high 
level of synchronization.. the OAI approach might be a little more 
robust, though.

>When updating a database with general record IDs does (can) Zebra
>also create a MARC file for that record?
>  
>
I think you're asking if Zebra can maintain a redundant external copy of 
a database in MARC format -- or at least a transaction log of sorts.. at 
present, there is no such option, but it would be possible to add.

>Is there any foreseeable way to get around the speed issues with 
>updating that would make it feasible to store status data quickly?
>For instance, Chris wondered if we could delay the actual indexing
>process, but get the data in there for when the record is actually
>retrieved (then index in batch say every minute or so)?
>  
>
There are lots of ways to work around that I am sure.. different 
complexities, different challenges, but I don't think any unsurmountable 
problems. I have to be careful not to entirely drop my programmer's hat 
in favor of my marketing hat, but really, it is primarily a matter of time.

What is the issue here -- the ability to use circulation status as a 
filter in bibliographic searching, or what?

--Sebastian

-- 
Sebastian Hammer, Index Data
quinn at indexdata.com   www.indexdata.com
Ph: (603) 209-6853