[Koha-zebra] Updating zebradb

Tümer Garip tgarip at neu.edu.tr
Wed Mar 22 16:50:51 CET 2006


Hi Sebastian,
Thanks for the concern.
Well I tried to recreate the problem last night and managed to crash the
db 2 times.
Here is what I have done I started 2 separate zebraidx processes from
the command line reading lots of marc records and committing at the end.
Also from KOHA I have updated single records which is always an update
and commit on each record.

On both occasions starting from a clean db  I managed to crash the
update process. zebraidx logs giving fatal- inconsistent register errors
and ZOOM stopping the zebrasrv as well.

I am on a windows 2003 platform using zebra 1.3.34 yaz 1.1.14
Now I changed that to zebra 1.4 yaz-2.1.5 and tried just now with 2
command line zebraidx processes feeding marc records about 20K with no
commit and another command line process doing random commits. 1.4 zebra
seems to handle things better but if you try hard enough it crashed as
well.

Ofcourse this is not a real life situation for me. In real life every
update is update-commit. And in real life its to do with ZOOM, Z39.50 ,
my scripts and Windows.

For the last 16 hrs I have an online system with zebra 1.4 yaz-2.1.15 no
shadow files lots of error checking script
And a script to restart the server if it goes down due to an error. The
system also writes the record that was being updated and stopped the
server so that I can try and find out if it has to do with my records.
Out of 1000 updates I have 3 records that I am analyzing to see whats
wrong. If I can pinpoint anything I'll send it to you.

Well in general it looks as if its actually working. 1.4 zebra seems to
have better memory management and less cpu demanding. 
Since in real life situations our updates has to be online ( I mean
single record update-commit) do we really need the shadow files as long
as zebra database is sturdy?

I will continue to test without shadow files and see what happens.

Thanks
Tumer

-----Original Message-----
From: Sebastian Hammer [mailto:quinn at indexdata.com] 
Sent: Wednesday, March 22, 2006 4:08 PM
To: Tümer Garip
Cc: koha-zebra at nongnu.org; Adam Dickmeiss; Mike Taylor
Subject: Re: [Koha-zebra] Updating zebradb


Hi Tümer,

We've now put some considerable time into attempting to recreate the 
problem with the updates, with no luck. It would be a great help if you 
could provide *any* additional information about what you do to cause 
this -- ideally a small test script that illustrates the problem. We've 
run 200 scripts in parallel running updates to the DB with no problem.

In particular, it would be helpful to know what the pattern of updates 
is.. do you always commit immediately as part of an update or do you 
sometimes do multiple updates between commits?

Cheers,

--Sebastian

Tümer Garip wrote:

>Hi,
>
>My hintch is that it is the multiclient usage that is causing the 
>problem. You can have one single process doing thousands of updates one

>after the other and you do not get any problems. We already know that 
>as when we first build a zebradb we do not get any problems even if we 
>build 150K records as xml read from KOHA. Even if you have searches 
>going on at he same time updating is blocking so the search takes a 
>long time which is normal.
>
>I noticed that its actually the server crashing in the middle of an 
>update if you have different processes updating the database and it is 
>most probably that leaving the database in an unstable stage. I just 
>realised that my zebraserver is set to automatically restart after a 
>failiure.
>
>I think rather than trying to reproduce an update/commit sequence that 
>causes this crash it will be easier for you to try and use  different 
>processes trying to update the database with a set of different records

>all at once and see what happens.
>
>I myself just set up a system like that and started trying. At this 
>preliminary stage my findings are that server crashes and stops if you 
>have multiple clients running parallel. Probably keep doing this and I 
>will crash the database at some stage.
>
>Lets keep trying,
>Tumer
>
>
>
>-----Original Message-----
>From: Sebastian Hammer [mailto:quinn at indexdata.com]
>Sent: Tuesday, March 21, 2006 6:33 PM
>To: Tümer Garip
>Cc: koha-zebra at nongnu.org
>Subject: Re: [Koha-zebra] Updating zebradb
>
>
>Tümer,
>
>Is there some sequence of updates/commits that will reliably cause the
>crash that you've seen?
>
>If we could come up with a Perl script, say, which performed a certain
>sequence of updates and commits and caused the problem to reappear, it 
>would be a lot easier to trace. Zebra should take care of locking 
>between commits and updates, but there might be a condition that has 
>been missed which leads to the corrupted database. Whatever it is, it 
>sounds like a bug and it's something we're keen to have fixed.
>
>I strongly suspect that the challenge here is that you have multiple
>processes (i.e. circulation desks) doing concurrent updates..  many 
>applications I have run into would have updates funnelled through one 
>queue...  but we're pretty keen to isolate this problem.
>
>--Seb
>
>Tümer Garip wrote:
>
>  
>
>>Hi,
>>
>>This is the continuation of building zebradb thread.
>>
>>As suggested by Sebastian we moved to zebra 1.4 but our problems have
>>not gone away. Version 1.4 not being a release version we moved back
to
>>    
>>
>
>  
>
>>1.3.34 to do further tests.
>>
>>Updating zebradb with just one process is not a problem. i.e a script
>>that reads 2000 random records modifies a flag and updates the zebradb

>>works OK. If you do not have shadow files search respone slows down 
>>drastically but if you update with shadow files and do a single commit

>>everything looks fine.
>>
>>But that is not a real life situation and it real life everything is
>>still hectic. In real life you have about 20 searches going on and 
>>about 10-20 update requests in queue. In those situations zebra can
not
>>    
>>
>
>  
>
>>handle things. If you use shadow files we had a crashed db. I dont 
>>know
>>    
>>
>
>  
>
>>the mechanics behind zebra but when you have:
>>
>>Update
>>|
>>|
>>|		Update
>>|		 |
>>Commit	 |
>>		 |
>>		 |
>>		commit
>>
>>Processes like this going on is it safe? Is each commit knows what to
>>commit? I can not comment on this but just can say that it does not 
>>work.
>>
>>So we changed file handling not to use shadow files. Now we get lots 
>>of
>>    
>>
>
>  
>
>>TIMEOUT errors from updates and then zebraserver crashes. Well at 
>>least
>>    
>>
>
>  
>
>>the zebradb stayed intact.
>>
>>We tried all sorts of other things (like looping the process until it
>>updates) but all in vain.
>>
>>I can definitely say that we have not managed to get zebra updates
>>coming from multiple clients work whatever we tried. Either ZEBRA does

>>not like this sort of updating or we have a long way of debugging.
>>
>>A little bit disappointed but not lost hope. We'll keep on trying and
>>reporting.
>>
>>Regards
>>Tumer
>>NEU Grand Library
>>CYPRUS
>>
>>
>>
>>_______________________________________________
>>Koha-zebra mailing list
>>Koha-zebra at nongnu.org
>>http://lists.nongnu.org/mailman/listinfo/koha-zebra
>>
>> 
>>
>>    
>>
>
>  
>

-- 
Sebastian Hammer, Index Data
quinn at indexdata.com   www.indexdata.com
Ph: (603) 209-6853







More information about the Koha-zebra mailing list