[Koha-devel] FW: RDBMS like behaviour or merged result sets from ZEBRA-ZOOM

Tue Jul 25 11:58:53 CEST 2006

-----Original Message-----
From: Mike Taylor [mailto:mike at miketaylor.org.uk] 
Sent: Tuesday, July 25, 2006 12:43 PM
To: Tümer Garip
Cc: support at indexdata.dk; 'Joshua Ferraro'
Subject: RDBMS like behaviour or merged result sets from ZEBRA-ZOOM

Tümer Garip writes:
 > We have come a long way in integrating ZEBRA and ZOOM into KOHA and
> first examples are already running in some libraries.

That's great news.

 > However we now have a problem that we could not solve and wonder  >
whether its us or the capabilities of ZEBRA or ZOOM (probably all)  >
thats preventing further development.  > 
 > We have two different types of MARC records. Namely bibliographic  >
and holdings. These records are related to each other with uniquie  > ID
numbers. A bibliographic record can have multiple holdings  > record
related to it.  >  > We are having problem searching these records in
RDBMS manner. i.e  > Searching for say title which exists in
bibliographic record and  > limiting that search to a specific branch
which exists in holdings  > record.

Let me make sure I understand what you're trying to do here.  It's a
two-stage search, in which you first want to search the list of works
for books with a certain title, then find the holdings records for
nearby libraries that have the books of that title?  In a relational
database, it would be something like this:

	select all from works,holdings
		where works.title = 'Harry Potter'
		and holdings.work_id = works.id
		and holdings.library_id in (45,76,102);

Correct?

If so, you simply cannot do this with Zebra -- it's a relational query
and Zebra is not a relational database.  You can't even express such a
query in the Z39.50 Type-1 or SRU CQL query languages.

 > We tried putting all the records in the same database could not do  >
it.  We than tried putting them in different databases (even  >
different servers) but could not find a way of telling zebra to  > merge
the results together.  > 
 > For a small resultset we can do a recursive search of ID's on the  >
other database and get a merged resultset but library operations  > work
on large numbers and that is impossible when the small set is  > around
1800 records.

I think you mean code like this?  (WARNING: UNTESTED)

	$rs = $works_conn->search('title="harry potter");
	foreach my $i (1 .. $rs->size()) {
	    $rec = $rs->record($i);
	    $id = extract_identifier_from_record($rec);
	    $rs2 = $holdings_conn->search("id=$id and library_id=45
		or library_id=76 or library_id=102");
	    push @holdings_records, records_extracted_from($rs2);
	}

That approach is the obvious one to take; but, no, it doesn't scale well
when the initial result set is large.

 > Is there any way we can actually do this with the capabilities of  >
ZEBRA&ZOOM? 
 > Something like @attr 1=52 (@resultset1 @attr 1=52)?
 > Or may be a new z:index called linkid which joins two databases or  >
different records within a database.

What you're talking about is certainly in the realm of the possible, but
it would be a _significant_ research, design and implementation task --
one, I think, that would beyond the Koha consortium's ability to fund.

I am sorry to disappoint, but I think you're going to have to work round
this one.

 > Any idea is appreciated as the whole database design is currently  >
stuck with this problem.

Have you considered adding redundant data, such as author and title, to
the holdings records?  Then you could search them directly.  Not pretty,
but it may be pragmatic.

 _/|_
___________________________________________________________________
/o ) \/  Mike Taylor  <mike at miketaylor.org.uk>
http://www.miketaylor.org.uk )_v__/\  The Unix command to kill processes
is not "kill -9".  The Unix
	 command to make a tape archive is not "tar cv".  Options are
	 optional.