[Koha-bugs] [Bug 10662] Build OAI-PMH Harvesting Client

bugzilla-daemon at bugs.koha-community.org bugzilla-daemon at bugs.koha-community.org
Mon Apr 11 01:48:41 CEST 2016


https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=10662

--- Comment #74 from David Cook <dcook at prosentient.com.au> ---
(In reply to Mirko Tietgen from comment #70)
> - Would it make sense for you to use Catmandu::OAI instead of IO::*? We will
> use Catmandu for Elasticsearch, it would probably make sense here too?
> 

I'd be open to that. HTTP::OAI is already a dependency of Koha, so I used that,
but I think it's a bit rubbish, so I would be happy to use something else like
Catmandu::OAI I suspect.

> - Task type is set on separate page for add and edit of tasks. Should be on
> the same page as the rest of the config.
> 

It's on a separate page, as changing it will change the template for the rest
of the config. I suppose this could be done with AJAX to make it prettier, but
at the moment I'm going for function over everything else. 

> - I can add a task, I can start a task -- but I can't stop a task, just
> remove.
> 

Yeah, that's on my TODO list. Ideally, I'd add "pause" and "stop". Maybe even
"edit", which would require a "stop" first. 

> - Task numbering always starts at 2. There is no task 1?
> 

If it's active tasks, that's an artifact of POE. I suppose that could be
changed, although I never thought order would matter much.

> - Tasks should be sorted by task number
> 

Honestly, I've thought about doing away with task numbers, and using task names
instead, as that would probably be more useful. 

> - I can send a single task to Icarus multiple times. Is that intended?
> 

Mmm, I know that it does this, but it's unintentional. I have thought about
adding safeguards, but I've been focusing on core functionality first. 

> - "Send to Icarus" leads to empty page if Icarus is not running
> 

Ahhh, I'd heard of the blank page, but not the cause. Cool. I'll look at fixing
that. 

> - Permissions for the OAI user? Even with superlibrarian I get several auth
> errors, and I would not want to give it superlibrarian permissions anyway.
> 

What do you mean by "OAI user"? Do you mean the user for the /svc/import_oai
API? Those auth errors are misleading. Like /svc/import_bib, it tries to do the
import first before doing any auth, so you'll get a 403 error (and it'll
probably show up twice because of bad logging). On the second try, it should
work. I think all you need is "catalogue edit" permissions for that user (like
with /svc/import_bib). 

> - Log should display something more useful than [server 1], like name or IP
> 

It's a bit of a muchness. The name or IP would be localhost/127.0.0.1. [server
1] refers to the Icarus listener.

> - Log shows lots of "Connection n started.1" and "Connection n failed or
> ended" but there is no hint what that actually means. It does not seem to be
> relevant for fulfilling the task
> 

True. It's mostly for debugging. I'll be removing a lot of logging before I'm
ready for a sign off. 

> - Enqueue needs an identifier to work. What if I want to get more than one
> record? Using just the prefix does not work.
> 

It only needs an identifier to work if you're using the GetRecord verb. You
don't need it for ListRecords. I could use Javascript to make that easier in
the UI. As mentioned above, I'm still just at a barebones level with this
feature.

> - Enqueue seems to work so far, I downloaded a record.
> 
> - Dequeue does not work for me. Several auth errors, then a working auth. A
> record is created, but it only contains a (broken) leader.
> 

As above, the auth errors are the same as you'd get with /svc/import_bib. I
have ideas about how to improve that, but they're more optimisations than
anything. 

Is this when you were using oai_dc? I'm amazed that anything was created...
that's probably a bug. I would've expected it to fail...

> - Have not tested matching yet.

That'll probably be the hardest/most interesting bit :p.

For that to work correctly, a person will have applied the other bugzilla
dependencies, and have Zebra indexing rapidly. By default, the Debian packages
are going to be too slow, I think, as they only process updates every 5
minutes, I think. We update Zebra every 5 seconds, so I haven't noticed any
matching problems to date, when everything is configured correctly...

---

Thanks for the feedback, Mirko! I have another update that's almost ready to go
out. Just juggling a couple of projects atm.

-- 
You are receiving this mail because:
You are watching all bug changes.


More information about the Koha-bugs mailing list