[Koha-devel] Sparql and koha opac (wikidata and bnf to try things)

Mon Jul 16 03:10:34 CEST 2018

Actually, I’ll reply here instead as it allows for a broader audience.

That’s awesome that you’re looking to learn more about the Semantic Web and SPARQL. Bon courage!

Hopefully some libraries reply to you here and find that the work is relevant. Since I work for a software company, it’s not relevant to me per se as my work is directed by clients for the most part, but I hope someone finds it relevant. Petter Goksøyr Åsen might like to talk about it, as might Magnus Enger. 

I have done work on Linked Data in the community, but no one really seemed very interested in it, so it’s been dormant for a while now. I’m still looking for testers ;). Although to be honest my understanding of Linked Data has grown a lot since I first posted my patches, so I think my original work isn’t good enough anymore. If you want to learn more about how RDF can be used in a library system, I’d suggest you look at Fedora Commons 4.x and IIIF. Both use RDF natively for their data. (Actually, if you really want to chat about this, I would be happy to talk for ages and ages about this topic. I don’t have commercial reasons to keep working in Linked Data, but I do still find it interesting professionally.)

I notice in your patch that you’re currently using biblio.abstract to store your external URIs but you might want to look at https://www.loc.gov/marc/mac/2017/2017-08.html and https://www.loc.gov/aba/pcc/bibframe/TaskGroups/URI%20FAQs.pdf. I was at a conference earlier this year talking about use of the $0 and $1 for embedding RDF URIs in MARC. This seems like it might be relevant for you. 

Not sure if I have much else to add. In terms of code, I like the idea of libraries being able to define their own target endpoints and SPARQL queries. The downside of that is it would be easy to break, but the upside is that you could add any source that you want. In terms of performance, I’d say either caching locally makes sense. I’ve worked a fair bit with Apache Fuseki. It’s not the greatest piece of technology, but it could do what you want, although I’d suggest either batching all your SPARQL update queries into one HTTP request or just using the built-in REST API. (I’d also use named graphs otherwise it’ll be a nightmare trying to manage all the RDF triples “per record”). I have some tips and tricks for Fuseki if you’re interested. If you are using Fuseki for caching, I’d say don’t use a disk-based data store – use something in memory. Alternatively use some other in-memory cache. Another idea would be to create a daemon that can fork child processes to do the actual HTTP requests, so that you can do N number of SPARQL queries in parallel. Have the opac-detail.pl page load and then query that daemon asynchronously so that the Koha page loads quickly and you can show a little “Loading…” graphic on the page until the background daemon has prepared the data from the external source(s). (Or if the data is cached you don’t even need to call out to the daemon, you can just fetch the cached data from your local cache.) I think it would be a nicer user experience and you’d save time/have better performance. 

I wish you luck : ).

David Cook

Systems Librarian

Prosentient Systems

72/330 Wattle St

Ultimo, NSW 2007

Australia

Office: 02 9212 0899

Direct: 02 8005 0595

From: koha-devel-bounces at lists.koha-community.org [mailto:koha-devel-bounces at lists.koha-community.org] On Behalf Of Claire Hernandez
Sent: Saturday, 14 July 2018 12:10 AM
To: Koha-devel at lists.koha-community.org
Subject: [Koha-devel] Sparql and koha opac (wikidata and bnf to try things)

Hi,

I wanted to try and learn about semantic web and sparql. So I give it a try in a opac-detail view. This patch is just a poc and really not for production. I wanted to know if this type of work is relevant to you. I know in community previous work was done but maybe you can share where you are now with linked data ?

Just 2 words about the dev :

    * Simple syspref called "Explore", 3 possibilities: nothing, Bnf (national library in france) and Wikidata

    * Some datas are fetched, I suppose I have an ark in my biblio

    * I display "simple information", there is no complicated query

Possibles "next" :

  * Not one source (bnf or wikidata), but cross fetch between sources (ex: bnf + wikidata + europeana + geonames + whatever ;)

  * Having a "local triple store" or not doing real time to have information display quicker

  * More complicated queries to display timeline for date or exploring period, exploring "notable work" or "author collaborated with" etc.

The dev is probably not as clean as you wish, and I prefer to share something to talk about - even not perfect - than nothing. I plan to write a blog post about what I learn. Tell me if you are interested, it could encourage me to do it :p

Claire.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.koha-community.org/pipermail/koha-devel/attachments/20180716/9d745886/attachment-0001.html>