[Koha-devel] Finding invalid XML characters in Koha data via SQL

David Cook dcook at prosentient.com.au
Fri Apr 12 08:23:51 CEST 2024


Yeah, I was thinking it would be good to add to the SQL Report Library. I've just been flat out today... (but wanted to make sure I shared it with you folk at least)

David Cook
Senior Software Engineer
Prosentient Systems
Suite 7.03
6a Glen St
Milsons Point NSW 2061
Australia

Office: 02 9212 0899
Online: 02 8005 0595

-----Original Message-----
From: Koha-devel <koha-devel-bounces at lists.koha-community.org> On Behalf Of Magnus Enger via Koha-devel
Sent: Friday, 12 April 2024 4:06 PM
To: koha-devel at lists.koha-community.org
Subject: Re: [Koha-devel] Finding invalid XML characters in Koha data via SQL

Hi!

Den 12.04.2024 03:36, skrev David Cook via Koha-devel:
> Hi all,
> 
> I just wanted to share a (MariaDB) SQL report that I wrote for finding 
> bib records with invalid XML characters:
> 
> select biblionumber from biblio_metadata where metadata REGEXP 
> '[^\\x{0009}\\x{000A}\\x{000D}\\x{0020}-\\x{D7FF}\\x{E000}-\\x{FFFD}\\
> x{10000}-\\x{10FFFF}]+';
> 
> Newer versions of Koha strip invalid character from the XML so that 
> you can fix your records. I figure this report is very valuable when 
> coupled with that functionality. In fact, I just advised a library 
> today to use them together to fix up some bad data in their catalogue.
> 
> --
> 
> On a related note, I’ve noticed that you can have a record with good 
> bib XML but invalid item XML, and you won’t notice until your record 
> fails to be indexed. So I’m planning on writing a report for that too.
> 
> I’m thinking it might be good to add these reports to core Koha, so 
> that people can find and fix their own metadata problems. What do people think?

Sounds like an excellent idea! Sounds kind of similar to "MARC bibliographic framework test" at /cgi-bin/koha/admin/checkmarc.pl

The report could also be added to
https://wiki.koha-community.org/wiki/SQL_Reports_Library for older Kohas and to be immediately useful.

Best regards,
Magnus
_______________________________________________
Koha-devel mailing list
Koha-devel at lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
website : https://www.koha-community.org/ git : https://git.koha-community.org/ bugs : https://bugs.koha-community.org/



More information about the Koha-devel mailing list