[Koha-devel] Testing XML parsers

Chris Cormack chrisc at catalyst.net.nz
Wed Apr 28 21:50:01 CEST 2010


* Clay Fouts (cfouts at ptfs.com) wrote:
>    [Moving to the more topical koha-devel]
>    For a bit of anecdotal evidence, we've been testing with the non-Parser
>    version for over six months with no ill effects. However, I definitely
>    encourage further, more rigorous compatibility tests before making a
>    recommendation. How do we go about testing that formally?
>    Also, even though $KOHA_CONF is parsed in each and every script, the time
>    for a single snarfing of that file is immaterial compared to that required
>    for parsing the 20 or so MARCXML records and corresponding XSLT templates
>    that pop up in single OPAC search results page.
>    Clay

Certainly, all sounds good for 3.4, 
However this requirement to edit parserdetails.ini has been there for
nearly 3 years now. 

This removes that need for the 3.2 release, you are of course welcome to
grepa and change the line if you feel like it is safe for your clients
to do so.

All I was attempting to do was to remove the need to edit a systemwide
preference while maintaining the current behaviour (we are in feature
freeze after all). 

Feel free to send counter patches, or to wait until 3.4.0 and shift
all the marcxml handling to C4::Record where it belongs.

Chris
> 
>    On Tue, Apr 27, 2010 at 4:47 PM, Chris Cormack <chrisc at catalyst.net.nz>
>    wrote:
> 
>      * Clay Fouts (cfouts at ptfs.com) wrote:
>      >    I agree ParserDetails.ini is also not a good place to define it
>      since it
>      >    forces a system-wide preference. But wouldn't adding it to
>      C4/Context.pm
>      >    be sufficient since everything else uses that?
>      >    Also, can anyone point to a verifiable difference in correct
>      functionality
>      >    between the Parser and non-Parser LibXML::SAX modules? They're
>      built off
>      >    of the same code, the only difference being that the Parser takes
>      extra
>      >    time to build a DOM tree, which Koha doesn't use. The expense of
>      building
>      >    that tree is considerable.
> 
>      I'll have to defer to the people who specified the dependency on that.
> 
>      Well actually I'll to talk to some others, because that person threw his
>      toys
>      out of the sandpit a while ago.  The dependency has been their since
>      2007, it may well be that it's time to revisit it. However this close to
>      3.2 release that probably will be needed to be backburnered until 3.4.
>      My main desire was to not have the 3.2.0 release needing to change the
>      parserdetails.ini file like the 3.0.x ones have.
> 
>      C4::Context would override it everywhere in Koha, not just where we
>      throw around MARCXML. So as it stands in the files that actually use
>      MARC::File::XML, it does mean that you use LibXML::SAX non parser to
>      parse the config file (faster) and the one that we think is the only
>      safe one, to parse the MARCXML
> 
>      By removing it from the parserdetails.ini and allowing C4::Context to
>      parse koha-conf.xml with a faster parser we do get a performance win on
>      every single page.
> 
>      Sorry I should have explained better.
>      Chris
>      >
>      >    On Tue, Apr 27, 2010 at 4:04 PM, Chris Cormack
>      <chrisc at catalyst.net.nz>
>      >    wrote:
>      >
>      >      * Clay Fouts (cfouts at ptfs.com) wrote:
>      >      >    Hi, Chris.
>      >      >    Wouldn't it make more sense to define this in a single
>      place? If a
>      >      user
>      >      >    decides at some point in the future to swap parsers, they'd
>      have to
>      >      change
>      >      >    dozens of files to remain consistent.
>      >      >    Also, not everyone uses XML::LibXML::SAX::Parser, so it
>      seems
>      >      >    inappropriate to hard code that into the source. For
>      example,
>      >      >    XML::LibXML::SAX is much faster and functionally equitable
>      for
>      >      Koha's
>      >      >    purposes.
>      >      >    Cheers,
>      >      >    Clay
>      >
>      >      Hi Clay
>      >
>      >      The MARC editing Koha does, does depend on
>      XML::LibXML::SAX::Parser, the
>      >      current situation means people have to edit parserdetails.ini  to
>      make
>      >      that the default. This means that now everything on the system
>      uses it.
>      >      Bad ... In the packaging work Lars is doing, there is no way that
>      doing
>      >      it that way that will make it into debian.
>      >
>      >      No one should use Koha with other than XML::LibXML::SAX::Parser
>      because
>      >      although the others maybe functionally equitable, things break
>      with
>      >      them, specifically in MARC::File::XML
>      >
>      >      So since Koha depends on this parser, it makes more sense to me
>      to
>      >      define it in the files that depend on it. Thus allowing the user
>      to use
>      >      whatever else Parse they look for the rest of their system.
>      >
>      >      It also allows the debian packaging to continue.
>      >
>      >      The other option is of course to add this to MARC::File::XML
>      instead.
>      >
>      >      Hope this helps explain
>      >
>      >      Chris
>      >      >
>      >      >    On Tue, Apr 27, 2010 at 2:34 PM, Chris Cormack
>      >      <chrisc at catalyst.net.nz>
>      >      >    wrote:
>      >      >
>      >      >      ---
>      >      >       C4/Biblio.pm                                       |    2
>      ++
>      >      >       C4/Record.pm                                       |    4
>      ++++
>      >      >       cataloguing/addbiblio.pl                           |    3
>      +++
>      >      >       cataloguing/additem.pl                             |    4
>      ++++
>      >      >       cataloguing/moveitem.pl                            |    2
>      ++
>      >      >       installer/data/mysql/update22to30.pl               |    1
>      +
>      >      >       installer/data/mysql/updatedatabase.pl             |    2
>      ++
>      >      >       misc/batchImportMARCWithBiblionumbers.pl           |   17
>      >      >      +++++++++++++++++
>      >      >       misc/batchupdateISBNs.pl                           |    2
>      ++
>      >      >       misc/maintenance/MARC21_utf8_flag_fix.pl           |    3
>      +++
>      >      >       .../migration_tools/22_to_30/export_Authorities.pl |    2
>      ++
>      >      >       .../22_to_30/export_Authorities_xml.pl             |    2
>      ++
>      >      >       .../22_to_30/move_marc_to_biblioitems.pl           |    2
>      ++
>      >      >       misc/migration_tools/bulkauthimport.pl             |    2
>      ++
>      >      >       misc/migration_tools/bulkmarcimport.pl             |    2
>      ++
>      >      >       misc/sax_parser_print.pl                           |    1
>      +
>      >      >       misc/sax_parser_test.pl                            |    2
>      ++
>      >      >       17 files changed, 53 insertions(+), 0 deletions(-)
>      >      >
>      >      >      diff --git a/C4/Biblio.pm b/C4/Biblio.pm
>      >      >      index e24bee8..24ba2f1 100644
>      >      >      --- a/C4/Biblio.pm
>      >      >      +++ b/C4/Biblio.pm
>      >      >      @@ -17,6 +17,8 @@ package C4::Biblio;
>      >      >       # with Koha; if not, write to the Free Software
>      Foundation,
>      >      Inc.,
>      >      >       # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301
>      USA.
>      >      >
>      >      >      +BEGIN { $XML::SAX::ParserPackage =
>      "XML::LibXML::SAX::Parser"; }
>      >      # Make
>      >      >      sure MARC::File::XML uses the correct parser
>      >      >      +
>      >      >       use strict;
>      >      >       use warnings;
>      >      >
>      >      >      diff --git a/C4/Record.pm b/C4/Record.pm
>      >      >      index 233fa6e..28903ee 100644
>      >      >      --- a/C4/Record.pm
>      >      >      +++ b/C4/Record.pm
>      >      >      @@ -2,6 +2,7 @@ package C4::Record;
>      >      >       #
>      >      >       # Copyright 2006 (C) LibLime
>      >      >       # Joshua Ferraro <jmf at liblime.com>
>      >      >      +# Copyright 2010 Catalyst IT Limited
>      >      >       #
>      >      >       # This file is part of Koha.
>      >      >       #
>      >      >      @@ -19,6 +20,9 @@ package C4::Record;
>      >      >       # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301
>      USA.
>      >      >       #
>      >      >       #
>      >      >      +
>      >      >      +BEGIN { $XML::SAX::ParserPackage =
>      "XML::LibXML::SAX::Parser"; }
>      >      # Make
>      >      >      sure MARC::File::XML uses the correct parser
>      >      >      +
>      >      >       use strict;
>      >      >       #use warnings; FIXME - Bug 2505
>      >      >
>      >      >      diff --git a/cataloguing/addbiblio.pl
>      b/cataloguing/addbiblio.pl
>      >      >      index 7a1ed65..af73822 100755
>      >      >      --- a/cataloguing/addbiblio.pl
>      >      >      +++ b/cataloguing/addbiblio.pl
>      >      >      @@ -2,6 +2,7 @@
>      >      >
>      >      >       # Copyright 2000-2002 Katipo Communications
>      >      >      +# Copyright 2010 Catalyst IT Limited
>      >      >       #
>      >      >       # This file is part of Koha.
>      >      >       #
>      >      >      @@ -18,6 +19,8 @@
>      >      >       # with Koha; if not, write to the Free Software
>      Foundation,
>      >      Inc.,
>      >      >       # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301
>      USA.
>      >      >
>      >      >      +BEGIN { $XML::SAX::ParserPackage =
>      "XML::LibXML::SAX::Parser"; }
>      >      # Make
>      >      >      sure MARC::File::XML uses the correct parser
>      >      >      +
>      >      >       use strict;
>      >      >       #use warnings; FIXME - Bug 2505
>      >      >       use CGI;
>      >      >      diff --git a/cataloguing/additem.pl
>      b/cataloguing/additem.pl
>      >      >      index bd3876e..40ccba9 100755
>      >      >      --- a/cataloguing/additem.pl
>      >      >      +++ b/cataloguing/additem.pl
>      >      >      @@ -2,6 +2,7 @@
>      >      >
>      >      >       # Copyright 2000-2002 Katipo Communications
>      >      >      +# Copyright 2010 Catalyst IT Limited
>      >      >       #
>      >      >       # This file is part of Koha.
>      >      >       #
>      >      >      @@ -18,6 +19,9 @@
>      >      >       # with Koha; if not, write to the Free Software
>      Foundation,
>      >      Inc.,
>      >      >       # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301
>      USA.
>      >      >
>      >      >      +
>      >      >      +BEGIN { $XML::SAX::ParserPackage =
>      "XML::LibXML::SAX::Parser"; }
>      >      # Make
>      >      >      sure MARC::File::XML uses the correct parser
>      >      >      +
>      >      >       use strict;
>      >      >       #use warnings; FIXME - Bug 2505
>      >      >       use CGI;
>      >      >      diff --git a/cataloguing/moveitem.pl
>      b/cataloguing/moveitem.pl
>      >      >      index e0ab551..31b0093 100755
>      >      >      --- a/cataloguing/moveitem.pl
>      >      >      +++ b/cataloguing/moveitem.pl
>      >      >      @@ -19,6 +19,8 @@
>      >      >       # with Koha; if not, write to the Free Software
>      Foundation,
>      >      Inc.,
>      >      >       # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301
>      USA.
>      >      >
>      >      >      +BEGIN { $XML::SAX::ParserPackage =
>      "XML::LibXML::SAX::Parser"; }
>      >      # Make
>      >      >      sure MARC::File::XML uses the correct parser
>      >      >      +
>      >      >       use strict;
>      >      >       #use warnings; FIXME - Bug 2505
>      >      >       use CGI;
>      >      >      diff --git a/installer/data/mysql/update22to30.pl
>      >      >      b/installer/data/mysql/update22to30.pl
>      >      >      index cb4379b..f314749 100755
>      >      >      --- a/installer/data/mysql/update22to30.pl
>      >      >      +++ b/installer/data/mysql/update22to30.pl
>      >      >      @@ -11,6 +11,7 @@
>      >      >       # - Would also be a good idea to offer to do a backup at
>      this
>      >      time...
>      >      >
>      >      >       # NOTE:  If you do something more than once in here, make
>      it
>      >      table
>      >      >      driven.
>      >      >      +BEGIN { $XML::SAX::ParserPackage =
>      "XML::LibXML::SAX::Parser"; }
>      >      # Make
>      >      >      sure MARC::File::XML uses the correct parser
>      >      >       use strict;
>      >      >       #use warnings; FIXME - Bug 2505
>      >      >
>      >      >      diff --git a/installer/data/mysql/updatedatabase.pl
>      >      >      b/installer/data/mysql/updatedatabase.pl
>      >      >      index 201e30c..8a2c15d 100755
>      >      >      --- a/installer/data/mysql/updatedatabase.pl
>      >      >      +++ b/installer/data/mysql/updatedatabase.pl
>      >      >      @@ -14,6 +14,8 @@
>      >      >
>      >      >       # NOTE: Please keep the version in kohaversion.pl
>      up-to-date!
>      >      >
>      >      >      +BEGIN { $XML::SAX::ParserPackage =
>      "XML::LibXML::SAX::Parser"; }
>      >      # Make
>      >      >      sure MARC::File::XML uses the correct parser
>      >      >      +
>      >      >       use strict;
>      >      >       use warnings;
>      >      >
>      >      >      diff --git a/misc/batchImportMARCWithBiblionumbers.pl
>      >      >      b/misc/batchImportMARCWithBiblionumbers.pl
>      >      >      index 4acc02b..8ca8a34 100755
>      >      >      --- a/misc/batchImportMARCWithBiblionumbers.pl
>      >      >      +++ b/misc/batchImportMARCWithBiblionumbers.pl
>      >      >      @@ -1,6 +1,22 @@
>      >      >       #!/usr/bin/perl
>      >      >       # load records that already have biblionumber set into a
>      koha
>      >      system
>      >      >       # Written by TG on 10/04/2006
>      >      >      +
>      >      >      +# This file is part of Koha.
>      >      >      +#
>      >      >      +# Koha is free software; you can redistribute it and/or
>      modify
>      >      it under
>      >      >      the
>      >      >      +# terms of the GNU General Public License as published by
>      the
>      >      Free
>      >      >      Software
>      >      >      +# Foundation; either version 2 of the License, or (at
>      your
>      >      option) any
>      >      >      later
>      >      >      +# version.
>      >      >      +#
>      >      >      +# Koha is distributed in the hope that it will be useful,
>      but
>      >      WITHOUT
>      >      >      ANY
>      >      >      +# WARRANTY; without even the implied warranty of
>      MERCHANTABILITY
>      >      or
>      >      >      FITNESS FOR
>      >      >      +# A PARTICULAR PURPOSE.  See the GNU General Public
>      License for
>      >      more
>      >      >      details.
>      >      >      +#
>      >      >      +# You should have received a copy of the GNU General
>      Public
>      >      License
>      >      >      along
>      >      >      +# with Koha; if not, write to the Free Software
>      Foundation,
>      >      Inc.,
>      >      >      +# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301
>      USA.
>      >      >      +
>      >      >       use strict;
>      >      >       #use warnings; FIXME - Bug 2505
>      >      >       BEGIN {
>      >      >      @@ -8,6 +24,7 @@ BEGIN {
>      >      >          # test carefully before changing this
>      >      >          use FindBin;
>      >      >          eval { require "$FindBin::Bin/kohalib.pl" };
>      >      >      +    $XML::SAX::ParserPackage =
>      "XML::LibXML::SAX::Parser";
>      >      >       }
>      >      >
>      >      >       # Koha modules used
>      >      >      diff --git a/misc/batchupdateISBNs.pl
>      b/misc/batchupdateISBNs.pl
>      >      >      index 8cc8c6a..6041b80 100755
>      >      >      --- a/misc/batchupdateISBNs.pl
>      >      >      +++ b/misc/batchupdateISBNs.pl
>      >      >      @@ -30,6 +30,8 @@ BEGIN {
>      >      >          # test carefully before changing this
>      >      >          use FindBin;
>      >      >          eval { require "$FindBin::Bin/kohalib.pl" };
>      >      >      +    $XML::SAX::ParserPackage =
>      "XML::LibXML::SAX::Parser";
>      >      >      +
>      >      >       }
>      >      >       use C4::Context;
>      >      >       use MARC::File::XML;
>      >      >      diff --git a/misc/maintenance/MARC21_utf8_flag_fix.pl
>      >      >      b/misc/maintenance/MARC21_utf8_flag_fix.pl
>      >      >      index e7e9156..fbff241 100755
>      >      >      --- a/misc/maintenance/MARC21_utf8_flag_fix.pl
>      >      >      +++ b/misc/maintenance/MARC21_utf8_flag_fix.pl
>      >      >      @@ -1,6 +1,7 @@
>      >      >       #!/usr/bin/perl
>      >      >       #
>      >      >       # Copyright 2009 Liblime
>      >      >      +# Copyright 2010 Catalyst IT
>      >      >       #
>      >      >       # This file is part of Koha.
>      >      >       #
>      >      >      @@ -17,6 +18,8 @@
>      >      >       # with Koha; if not, write to the Free Software
>      Foundation,
>      >      Inc.,
>      >      >       # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301
>      USA.
>      >      >
>      >      >      +BEGIN { $XML::SAX::ParserPackage =
>      "XML::LibXML::SAX::Parser"; }
>      >      # Make
>      >      >      sure correct sax parser is used
>      >      >      +
>      >      >       use strict;
>      >      >       use warnings;
>      >      >
>      >      >      diff --git
>      a/misc/migration_tools/22_to_30/export_Authorities.pl
>      >      >      b/misc/migration_tools/22_to_30/export_Authorities.pl
>      >      >      index aeed3f2..cd24a7d 100755
>      >      >      --- a/misc/migration_tools/22_to_30/export_Authorities.pl
>      >      >      +++ b/misc/migration_tools/22_to_30/export_Authorities.pl
>      >      >      @@ -6,6 +6,8 @@ BEGIN {
>      >      >          # test carefully before changing this
>      >      >          use FindBin;
>      >      >          eval { require "$FindBin::Bin/../../kohalib.pl" };
>      >      >      +    $XML::SAX::ParserPackage =
>      "XML::LibXML::SAX::Parser"; #
>      >      Make sure
>      >      >      correct sax parser is used
>      >      >      +
>      >      >       }
>      >      >       use C4::Context;
>      >      >       #use MARC::File::XML(BinaryEncoding=>"utf8");
>      >      >      diff --git
>      >      a/misc/migration_tools/22_to_30/export_Authorities_xml.pl
>      >      >      b/misc/migration_tools/22_to_30/export_Authorities_xml.pl
>      >      >      index 70647ee..bd871c4 100755
>      >      >      ---
>      a/misc/migration_tools/22_to_30/export_Authorities_xml.pl
>      >      >      +++
>      b/misc/migration_tools/22_to_30/export_Authorities_xml.pl
>      >      >      @@ -6,6 +6,8 @@ BEGIN {
>      >      >          # test carefully before changing this
>      >      >          use FindBin;
>      >      >          eval { require "$FindBin::Bin/../../kohalib.pl" };
>      >      >      +    $XML::SAX::ParserPackage =
>      "XML::LibXML::SAX::Parser";  #
>      >      Make sure
>      >      >      correct sax parser is used
>      >      >      +
>      >      >       }
>      >      >       use C4::Context;
>      >      >       use MARC::File::XML(BinaryEncoding=>"utf8");
>      >      >      diff --git
>      >      a/misc/migration_tools/22_to_30/move_marc_to_biblioitems.pl
>      >      >    
>       b/misc/migration_tools/22_to_30/move_marc_to_biblioitems.pl
>      >      >      index 3635dbc..051d93b 100755
>      >      >      ---
>      a/misc/migration_tools/22_to_30/move_marc_to_biblioitems.pl
>      >      >      +++
>      b/misc/migration_tools/22_to_30/move_marc_to_biblioitems.pl
>      >      >      @@ -8,6 +8,8 @@ BEGIN {
>      >      >          # test carefully before changing this
>      >      >          use FindBin;
>      >      >          eval { require "$FindBin::Bin/../../kohalib.pl" };
>      >      >      +    $XML::SAX::ParserPackage =
>      "XML::LibXML::SAX::Parser"; #
>      >      Make sure
>      >      >      correct sax parser is used
>      >      >      +
>      >      >       }
>      >      >       use C4::Context;
>      >      >       use C4::Biblio;
>      >      >      diff --git a/misc/migration_tools/bulkauthimport.pl
>      >      >      b/misc/migration_tools/bulkauthimport.pl
>      >      >      index 2709db9..5f859a1 100755
>      >      >      --- a/misc/migration_tools/bulkauthimport.pl
>      >      >      +++ b/misc/migration_tools/bulkauthimport.pl
>      >      >      @@ -8,6 +8,8 @@ BEGIN {
>      >      >          # test carefully before changing this
>      >      >          use FindBin;
>      >      >          eval { require "$FindBin::Bin/kohalib.pl" };
>      >      >      +    $XML::SAX::ParserPackage =
>      "XML::LibXML::SAX::Parser"; #
>      >      Make sure
>      >      >      correct sax parser is used
>      >      >      +
>      >      >       }
>      >      >
>      >      >       # Koha modules used
>      >      >      diff --git a/misc/migration_tools/bulkmarcimport.pl
>      >      >      b/misc/migration_tools/bulkmarcimport.pl
>      >      >      index da0ebf7..5b1d823 100755
>      >      >      --- a/misc/migration_tools/bulkmarcimport.pl
>      >      >      +++ b/misc/migration_tools/bulkmarcimport.pl
>      >      >      @@ -9,6 +9,8 @@ BEGIN {
>      >      >          # test carefully before changing this
>      >      >          use FindBin;
>      >      >          eval { require "$FindBin::Bin/../kohalib.pl" };
>      >      >      +    $XML::SAX::ParserPackage =
>      "XML::LibXML::SAX::Parser"; #
>      >      Make sure
>      >      >      correct sax parser is used
>      >      >      +
>      >      >       }
>      >      >
>      >      >       # Koha modules used
>      >      >      diff --git a/misc/sax_parser_print.pl
>      b/misc/sax_parser_print.pl
>      >      >      index d206a5e..ab5d1ff 100755
>      >      >      --- a/misc/sax_parser_print.pl
>      >      >      +++ b/misc/sax_parser_print.pl
>      >      >      @@ -3,6 +3,7 @@
>      >      >
>      >      >       use strict;
>      >      >       use warnings;
>      >      >      +BEGIN { $XML::SAX::ParserPackage =
>      "XML::LibXML::SAX::Parser"; }
>      >      >       use XML::SAX::ParserFactory;
>      >      >
>      >      >       my $parser = XML::SAX::ParserFactory->parser();
>      >      >      diff --git a/misc/sax_parser_test.pl
>      b/misc/sax_parser_test.pl
>      >      >      index b2e5974..359bb45 100755
>      >      >      --- a/misc/sax_parser_test.pl
>      >      >      +++ b/misc/sax_parser_test.pl
>      >      >      @@ -1,5 +1,7 @@
>      >      >       #!/usr/bin/perl
>      >      >
>      >      >      +BEGIN { $XML::SAX::ParserPackage = "XML::LibXML::SAX"; }
>      >      >      +
>      >      >       use strict;
>      >      >       use warnings;
>      >      >      --
>      >      >      1.6.3.3
>      >      >
>      >      >      _______________________________________________
>      >      >      Koha-patches mailing list
>      >      >      Koha-patches at lists.koha.org
>      >      >      http://lists.koha.org/mailman/listinfo/koha-patches
>      >
>      >      --
>      >      Chris Cormack
>      >      Catalyst IT Ltd.
>      >      +64 4 803 2238
>      >      PO Box 11-053, Manners St, Wellington 6142, New Zealand
> 
>      --
>      Chris Cormack
>      Catalyst IT Ltd.
>      +64 4 803 2238
>      PO Box 11-053, Manners St, Wellington 6142, New Zealand

-- 
Chris Cormack
Catalyst IT Ltd.
+64 4 803 2238
PO Box 11-053, Manners St, Wellington 6142, New Zealand



More information about the Koha-devel mailing list