[Koha-devel] Testing XML parsers
Clay Fouts
cfouts at ptfs.com
Wed Apr 28 22:22:34 CEST 2010
Chris,
What about Michael's suggestion of "require"-ing a file that defines that
BEGIN block, instead? Or, more broadly, define it once in C4/Context.pm? The
former keeps the same functionality you're attempting to apply, and the user
only need modify a single file to change it. The latter, while it would
define a single parser for all of Koha instead of allowing it to vary by
context, is not functionally different from the current behavior where it's
set in ParserDetails.ini.
I understand and agree with your reasoning behind implementing this change.
However, forcing a user to modify more than a dozen source files in order to
change the parser is not an inviting solution to this issue.
Clay
On Wed, Apr 28, 2010 at 12:50 PM, Chris Cormack <chrisc at catalyst.net.nz>wrote:
> * Clay Fouts (cfouts at ptfs.com) wrote:
> > [Moving to the more topical koha-devel]
> > For a bit of anecdotal evidence, we've been testing with the
> non-Parser
> > version for over six months with no ill effects. However, I definitely
> > encourage further, more rigorous compatibility tests before making a
> > recommendation. How do we go about testing that formally?
> > Also, even though $KOHA_CONF is parsed in each and every script, the
> time
> > for a single snarfing of that file is immaterial compared to that
> required
> > for parsing the 20 or so MARCXML records and corresponding XSLT
> templates
> > that pop up in single OPAC search results page.
> > Clay
>
> Certainly, all sounds good for 3.4,
> However this requirement to edit parserdetails.ini has been there for
> nearly 3 years now.
>
> This removes that need for the 3.2 release, you are of course welcome to
> grepa and change the line if you feel like it is safe for your clients
> to do so.
>
> All I was attempting to do was to remove the need to edit a systemwide
> preference while maintaining the current behaviour (we are in feature
> freeze after all).
>
> Feel free to send counter patches, or to wait until 3.4.0 and shift
> all the marcxml handling to C4::Record where it belongs.
>
> Chris
> >
> > On Tue, Apr 27, 2010 at 4:47 PM, Chris Cormack <
> chrisc at catalyst.net.nz>
> > wrote:
> >
> > * Clay Fouts (cfouts at ptfs.com) wrote:
> > > I agree ParserDetails.ini is also not a good place to define it
> > since it
> > > forces a system-wide preference. But wouldn't adding it to
> > C4/Context.pm
> > > be sufficient since everything else uses that?
> > > Also, can anyone point to a verifiable difference in correct
> > functionality
> > > between the Parser and non-Parser LibXML::SAX modules? They're
> > built off
> > > of the same code, the only difference being that the Parser
> takes
> > extra
> > > time to build a DOM tree, which Koha doesn't use. The expense
> of
> > building
> > > that tree is considerable.
> >
> > I'll have to defer to the people who specified the dependency on
> that.
> >
> > Well actually I'll to talk to some others, because that person threw
> his
> > toys
> > out of the sandpit a while ago. The dependency has been their since
> > 2007, it may well be that it's time to revisit it. However this
> close to
> > 3.2 release that probably will be needed to be backburnered until
> 3.4.
> > My main desire was to not have the 3.2.0 release needing to change
> the
> > parserdetails.ini file like the 3.0.x ones have.
> >
> > C4::Context would override it everywhere in Koha, not just where we
> > throw around MARCXML. So as it stands in the files that actually use
> > MARC::File::XML, it does mean that you use LibXML::SAX non parser to
> > parse the config file (faster) and the one that we think is the only
> > safe one, to parse the MARCXML
> >
> > By removing it from the parserdetails.ini and allowing C4::Context
> to
> > parse koha-conf.xml with a faster parser we do get a performance win
> on
> > every single page.
> >
> > Sorry I should have explained better.
> > Chris
> > >
> > > On Tue, Apr 27, 2010 at 4:04 PM, Chris Cormack
> > <chrisc at catalyst.net.nz>
> > > wrote:
> > >
> > > * Clay Fouts (cfouts at ptfs.com) wrote:
> > > > Hi, Chris.
> > > > Wouldn't it make more sense to define this in a single
> > place? If a
> > > user
> > > > decides at some point in the future to swap parsers,
> they'd
> > have to
> > > change
> > > > dozens of files to remain consistent.
> > > > Also, not everyone uses XML::LibXML::SAX::Parser, so it
> > seems
> > > > inappropriate to hard code that into the source. For
> > example,
> > > > XML::LibXML::SAX is much faster and functionally
> equitable
> > for
> > > Koha's
> > > > purposes.
> > > > Cheers,
> > > > Clay
> > >
> > > Hi Clay
> > >
> > > The MARC editing Koha does, does depend on
> > XML::LibXML::SAX::Parser, the
> > > current situation means people have to edit parserdetails.ini
> to
> > make
> > > that the default. This means that now everything on the
> system
> > uses it.
> > > Bad ... In the packaging work Lars is doing, there is no way
> that
> > doing
> > > it that way that will make it into debian.
> > >
> > > No one should use Koha with other than
> XML::LibXML::SAX::Parser
> > because
> > > although the others maybe functionally equitable, things
> break
> > with
> > > them, specifically in MARC::File::XML
> > >
> > > So since Koha depends on this parser, it makes more sense to
> me
> > to
> > > define it in the files that depend on it. Thus allowing the
> user
> > to use
> > > whatever else Parse they look for the rest of their system.
> > >
> > > It also allows the debian packaging to continue.
> > >
> > > The other option is of course to add this to MARC::File::XML
> > instead.
> > >
> > > Hope this helps explain
> > >
> > > Chris
> > > >
> > > > On Tue, Apr 27, 2010 at 2:34 PM, Chris Cormack
> > > <chrisc at catalyst.net.nz>
> > > > wrote:
> > > >
> > > > ---
> > > > C4/Biblio.pm |
> 2
> > ++
> > > > C4/Record.pm |
> 4
> > ++++
> > > > cataloguing/addbiblio.pl |
> 3
> > +++
> > > > cataloguing/additem.pl |
> 4
> > ++++
> > > > cataloguing/moveitem.pl |
> 2
> > ++
> > > > installer/data/mysql/update22to30.pl |
> 1
> > +
> > > > installer/data/mysql/updatedatabase.pl |
> 2
> > ++
> > > > misc/batchImportMARCWithBiblionumbers.pl |
> 17
> > > > +++++++++++++++++
> > > > misc/batchupdateISBNs.pl |
> 2
> > ++
> > > > misc/maintenance/MARC21_utf8_flag_fix.pl |
> 3
> > +++
> > > > .../migration_tools/22_to_30/export_Authorities.pl |
> 2
> > ++
> > > > .../22_to_30/export_Authorities_xml.pl |
> 2
> > ++
> > > > .../22_to_30/move_marc_to_biblioitems.pl |
> 2
> > ++
> > > > misc/migration_tools/bulkauthimport.pl |
> 2
> > ++
> > > > misc/migration_tools/bulkmarcimport.pl |
> 2
> > ++
> > > > misc/sax_parser_print.pl |
> 1
> > +
> > > > misc/sax_parser_test.pl |
> 2
> > ++
> > > > 17 files changed, 53 insertions(+), 0 deletions(-)
> > > >
> > > > diff --git a/C4/Biblio.pm b/C4/Biblio.pm
> > > > index e24bee8..24ba2f1 100644
> > > > --- a/C4/Biblio.pm
> > > > +++ b/C4/Biblio.pm
> > > > @@ -17,6 +17,8 @@ package C4::Biblio;
> > > > # with Koha; if not, write to the Free Software
> > Foundation,
> > > Inc.,
> > > > # 51 Franklin Street, Fifth Floor, Boston, MA
> 02110-1301
> > USA.
> > > >
> > > > +BEGIN { $XML::SAX::ParserPackage =
> > "XML::LibXML::SAX::Parser"; }
> > > # Make
> > > > sure MARC::File::XML uses the correct parser
> > > > +
> > > > use strict;
> > > > use warnings;
> > > >
> > > > diff --git a/C4/Record.pm b/C4/Record.pm
> > > > index 233fa6e..28903ee 100644
> > > > --- a/C4/Record.pm
> > > > +++ b/C4/Record.pm
> > > > @@ -2,6 +2,7 @@ package C4::Record;
> > > > #
> > > > # Copyright 2006 (C) LibLime
> > > > # Joshua Ferraro <jmf at liblime.com>
> > > > +# Copyright 2010 Catalyst IT Limited
> > > > #
> > > > # This file is part of Koha.
> > > > #
> > > > @@ -19,6 +20,9 @@ package C4::Record;
> > > > # 51 Franklin Street, Fifth Floor, Boston, MA
> 02110-1301
> > USA.
> > > > #
> > > > #
> > > > +
> > > > +BEGIN { $XML::SAX::ParserPackage =
> > "XML::LibXML::SAX::Parser"; }
> > > # Make
> > > > sure MARC::File::XML uses the correct parser
> > > > +
> > > > use strict;
> > > > #use warnings; FIXME - Bug 2505
> > > >
> > > > diff --git a/cataloguing/addbiblio.pl
> > b/cataloguing/addbiblio.pl
> > > > index 7a1ed65..af73822 100755
> > > > --- a/cataloguing/addbiblio.pl
> > > > +++ b/cataloguing/addbiblio.pl
> > > > @@ -2,6 +2,7 @@
> > > >
> > > > # Copyright 2000-2002 Katipo Communications
> > > > +# Copyright 2010 Catalyst IT Limited
> > > > #
> > > > # This file is part of Koha.
> > > > #
> > > > @@ -18,6 +19,8 @@
> > > > # with Koha; if not, write to the Free Software
> > Foundation,
> > > Inc.,
> > > > # 51 Franklin Street, Fifth Floor, Boston, MA
> 02110-1301
> > USA.
> > > >
> > > > +BEGIN { $XML::SAX::ParserPackage =
> > "XML::LibXML::SAX::Parser"; }
> > > # Make
> > > > sure MARC::File::XML uses the correct parser
> > > > +
> > > > use strict;
> > > > #use warnings; FIXME - Bug 2505
> > > > use CGI;
> > > > diff --git a/cataloguing/additem.pl
> > b/cataloguing/additem.pl
> > > > index bd3876e..40ccba9 100755
> > > > --- a/cataloguing/additem.pl
> > > > +++ b/cataloguing/additem.pl
> > > > @@ -2,6 +2,7 @@
> > > >
> > > > # Copyright 2000-2002 Katipo Communications
> > > > +# Copyright 2010 Catalyst IT Limited
> > > > #
> > > > # This file is part of Koha.
> > > > #
> > > > @@ -18,6 +19,9 @@
> > > > # with Koha; if not, write to the Free Software
> > Foundation,
> > > Inc.,
> > > > # 51 Franklin Street, Fifth Floor, Boston, MA
> 02110-1301
> > USA.
> > > >
> > > > +
> > > > +BEGIN { $XML::SAX::ParserPackage =
> > "XML::LibXML::SAX::Parser"; }
> > > # Make
> > > > sure MARC::File::XML uses the correct parser
> > > > +
> > > > use strict;
> > > > #use warnings; FIXME - Bug 2505
> > > > use CGI;
> > > > diff --git a/cataloguing/moveitem.pl
> > b/cataloguing/moveitem.pl
> > > > index e0ab551..31b0093 100755
> > > > --- a/cataloguing/moveitem.pl
> > > > +++ b/cataloguing/moveitem.pl
> > > > @@ -19,6 +19,8 @@
> > > > # with Koha; if not, write to the Free Software
> > Foundation,
> > > Inc.,
> > > > # 51 Franklin Street, Fifth Floor, Boston, MA
> 02110-1301
> > USA.
> > > >
> > > > +BEGIN { $XML::SAX::ParserPackage =
> > "XML::LibXML::SAX::Parser"; }
> > > # Make
> > > > sure MARC::File::XML uses the correct parser
> > > > +
> > > > use strict;
> > > > #use warnings; FIXME - Bug 2505
> > > > use CGI;
> > > > diff --git a/installer/data/mysql/update22to30.pl
> > > > b/installer/data/mysql/update22to30.pl
> > > > index cb4379b..f314749 100755
> > > > --- a/installer/data/mysql/update22to30.pl
> > > > +++ b/installer/data/mysql/update22to30.pl
> > > > @@ -11,6 +11,7 @@
> > > > # - Would also be a good idea to offer to do a backup
> at
> > this
> > > time...
> > > >
> > > > # NOTE: If you do something more than once in here,
> make
> > it
> > > table
> > > > driven.
> > > > +BEGIN { $XML::SAX::ParserPackage =
> > "XML::LibXML::SAX::Parser"; }
> > > # Make
> > > > sure MARC::File::XML uses the correct parser
> > > > use strict;
> > > > #use warnings; FIXME - Bug 2505
> > > >
> > > > diff --git a/installer/data/mysql/updatedatabase.pl
> > > > b/installer/data/mysql/updatedatabase.pl
> > > > index 201e30c..8a2c15d 100755
> > > > --- a/installer/data/mysql/updatedatabase.pl
> > > > +++ b/installer/data/mysql/updatedatabase.pl
> > > > @@ -14,6 +14,8 @@
> > > >
> > > > # NOTE: Please keep the version in kohaversion.pl
> > up-to-date!
> > > >
> > > > +BEGIN { $XML::SAX::ParserPackage =
> > "XML::LibXML::SAX::Parser"; }
> > > # Make
> > > > sure MARC::File::XML uses the correct parser
> > > > +
> > > > use strict;
> > > > use warnings;
> > > >
> > > > diff --git a/misc/batchImportMARCWithBiblionumbers.pl
> > > > b/misc/batchImportMARCWithBiblionumbers.pl
> > > > index 4acc02b..8ca8a34 100755
> > > > --- a/misc/batchImportMARCWithBiblionumbers.pl
> > > > +++ b/misc/batchImportMARCWithBiblionumbers.pl
> > > > @@ -1,6 +1,22 @@
> > > > #!/usr/bin/perl
> > > > # load records that already have biblionumber set
> into a
> > koha
> > > system
> > > > # Written by TG on 10/04/2006
> > > > +
> > > > +# This file is part of Koha.
> > > > +#
> > > > +# Koha is free software; you can redistribute it
> and/or
> > modify
> > > it under
> > > > the
> > > > +# terms of the GNU General Public License as
> published by
> > the
> > > Free
> > > > Software
> > > > +# Foundation; either version 2 of the License, or (at
> > your
> > > option) any
> > > > later
> > > > +# version.
> > > > +#
> > > > +# Koha is distributed in the hope that it will be
> useful,
> > but
> > > WITHOUT
> > > > ANY
> > > > +# WARRANTY; without even the implied warranty of
> > MERCHANTABILITY
> > > or
> > > > FITNESS FOR
> > > > +# A PARTICULAR PURPOSE. See the GNU General Public
> > License for
> > > more
> > > > details.
> > > > +#
> > > > +# You should have received a copy of the GNU General
> > Public
> > > License
> > > > along
> > > > +# with Koha; if not, write to the Free Software
> > Foundation,
> > > Inc.,
> > > > +# 51 Franklin Street, Fifth Floor, Boston, MA
> 02110-1301
> > USA.
> > > > +
> > > > use strict;
> > > > #use warnings; FIXME - Bug 2505
> > > > BEGIN {
> > > > @@ -8,6 +24,7 @@ BEGIN {
> > > > # test carefully before changing this
> > > > use FindBin;
> > > > eval { require "$FindBin::Bin/kohalib.pl" };
> > > > + $XML::SAX::ParserPackage =
> > "XML::LibXML::SAX::Parser";
> > > > }
> > > >
> > > > # Koha modules used
> > > > diff --git a/misc/batchupdateISBNs.pl
> > b/misc/batchupdateISBNs.pl
> > > > index 8cc8c6a..6041b80 100755
> > > > --- a/misc/batchupdateISBNs.pl
> > > > +++ b/misc/batchupdateISBNs.pl
> > > > @@ -30,6 +30,8 @@ BEGIN {
> > > > # test carefully before changing this
> > > > use FindBin;
> > > > eval { require "$FindBin::Bin/kohalib.pl" };
> > > > + $XML::SAX::ParserPackage =
> > "XML::LibXML::SAX::Parser";
> > > > +
> > > > }
> > > > use C4::Context;
> > > > use MARC::File::XML;
> > > > diff --git a/misc/maintenance/MARC21_utf8_flag_fix.pl
> > > > b/misc/maintenance/MARC21_utf8_flag_fix.pl
> > > > index e7e9156..fbff241 100755
> > > > --- a/misc/maintenance/MARC21_utf8_flag_fix.pl
> > > > +++ b/misc/maintenance/MARC21_utf8_flag_fix.pl
> > > > @@ -1,6 +1,7 @@
> > > > #!/usr/bin/perl
> > > > #
> > > > # Copyright 2009 Liblime
> > > > +# Copyright 2010 Catalyst IT
> > > > #
> > > > # This file is part of Koha.
> > > > #
> > > > @@ -17,6 +18,8 @@
> > > > # with Koha; if not, write to the Free Software
> > Foundation,
> > > Inc.,
> > > > # 51 Franklin Street, Fifth Floor, Boston, MA
> 02110-1301
> > USA.
> > > >
> > > > +BEGIN { $XML::SAX::ParserPackage =
> > "XML::LibXML::SAX::Parser"; }
> > > # Make
> > > > sure correct sax parser is used
> > > > +
> > > > use strict;
> > > > use warnings;
> > > >
> > > > diff --git
> > a/misc/migration_tools/22_to_30/export_Authorities.pl
> > > > b/misc/migration_tools/22_to_30/export_Authorities.pl
> > > > index aeed3f2..cd24a7d 100755
> > > > ---
> a/misc/migration_tools/22_to_30/export_Authorities.pl
> > > > +++
> b/misc/migration_tools/22_to_30/export_Authorities.pl
> > > > @@ -6,6 +6,8 @@ BEGIN {
> > > > # test carefully before changing this
> > > > use FindBin;
> > > > eval { require "$FindBin::Bin/../../kohalib.pl"
> };
> > > > + $XML::SAX::ParserPackage =
> > "XML::LibXML::SAX::Parser"; #
> > > Make sure
> > > > correct sax parser is used
> > > > +
> > > > }
> > > > use C4::Context;
> > > > #use MARC::File::XML(BinaryEncoding=>"utf8");
> > > > diff --git
> > > a/misc/migration_tools/22_to_30/export_Authorities_xml.pl
> > > >
> b/misc/migration_tools/22_to_30/export_Authorities_xml.pl
> > > > index 70647ee..bd871c4 100755
> > > > ---
> > a/misc/migration_tools/22_to_30/export_Authorities_xml.pl
> > > > +++
> > b/misc/migration_tools/22_to_30/export_Authorities_xml.pl
> > > > @@ -6,6 +6,8 @@ BEGIN {
> > > > # test carefully before changing this
> > > > use FindBin;
> > > > eval { require "$FindBin::Bin/../../kohalib.pl"
> };
> > > > + $XML::SAX::ParserPackage =
> > "XML::LibXML::SAX::Parser"; #
> > > Make sure
> > > > correct sax parser is used
> > > > +
> > > > }
> > > > use C4::Context;
> > > > use MARC::File::XML(BinaryEncoding=>"utf8");
> > > > diff --git
> > > a/misc/migration_tools/22_to_30/move_marc_to_biblioitems.pl
> > > >
> > b/misc/migration_tools/22_to_30/move_marc_to_biblioitems.pl
> > > > index 3635dbc..051d93b 100755
> > > > ---
> > a/misc/migration_tools/22_to_30/move_marc_to_biblioitems.pl
> > > > +++
> > b/misc/migration_tools/22_to_30/move_marc_to_biblioitems.pl
> > > > @@ -8,6 +8,8 @@ BEGIN {
> > > > # test carefully before changing this
> > > > use FindBin;
> > > > eval { require "$FindBin::Bin/../../kohalib.pl"
> };
> > > > + $XML::SAX::ParserPackage =
> > "XML::LibXML::SAX::Parser"; #
> > > Make sure
> > > > correct sax parser is used
> > > > +
> > > > }
> > > > use C4::Context;
> > > > use C4::Biblio;
> > > > diff --git a/misc/migration_tools/bulkauthimport.pl
> > > > b/misc/migration_tools/bulkauthimport.pl
> > > > index 2709db9..5f859a1 100755
> > > > --- a/misc/migration_tools/bulkauthimport.pl
> > > > +++ b/misc/migration_tools/bulkauthimport.pl
> > > > @@ -8,6 +8,8 @@ BEGIN {
> > > > # test carefully before changing this
> > > > use FindBin;
> > > > eval { require "$FindBin::Bin/kohalib.pl" };
> > > > + $XML::SAX::ParserPackage =
> > "XML::LibXML::SAX::Parser"; #
> > > Make sure
> > > > correct sax parser is used
> > > > +
> > > > }
> > > >
> > > > # Koha modules used
> > > > diff --git a/misc/migration_tools/bulkmarcimport.pl
> > > > b/misc/migration_tools/bulkmarcimport.pl
> > > > index da0ebf7..5b1d823 100755
> > > > --- a/misc/migration_tools/bulkmarcimport.pl
> > > > +++ b/misc/migration_tools/bulkmarcimport.pl
> > > > @@ -9,6 +9,8 @@ BEGIN {
> > > > # test carefully before changing this
> > > > use FindBin;
> > > > eval { require "$FindBin::Bin/../kohalib.pl" };
> > > > + $XML::SAX::ParserPackage =
> > "XML::LibXML::SAX::Parser"; #
> > > Make sure
> > > > correct sax parser is used
> > > > +
> > > > }
> > > >
> > > > # Koha modules used
> > > > diff --git a/misc/sax_parser_print.pl
> > b/misc/sax_parser_print.pl
> > > > index d206a5e..ab5d1ff 100755
> > > > --- a/misc/sax_parser_print.pl
> > > > +++ b/misc/sax_parser_print.pl
> > > > @@ -3,6 +3,7 @@
> > > >
> > > > use strict;
> > > > use warnings;
> > > > +BEGIN { $XML::SAX::ParserPackage =
> > "XML::LibXML::SAX::Parser"; }
> > > > use XML::SAX::ParserFactory;
> > > >
> > > > my $parser = XML::SAX::ParserFactory->parser();
> > > > diff --git a/misc/sax_parser_test.pl
> > b/misc/sax_parser_test.pl
> > > > index b2e5974..359bb45 100755
> > > > --- a/misc/sax_parser_test.pl
> > > > +++ b/misc/sax_parser_test.pl
> > > > @@ -1,5 +1,7 @@
> > > > #!/usr/bin/perl
> > > >
> > > > +BEGIN { $XML::SAX::ParserPackage =
> "XML::LibXML::SAX"; }
> > > > +
> > > > use strict;
> > > > use warnings;
> > > > --
> > > > 1.6.3.3
> > > >
> > > > _______________________________________________
> > > > Koha-patches mailing list
> > > > Koha-patches at lists.koha.org
> > > > http://lists.koha.org/mailman/listinfo/koha-patches
> > >
> > > --
> > > Chris Cormack
> > > Catalyst IT Ltd.
> > > +64 4 803 2238
> > > PO Box 11-053, Manners St, Wellington 6142, New Zealand
> >
> > --
> > Chris Cormack
> > Catalyst IT Ltd.
> > +64 4 803 2238
> > PO Box 11-053, Manners St, Wellington 6142, New Zealand
>
> --
> Chris Cormack
> Catalyst IT Ltd.
> +64 4 803 2238
> PO Box 11-053, Manners St, Wellington 6142, New Zealand
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: </pipermail/koha-devel/attachments/20100428/930d23fe/attachment-0002.htm>
More information about the Koha-devel
mailing list