[Koha-patches] [PATCH] Bug 9114: Make frameworks import/export routines correctly use UTF-8

Tomas Cohen Arazi tomascohen at gmail.com
Wed Jan 22 19:46:23 CET 2014


Currently both the import_export_framework.pl script outputs data with
Perl's default encoding, ISO-8859. This patch properly sets
the binmode to UTF-8 when exporting SQL and CSV files using the PerlIO layer
(":encoding(UTF-8)") for STDOUT.

To test:

Export step test
- Use some ASCII character(s) with DIACRITICS in some field description in a chosen framework.
- Export the framework at Administration > MARC frameworks
- Run this to check the file is ISO-8859 encoded:
 $ file export_XXX.csv
 export_XXX.csv: ISO-8859 text, with very long lines
 (Note: try SQL and other output formats too. But not ODS)
- Apply the patch
- Export the framework again (change the name), and test encoding:
 $ file export_XXX_2.csv
 export_XXX_2.csv: UTF-8 Unicode text

Import step test
I assume you have two files, export_XXX.csv (ISO-8859 encoded) and export_XXX_2.csv (XXX will depend on
your framework's code)
- Reset your testing branch to master
- Import export_XXX.csv
- The string with non-ASCII chars is truncated at the first non-ASCII char's position
  (Note: this is the current behaviour).
- Import export_XXX_2.csv
- The non-ASCII chars are broken, the logs show errors on non-UNICODE chars.
  (Note: even thou UTF-8 is the expected encoding it is treated as ISO-8859).
- Apply the patch
- Import the good (UTF-8 as expected) file and check everything worked as expected.

No double encoding should occur with either combination of formats.

Regards
To+

Sponsored-by: Universidad Nacional de Cordoba
---
 admin/import_export_framework.pl | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/admin/import_export_framework.pl b/admin/import_export_framework.pl
index 8674ebf..8f8bd7c 100755
--- a/admin/import_export_framework.pl
+++ b/admin/import_export_framework.pl
@@ -34,6 +34,13 @@ if ($action eq 'export' && $input->request_method() eq 'GET') {
     my $strXml = '';
     my $format = $input->param('type_export_' . $frameworkcode);
     ExportFramework($frameworkcode, \$strXml, $format);
+
+    # Correctly set the encoding to output plain text in UTF-8
+    if ( $format eq 'csv' ||
+         $format eq 'sql' ) {
+        binmode(STDOUT,':encoding(UTF-8)');
+    }
+
     if ($format eq 'csv') {
         # CSV file
         print $input->header(-type => 'application/vnd.ms-excel', -attachment => 'export_' . $frameworkcode . '.csv');
-- 
1.8.3.2



More information about the Koha-patches mailing list