[Koha-devel] Strange characters
dcook at prosentient.com.au
dcook at prosentient.com.au
Thu Dec 17 03:30:33 CET 2020
So EF BF BD is the box with a question mark �.
If you look at commit 100e6a9808ead4ee8d951da59ead1550e75bb4c3, you'll see the following:
-# Casta361eda, Carlos Sebastian - seba3c at yahoo.com.ar - Physics Library UNLP Argentina
+# Casta�eda, Carlos Sebastian - seba3c at yahoo.com.ar - Physics Library UNLP Argentina
361 is the octal for ñ.
I wrote this Perl oneliner to find these:
find . -not -path '*/\.git/*' -type f -exec perl -lne 'print $ARGV if /\xef\xbf\xbd/' {} \; | sort -u
I found the following in 19.11:
./koha-tmpl/intranet-tmpl/lib/datatables/datatables.js
Seemingly intentional...
./koha-tmpl/intranet-tmpl/lib/datatables/datatables.min.js
Seemingly intentional
./koha-tmpl/intranet-tmpl/lib/yui/plugins/loading-min.js
Looks like a typo in a name like koha-news.pl
./misc/migration_tools/buildEDITORS.pl
Lots in a commented out section which should probably just be deleted...
./misc/release_notes/release_notes_19_11_02.html
Input error in an organisation name?
./misc/release_notes/release_notes_19_11_02.md
Input error in an organisation name?
./t/db_dependent/data/marc21/zebraexport/biblio/exported_records
In record data?
./tools/koha-news.pl
You already know this one
Using vim, you can just search for �.
David Cook
Software Engineer
Prosentient Systems
Suite 7.03
6a Glen St
Milsons Point NSW 2061
Australia
Office: 02 9212 0899
Online: 02 8005 0595
-----Original Message-----
From: Koha-devel <koha-devel-bounces at lists.koha-community.org> On Behalf Of Didier Gautheron
Sent: Thursday, 17 December 2020 3:03 AM
To: koha-devel <koha-devel at lists.koha-community.org>
Subject: Re: [Koha-devel] Strange characters
Hi,
16 décembre 2020 16:23 "Fridolin SOMERS" <fridolin.somers at biblibre.com> a écrit:
> Hi,
>
> I found some strange characters in sources :
>
> https://git.koha-community.org/Koha-community/Koha/src/branch/master/t
> ools/koha-news.pl#L7
>
> It se a <?> :
> Casta?eda, Carlos Sebastian
>
> Do you see that ?
It seems to be a valid UTF8:
ef bf bd
Character name REPLACEMENT CHARACTER
Likely from an old window file: ñ being the culprit.
> Is this non-UTF8 ?
> Can we build a command to find them all ?
> I've tried with 'grep -P' but impossible.
git grep
find them, with false positive, or using iconv?
iconv -f utf8 -t utf8
should complain if there's invalid sequences
eg:
LANG=C iconv -f utf8 -t utf8 ./misc/cronjobs/automatic_renewals.pl > /dev/null
iconv: illegal input sequence at position 81 _______________________________________________
Koha-devel mailing list
Koha-devel at lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
More information about the Koha-devel
mailing list