[Koha-devel] kohastructure.sql : columns with COLLATE

Paul A paul.a at navalmarinearchive.com
Sat Sep 28 01:50:16 CEST 2019


On 2019-09-27 10:27 a.m., Fridolin SOMERS wrote:
> Hi
> 
> I'm working on DB structure compare.
> I note that in installer/data/mysql/kohastructure.sql, a few columns 
> define CHARACTER SET and/or COLLATE, with same values as table.
> 
> Is this usefull or should we remove it ?

Might I assume that this is in a context of upgrade rather than new 
install? If not, the following is perhaps irrelevant.

In the "old days", (+/- 5 years ago, Debian 7, Ubuntu 14; MySQL 5.7 -- 
all "thereabouts"), my[sql].cnf in its various formats and paths used to 
include:
    # ensure charsets.
    character-set-server=utf8
    collation-server=utf8_general_ci

This has been dropped in more recent releases, corresponding to 
introduction dates of utf8mb4 (most importantly 3- to 4-bit encoding 
which can seriously impact some field lengths) -- core functionality is 
perhaps presumed.

Specificity may well be useful for 'international' and 'backwards 
compatibility' reasons -- both of which are important to Koha where 
admins may be very late in upgrading os/MySQL/Koha versions. The use of 
COLLATE with a specific CHARACTER SET should|could|might find (various 
find|sorting algorithms) existing, relevant data -- despite the fact 
that Koha appears to respect a system (Debian, Ubuntu) wide 'plain' utf8.

However, MySQL (8+) and InnoDB may not be totally aligned (I have no 
practical knowledge of MariaDB), and the char sets utf8, utf8mb4, 
utf8mb4_general_ci, utf8mb4_unicode_ci (maybe more) are an ongoing 
source of improvement and discussion.

Surely Koha should not remove this [not-neccessarily-redundant] back-stop.

Best -- Paul
> 
> $ grep  'COLLATE' installer/data/mysql/kohastructure.sql  | grep -v 
> ENGIN | grep 'utf8mb4_unicode_ci'
>    `category_name` varchar(32) COLLATE utf8mb4_unicode_ci NOT NULL 
> DEFAULT '',
>    `font` char(10) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT 
> NULL DEFAULT 'TR',
>    `text_justify` char(1) CHARACTER SET utf8mb4 COLLATE 
> utf8mb4_unicode_ci NOT NULL DEFAULT 'L',
>    `auto_renew_error` varchar(32) COLLATE utf8mb4_unicode_ci DEFAULT 
> NULL, -- automatic renewal error
>    `auto_renew_error` varchar(32) COLLATE utf8mb4_unicode_ci DEFAULT 
> NULL, -- automatic renewal error
>    `suppliers_report` MEDIUMTEXT COLLATE utf8mb4_unicode_ci, -- reports 
> received from suppliers
>        entity varchar(16) COLLATE utf8mb4_unicode_ci NOT NULL,
>        code varchar(64) COLLATE utf8mb4_unicode_ci NOT NULL,
>        lang varchar(25) COLLATE utf8mb4_unicode_ci NOT NULL, -- could be 
> a foreign key
>        translation MEDIUMTEXT COLLATE utf8mb4_unicode_ci,
>    `branchcode` varchar(10) CHARACTER SET utf8mb4 COLLATE 
> utf8mb4_unicode_ci DEFAULT NULL,
> 



More information about the Koha-devel mailing list