[Koha-devel] Diacriticals, Unicode, and PDF's
Nathan Gray
kolibrie at graystudios.org
Tue Sep 29 03:44:20 CEST 2009
On Mon, Sep 28, 2009 at 09:21:39PM -0400, Chris Nighswonger wrote:
> The UTF to PDF conversion issue appears to be primarily caused by the
> fact that the PDF stream uses glyphIDs rather than unicode to display
> strings. Thus there is not a direct, one-to-one unicode-gliphID
> relationship. The reason that *some* unicode chars come across ok is
> more ascribable to chance than to design. This happens when the
> unicode *happens* to match the font gliphID. What really should be
> happening is that there should be a "ToUnicode" table built and
> embedded in the PDF file so that the relationship from unicode to
> gliphID may be properly defined.
[snip]
> Any thoughts, information, suggestions, etc. is most gratefully appreciated.
The cairographics project has done a lot of work on PDFs and text
to glyph translation, if I remember correctly.
http://cairographics.org
A google search with these terms is a good start:
cairo graphics pdf text to glyph
It looks like they rely on pango libraries (something called
pangocairo in particular).
-kolibrie
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: Digital signature
URL: </pipermail/koha-devel/attachments/20090928/1b418f94/attachment-0003.pgp>
More information about the Koha-devel
mailing list