[OS X TeX] converting ligatures into text
William F. Adams
wadams at atlis.com
Fri Apr 22 09:48:28 EDT 2005
On Apr 22, 2005, at 9:37 AM, Lawrence Paulson wrote:
> I have to extract text from a large number of PDF documents produced
> using TeX. Because (I presume) of TeX's non-standard font encodings,
> cut and paste often goes wrong. In particular, ligatures get garbled:
> I get di±cult instead of difficult.
> Does anybody know of a program (or of a definitive set of replacements
> that could be given to Perl) for cleaning up such text?
Marcel Weiher's TextLightning.app has explicit support for TeX ligature
encodings --- http://www.metaobject.com Shareware and well worth the
money. (ob. discl. I was a beta-tester).
I'm surprised you're having this difficulty though --- what program are
you using? Isn't there a pdftotext program as part of xpdf?
William Adams, publishing specialist
voice - 717-731-6707 | Fax - 717-731-6708
--------------------- Info ---------------------
Mac-TeX Website: http://www.esm.psu.edu/mac-tex/
& FAQ: http://latex.yauh.de/faq/
TeX FAQ: http://www.tex.ac.uk/faq
List Post: <mailto:MacOSX-TeX at email.esm.psu.edu>
More information about the MacOSX-TeX