[OS X TeX] converting ligatures into text

William F. Adams wadams at atlis.com
Fri Apr 22 09:48:28 EDT 2005


On Apr 22, 2005, at 9:37 AM, Lawrence Paulson wrote:

>  I have to extract text from a large number of PDF documents produced 
> using TeX. Because (I presume) of TeX's non-standard font encodings, 
> cut and paste often goes wrong. In particular, ligatures get garbled: 
> I get di±cult instead of difficult.
>
> Does anybody know of a program (or of a definitive set of replacements 
> that could be given to Perl) for cleaning up such text?

Marcel Weiher's TextLightning.app has explicit support for TeX ligature 
encodings --- http://www.metaobject.com Shareware and well worth the 
money. (ob. discl. I was a beta-tester).

I'm surprised you're having this difficulty though --- what program are 
you using? Isn't there a pdftotext program as part of xpdf?

William

-- 
William Adams, publishing specialist
voice - 717-731-6707 | Fax - 717-731-6708
www.atlis.com

--------------------- Info ---------------------
Mac-TeX Website: http://www.esm.psu.edu/mac-tex/
           & FAQ: http://latex.yauh.de/faq/
TeX FAQ: http://www.tex.ac.uk/faq
List Post: <mailto:MacOSX-TeX at email.esm.psu.edu>





More information about the MacOSX-TeX mailing list