[OS X TeX] Documents less legible with T1 font encoding

Bruno Voisin bvoisin at mac.com
Fri Aug 5 13:48:26 EDT 2005


Le 5 août 05 à 19:27, Armin Goralczyk a écrit :

> So to get this clear (I am a little bit confused): T1 hyphenates  
> words with accents or umlauts correct, but CM-Super and Latin  
> Modern differ?! Why is that? Does that mean CM-Super and/or Latin  
> Modern hyphenation is incorrect or not optimal?

I do not think the rules of hyphenation differ in CM-Super and Latin  
Modern, these do not depend on the font directly.

With the standard CM fonts, the accented characters do not exist as  
such: they are composites, built at the time a TeX document is  
typeset, by combining accent characters with non-accented letters.  
For example, é is in fact the composite e + ´. It is input through a  
command \'e, so that for example the French word anémie (anaemia)  
would be entered as an\'emie. Alas: TeX cannot hyphenate words  
containing commands; as a consequence, any word containing an  
accented letter cannot be hyphenated at all as long as CM fonts are  
used.

However, the babel package introduces some trickery, such that a word  
containing a command is separated in two parts, which can each be  
hyphenated individually, yielding possible hyphenation either before  
or after the accent. This is better, but not optimal yet, as not all  
possible break points will be found, and those that are found are not  
necessarily correct.

With T1 fonts, on the contrary, and assuming you use the inputenc  
package with the correct encoding, you are both (1) entering accented  
letters as letters on your keyboard (like é), thanks to the inputenc  
package, avoiding the resort to commands, and (2) getting output in  
which the accented letters are individual characters (like é), thanks  
to the fontenc package and to the installation of the appropriate  
fonts. And your hyphenation points are finally correct!

Regarding CM-Super and Latin Modern, the difference has another  
origin: the metrics of the two font sets are very subtly different.  
This means that the width of each letter, and the amount of white  
space between letters, is very slightly different between the two  
sets. As a consequence, it may happen that the word to be hyphenated  
at the end of a line in a long paragraph is different when either set  
is used, yielding different hyphenated words. However, the  
hyphenation rules and algorithm in the two cases should be exactly  
the same.

Bruno Voisin--------------------- Info ---------------------
Mac-TeX Website: http://www.esm.psu.edu/mac-tex/
           & FAQ: http://latex.yauh.de/faq/
TeX FAQ: http://www.tex.ac.uk/faq
List Post: <mailto:MacOSX-TeX at email.esm.psu.edu>





More information about the MacOSX-TeX mailing list