e: [OS X TeX] utf8 problem and one TeXShop bug

Peter Dyballa Peter_Dyballa at Web.DE
Thu Apr 19 05:55:42 EDT 2007

Am 19.04.2007 um 03:49 schrieb Chabot Denis:

> But in this example, I took care of setting the encoding to utf8 in  
> preferences before opening this file, and all the accented vowels  
> and degree signs etc. display just fine in TeXShop.

That's not the question: what you see can look right, but it can be  
encoded in a zillion of ways and LaTeX supports possibly a dozen  
(less than TeXShop). An editor like GNU Emacs can give you exact  
information of the encoding used to present the file's contents  
(these are two different things: the actual contents of a file and  
the way this contents is presented to you with some computer  
programme) and you can provide information about the encoding in  
which the buffer's contents is to be saved to a file.

With TeXShop nothing is clear or obvious, if you like you can / 
believe/ something.

> For now I'll keep this document in iso Latin 1, it is just more  
> confusing when some of my files are in one encoding and others are  
> in another. With R and Sweave (which allow me to combine R commands  
> within a LaTeX document), I need UTF8 and I thought I'd standardize  
> on this. I guess I can't do this until I understand better what  
> fails with my example.

You should be able to work with UTF-8 text encoding and TeXShop.  
Provided you do two or three things:

	change TeXShop's preferences, save them, and quit TeXShop
	make your UTF-8 files start with %%!TEX encoding = UTF-8 Unicode
	whenever TeXShop tells you it can't decipher an encoding don't start  
to work on this file

In the last case a forced quit (pressing the Alt key while choosing  
Quit in the Dock) might help to keep the integrity of the file.

On the command line you have at least one option to convert your ISO  
Latin-1 LaTeX file into a correct UTF-8 encoded LaTeX. This can be  
done with the iconv programme, but before this can start to do its  
job you must provide that the new file will be recognised by TeXShop  
as a UTF-8 encoded file. And because tcsh and bash both are a bit too  
curious we need to escape the exclamation mark:

	cd «to where the file is»
	echo "%%\!TEX encoding = UTF-8 Unicode" > UTF-8_file.tex
	cat «old file name.tex» | grep -v "TEX encoding =" | iconv -f  
ISO-8859-1 -t UTF-8 >> UTF-8_file.tex

• In the first line you change your working directory in the shell  
running in Terminal to the place where your LaTeX file resides.
• Then you create the new UTF-8 encoded file by writing a single line  
into it, the header component ``%%!TEX encoding = UTF-8 Unicode´´.
• Finally the cat command reads the contents of the ISO Latin-1  
encoded file as is and passes it in a UNIX pipe to the grep command,  
which strips the TeXShop file encoding header line(s). No matter  
whether such a line exists or not, the cleaned result (provided the  
header line is exactly written as in the argument for grep, if not  
this argument needs to be adapted to this writing) is passed via  
another pipe to iconv, which  according to the -f(rom) encoding  
interprets this input stream of data and converts it into an UTF-8  
encoded output stream according to the -t(o) encoding given. This  
data is then "redirected" from "standard output" to the recently  
created new file. By using ``>>´´ instead of the simple ``>´´ the  
output of iconv is *added* to the previous contents. Otherwise the  
new contents overwrites the old one.

Iconv might output messages about characters it can't convert – then  
you would have an indication that the file's contents is already  
polluted by TeXShop and why LaTeX cannot finish.

Now let's see whether TeXShop can handle this!



If you're not confused, you're not paying attention.

------------------------- Helpful Info -------------------------
Mac-TeX Website: http://www.esm.psu.edu/mac-tex/
TeX FAQ: http://www.tex.ac.uk/faq
List Archive: http://tug.org/pipermail/macostex-archives/
List Reminders & Etiquette: http://www.esm.psu.edu/mac-tex/list/

More information about the MacOSX-TeX mailing list