e: [OS X TeX] utf8 problem and one TeXShop bug
Peter Dyballa
Peter_Dyballa at Web.DE
Thu Apr 19 05:55:42 EDT 2007
Am 19.04.2007 um 03:49 schrieb Chabot Denis:
> But in this example, I took care of setting the encoding to utf8 in
> preferences before opening this file, and all the accented vowels
> and degree signs etc. display just fine in TeXShop.
That's not the question: what you see can look right, but it can be
encoded in a zillion of ways and LaTeX supports possibly a dozen
(less than TeXShop). An editor like GNU Emacs can give you exact
information of the encoding used to present the file's contents
(these are two different things: the actual contents of a file and
the way this contents is presented to you with some computer
programme) and you can provide information about the encoding in
which the buffer's contents is to be saved to a file.
With TeXShop nothing is clear or obvious, if you like you can /
believe/ something.
>
> For now I'll keep this document in iso Latin 1, it is just more
> confusing when some of my files are in one encoding and others are
> in another. With R and Sweave (which allow me to combine R commands
> within a LaTeX document), I need UTF8 and I thought I'd standardize
> on this. I guess I can't do this until I understand better what
> fails with my example.
You should be able to work with UTF-8 text encoding and TeXShop.
Provided you do two or three things:
change TeXShop's preferences, save them, and quit TeXShop
make your UTF-8 files start with %%!TEX encoding = UTF-8 Unicode
whenever TeXShop tells you it can't decipher an encoding don't start
to work on this file
In the last case a forced quit (pressing the Alt key while choosing
Quit in the Dock) might help to keep the integrity of the file.
On the command line you have at least one option to convert your ISO
Latin-1 LaTeX file into a correct UTF-8 encoded LaTeX. This can be
done with the iconv programme, but before this can start to do its
job you must provide that the new file will be recognised by TeXShop
as a UTF-8 encoded file. And because tcsh and bash both are a bit too
curious we need to escape the exclamation mark:
cd «to where the file is»
echo "%%\!TEX encoding = UTF-8 Unicode" > UTF-8_file.tex
cat «old file name.tex» | grep -v "TEX encoding =" | iconv -f
ISO-8859-1 -t UTF-8 >> UTF-8_file.tex
• In the first line you change your working directory in the shell
running in Terminal to the place where your LaTeX file resides.
• Then you create the new UTF-8 encoded file by writing a single line
into it, the header component ``%%!TEX encoding = UTF-8 Unicode´´.
• Finally the cat command reads the contents of the ISO Latin-1
encoded file as is and passes it in a UNIX pipe to the grep command,
which strips the TeXShop file encoding header line(s). No matter
whether such a line exists or not, the cleaned result (provided the
header line is exactly written as in the argument for grep, if not
this argument needs to be adapted to this writing) is passed via
another pipe to iconv, which according to the -f(rom) encoding
interprets this input stream of data and converts it into an UTF-8
encoded output stream according to the -t(o) encoding given. This
data is then "redirected" from "standard output" to the recently
created new file. By using ``>>´´ instead of the simple ``>´´ the
output of iconv is *added* to the previous contents. Otherwise the
new contents overwrites the old one.
Iconv might output messages about characters it can't convert – then
you would have an indication that the file's contents is already
polluted by TeXShop and why LaTeX cannot finish.
Now let's see whether TeXShop can handle this!
--
Greetings
Pete
If you're not confused, you're not paying attention.
------------------------- Helpful Info -------------------------
Mac-TeX Website: http://www.esm.psu.edu/mac-tex/
TeX FAQ: http://www.tex.ac.uk/faq
List Archive: http://tug.org/pipermail/macostex-archives/
List Reminders & Etiquette: http://www.esm.psu.edu/mac-tex/list/
More information about the MacOSX-TeX
mailing list