[OS X TeX] bug with the cyrillic fonts
v.yu.shavrukov at gmail.com
Thu Sep 25 00:41:56 EDT 2014
On Sep 25, 2014, at 06:19, juan tolosa <juantolo at me.com> wrote:
> One baffling thing I noticed, though, is that if, after typesetting, I change the encoding back to the default
> Roman (Mac OS Roman), then the original .tex file that you sent me gets totally corrupted!
> Maybe this is as it should be?
Have not really tried playing with source encoding changes. Just chose UTF-8 on day one as likely to cover everything I may need in the future.
The simplest way for an application to implement this is to say, right, all the files I make are in this encoding, and all the files I open are interpreted as though they were made in that same encoding. (There are ways to be more sophisticated than that though.)
If you view a UTF-8 encoded file as though it was in Mac Roman, then most if not all of non-ANSI UTF-8 characters will be represented by three seemingly random charaters — UTF-8 agrees with ANSI on a basic set (basic Latin, digits, etc.), and uses three bytes to represent everything else. Mac Roman is a one character-per-byte encoding, and I do not remember if it agrees with ANSI on basics.
On a side note, plain text files are one of the oldest type of computer files, and their anatomy is such that it does not allow to specify the encoding, for in the beginning there was just one encoding and the need for more may not have been quite obvious. Nowadays one could specify the encoding in the metadata, but there is _no_ standard or universally supported way that any metadata could travel across platforms and filesystems.
More information about the MacOSX-TeX