[OS X TeX] Strangeness re IsoLatin9 and UTF-8 encodings

Richard Koch koch at uoregon.edu
Sat Feb 24 19:53:58 EST 2024


Bill,

TeXShop 5.27 comes with a Manual, available under the TeXShop Help Menu. The manual has about 38 chapters and chapter 7 is all abound encoding. I recommend that you read it.

A computer file is just a long stream of bytes, which each byte is an integer between 0 and 255. For most encodings except UTF-8, any random stream of bytes is a legal file. Usually the first 127 bytes contain standard ASCII characters, but the upper 127 bytes can refer to many different characters, depending on the encoding. Sadly, the file does not have a special code at the start indicating the coding.

Files encoded with UTF-8 Unicode are different. These files can contain any unicode character, and thus can display a combination of English, Cyrillic, Arabic, Chinese, Japanese, and so forth. The characters are ENCODED in a special way in these files, so random strings of characters are usually illegal. Thus when TeXShop or any other editor reads such a file, and the file claims to be UTF-8 but isn't, the computer will almost surely find illegal code in the file and refuse to read it. In that case, the dialog you show will appear.  TeXShop will then offer to read the file in IsoLatin 9 because that is the most common other encoding for TeX. This may not be the correct encoding, but at least there will be no error reading the file because any random stream of characters is legal in IsoLatin 9. 

This explains why TeXShop put up the message you show.

As for the sequence of events leading up to this dialog, there is too little information to guess why this happened.

Dick Koch




More information about the MacOSX-TeX mailing list