[OS X Emacs] how to determine file encoding

Doug McNutt douglist at macnauchtan.com
Wed Nov 14 15:58:08 EST 2007


At 20:50 +0100 11/14/07, Peter Dyballa wrote:
>Am 14.11.2007 um 16:45 schrieb Dirk Schlimm:
>
>>I received some text files and now I'd like to know how they are
>>encoded (UTF-8, Latin9, etc.). Is there a simple way of finding this
>>out?
>
>Yes: visit the files and change the encoding as long as you can see  rubbish. When the text appears clean and readable, then you've found  a useful encoding.
>
>Putting it in other words: there is no reliable way. The UNIX command  file can help a bit, the file name's extension can be helpful, too.

There are a couple of things that are worth looking for but they're not required.

The file might begin with a Doctype line which can be quite helpful..

It also might begin with a byte order mark in UTF fashion.  FFFE for 16 bit UTF with "standard" byte ordering and  FEFF for dyslexic versions but which you see depends on the state of dyslexity of the machine you're reading with. Sometimes the 24 bit equivalent of FFFE shows up at the start of a UTF-8 encoded file.

Peter Dyballa continued:
>Without vi there is only GNU Emacs

There once was MPW - the Macintosh Programmer's Workshop - which IMHO was better than either of the above. Apple has killed it.  Sigh.

-- 

Applescript syntax is like English spelling:
Roughly, though not thoroughly, thought through.



More information about the MacOSX-Emacs mailing list