[OS X Emacs] M-x shell and file names with umlauts

Ian Eure ian at digg.com
Tue Oct 7 16:40:05 EDT 2008

On Oct 7, 2008, at 11:58 AM, David Reitter wrote:

> On 7 Oct 2008, at 14:49, Ian Eure wrote:
>>> then, suddenly, "ls" does the right thing!  I'm not sure what the  
>>> correct setting for the language bit would be - en_US.UTF-8 works  
>>> just as well.
>> The format is:
>> e.g. en_US.UTF-8 is US English encoded in UTF-8.
> Well yeah, I tried de_DE first because my test file name had a  
> German language umlaut, but the language doesn't matter for the  
> coding when it's UTF-8.  The real question is how I would detect the  
> right language from the user's settings.  You address this question:
I'm sure there's some API you can get it from, as well. This is  
definitely not what you want, though. I don't know if you tried this,  
but give it a shot:

LANG=de_DE.UTF-8 ls --help

>> I don't know if Emacs even has the ability to set environment  
>> variables for inferior processes. I think this is the job of your  
>> shell or system init files. I'd suggest adding this to your  
>> ~/.profile:
>> eval `locale | sed 's/^/export /'`
> Well I wouldn't ever want to alter the user's .profile from  
> Aquamacs, but of course I can set an env variable from within Emacs,  
> and inferior processes should inherit that.
> (M-x shell does not seem to start a login shell, by the way - just a  
> shell).
> The other problem is that from M-x shell, this is my locale:
> ~$ locale
> and from a login shell opened in iTerm, I get
> ~$ locale
> Neither of them would be sufficient to display non-ASCII file names.
Hm, yeah, locale sets LC_* based on LANG. If it's not set, you get  
"C", which is the legacy UNIX locale. In Terminal's preferences,  
there's a "Set LANG environment variable on startup" checkbox, so I  
assume it has some way to query the system-wide preferences to get  
that. I think you need to do the same thing in Aquamacs.

> Isn't at least the coding part of the locale up to the terminal that  
> is being used, i.e. Emacs with it's shell buffer?
Well, you can't set just the encoding part. You have to set the full  
locale, so you need to know what that is first.

>> Setting the coding-system for process I/O like you did (C-x RET p)  
>> is a part of setting your language environment. I'd suggest you set  
>> it to UTF-8, which can be accomplished by going to Options ->  
>> Language -> Set Language Environment -> UTF-8.
> So, just to make it clear, you're suggesting that we set the default  
> language environment that an Aquamacs user gets, to UTF-8?
> Sounds to me like this has some grave implications.  Originally I  
> was thinking of just modifying what M-x shell does, but of course  
> one could consider this.
Yeah, I don't think you should do that. I was just saying that it  
would be a good idea for you personally, not as a default for  
Aquamacs. It will make sense at some point in the (hopefully not too  
distant) future, as UTF-8 will replace other encodings.

I see nothing wrong with:

  - Determining the user’s locale
  - Determining the preferred encoding (from the chosen language  
  - Setting $LANG to that locale/encoding - if it's not already set.  
Probably be a good idea to warn if the chosen language environment  
clashes with $LANG.
  - Setting process I/O to match that encoding

  - Ian

