[OS X Emacs] Re: Return of dired mode problems like in Aquaemacs 1.7

Peter Dyballa Peter_Dyballa at Web.DE
Sat Aug 15 22:42:23 EDT 2009

Am 16.08.2009 um 02:57 schrieb David Reitter:

> I'd still like to know how to trigger the basic problem showing a  
> directory in dired in Emacs 23.1 (and none of the older Emacs  
> variants are relevant when it comes to fixing this bug).  That  
> would be VERY helpful now.

With the 8-bit ISO Latin locales Emacs.app receives in dired from (g) 
ls obviously 8-bit data in the case of the date's Mär string. The X  
client just switches to octal representation (why?*1*) while Cocoa  
has no such fall-back, so it fails completely, and maybe emits an  
extra error code which the Emacs core does not expect.

The X client's *shell* buffer is switched to ISO Latin, so the output  
of (g)ls is displayed correctly as umlaut. In dired again 8-bit data  
is received from (g)ls, easily handled, but terribly displayed  
because formatted by internal (butcher's) machinery. It's easy to  
follow into dïräçtörÿ. (GNU Emacsen 22.x have a big problem:    
character:  (332488, #o1211310, #x512c8, U+0308) is the ü in a file  
name. In Carbon and Aquamacs Emacs based 22.x source the problem is  
solved by using the UTF-8m file-name-encoding. GNU Emacsen 22.x have  
no problem to find a file or to traverse into a "dïräçtörÿ." Well, it  
just does not look so good with OPEN BOX characters in the path or  
file name.)

In the UTF-8 environment (g)ls obviously emits valid UTF. In case of  
the file name the underlying Apple file system (HFS and UFS) returns  
only de-composed characters. So (g)ls should learn to compose when  
GNU Emacs can't. This would solve the problem with dired and *shell*  
etc., but not with C-x C-f, i.e., dired-x-find-file. Obviously it  
works to enter: C-x C-f /path/to/dïräçtörÿ/file, but this file is  
from nirvana, not from /path/to/dïräçtörÿ ("Use M-x make-directory  
RET RET to create the directory and its parents") and it does not  
work at all with file name completion. Can it be that a "composed"  
character gets de-composed before "something" does file name  
completion? The tcsh I use in Terminal has a very special  
presentation of "dïräçtörÿ", and I cannot invoke:

	ls -lw /path/to/<dïräçtörÿ presentation> RET

*1* The ä in the Mär month date is described as

	        character: \344 (4194276, #o17777744, #x3fffe4)
	preferred charset: eight-bit (Raw bytes 128-255)
	       code point: 0xE4
	           syntax: w 	which means: word
	      buffer code: #xE4
	        file code: not encodable by coding system iso-latin-9-unix
	          display: no font available

which certainly is wrong in respect to the ISO Latin encoding in the  
dired buffer, because LATIN SMALL LETTER A WITH DIAERESIS is 344 =  
228 = E4 = U+00E4. It's a valid data, nothing raw. Maybe here is the  
cause for one failure. It's correct in *shell*, and therefore I sent  
a bug report: http://emacsbugs.donarmstrong.com/cgi-bin/bugreport.cgi? 

In an UTF-8 environment the "ä" in the date field are in *shell* and  
in dired both:

	        character: ä (228, #o344, #xe4)
	preferred charset: iso-8859-1 (Latin-1 (ISO/IEC 8859-1))
	       code point: 0xE4
	           syntax: w 	which means: word
	         category: .:Base, j:Japanese, l:Latin
	      buffer code: #xC3 #xA4
	        file code: #xC3 #xA4 (encoded by coding system utf-8-unix)
	          display: by this font (glyph code)
iso10646-1 (#xE4)

just as in 22.x *shell*. In the file name it's both times:

	        character: a (97, #o141, #x61)
	preferred charset: ascii (ASCII (ISO646 IRV))
	       code point: 0x61
	           syntax: w 	which means: word
	         category: .:Base, a:ASCII, l:Latin, r:Roman
	      buffer code: #x61
	        file code: #x61 (encoded by coding system utf-8-unix)
	          display: composed to form "ä" (see below)
	Composed with the following character(s) "" using this font:
	by these glyphs:
	  [0 1 97 97 7 1 7 7 0 nil]
	  [0 1 776 776 0 0 5 14 -12 [-5 3 0]]

telling also that the ¨ character was missed.

Mit friedvollen Grüßen

   Pete       (:
         _    / __    -    -
       _/ \__/_/        -     -
      (´`)      (´`)   -    -
       `´        `´

More information about the MacOSX-Emacs mailing list