[OS X TeX] Invisible character
jonathan_kew at sil.org
Mon Jun 26 03:39:13 EDT 2006
On 26 Jun 2006, at 1:18 am, Ross Moore wrote:
> Does TeX need to have the \catcode idea extended
> to have flexibility with more characters ?
> With 32-bit and 64-bit machines now quite common (indeed standard),
> it shouldn't be too hard to implement this.
> Certainly it would need a new primitive, \UTFcatcode say,
> that would consider multiple bytes on input, and either set flags
> within the extra (currently unused) bytes, or adjust the
> normal \catcode of each byte in some appropriate way.
> Interesting concept.
Forget the bytes; think in terms of Unicode characters. And then set
the \catcode for a *character*, whether that character was
represented in the input as a single (ASCII) byte or a multi-byte
UTF-8 sequence (or a UTF-16 value, for that matter).
So you can say \catcode`\क = 11 or \catcode`\你 = 12 or whatever,
and it works.
Which happens to be how xetex does it. :)
> One day we'll want to move to UTF16 input as well.
> Thus TeX's method of tokenisation really will need
> to be changed to accommodate this.
xetex reads UTF-16 as well as UTF-8, and it makes no difference at
all to macro processing, catcodes, etc., as everything works in terms
of the Unicode characters.
------------------------- Info --------------------------
Mac-TeX Website: http://www.esm.psu.edu/mac-tex/
& FAQ: http://latex.yauh.de/faq/
TeX FAQ: http://www.tex.ac.uk/faq
List Archive: http://tug.org/pipermail/macostex-archives/
More information about the MacOSX-TeX