[OS X TeX] counting words in 2010
cfrees at imapmail.org
cfrees at imapmail.org
Thu Oct 28 21:12:35 EDT 2010
On Thu 28th Oct, 2010 at 19:59, David Derbes seems to have written:
> I'm not sure if this is a solution, but Excalibur counts words as it spell-checks.
Hmm... I didn't know that. How accurate is it?
> There is also WordService from Devon Technologies that works with many programs; it has a word count feature.
I use this for paragraphs but it is no use where there's a lot of
markup or for entire documents because it doesn't filter out the TeX
stuff at all. But you are right that it is a very useful service to
> David Derbes
> U of Chicago Laboratory Schools
> On Oct 28, 2010, at 6:56 PM, Dr. Clea F. Rees wrote:
>> As I understand it, TeXShop uses /usr/texbin/detex to calculate
>> document statistics (words, characters, lines). Specifically, it calls
>> detex via a wrapper script included in the application's resources. (In
>> my case, the wrapper has been "tweaked" but this is not relevant here.)
>> The problem I'm seeing is with /usr/texbin/detex as supplied with TeX
>> Live 2010 as opposed to the versions supplied with TeX Live 2008 and
>> 2009. Essentially, I'm getting much lower word counts than I should
>> because detex is stripping out text which it really shouldn't. The
>> things I'm certain about include footnote text and italicised text but
>> I suspect these are just a part of the problem.
>> I'm hoping this isn't intended to be a feature. Does anybody know:
>> - if this is a known (or unknown) bug?
>> - if there is any way of working around it? (I'm currently using the
>> 2009 issue of detex but that's a bit messy.)
>> - if there is a better way of getting document statistics?
>> Specifically, I need word counts which are as accurate as possible. But
>> if there is to be inaccuracy, it is generally better if the count is
>> reported as slightly higher than it really is rather than lower because
>> I'm typically trying to write stuff which does not exceed a given limit.
>> This makes the current detex almost useless.
>> I know detex is used for more than word counts but can't imagine what
>> purpose is served by stripping out italic text, for example. Please,
>> this isn't supposed to be a feature, is it? Please?!
>> This is also intended to alert people who rely on TeXShop's statistics
>> (or detex | wc) that the results may be unreliable with TeX Live 2010.
>> Perhaps I missed it, but I don't recall seeing any warnings to this
>> effect or information about changes to the current version of detex.
>> (If anybody saw such and can send me a pointer, that'd be great.)
>> ----------- Please Consult the Following Before Posting -----------
>> TeX FAQ: http://www.tex.ac.uk/faq
>> List Reminders and Etiquette: http://email.esm.psu.edu/mac-tex/
>> List Archive: http://tug.org/pipermail/macostex-archives/
>> TeX on Mac OS X Website: http://mactex-wiki.tug.org/
>> List Info: http://email.esm.psu.edu/mailman/listinfo/macosx-tex
More information about the MacOSX-TeX