[NTLUG:Discuss] Counting key presses in a file...

Mon Aug 27 18:55:29 CDT 2007

Richard Geoffrion wrote:
> Robert Citek wrote:
>> On 08/27/2007 03:13 PM, Richard Geoffrion wrote:
>>   
>>> I know that 'wc' can count words and lines, but how would one count individual keystrokes.  
>>> While 'a' would constitute ONE keystroke, 'A' would constitute two. 
>>> (Shift + a).
>>>     
>> Not necessarily.  If you have CapsLock on, then the reverse would be
>> true. That is, 'A' would count as one and 'a' as two.  What would happen if it was a string of 'A's, as in AAA?  <snip>
>>> Carriage returns would count as one while bolding text
>>> would constitute four keystrokes as one would need a CTRL-B to turn 
>>> bolding on and another to turn bolding off.
>>>     
>> What if I type 'a^Hah'?  What if I type 'a{left arrow}h'?  What if I
>> type 'ah'?  They all produce 'ah' but are a different number of
>> key-strokes.  Lastly, what about cut and paste?
>>
>>   
>>> How would one go about <snip converting [a document] into something that can be parsed and counted?
>>>     
>> If you can't use Word, which has a built-in counter, then open the
>> document in OpenOffice Writer, press Ctrl-A to highlight everything,
>> press Ctrl-C to copy all the text in the clipboard, open gedit, press
>> Ctrl-V to pasted the text into gedit, and save as a textfile
>> "foobar.txt"  Then open a terminal and type 'wc foobar.txt'.
>>
>> Am I even close to answering your question?
>>
>>   
> These are all VERY good points and maybe I should state a more 
> end-product kind of thing.  The end result would be an automated script 
> that could evaluate a set of files to calculate a $$ value for a 
> document based on the number of characters typed.   (One would have to 
> assign certain values to CAPS, *bold*, _underline_  et al.)

You know since this is going to vary greatly and probably be
very error prone.... perhaps something more probablistic is warranted.
Maybe just assume a percentage based on normal text.

Of course, I realize you are trying to be exact, but there are
way too many variables.

> 
> So, no matter how quickly or efficiently Bob can type a document, he 
> only gets paid a set value for the document based on pre established 
> rules.  If Bob can figure out shortcuts to transcribing his documents, 
> then fine -- he makes extra money.  If Bob has to correct 1/3 of the 
> text he types or chooses to use the mouse to do formatting -- well then 
> that's his problem for being woefully inaccurate or inefficient.
> 
> So with set values assigned for each character type and formatting 
> class.....any words of wisdom to solve this issue?  I've been on 
> sourceforge and freshmeat but no luck so far.  I've seen a couple of 
> commercial win32 packages, but win32 apps don't lend themselves too well 
> to automation in a linux cron job. :)

Those win32 apps HAVE to be making some huge assumptions.  Perhaps
I missed it.  Have you defined things down to a particular app,
particular keyboard type, particular language?

> 
> If one *DID* try to parse a file manually... what would one need to do?  
> Lots of grepping, counting and stripping of control characters? Counting 
> and stripping higher cost characters that have ascii values > 127 
> (typically upper case and foreign characters)?
> 
> I'm exploring the OO macro scene. Hopefully there is a similar project 
> somewhere.
> 

Ok.. so we're assuming that what is being typed is an OOo document
of some sort.  I'm still not sure if that limits the variables down
enough though.

I'm not sure that your "motivation" paragraph has sold me on
the benefits of this (not that you have to sell me on the idea
of course).  You're wanting to be very precise... are we
talking about VERY high volumes or something?  You don't have
to answer that... just wondering why such high precision is
needed.

???