Word Count using TPIPE

There was a recent post asking for a Word Count command.

Using TPIPE, I can count the number of lines in a file. Example;
Code:
echo %@execstr[tpipe /input=mytextfile.txt /grep=5,0,0,0,1,0,0,0,"[^ \t\r\n]*"] lines

Is it possible to use TPIPE to count the number of words in a file?

I was looking at the /simple=40 option, that creates a word list, but when I did;
Code:
%@execstr[tpipe /input=mytextfile.txt /simple=40]
it just gave me the first word in the text file.

I was thinking that the list could be output to a temporary file, and then use @lines[temporaryfile] to get the word count. I'm sure that TPIPE can somehow do all of this, not just sure how.

Joe
 
...and here's a quick little batch file to obtain the word count of a text file, and store the word count in the _wc environment variable;
Code:
:: wc.btm
@setlocal
@echo off
iff %# eq 0 then
  echo USAGE: wc.btm yourtextfile.txt
  quit
endiff
iff exist %1 then
  set output=%@unique[]
  tpipe /input=%1 /simple=40 /output=%output
  set _wc=%@inc[%@lines[%output]]
  if exist %output del /q %output
  echo The word count of %1 is in the variable _wc
else
  echo %1 does not exist
endiff
endlocal _wc

Now, when I used the wc.exe inside my Cygwin Terminal, it gave me a word count of 914 on a text file that I used for testing.

The wc.btm file returned a word count of 886 words.

Does TPIPE count words differently than wc.exe?

Joe
 
One of these days, someone has to create a .CHM file of all the plugins available....

Using words from your plugin, it said that my test file had;
Code:
  873 words total, 392 unique, 88 proper.  914 runs of non-blanks.
  11 sentences total:  10.  0!  1?  Average sentence 14.8 words.
  9 paragraphs, 112 titles.  Average paragraph 1.2 sentences.
  364 lines total, 201 not blank; the longest had 78 characters.
  9521 characters in 9521 bytes (OEM, prewrapped).

I like the _words internal variable, which, of course, returned 873.

If I use the Word/Line Count option from within VIEW, I get 903 words.

So, four different ways to count words in the same file, and four different values returned;
Code:
wc.btm - 886
wc.exe - 914
_words - 873
view - 903

Joe
 

Charles Dye

Super Moderator
Staff member
May 20, 2008
4,461
88
Albuquerque, NM
prospero.unm.edu
The _WC internal variable (displayed as "runs of non-blanks" in the command's output) is intended to return the same value as the Unix wc command. (You can use it in loo of that utility, yuk yuk.)
 
The _WC internal variable (displayed as "runs of non-blanks" in the command's output) is intended to return the same value as the Unix wc command. (You can use it in loo of that utility, yuk yuk.)

Your _wc gave me the same results as my wc.btm. Then it dawned on me...

I had to unset the _wc from my wc.btm so that I could use the _wc from your plugin.

Well, your _wc returns 914, same as the cygwin wc.exe

Winner!

Joe
 
May 20, 2008
11,400
99
Syracuse, NY, USA
Why not TPIPE with both?

Code:
/simple=40 /grep=5,0,0,0,1,0,0,0,"[^ \t\r\n]*"

It ought to work but I couldn't get an accurate count on a file.

Code:
v:\> echo My dog has fleas.^nMy cat has fleas. | tpipe /simple=40 /grep=5,0,0,0,1,0,0,0,"[^\s\t\r\n]*"
8
 
Why not TPIPE with both?

Code:
/simple=40 /grep=5,0,0,0,1,0,0,0,"[^ \t\r\n]*"

Thanks Vince. That makes my batch file cleaner;
Code:
:: wc.btm
@setlocal
@echo off
iff %# eq 0 then
  echo USAGE: wc.btm yourtextfile.txt
  quit
endiff
iff exist %1 then
  set _wc=%@execstr[tpipe /input=%1 /simple=40 /grep=5,0,0,0,1,0,0,0,"[^\s\t\r\n]*"]
  echo The word count of %1 is in the variable _wc
else
  echo %1 does not exist
endiff
endlocal _wc

It still returns a word count of 886. The plugin that Charles provided returns the same result as the Cygwin wc.exe, so I'm going to stick with that.

That is, unless, TPIPE can be made to return a word count of 914 on my test file.

Joe
 
Last edited:
May 20, 2008
11,400
99
Syracuse, NY, USA
After more experimenting, it seems that TPIPE's "/simple=40" is totally inappropriate (IMHO, anyway). Apparently, it considers only a-z, A-Z, 0-9, and '-' as "word characters". That will rarely agree with any version of WC.EXE that I've seen.
 
Similar threads
Thread starter Title Forum Replies Date
W Starting program for word to pdf conversion (difference cmd and tcc) Support 13
dcantor @WORD vs @FIELD in v 22 Support 3
rps How to? @word help Support 6
J Documentation New help system: "skip word list" Support 11
vefatica @WORD[1-2,...] ? Support 2
Stefano Piccardi TPIPE and word to text conversion Support 4
G WAD PDIR @IF bug, @WORD bug Support 8
M What can I do to "word around" this problem? Support 8
T @Word from Lines with Pipes Support 3
vefatica @WORD[], quoted string? Support 8
R A problem using find in list when word wrap? Support 2
R FUNCTION to count NUMBER of files matching a pattern and specific length/not length Support 4
Dick Johnson How to force TCC to count in proper numerical order Support 6
Jay Sage TASKLIST Command Process Count Support 5
vefatica Fixed KEYSTACK /i with repeat count sends wrong keys Support 7
R @count to count commas Support 20
B %@lines doesn't count last line without cr/lf at eof Support 1
Jesse Heines Using Regular Expressions with the REN commanc Support 8
R WAD Unusable state when using Chinese characters Support 3
Joe Caverly Using TYPE with non-English text Support 22
L Using TCC.exe through an SSH connection Support 3
Joe Caverly Using @PSHELL from @EVAL Support 8
Jay Sage Command Will Not Run Using Short Name of Path Support 5
Jay Sage Cannot Postion Cursor in Command Line Using Mouse Support 7
MikeBaas Using the debugger Support 2
Joe Caverly Multiple Text Searches at once using FFIND or TPIPE Support 4
Dick Johnson Using the @instr function Support 5
Joe Caverly Using this CMD technique from TCC Support 17
Joe Caverly What version and Windows OS are you using? Support 3
Craig Fitzgerald Problems using tcstart.btm Support 3
R How to? Append files in multiple subfolders using copy? Support 8
M Ctrl-C when using command line history does nothing Support 2
Joe Caverly Using a Directory Alias with @iniwrite fails Support 14
C French accents using msgbox Support 4
C "Failed to update the system registry. Please try using REGEDIT" Support 3
x13 Problem listing repository files using DIR http(s)://... Support 8
R Regex using ^ Support 2
cxxl WAD Mouse movement sluggish when using TCC list Support 3
jfalch WAD internal "which" should check for FILEEXIST when using AppPaths entry Support 8
D How to? Scale font in TCC using wheel mouse or other means Support 2
fpefpe Documentation copy/move using /c or /u Support 2
gschizas Fixed Using codepage 65001 (UTF-8) breaks non-ASCII characters Support 8
D How to? Use typed envars using regex. Support 3
epement Using "everything" Support 10
U Installing the Compaq Visual Fortran 6.0 compiler using Take Command Support 8
Alexander WAD The number of files and dirs are multiplied when using multiple wildcards Support 6
T using bottom line for status bar Support 3
A How to? How do you launch TCC/LE using the already-open instance? Support 5
rps Fixed Using last argument variable ! Support 5
M Incorrect ARGV1 when using linux-style paths Support 4

Similar threads