TCC Grep

#1
TCC has ffind, but it can't replace. Unless I've missed it, TCC doesn't have grep?

And I've done a fair bit of searching, but I cannot find a decent stand-alone command-line grep utility for Windows. Can anyone recommend one? I'd rather not need to install an entire utility suite, but if that's what it takes, which would you recommend?
 
#2
FFIND does regular expression search, see the /E"xx" option.

AFAIK grep never did replacement, only sed and awk did. However, Vince Fatica's 4utils.dll includes the @xreplace function, which does regular expression replacement.
 
Feb 26, 2013
100
0
#3
Have you tried the TPIPE command in TCC? The syntax is somewhat arcane but it should do what you need and can be wrapped in an alias to make it simpler to use.
 
#5
FFIND does regular expression search, see the /E"xx" option.

AFAIK grep never did replacement, only sed and awk did. However, Vince Fatica's 4utils.dll includes the @xreplace function, which does regular expression replacement.
Ah, perhaps I am mistaken about grep alone doing replacements (I am the most neophyte of Linux users). So my question really should be, what stand-alone command-line utility (or combination of utilities), whether internal or external to TCC, can be used in Windows to search a file and make replacements (regular expressions not necessarily required)?
 
#8
... what stand-alone command-line utility (or combination of utilities), whether internal or external to TCC, can be used in Windows to search a file and make replacements (regular expressions not necessarily required)?
Many TCC users use sed (stream editor; see Tim Butterfield's post above where you can get a free copy). It processes a file line by line, and replaces matches to a regular expression or copies unmatched lines. It is a utility imported from Unix and its descendants. This seems to fill your need.
 
#11
Many TCC users use sed (stream editor; see Tim Butterfield's post above where you can get a free copy). It processes a file line by line, and replaces matches to a regular expression or copies unmatched lines. It is a utility imported from Unix and its descendants. This seems to fill your need.
Thanks, Tim and Steve. SED looks like the tool I need; I'm still trying to learn whether I can download/install just this one utility, or whether I need to install the entire package--and what might be involved in that.
 
#15
You might look for UnxUtils.zip. The package describes itself like this:
This are some ports of common GNU utilities to native Win32. In this context, native means the executables do only depend on the Microsoft C-runtime (msvcrt.dll) and not an emulation layer like that provided by Cygwin tools.
 
#16
GnuWin32 and UnxUtils seem to cover similar ground; there is some overlap, but still something of value in each. I downloaded sed from GnuWin32, and it does do the job.

But there is one problem (sorry for getting slightly off topic here, as this no longer has to do with TCC), but it doesn't appear that sed works with unicode files--or am I just missing it?

Can one cause Windows to use ASCII files instead of unicode files (for example, when exporting a task from task scheduler, the resulting unicode file can't be massaged by sed)?
 
Last edited:
#17
Here, Eric Pement (didn't he once hang around in these forums?) has tested several versions of SED.EXE. He claims the UnxUtils one handles Unicode newlines correctly (being in binary mode by default) and otherwise works correctly if you literally specify the Unicode characters. He gives this example of changing 'e' to 'T' in a UTF-16LE file.

Code:
sed "s/e\x00e/T\x00T/g" utf16.txt >new_utf16.txt
 
#18
Here, Eric Pement (didn't he once hang around in these forums?) has tested several versions of SED.EXE. He claims the UnxUtils one handles Unicode newlines correctly (being in binary mode by default) and otherwise works correctly if you literally specify the Unicode characters. He gives this example of changing 'e' to 'T' in a UTF-16LE file.

Code:
sed "s/e\x00e/T\x00T/g" utf16.txt >new_utf16.txt
I tested this. It does not work with the SED in UnxUtils.zip. But it does work with the updated SED in
http://unxutils.sourceforge.net/UnxUpdates.zip which claims to be "GNU sed version 4.0.7".
 
#21
Here, Eric Pement (didn't he once hang around in these forums?) has tested several versions of SED.EXE. He claims the UnxUtils one handles Unicode newlines correctly (being in binary mode by default) and otherwise works correctly if you literally specify the Unicode characters. He gives this example of changing 'e' to 'T' in a UTF-16LE file.

Code:
sed "s/e\x00e/T\x00T/g" utf16.txt >new_utf16.txt
I tested this. It does not work with the SED in UnxUtils.zip. But it does work with the updated SED in
http://unxutils.sourceforge.net/UnxUpdates.zip which claims to be "GNU sed version 4.0.7".
Thanks for testing, Vince. I did have the same thought, though it would be incredibly tedious.

Following your post, I did test on GnuWin32 sed, and it worked fine, using as example:
Code:
sed -b -es/G\x00r\x00\\/P\x00a\x00\\/i to replace "Gr\" with "Pa\"

But I'm afraid I don't quite follow your example, which looks like it is replacing "e\00e" with "T\00T"? What is the purpose of having the 'e' (and the 'T') on either side of the hex 0 byte? Something to do with big-endian vs little-endian; but if so, how does it work?

Also, this would be quite difficult if the substitution is not a literal string. For example, if one wanted to replace "Grandpa\burger" with "%COMPUTERNAME%\burger", where %COMPUTERNAME% might be "Papa", "Momma", or "Teen". Back on topic for TCC, can someone provide an example of a batch file that could do this?
 
#22
That option is already unchecked; but I'm not sure how that would affect Windows Task Manager?
Sorry, apparently I did not read your post carefully enough. Obviously any options in TCC can effect only TCC's own output, and my answer was irrelevant. I have no idea whether or not it is possible to force Windows to be directly compatible to ASCII. However, piping the output of a Windows action, including Task Manager, to a TCC instance with ASCII mode set will produce an ASCII file. An alternative is to save Windows' Unicode output in a file, and use TCC's TYPE command to translate it to ASCII. I have done this successfully with registry dumps.
 
#23
Thanks for testing, Vince. I did have the same thought, though it would be incredibly tedious.

Following your post, I did test on GnuWin32 sed, and it worked fine, using as example:
Code:
sed -b -es/G\x00r\x00\\/P\x00a\x00\\/i to replace "Gr\" with "Pa\"

But I'm afraid I don't quite follow your example, which looks like it is replacing "e\00e" with "T\00T"? What is the purpose of having the 'e' (and the 'T') on either side of the hex 0 byte? Something to do with big-endian vs little-endian; but if so, how does it work?

Also, this would be quite difficult if the substitution is not a literal string. For example, if one wanted to replace "Grandpa\burger" with "%COMPUTERNAME%\burger", where %COMPUTERNAME% might be "Papa", "Momma", or "Teen". Back on topic for TCC, can someone provide an example of a batch file that could do this?
Sorry! I copied Eric Pement's example, but didn't describe it accurately (he did describe it correctly). There's nothing mystical about it; he simply wanted to change "ee" into "TT".

Were you testing Gnu SED 4.0.7? I'm not sure whether an earlier version will get the newlines, which you want to be 0D000A00, correct.
 
#24
Sorry, apparently I did not read your post carefully enough. Obviously any options in TCC can effect only TCC's own output, and my answer was irrelevant. I have no idea whether or not it is possible to force Windows to be directly compatible to ASCII. However, piping the output of a Windows action, including Task Manager, to a TCC instance with ASCII mode set will produce an ASCII file. An alternative is to save Windows' Unicode output in a file, and use TCC's TYPE command to translate it to ASCII. I have done this successfully with registry dumps.
Steve, thanks, that's a good workaround; better than trying to force Windows to output ASCII. I tried the TYPE command, and it worked fine for my purpose of an exported task xml file, even removing the BOM. I'm not sure how one would pipe the Task Scheduler Export function output to TCC?
 
#26
Sorry! I copied Eric Pement's example, but didn't describe it accurately (he did describe it correctly). There's nothing mystical about it; he simply wanted to change "ee" into "TT".

Were you testing Gnu SED 4.0.7? I'm not sure whether an earlier version will get the newlines, which you want to be 0D000A00, correct.
Ha! Thank you for clarifying. I am testing Gnu sed 4.2.1.

Interestingly, I found that when replacing in a unicode file, I had to use the -b binary option to get the line-endings correct; whereas when replacing in an ASCII file, I had to leave off the -b binary option in order to get the line-endings correct, but only when doing the replace in-place (i.e. replacing the input file with the output file); when not replacing in-place (i.e. output to standard out), the output was correct (whether viewed on the terminal or redirected to a file) regardless of the presence or absence of the -b option.
 
#27
It wouldn't surprise me if the scheduled task applet could import a (proper identified) ANSI XML file (and I wouldn't be surprised if it couldn't). Notepad could save an exported task as ANSI text, and apparently, a new file name would mean a new task name. I'd test it but I don't know what the encoding on the first line (below) should be changed to (because I know darn little about XML).
Code:
<?xml version="1.0" encoding="UTF-16"?>
 
#28
Indeed, manually adding \x00 wherever needed would be cumbersome. But there are some tricks you can do with TCC. I don't know if your version of TCC has @REREPLACE, but if it does, one such trick would be
Code:
v:\> set var=grandpa

v:\> echo %@rereplace[(.),\1\\x00,%var]
g\x00r\x00a\x00n\x00d\x00p\x00a\x00
Without @REREPLACE, my 4UTILS plugin offers @XREPLACE which does the same thing.
Code:
v:\> set var=grandpa

v:\> echo %@xreplace[(.),\1\\x00,%var]
g\x00r\x00a\x00n\x00d\x00p\x00a\x00
 
#29
It wouldn't surprise me if the scheduled task applet could import a (proper identified) ANSI XML file (and I wouldn't be surprised if it couldn't). Notepad could save an exported task as ANSI text, and apparently, a new file name would mean a new task name. I'd test it but I don't know what the encoding on the first line (below) should be changed to (because I know darn little about XML).
Code:
<?xml version="1.0" encoding="UTF-16"?>
You are correct: after testing sed on the unicode task file, and then testing TYPE to convert it to an ASCII file, I imported it with Task Scheduler, and it worked just fine... and, I forgot to change the encoding specification, which still said UTF-16, but Windows figured it out. But in practice, I would suspect changing the encoding to UTF-8 would be appropriate, since the first 128 bytes of UTF-8 are the same as ASCII. Of course, one could use TYPE first, and then use sed on the ASCII file, as well.
 
#30
Indeed, manually adding \x00 wherever needed would be cumbersome. But there are some tricks you can do with TCC. I don't know if your version of TCC has @REREPLACE, but if it does, one such trick would be
Code:
v:\> set var=grandpa

v:\> echo %@rereplace[(.),\1\\x00,%var]
g\x00r\x00a\x00n\x00d\x00p\x00a\x00
Without @REREPLACE, my 4UTILS plugin offers @XREPLACE which does the same thing.
Code:
v:\> set var=grandpa

v:\> echo %@xreplace[(.),\1\\x00,%var]
g\x00r\x00a\x00n\x00d\x00p\x00a\x00
Vince, thanks for the code. One could use that for both the search string and the replace string, which would ease the task considerably. It seems that TCC 13 doesn't have @rereplace, so @xreplace would be very helpful--thank you for making 4UTILS available!