How to search and replace ASCII 160 by ASCII 32 ?

oph

Jun 28, 2008
29
0
#1
I receive some files with ASCII 160 characters (a acute (á) or unamovible space of Excel). I want to convert to spaces with TPIPE. Is it possible?

Thanks.

OPH
 

Charles Dye

Super Moderator
Staff member
May 20, 2008
3,556
46
Albuquerque, NM
prospero.unm.edu
#2
I receive some files with ASCII 160 characters (a acute (á) or unamovible space of Excel). I want to convert to spaces with TPIPE. Is it possible?
Possibly something like this?

Code:
tpipe /input=infile.txt /output=outfile.txt /replace=0,0,0,0,0,0,0,0,0,"%@char[160]","%@char[32]"
Nine, count 'em, nine zeros before the search and replace strings. You may also need a /UNICODE= option if either the input file or the output file uses UTF-8 or UTF-16.
 

oph

Jun 28, 2008
29
0
#3
Thank you.

I tried

Code:
tpipe /input=infile.txt /output=outfile.txt /replace=0,0,0,0,0,0,0,0,0,"á"," "
but it doesn't work.
 

Charles Dye

Super Moderator
Staff member
May 20, 2008
3,556
46
Albuquerque, NM
prospero.unm.edu
#4
I tried

Code:
tpipe /input=infile.txt /output=outfile.txt /replace=0,0,0,0,0,0,0,0,0,"á"," "
but it doesn't work.
Does it work as expected if you use %@CHAR[160] instead of the accented letter? Because I suspect you may be getting bitten by OEM-to-Unicode conversion, somewhere along the line.

(TCC uses Unicode internally, but characters from a batch file, or e.g. copied from the clipboard, can be in an OEM character set. And the conversions don't always work as you might expect.... And to muddy the waters further, console programs like TCC often use a different OEM character set than graphical programs like the text editor you use to write your batch file! Windows is a mess.)
 

oph

Jun 28, 2008
29
0
#5
Does it work as expected if you use %@CHAR[160] instead of the accented letter? Because I suspect you may be getting bitten by OEM-to-Unicode conversion, somewhere along the line.

(TCC uses Unicode internally, but characters from a batch file, or e.g. copied from the clipboard, can be in an OEM character set. And the conversions don't always work as you might expect.... And to muddy the waters further, console programs like TCC often use a different OEM character set than graphical programs like the text editor you use to write your batch file! Windows is a mess.)
Yes, your solution work. Thank you.

CHAR[160] can be seen as "á" or as a hard space or as a Beta (double s of German).

By the way, the files are OEM, I guess, they are as text of emails, surely distorted by the email program.

A sequence of
Hexadecimal

20 20 20 20 20 20.....

was changed to

20 A0 20 A0 20 A0.....

Thank you.