How to? How do I read a Unicode file through standard-input?

May 24, 2010
855
0
Northlake, Il
I'm sure I'm missing something here, but I don't quite know what I'm missing or where to find it (and I've spent some time looking). Specifically, I wanted to read a Unicode file through "standard input", i.e. <"A Unicode File.txt". And I tried to read it using the "usual" technique, i.e. "Set Line=%@SafeExp[@Line[CON,0]]". (Does "@SafeExp" have something to do with it?) And the problem is that the program completely fails when the input file is a Unicode file, and works as expected when it is an ANSI (ASCII file). In the short term I fixed the problem by creating an ANSI version of the input file I want to process, but I'd really rather to be able to "natively" read a Unicode file. So how do I do that?

- Dan
 

Charles Dye

Super Moderator
Staff member
May 20, 2008
4,446
88
Albuquerque, NM
prospero.unm.edu
You can try it without the @SAFEEXP.... but I don't think that CON expects to receive anything but 8-bit text. You'll probably have to rewrite your batch file to use @FILEOPEN, @FILECLOSE, and @FILEREAD or @SAFEREAD.
 
May 24, 2010
855
0
Northlake, Il
Thank you, Charles, that's kind of what I expected although I find it quite surprising because I thought that TCC's native "language" was Unicode and not ANSI/ASCII, which were only there as kind of a "concession". And this is kind of surprising to me because the primary kinds of batch files that I write are "filters" of some kind or another; they read some data in from standard input, process it in some manner, and write it back out to standard output. And it's really not too unusual to for me to have two or three or even more of these "filters" "piped" one after the other on the command line. (And this kind of thing can not be done using temporary files that I have to write and read.) So the inability to read Unicode from standard input basically means that I will have to uncheck the "Unicode Output" option, which I've kept unchecked for probably as long as it's existed but recently changed because I've read a couple of things lately that all basically said "The world is switching to Unicode, and you'll be left behind if you don't switch too." (And I really doubt that "@SafeExp" has anything at all to do with it and I really don't want to live without it - in particular, I make heavy use of very-long files (as a memory aid to tell me exactly what is in a file given my bad memory), and I like to use "&" instead of "and" and I make heavy use of commas (which I have to "UnSafe /E:, >NUL: of course), and because batch files couldn't handle these things before your "@Safe" routines (and, as always, thank you very much!) I was forced to write C++ programs which, because of my drastically declining programming skills due to my bad memory was getting more and more impractical. So, thank you again!)

- Dan
 

Charles Dye

Super Moderator
Staff member
May 20, 2008
4,446
88
Albuquerque, NM
prospero.unm.edu
And this is kind of surprising to me because the primary kinds of batch files that I write are "filters" of some kind or another; they read some data in from standard input, process it in some manner, and write it back out to standard output. And it's really not too unusual to for me to have two or three or even more of these "filters" "piped" one after the other on the command line. (And this kind of thing can not be done using temporary files that I have to write and read.)

Well, perhaps you could add just one more filter to your command line, to translate Unicode to OEM before sending it on to your batch file. If you'd like to try one of mine, Xcode will output 8-bit text if you give it the /A option.

It ought to be possible to add a function to SafeChars to slurp a line from stdin; I'll take a look at it sometime next week.
 
May 24, 2010
855
0
Northlake, Il
Thank you, Charles, I'll look into XCode. And while adding a function to "slurp lines from stdin" might be a nice addition (only if it is relatively easy for you to do!), I tend to think that, in my circumstances, just unchecking the "Unicode Output" option is a better idea because, as far as I know, I really don't need Unicode anyway.

- Dan
 
Similar threads
Thread starter Title Forum Replies Date
K "copy /z /w" command not deleting read-only files in destination Support 10
T unqlite binary read test is inconsistent Support 2
A TCC failing to read recursive symlinks Support 25
Alpengreis [Forum] Mark forumS (all!) read Support 2
D Custom ini-file is not read, by tcmd.exe cli Support 6
T read snmp values Support 4
WinLanEm Read Cyrillic text from a file Support 12
C How to? Read Win7's Computer Libraries Support 19
T [TCC 18] Can't read nor write history Support 1
Charles Dye Read-only environment variables wiped by SETLOCAL / ENDLOCAL Support 5
MickeyF Trying to read XML Support 3
R attrib -C on a read only file? Support 1
samintz How to? read while ignoring whitespace Support 3
S WAD "Mark Forums Read" fails Support 4
vefatica Can DIRHISTORY read from clip:? Support 2
vefatica Mark a forum or thread as "read"? Support 5
daniel347x How to get Take Command to read in all current system environment variables that CMD.EXE sees? Support 13
C Attaching PowerShell changes its colors and makes it hard to read Support 15
Peter Murschall TEE cannot handle Unicode output Support 2
B Fullwidth Unicode forms display incorrectly Support 5
T @execstr unicode support Support 6
Peter Murschall TPIPE generate unicode on Piping or redirecting Support 3
D Pasting Unicode data has different behavior on TCC and CMD Support 2
vefatica TYPE goes crazy with no-BOM Unicode file Support 7
Charles Dye TCC smashing Unicode quotes Support 9
Peter Murschall UNICODE mixed with ANSI Code Support 11
Joe Caverly Unicode, Codepage 437, and line characters Support 3
B How to? Convert Unicode to ANSI Support 1
StarliteLemming Fileread fails on Unicode file Support 10
vefatica DO ... /P ... and Unicode? Support 3
vefatica Unicode ... I don't understand Support 1
jadaml Echo unicode characters from UTF-8 Batch files? Support 1
Charles Dye @ASCII vs. @UNICODE Support 5
A How to? Filter history list with unicode chars Support 0
vefatica TYPE, Unicode, installer Support 10
A WAD Limitations on display of unicode punctuation marks Support 11
A Include lists and Unicode Support 1
M WAD A bit of strangeness related to Unicode-marked file not being Unicode Support 2
M @CHAR and UNICODE Support 4
D LIST command wierdness with empty Unicode file Support 1
B Unicode/dword issue in TCC12 Support 4
J dir failure with some unicode characters Support 6
M TCC Unicode support? Support 7
vefatica BOMs in [dir]history / TAIL with Unicode Support 2
vefatica Unicode screw-up in IDE Support 4
vefatica Unicode anomaly Support 0
vefatica Debugger now Unicode? Support 1
vefatica TYPE /X and Unicode files? Support 0
dcantor Convert ASCII to Unicode or vice versa? Support 6
H HISTORY and DIRHISTORY /R can't handle Unicode Support 0

Similar threads