Unicode ... I don't understand

May 20, 2008
11,400
99
Syracuse, NY, USA
Now I get the same output with LOADBTM on or off. But I just don't understand how it works. This is (ASCII) TEST.BTM (as I see it with TYPE, LIST, VIEW, or in an editor).
Code:
echo ² is character %@ascii[²]
echo @CHAR[%@ASCII[²]] is %@CHAR[%@ASCII[²]]
echo ²²²²²²²²²²²²²²²²²²²²
echo %@repeat[²,20]
echo %@repeat[%@char[178],20]
In the file, that superscript 2 is 0xB2 (178). When I run the BTM I see
Code:
▓ is character 9619
@CHAR[9619] is ▓
▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓
▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓
²²²²²²²²²²²²²²²²²²²²
I get a different result at the command line.
Code:
v:\> echo ² is character %@ascii[²] & echo @CHAR[%@ASCII[²]] is %@CHAR[%@ASCII[²]] & echo ²²²²²²²²²²²²²²²²²²²² & echo %@repeat[²,20] & echo %@repeat[%@char[178],20]
² is character 178
@CHAR[178] is ²
²²²²²²²²²²²²²²²²²²²²
²²²²²²²²²²²²²²²²²²²²
²²²²²²²²²²²²²²²²²²²²
Why are they different and where's character 9619 coming from?
 

Charles Dye

Super Moderator
Staff member
May 20, 2008
4,461
88
Albuquerque, NM
prospero.unm.edu
Different character sets. Your 8-bit file is interpreted according to your current (console or "OEM") code page, most likely code page 437. If you Google code page 437, the first hit is a Wikipedia page with a little table giving you not just pictures of the various characters, but also Unicode equivalents. Character 0xB2 in code page 437 is a graphics character, mapping to Unicode U+2593 -- or 9616 decimal.

TCC uses Unicode internally. All the stuff you type at the command line, or paste in from another program, is Unicode. And the superscripted 2 just happens to be Unicode character U+00B2.

Just to add to the confusion, most Windows programs (the non-console ones) use yet another character set, the Windows code page. (Even more confusingly, that one is also called the "ANSI code page" even though the American National Standards Institute had nothing to do with it.) So if you open that batch file in, say, Notepad, you might see a third character, like an ogonek if you happen to be using code page 1250.

Code pages are an idea whose time has come and long since gone. Unfortunately for Rex, compatibility with CMD.EXE requires that he continue to support the nasty things.
 
Similar threads
Thread starter Title Forum Replies Date
Peter Murschall TEE cannot handle Unicode output Support 2
B Fullwidth Unicode forms display incorrectly Support 5
T @execstr unicode support Support 6
Peter Murschall TPIPE generate unicode on Piping or redirecting Support 3
D Pasting Unicode data has different behavior on TCC and CMD Support 2
vefatica TYPE goes crazy with no-BOM Unicode file Support 7
Charles Dye TCC smashing Unicode quotes Support 9
Peter Murschall UNICODE mixed with ANSI Code Support 11
Joe Caverly Unicode, Codepage 437, and line characters Support 3
B How to? Convert Unicode to ANSI Support 1
StarliteLemming Fileread fails on Unicode file Support 10
vefatica DO ... /P ... and Unicode? Support 3
jadaml Echo unicode characters from UTF-8 Batch files? Support 1
Charles Dye @ASCII vs. @UNICODE Support 5
A How to? Filter history list with unicode chars Support 0
vefatica TYPE, Unicode, installer Support 10
A WAD Limitations on display of unicode punctuation marks Support 11
A Include lists and Unicode Support 1
M How to? How do I read a Unicode file through standard-input? Support 4
M WAD A bit of strangeness related to Unicode-marked file not being Unicode Support 2
M @CHAR and UNICODE Support 4
D LIST command wierdness with empty Unicode file Support 1
B Unicode/dword issue in TCC12 Support 4
J dir failure with some unicode characters Support 6
M TCC Unicode support? Support 7
vefatica BOMs in [dir]history / TAIL with Unicode Support 2
vefatica Unicode screw-up in IDE Support 4
vefatica Unicode anomaly Support 0
vefatica Debugger now Unicode? Support 1
vefatica TYPE /X and Unicode files? Support 0
dcantor Convert ASCII to Unicode or vice versa? Support 6
H HISTORY and DIRHISTORY /R can't handle Unicode Support 0
R Reading an Unicode file with more than 8191 lines Support 1
R Bug TPIPE's pdf to text conversions don't work Support 2
vefatica Console popup windows don't work correctly Support 0
M Backquoted parameters used in GoSub don't pass string as a single parameter Support 2
V Ctrl-Home/End don't work in Win10 Support 3
vefatica Documentation Help: links to "Conditional expressions" don't work Support 2
RChrismon How to? Changes to Options Don't Save Support 8
MickeyF WAD 'if' and aliases don't mix so well Support 5
D How to? V15.00.30 -(help!) - Floating view windows don't open Support 1
vefatica Backquotes, aliases get 'em, BTMs don't? Support 1
D Automatic directory changes don't work at all Support 4
W redirection with pipe don't work Support 10
vefatica @PID, ISAPP - don't see system processes Support 12
J Fixed IDE search keys don't work in TCMD 13.01.32 Support 3
M An issue I really don't understand and is too long and detailed for this "Title" line... Support 16
vefatica =~ ... still don't get it Support 4
vefatica "Administrator:" - I don't want to see it Support 0
C NTFS Descriptions don't work? Support 3

Similar threads