Pasting Unicode data has different behavior on TCC and CMD

Aug 21, 2014
17
1
I just read about Microsoft's enhancements to the low level Unicode handling on command prompts on 1809, so I decided to try it. I copied some Unicode text that's not in Code Page 437 (example: ░▒▓) and tried pasting it into TCC.

On TCC running through Take Command, I get "°±²".
On TCC running as a command prompt, I also get "°±²".
On CMD, I get the correct value "░▒▓"

It seems to do this if I use Ctrl+V or paste via the system menu. It seems like there is some different behavior between the way Unicode text is handled.

TCC does have UTF-8 support enabled (via OPTION).
 

rconn

Administrator
Staff member
May 14, 2008
11,957
133
That's not quite what's happening -- none of those characters are Unicode. They're extended ASCII characters (8 bit) and their representation depends on the font & character set selected.

If you select a Unicode font and the Unicode character set, you will get °±². If you select a Unicode font and a DOS character set, you will get ░▒▓ . Which one is "correct"? (The characters are the same in either case - 0xB0, 0xB1, 0xB2.)

CMD is "sort of Unicode", except for file handling and some keyboard I/O. TCC is fully Unicode.
 
Aug 21, 2014
17
1
Interesting. I just checked other random Unicode characters that are represented in my console font (Consolas) and they do seem to work. It only seems like a problem with the characters that are in the classic DOS code page but not 437.

I copied those characters from the "Character Map" tool. I verified that the clipboard data (CF_UNICODETEXT) is U+2591, U+2592, U+2593 and the CF_TEXT is "???" since none of those characters are in my code page (437).