Welcome!

By registering with us, you'll be able to discuss, share and private message with other members of our community.

SignUp Now!

Pasting Unicode data has different behavior on TCC and CMD

Aug
18
1
I just read about Microsoft's enhancements to the low level Unicode handling on command prompts on 1809, so I decided to try it. I copied some Unicode text that's not in Code Page 437 (example: ░▒▓) and tried pasting it into TCC.

On TCC running through Take Command, I get "°±²".
On TCC running as a command prompt, I also get "°±²".
On CMD, I get the correct value "░▒▓"

It seems to do this if I use Ctrl+V or paste via the system menu. It seems like there is some different behavior between the way Unicode text is handled.

TCC does have UTF-8 support enabled (via OPTION).
 
That's not quite what's happening -- none of those characters are Unicode. They're extended ASCII characters (8 bit) and their representation depends on the font & character set selected.

If you select a Unicode font and the Unicode character set, you will get °±². If you select a Unicode font and a DOS character set, you will get ░▒▓ . Which one is "correct"? (The characters are the same in either case - 0xB0, 0xB1, 0xB2.)

CMD is "sort of Unicode", except for file handling and some keyboard I/O. TCC is fully Unicode.
 
Interesting. I just checked other random Unicode characters that are represented in my console font (Consolas) and they do seem to work. It only seems like a problem with the characters that are in the classic DOS code page but not 437.

I copied those characters from the "Character Map" tool. I verified that the clipboard data (CF_UNICODETEXT) is U+2591, U+2592, U+2593 and the CF_TEXT is "???" since none of those characters are in my code page (437).
 
Back
Top
[FOX] Ultimate Translator
Translate