Welcome!

By registering with us, you'll be able to discuss, share and private message with other members of our community.

SignUp Now!

Display problem with Unicode/UTF-8 Characters

Dec
3
0
Hi,

I've hit a problem when doing work involving Unicode/UTF-8 characters in filenames. In a TCC window, using the Consolas font, but defaulting I think to an ANSI/CodePage system for foreign characters I get displays like this:

1641516993370.png


The (in this case) Japanese characters display as boxed question marks, and take up two character positions.

In TakeCommand, when I have UTF-8 support turned on I get this:

1641517131315.png


Now the Japanese characters appear correctly, but I get a white bar on the right hand side, equal in size to the number of foreign letters, i.e. the double width characters are displayed correctly in a single character space, but this results in the whole line being shrunk.

That's a cosmetic issue and liveable, but the real problem is if I hit the bottom of the window with this sort of display glitch occurring it seems to stuff up the colours in general. I find myself typing in black-on-black, even when using the up and down arrows to go through my command history. The cursor jumps as the command line changes, but the text is not displayed at all. I have to perform a "cls" to reset things and get my command line to display again.

Is there a known setting that can work around this?

Cheers

Michael
 
If it's a console (stand-alone) window, Windows is responsible for the character display.

If it's a Take Command tab window, the double-wide characters will only be displayed correctly if you're using the appropriate code page. (Take Command does not do the double-wide character check for non-DBCS code pages because it slows down everything substantially for the 99.9999% of users who *don't* want to occasionally display double-wide characters.)
 
G'day Rex,

Thanks for the quick reply. The first screen shot was a TCC console window. As you say, with Windows being responsible for that display I can't even find a setting that can make all characters display all the time, because it's fixed to using the system's default Code Page. (Yes, you could change the Code Page somehow probably, but you'd only ever have a reduced character set.)

The second screen shot was from a Take Command tab. I got excited when I was able to get actual Japanese characters to display! At least now I can see which files I'm actually working on a bit more reliably! As I say, a bit of a glitch on the right hand margin is not the end of the world, it was the black-on-black command lines that are causing me problems.

I just had a thought about using the colour settings for the command line that I haven't used in a long time. (Probably not since 4DOS!) This has given me a partial work-around. I set these TCC settings:

1641522889506.png


Now when I start up a Take Command window things are as you'd expect - stuff I type appears in yellow:

1641523001307.png


(The white bar appears automatically because in this case the prompt includes double-width characters. Very similar to the right-hand margin glitch, and I find it pretty easy to ignore.)

Now if I do a long DIR display with lots of foreign characters in the file names, whereas before I'd end up with black-on-black command text and things were unworkable, I now get this:

1641523192583.png


I've lost my yellow, but it's still readable! Unfortunately, the white block in this case is my cursor, so it's positioning is now out-of-sync with where text actually appears when you type. (i.e. the cursor above should be positioned right next to the "r" in "dir", and it behaves like it it.) Still, a lot better than being completely blind!

I appreciate you don't what to slow things downs massively for the 99.999%, but perhaps there's just one or two variables that could get "reset" each time a new prompt is displayed? As I say, a "CLS" command always fixes things, so hopefully it would just be case or working out what tiny bit of the "CLS" command also needs to be reset afresh for each command prompt?

Cheers

Michael
 
Interesting! I had no idea there was a Code Page for Unicode. (Seems a bit counter-intuitive, since Code Pages were how we worked around these things before Unicode!)

The AutoRun for CMD.EXE looks like a terrible thing - a big vector for trojans and viruses. And I imagine TCC doesn't pay attention to the setting anyway, and it's TCC I'm trying to "fix", not CMD.EXE. But I could use the 4Start.BTM instead.

Thanks

Michael
 
TCC uses AutoRun by default. Disable it here ... in OPTION\StartUp.

1642578490816.png
 
I've changed my codepage with chcp to 65001 and it doesn't change the rendering of unicode charaters in tcc in the slightest. What gives?
 
You need to be using a Unicode font. If you're running TCC in a console window, click on the icon in the upper left corner of the window, select Properties, then the Font tab. If you're running TCC in a Take Command tab window, click on the Options menu, then Tabs, and select a font.

My own preference is for Cascadia Code or Consolas.
 
chcp does not affect _rendering_, it affects behavior of the applications who are able to recognize current console code page.
For example, compare the output of `netsh int ipv4 show addr` before and after switching console CP. (If you have non-english localized Windows version, the distinction would be immediately apparent.)
 
So, for me, the solution was to install the new-ish Windows Terminal and run TCC under that.

However, this introduced a new bug which I posted about here:
 

Similar threads

Back
Top