Display problem with Unicode/UTF-8 Characters

Dec 8, 2019
3
0
Hi,

I've hit a problem when doing work involving Unicode/UTF-8 characters in filenames. In a TCC window, using the Consolas font, but defaulting I think to an ANSI/CodePage system for foreign characters I get displays like this:

1641516993370.png


The (in this case) Japanese characters display as boxed question marks, and take up two character positions.

In TakeCommand, when I have UTF-8 support turned on I get this:

1641517131315.png


Now the Japanese characters appear correctly, but I get a white bar on the right hand side, equal in size to the number of foreign letters, i.e. the double width characters are displayed correctly in a single character space, but this results in the whole line being shrunk.

That's a cosmetic issue and liveable, but the real problem is if I hit the bottom of the window with this sort of display glitch occurring it seems to stuff up the colours in general. I find myself typing in black-on-black, even when using the up and down arrows to go through my command history. The cursor jumps as the command line changes, but the text is not displayed at all. I have to perform a "cls" to reset things and get my command line to display again.

Is there a known setting that can work around this?

Cheers

Michael
 

rconn

Administrator
Staff member
May 14, 2008
12,425
153
If it's a console (stand-alone) window, Windows is responsible for the character display.

If it's a Take Command tab window, the double-wide characters will only be displayed correctly if you're using the appropriate code page. (Take Command does not do the double-wide character check for non-DBCS code pages because it slows down everything substantially for the 99.9999% of users who *don't* want to occasionally display double-wide characters.)
 
Dec 8, 2019
3
0
G'day Rex,

Thanks for the quick reply. The first screen shot was a TCC console window. As you say, with Windows being responsible for that display I can't even find a setting that can make all characters display all the time, because it's fixed to using the system's default Code Page. (Yes, you could change the Code Page somehow probably, but you'd only ever have a reduced character set.)

The second screen shot was from a Take Command tab. I got excited when I was able to get actual Japanese characters to display! At least now I can see which files I'm actually working on a bit more reliably! As I say, a bit of a glitch on the right hand margin is not the end of the world, it was the black-on-black command lines that are causing me problems.

I just had a thought about using the colour settings for the command line that I haven't used in a long time. (Probably not since 4DOS!) This has given me a partial work-around. I set these TCC settings:

1641522889506.png


Now when I start up a Take Command window things are as you'd expect - stuff I type appears in yellow:

1641523001307.png


(The white bar appears automatically because in this case the prompt includes double-width characters. Very similar to the right-hand margin glitch, and I find it pretty easy to ignore.)

Now if I do a long DIR display with lots of foreign characters in the file names, whereas before I'd end up with black-on-black command text and things were unworkable, I now get this:

1641523192583.png


I've lost my yellow, but it's still readable! Unfortunately, the white block in this case is my cursor, so it's positioning is now out-of-sync with where text actually appears when you type. (i.e. the cursor above should be positioned right next to the "r" in "dir", and it behaves like it it.) Still, a lot better than being completely blind!

I appreciate you don't what to slow things downs massively for the 99.999%, but perhaps there's just one or two variables that could get "reset" each time a new prompt is displayed? As I say, a "CLS" command always fixes things, so hopefully it would just be case or working out what tiny bit of the "CLS" command also needs to be reset afresh for each command prompt?

Cheers

Michael
 
Dec 8, 2019
3
0
Interesting! I had no idea there was a Code Page for Unicode. (Seems a bit counter-intuitive, since Code Pages were how we worked around these things before Unicode!)

The AutoRun for CMD.EXE looks like a terrible thing - a big vector for trojans and viruses. And I imagine TCC doesn't pay attention to the setting anyway, and it's TCC I'm trying to "fix", not CMD.EXE. But I could use the 4Start.BTM instead.

Thanks

Michael
 

Similar threads