Display problem with Unicode/UTF-8 Characters

MichaelH · Jan 6, 2022

Hi,

I've hit a problem when doing work involving Unicode/UTF-8 characters in filenames. In a TCC window, using the Consolas font, but defaulting I think to an ANSI/CodePage system for foreign characters I get displays like this:

The (in this case) Japanese characters display as boxed question marks, and take up two character positions.

In TakeCommand, when I have UTF-8 support turned on I get this:

Now the Japanese characters appear correctly, but I get a white bar on the right hand side, equal in size to the number of foreign letters, i.e. the double width characters are displayed correctly in a single character space, but this results in the whole line being shrunk.

That's a cosmetic issue and liveable, but the real problem is if I hit the bottom of the window with this sort of display glitch occurring it seems to stuff up the colours in general. I find myself typing in black-on-black, even when using the up and down arrows to go through my command history. The cursor jumps as the command line changes, but the text is not displayed at all. I have to perform a "cls" to reset things and get my command line to display again.

Is there a known setting that can work around this?

Cheers

Michael

rconn · Jan 6, 2022

If it's a console (stand-alone) window, Windows is responsible for the character display.

If it's a Take Command tab window, the double-wide characters will only be displayed correctly if you're using the appropriate code page. (Take Command does not do the double-wide character check for non-DBCS code pages because it slows down everything substantially for the 99.9999% of users who *don't* want to occasionally display double-wide characters.)

MichaelH · Jan 6, 2022

G'day Rex,

Thanks for the quick reply. The first screen shot was a TCC console window. As you say, with Windows being responsible for that display I can't even find a setting that can make all characters display all the time, because it's fixed to using the system's default Code Page. (Yes, you could change the Code Page somehow probably, but you'd only ever have a reduced character set.)

The second screen shot was from a Take Command tab. I got excited when I was able to get actual Japanese characters to display! At least now I can see which files I'm actually working on a bit more reliably! As I say, a bit of a glitch on the right hand margin is not the end of the world, it was the black-on-black command lines that are causing me problems.

I just had a thought about using the colour settings for the command line that I haven't used in a long time. (Probably not since 4DOS!) This has given me a partial work-around. I set these TCC settings:

Now when I start up a Take Command window things are as you'd expect - stuff I type appears in yellow:

(The white bar appears automatically because in this case the prompt includes double-width characters. Very similar to the right-hand margin glitch, and I find it pretty easy to ignore.)

Now if I do a long DIR display with lots of foreign characters in the file names, whereas before I'd end up with black-on-black command text and things were unworkable, I now get this:

I've lost my yellow, but it's still readable! Unfortunately, the white block in this case is my cursor, so it's positioning is now out-of-sync with where text actually appears when you type. (i.e. the cursor above should be positioned right next to the "r" in "dir", and it behaves like it it.) Still, a lot better than being completely blind!

I appreciate you don't what to slow things downs massively for the 99.999%, but perhaps there's just one or two variables that could get "reset" each time a new prompt is displayed? As I say, a "CLS" command always fixes things, so hopefully it would just be case or working out what tiny bit of the "CLS" command also needs to be reset afresh for each command prompt?

Cheers

Michael

AnrDaemon · Jan 17, 2022

Newer Windows versions has an option to permanently change console CP to 65001 (UTF-8). And you could always use Command Processor's AutoRun.

MichaelH · Jan 18, 2022

Interesting! I had no idea there was a Code Page for Unicode. (Seems a bit counter-intuitive, since Code Pages were how we worked around these things before Unicode!)

The AutoRun for CMD.EXE looks like a terrible thing - a big vector for trojans and viruses. And I imagine TCC doesn't pay attention to the setting anyway, and it's TCC I'm trying to "fix", not CMD.EXE. But I could use the 4Start.BTM instead.

Thanks

Michael

vefatica · Jan 19, 2022

TCC uses AutoRun by default. Disable it here ... in OPTION\StartUp.

AnrDaemon · Jan 24, 2022

MichaelH said:
The AutoRun for CMD.EXE looks like a terrible thing
…
But I could use the 4Start.BTM instead.

How's one is better than the other?

vefatica · Jan 24, 2022

AnrDaemon said:
How's one is better than the other?

One enhances compatibility with CMD; the other doesn't.

ClioCJS · May 19, 2023

I've changed my codepage with chcp to 65001 and it doesn't change the rendering of unicode charaters in tcc in the slightest. What gives?

rconn · May 19, 2023

You need to be using a Unicode font. If you're running TCC in a console window, click on the icon in the upper left corner of the window, select Properties, then the Font tab. If you're running TCC in a Take Command tab window, click on the Options menu, then Tabs, and select a font.

My own preference is for Cascadia Code or Consolas.

AnrDaemon · May 21, 2023

chcp does not affect _rendering_, it affects behavior of the applications who are able to recognize current console code page.
For example, compare the output of `netsh int ipv4 show addr` before and after switching console CP. (If you have non-english localized Windows version, the distinction would be immediately apparent.)

ClioCJS · Jun 7, 2023

So, for me, the solution was to install the new-ish Windows Terminal and run TCC under that.

However, this introduced a new bug which I posted about here:

TCC window completely disappearing(!!!), sometimes(?) crashing, if Windows Terminal is running (or maybe even, if it's simply been run)

I spent way too long wondering why "import openai" was crashing python so hard that the calling window disappeared, and how that was even possible. Then I realized. TCC is disappearing. And sometimes reappearing. But usually disappearing. For good. Only after I installed windows terminal...

jpsoft.com

Search

Welcome!

Display problem with Unicode/UTF-8 Characters

MichaelH

rconn

Administrator

MichaelH

AnrDaemon

MichaelH

vefatica

AnrDaemon

vefatica

ClioCJS

rconn

Administrator

AnrDaemon

ClioCJS

TCC window completely disappearing(!!!), sometimes(?) crashing, if Windows Terminal is running (or maybe even, if it's simply been run)

Similar threads