How to? How to use TCC in UTF-8 mode?

  • This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn more.
Feb 14, 2015
7
0
#1
As far as what I read from previous threads, this is not possible. Is it really so? It seems that Windows 7 Notepad is in ANSI mode by default. I'm not sure what encoding most text files in general have, but for me it's mostly UTF-8. What I'm asking is support for TCC (and why not TCCLE, 4NT and even cmd.exe) commands such as "type" to display my text file umlauts correctly. Just the umlauts would be enough (not as crucial to have other special chars display correctly, though that would be splendid too).

I can only display umlauts correctly if the text file was saved in Unicode. How to fix this problem?
 
#4
I have
Code:
alias utf8toansi tpipe /unicode=utf-8,ansi
I can then run
Code:
myutf8command | utf8toansi
utf8toansi /input=myutf8file
Also
Code:
%@UTF8DECODE[s,string]
decodes a string from UTF-8 to the current code page,
Code:
%@UTF8ENCODE[s,string]
encodes from the current code page to UTF-8 — the latter one is not documented in the help however.

For example
Code:
ffind /s /t"%@UTF8ENCODE[s,string]" files… | utf8toansi
 

rconn

Administrator
Staff member
May 14, 2008
10,096
85
#5
The UTF8 support in TCC / TCMD (which is waaay more than anything from Microsoft) is fundamentally limited due to the lack of any significant UTF8 support in Windows itself. What UTF8 support there is in TCC consists primarily of hacks to work around the limitations in the Windows APIs.

Microsoft is committed to UTF16 -- I don't think it's likely they're going to scrap it (and break most existing Windows apps) in order to support native UTF8.