1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

How to? How to use TCC in UTF-8 mode?

Discussion in 'Support' started by distill, Feb 14, 2015.

  1. distill

    Joined:
    Feb 14, 2015
    Messages:
    7
    Likes Received:
    0
    As far as what I read from previous threads, this is not possible. Is it really so? It seems that Windows 7 Notepad is in ANSI mode by default. I'm not sure what encoding most text files in general have, but for me it's mostly UTF-8. What I'm asking is support for TCC (and why not TCCLE, 4NT and even cmd.exe) commands such as "type" to display my text file umlauts correctly. Just the umlauts would be enough (not as crucial to have other special chars display correctly, though that would be splendid too).

    I can only display umlauts correctly if the text file was saved in Unicode. How to fix this problem?
     
  2. distill

    Joined:
    Feb 14, 2015
    Messages:
    7
    Likes Received:
    0
    By the way, Google Docs seem to save plain .txt in UTF-8. That is one reason why I think UTF-8 is important.
     
  3. Charles Dye

    Charles Dye Super Moderator
    Staff Member

    Joined:
    May 20, 2008
    Messages:
    3,300
    Likes Received:
    39
  4. Christian Albaret

    Joined:
    Jul 1, 2008
    Messages:
    154
    Likes Received:
    1
    I have
    Code:
    alias utf8toansi tpipe /unicode=utf-8,ansi
    
    I can then run
    Code:
    myutf8command | utf8toansi
    utf8toansi /input=myutf8file
    
    Also
    Code:
    %@UTF8DECODE[s,string]
    decodes a string from UTF-8 to the current code page,
    Code:
    %@UTF8ENCODE[s,string]
    encodes from the current code page to UTF-8 — the latter one is not documented in the help however.

    For example
    Code:
    ffind /s /t"%@UTF8ENCODE[s,string]" files… | utf8toansi
     
  5. rconn

    rconn Administrator
    Staff Member

    Joined:
    May 14, 2008
    Messages:
    9,854
    Likes Received:
    83
    The UTF8 support in TCC / TCMD (which is waaay more than anything from Microsoft) is fundamentally limited due to the lack of any significant UTF8 support in Windows itself. What UTF8 support there is in TCC consists primarily of hacks to work around the limitations in the Windows APIs.

    Microsoft is committed to UTF16 -- I don't think it's likely they're going to scrap it (and break most existing Windows apps) in order to support native UTF8.
     

Share This Page