1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

TCC Unicode support?

Discussion in 'Support' started by myarmor, Apr 24, 2010.

  1. myarmor

    Joined:
    May 30, 2008
    Messages:
    65
    Likes Received:
    1
    Does TCC (commandline) support unicode .btm files?

    If so, does it require the BOM, or can it determine encoding on it's own?
    Are there any other encodings it supports (utf8 etc)?

    I'm not sure if CMD.EXE (in windows 7) supports it though as I don't dare to try as executing random symbols (which it would be without unicode support) doesn't sound too good.
     
  2. rconn

    rconn Administrator
    Staff Member

    Joined:
    May 14, 2008
    Messages:
    9,870
    Likes Received:
    83
    Yes.


    The BOM mark is helpful (and faster), but provided it's more than a few
    bytes long TCC can determine its type.

    TCC does not support UTF-8 batch files (and I cannot think of a reason why
    you would want to use them!)

    Rex Conn
    JP Software
     
  3. vefatica

    Joined:
    May 20, 2008
    Messages:
    7,972
    Likes Received:
    30
    On Sat, 24 Apr 2010 06:54:28 -0400, myarmor <> wrote:

    |Does TCC (commandline) support unicode .btm files?
    |
    |If so, does it require the BOM, or can it determine encoding on it's own?
    |Are there any other encodings it supports (utf8 etc)?

    It's OK with a BOM.

    It apparently doesn't work without a BOM.

    v:\> ver

    TCC 11.00.48 Windows XP [Version 5.1.2600]

    v:\> type /x ucode.bat
    0000 0000 65 00 63 00 68 00 6f 00 20 00 66 00 6f 00 6f 00 e.c.h.o. .f.o.o.
    0000 0010 0d 00 0a 00 ....

    v:\> ucode.bat
    TCC: V:\ucode.bat [1] Unknown command "e"

    (XP's) CMD cannot deal with it, with or without a BOM:

    Microsoft Windows XP [Version 5.1.2600]
    (C) Copyright 1985-2001 Microsoft Corp.

    v:\> ucode.bat (with BOM)

    v:\> ÿþe
    'ÿþe' is not recognized as an internal or external command,
    operable program or batch file.

    v:\> ucode.bat (without BOM)

    v:\> e
    'e' is not recognized as an internal or external command,
    operable program or batch file.
    --
    - Vince
     
  4. myarmor

    Joined:
    May 30, 2008
    Messages:
    65
    Likes Received:
    1
    Thanks for the info..
    With BOM it works as expected.

    vefatica seems to be right though.. it apparently is somewhat bad at handling files without BOM (UTF-16 LE to be exact).
    I tested with two ECHO lines, first with european, second with japanese in a unicodefile without BOM.

    If the first char in the unicodefile is @ though, it doesn't complain nor run.

    However, now when it's known, I know what to do :)

    Finding a font which supports unicode in console, thats another matter (W7 Pro only lists Consolas, Lucida Console and Raster Fonts, and none of them seems to have much support for it).
     
  5. rconn

    rconn Administrator
    Staff Member

    Joined:
    May 14, 2008
    Messages:
    9,870
    Likes Received:
    83
    We've had this discussion before - Windows (not TCC) needs more than 10
    bytes to determine whether a string is Unicode. If you want to write really
    small Unicode batch files, you're going to have to insert the BOM.

    Rex Conn
    JP Software
     
  6. myarmor

    Joined:
    May 30, 2008
    Messages:
    65
    Likes Received:
    1
    I forgot to mention that this was ran on Windows 7 x64 Pro and newest version/update of Take Command (I tend to use only TCC of that package).

    My test contained:
    @echo off
    echo This is a test of a .btm file without BOM
    echo (15 japanese characters goes here, I don't include them in this post).

    In other words, it was a bit more than 10 chars, and over 3 lines in total.
    I'm not saying it is your fault or anything as you use the windows api's to determine it, I'm just mentioning it.

    However, as long as it's knows it doesn't really bother me that much..
     
  7. vefatica

    Joined:
    May 20, 2008
    Messages:
    7,972
    Likes Received:
    30
    On Sun, 25 Apr 2010 12:42:24 -0400, myarmor <> wrote:

    |My test contained:
    |@echo off
    |echo This is a test of a .btm file without BOM
    |echo (15 japanese characters goes here, I don't include them in this post).
    |
    |In other words, it was a bit more than 10 chars, and over 3 lines in total.
    |I'm not saying it is your fault or anything as you use the windows api's to determine it, I'm just mentioning it.

    I'd recommend to Rex using IS_TEXT_UNICODE_STATISTICS in addition to the current
    tests. It works better for me (recognizes short WCHAR strings, including L"echo
    foo").
    --
    - Vince
     
  8. rconn

    rconn Administrator
    Staff Member

    Joined:
    May 14, 2008
    Messages:
    9,870
    Likes Received:
    83
    Tried and abandoned long ago, because of an exceptionally high number of
    false positives.

    Rex Conn
    JP Software
     

Share This Page