1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

UnicodeOutput question

Discussion in 'Support' started by nikbackm, Dec 15, 2010.

  1. nikbackm

    Joined:
    May 30, 2008
    Messages:
    194
    Likes Received:
    1
    I have this command sequence:

    (echo commandSequence | externalProgram) > file.txt


    I would like "file.txt" to be Unicode so this should seem to work:

    OPTION //UnicodeOutput=yes
    (echo commandSequence | externalProgram) > file.txt
    OPTION //UnicodeOutput=no

    However, this causes externalProgram to also get its input in Unicode which it does not like too much.


    This works as I want, but it requires a temporary file:

    (echo commandSequence | externalProgram) > tmpfile.txt
    OPTION //UnicodeOutput=yes
    type tmpfile.txt > file.txt
    OPTION //UnicodeOutput=no


    Is there a way to change the UnicodeOutput option after the pipe, but before the redirection to the result file?

    I tried this:

    ((echo commandSequence | externalProgram) & (OPTION //UnicodeOutput=yes)) > file.txt

    It produced a file, but it was not in Unicode format.

    Should it work, and in that case, how?
     
  2. Jim Cook

    Joined:
    May 20, 2008
    Messages:
    604
    Likes Received:
    0
    I believe the Unicode output only applies to internal commands.

    Sent from Cookie's iPhone
    Jim Cook

    On Dec 15, 2010, at 0:10, nikbackm <> wrote:


     
  3. nikbackm

    Joined:
    May 30, 2008
    Messages:
    194
    Likes Received:
    1
    :o You're right of course.

    I guess I expected UnicodeOutput to do some magic. It should be up to the externalProgram to output Unicode or not, TCC (or any other command processor) would likely not interfere when its output is redirected to a file.

    But the TYPE seems to handle the job, so all's good.
     
  4. Steve Fabian

    Joined:
    May 20, 2008
    Messages:
    3,520
    Likes Received:
    4
    ---- Original Message ----
    From: nikbackm
    To: ESFabian@comcast.net
    Sent: Wednesday, 2010. December 15. 03:10
    Subject: [Support-t-2486] UnicodeOutput question

    | I have this command sequence:
    |
    | (echo commandSequence | externalProgram) > file.txt
    |
    |
    | I would like "file.txt" to be Unicode so this should seem to work:
    |
    | OPTION //UnicodeOutput=yes
    | (echo commandSequence | externalProgram) > file.txt
    | OPTION //UnicodeOutput=no
    |
    | However, this causes externalProgram to also get its input in Unicode
    | which it does not like too much.
    |
    |
    | This works as I want, but it requires a temporary file:
    |
    | (echo commandSequence | externalProgram) > tmpfile.txt
    | OPTION //UnicodeOutput=yes
    | type tmpfile.txt > file.txt
    | OPTION //UnicodeOutput=no
    |
    |
    | Is there a way to change the UnicodeOutput option after the pipe, but
    | before the redirection to the result file?
    |
    | I tried this:
    |
    | ((echo commandSequence | externalProgram) & (OPTION
    | //UnicodeOutput=yes)) > file.txt
    |
    | It produced a file, but it was not in Unicode format.
    |
    | Should it work, and in that case, how?
    ---- End of Original Message ----

    You are trying to use ASCII input to a program, and want to save its output in Unicode. Unless that program is prepared to do that, you need to work around it. AFAIK the only way to achieve your goal is the one you already found, using an
    temporary ASCII file, and translating it to Unicode.
    --
    Steve
     
  5. jabelli

    Joined:
    Oct 29, 2008
    Messages:
    83
    Likes Received:
    0
    Or iconv if you don't insist on a TCC-only solution. You can make aliases to use in pipes, e.g.
    Code:
    utf8=iconv --binary -f UTF-16LE -t UTF-8
    utf16=iconv --binary -f UTF-8 -t UTF-16LE
    then your batch file becomes
    Code:
    OPTION //UnicodeOutput=yes
    (echo commandSequence | utf8 | externalProgram | utf16) > file.txt
    OPTION //UnicodeOutput=no
    which should do what you want.
     
  6. nikbackm

    Joined:
    May 30, 2008
    Messages:
    194
    Likes Received:
    1
    Thank you!

    That does work very well indeed.

    Also seems to be more reliable than using CHCP 65001 overall, which does not always display UTF-8 output from external programs correctly. The same output formatted in a different way causes it to produce mangled characters in some cases.

    "ICONV -f UTF-8 -t CP1252" handles this better for some reason. Using codepage 65001 in the console is probably an hack that is not fully supported by Microsoft, or at least I've heard so a few times.
     

Share This Page