Welcome!

By registering with us, you'll be able to discuss, share and private message with other members of our community.

SignUp Now!

Take Command console tab AND TCC are not Chinese friendly

Apr
43
0
I download Take Command 11 Trial today and I found that Take Command console tab AND TCC are not Wide Char(Chinese/Japanese/Korean/etc.) friendly.

(The upload function in the forum is too restricted I use xs.to instead.)

The Chinese characters is showing half and the aligning is in a mess.
image-5851_4BB5854F.jpg


While in console window works better.
image-4741_4BB5854F.jpg


but the redirection/pipe from dir to external program or file is broken:
image-2373_4BB586D5.jpg

Opening the output text file in hex editor, you can see that the redirected output use Unicode character count for truncating ANSI string, resulting truncated ANSI output. (This issue seems applied in TCC/LE 11.00.44)

and the TYPE command is just inverted situation. Type command use ANSI byte count for Unicode output, resulting accessing uninitialized memory. (This issue seems *NOT* applied in TCC/LE 11.00.44)
image-DCFD_4BB5896B.jpg
 
Take Command does not currently support wide character sets.

Rex Conn
JP Software
But what about TCC and TCC/LE?
I can just use the console window instead, but dir/type commands(maybe this issue affect all internal commands) corrupt the output which REALLY NEED fixing as soon as possible because they affect Asians from using TCC and TCC/LE very much.
 
> ---Quote (Originally by rconn)---
> Take Command does not currently support wide character sets.
>
> Rex Conn
> JP Software
> ---End Quote---
> But what about TCC and TCC/LE?
> I can just use the console window instead, but dir/type commands(maybe
> this issue affect all internal commands) corrupt the output which
> REALLY NEED fixing as soon as possible because they affect Asians from
> using TCC and TCC/LE very much.

We cannot support that environment, as we do not have any in-house expertise
with wide character sets, nor do we have help available in anything other
than English, French, and German.

Rex Conn
JP Software
 
> but the redirection/pipe from dir to external program or file is
> broken:
>
image-2373_4BB586D5.jpg
']http://xs.to/image-2373_4BB586D5.jpg[/img][/URL]
> Opening the output text file in hex editor, you can see that the
> redirected output use Unicode character count for truncating ANSI
> string, resulting truncated ANSI output. (This issue seems applied in
> TCC/LE 11.00.44)

Did you specify Unicode output (OPTION / Startup)? The default is ASCII
output, and there's no way this would work with ASCII.


> and the _TYPE_ command is just inverted situation. Type command use
> ANSI byte count for Unicode output, resulting buffer overflow. (This
> issue seems **NOT** applied in TCC/LE 11.00.44)

Can you email me that text file? (And more details on your environment,
such as the code page & font you're using.)

Rex Conn
JP Software
 
> and the _TYPE_ command is just inverted situation. Type command use
> ANSI byte count for Unicode output, resulting buffer overflow. (This
> issue seems **NOT** applied in TCC/LE 11.00.44)
>
image-DCFD_4BB5896B.jpg
']http://xs.to/image-DCFD_4BB5896B.jpg[/img][/URL]

Is this a Unicode (UTF-16) file (which should work) or a DBCS file (which
definitely will not)?

Rex Conn
JP Software
 
Did you specify Unicode output (OPTION / Startup)? The default is ASCII
output, and there's no way this would work with ASCII.




Can you email me that text file? (And more details on your environment,
such as the code page & font you're using.)

Rex Conn
JP Software

image-0B53_4BB82BCF.jpg

File:
http://www.jpsoft.com/forums/attachment.php?attachmentid=125&stc=1&d=1270361530
This is a Big5(ANSI/CP 950) encoded file.
As I need compatibility with cmd.exe, I cannot use Unicode output.

In cmd.exe, I can use Big5(ANSI/CP 950) and UTF-8(CP 65001). But in TCC and TCC/LE, UTF-8 dir output is also broken.
 

Attachments

> This is a Big5(ANSI/CP 950) encoded file.
> As I need compatibility with cmd.exe, I cannot use Unicode output.

CMD.EXE also supports Unicode output.


> In cmd.exe, I can use Big5(ANSI/CP 950) and UTF-8(CP 65001). But in TCC
> and TCC/LE, UTF-8 dir output is also broken.

TCC does not support ANSI DBCS or UTF-8 (though I'm unaware of support for
UTF-8 in CMD either).

Rex Conn
JP Software
 
CMD.EXE also supports Unicode output.
TCC does not support ANSI DBCS or UTF-8 (though I'm unaware of support for UTF-8 in CMD either).

Related or not, the following behavior is odd. After "CHCP 65001", I see the following (these are not edited):

Code:
v:\> for /l %i in (125,1,135) echo %i %@char[%i]
125 }
126 ~
127 
135 ‡
v:\> for /l %i in (125,1,136) echo %i %@char[%i]
125 }
126 ~
127 
136 ˆ
The counted FOR loop stops at 127 and later prints the last number of the specified range.

There is no problem echoing the missing ones separately:

Code:
v:\> echo 128 %@char[128]
128 €
v:\> echo 129 %@char[129]
129 Â
v:\> echo 130 %@char[130]
130 ‚
 
I see what's happening (I think). For some reason (double-byte character?) the newline is not echoed. So each line overwrites the previous one and I see only the last one.

Related or not, the following behavior is odd. After "CHCP 65001", I see the following (these are not edited):

Code:
v:\> for /l %i in (125,1,135) echo %i %@char[%i]
125 }
126 ~
127 
135 ‡
v:\> for /l %i in (125,1,136) echo %i %@char[%i]
125 }
126 ~
127 
136 ˆ
The counted FOR loop stops at 127 and later prints the last number of the specified range.
 
CMD.EXE also supports Unicode output.




TCC does not support ANSI DBCS or UTF-8 (though I'm unaware of support for
UTF-8 in CMD either).

Rex Conn
JP Software
but external program doesn't support Unicode I/O, as a result I have to use ANSI.
 
> Related or not, the following behavior is odd. After "CHCP 65001", I
> see the following (these are not edited):
>
> Code:
> ---------
> v:\> for /l %i in (125,1,135) echo %i %@char[%i]
> 125 }
> 126 ~
> 127 
> 135 ‡
> v:\> for /l %i in (125,1,136) echo %i %@char[%i]
> 125 }
> 126 ~
> 127 
> 136 ˆ
> ---------
> The counted FOR loop stops at 127 and later prints the last number of
> the specified range.

This has nothing to do with FOR, ECHO, %@CHAR, or TCC.

The string is generated correctly, but the Windows WriteConsole API is
choking (returning a "ERROR_GEN_FAILURE") when it contains those characters.

Rex Conn
JP Software
 
We cannot support that environment, as we do not have any in-house expertise
with wide character sets, nor do we have help available in anything other
than English, French, and German.

Rex Conn
JP Software
Actually properly supporting so-called "wide character" is easy: take care the byte count after converting to target codepage.

There is an very good example is posted in stackoverflow.com:
http://stackoverflow.com/questions/215963/how-do-you-properly-use-widechartomultibyte

for example, you can create a file called "«®©».txt", DIR outputing in UTF-8(CP 65001) which will be represented in "«®©».txt" in ANSI form.
the original filename string is 8 wchar_t in length, but after converting to UTF-8 it will expended to 12 char in length.
"—"(U+2014 EM Dash, 1 wchar_t in length) will be represented in "—"(3 char in length).

So allocate/modify to enough size of the output buffer to length of converted string before writing to pipe/file, that's it.

And so do TYPE command, doing the same action and resize/truncate with L'\0' of output buffer before writing to pipe/file, that's it. And this change also fixes an uninitialized memory access vulnerability when TYPE a MBCS file.

I don't think JPSoft can't do such security bugfix for future version.
 
Related or not, the following behavior is odd. After "CHCP 65001", I see the following (these are not edited):

Code:
v:\> for /l %i in (125,1,135) echo %i %@char[%i]
125 }
126 ~
127 
135 ‡
v:\> for /l %i in (125,1,136) echo %i %@char[%i]
125 }
126 ~
127 
136 ˆ
The counted FOR loop stops at 127 and later prints the last number of the specified range.

There is no problem echoing the missing ones separately:

Code:
v:\> echo 128 %@char[128]
128 €
v:\> echo 129 %@char[129]
129 Â
v:\> echo 130 %@char[130]
130 ‚
if you use True Type Font but not Bitmap Font, you will be able to have a complete loop.

Code:
[F:\TCCLE]for /l %i in (125,1,135) echo %i %@char[%i]
125 }
126 ~
127 
128 €
129 
130 ‚
131 ƒ
132 „
133 …
134 †
135 ‡

[F:\TCCLE]
 
Back
Top
[FOX] Ultimate Translator
Translate