WAD FOR reads Text in ASCII !??!?

Peter Murschall · Jul 3, 2015

If I read a line from a listfile

R:\_SyncDirs_\5_DOWN\Down\Books, Documents and Manuals\Schnäppchen.pdf

with
FOR %line% in (@FileList) DO Echo Line:%line%
then the result is
Line:R:\_SyncDirs_\5_DOWN\Down\Books, Documents and Manuals\Schnõppchen.pdf

This lead allways to an error, because in the next steps the Copy-Command will not find the file.
The list are created by FFIND .....

How can I avoid it ?

Charles Dye · Jul 3, 2015

You're creating the file by redirecting the output of FFIND? Make it Unicode by doing OPTION //UNICODEOUTPUT=YES before doing the redirection.

vefatica · Jul 3, 2015

Shouldn't that work seamlessly if one is properly set up for one's own (European) language? It even works here with ASCII output if I merely CHCP to 1252.

Code:

v:\> chcp 1252
Active code page: 1252

v:\> ffind /b schn* > filelist

v:\> for %line in (@filelist) do copy %line empty\
V:\Schnõppchen.pdf => V:\empty\Schnõppchen.pdf
  1 file copied
V:\Schnõppchen.pdf.2 => V:\empty\Schnõppchen.pdf.2
  1 file copied

The spaces in Peter's path could be a problem if he's not quoting %line%.

Peter Murschall · Jul 3, 2015

Charles Dye said:
You're creating the file by redirecting the output of FFIND? Make it Unicode by doing OPTION //UNICODEOUTPUT=YES before doing the redirection.

I've do it, Charles as I see Your reply to vefatica DO - Problem - and it works !

but I must immediatly "re-switch" it after the redirection, because I wont have the other (following) Outputs in Unicode :android:

@vefatica:Dont worry, the rest of the Action will go on with

Copy "%line" "%target%" /HEVX

Charles Dye · Jul 3, 2015

vefatica said:
Shouldn't that work seamlessly if one is properly set up for one's own (European) language? It even works here with ASCII output if I merely CHCP to 1252.

It sounds like he's writing text as one code page, and reading it back as another. I don't know exactly how it happens, but I do know that character 0xE4 is lowercase A with umlaut in Windows ("ANSI") code pages 1250 and 1252, but lowercase O with tilde in OEM code pages 850 and 858.

I'd agitate for removing all support for 8-bit code pages, but that would pretty well scuttle backwards compatibility.

Peter Murschall · Jul 4, 2015

Charles Dye said:
It sounds like he's writing text as one code page, and reading it back as another. I don't know exactly how it happens, but I do know that character 0xE4 is lowercase A with umlaut in Windows ("ANSI") code pages 1250 and 1252, but lowercase O with tilde in OEM code pages 850 and 858.

I'd agitate for removing all support for 8-bit code pages, but that would pretty well scuttle backwards compatibility.

For clarification: Both actions are in one script ! At first I'm looking with FFIND (onto an USB-Drive) for files and if there are found some, then I try to copy them with the FOR - Loop. I never change the code page (why should I do such things

)
So only TCC can be the bad guy ..

Alpengreis · Jul 5, 2015

Peter Murschall said:
For clarification: Both actions are in one script ! At first I'm looking with FFIND (onto an USB-Drive) for files and if there are found some, then I try to copy them with the FOR - Loop. I never change the code page (why should I do such things )
So only TCC can be the bad guy ..

If you have umlauts IN the script itself (I don't know your exactly script), then the script itself can be the problem, if it's for example in UTF instead ANSI (Codepage 1252) ... and NOT TCC IMHO.

So ensure, that your script is saved in the correct codepage. If you have for example Notepad++ as editor, the default is UTF-8 (without BOM) and NOT ANSI (I believe) - so you would have to convert and save it to ANSI!

Edit: Even if your ffind is without umlauts, the script file codepage could be relevant ... I make some tests ...

Edit 2: Maybe I have it :-) it can be, that your windows GUI is set to CP1252 and your Console inclusive TCC is set to CP850 (this is the default on related systems), so a ffind maybe procudes a "false umlaut" in the filelist ...

Alpengreis · Jul 5, 2015

So, here a the detailled tests and a possible solution (used TCMD 18.00.27) ...

Okay, I was successful with the following steps:

1. Same codepage for Windows GUI and Windows Console

I have set the same codepage for Windows GUI and Windows Console (means DOS Prompt AND TCC/TCMD Console) as follow:

In the RegKey [HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage]

I have change the Key "OEMCP" to the same value as Windows GUI has. For example: on an US-english system, the default Windows Codepage is probably 1252 and OEMCP is probably 437; on a Swiss-German system (as here), the default Windows Codepage is probably 1252 and the default OEMCP is 850.

So I had to change to "OEMCP"="1252".

2. Reboot the system

3. Ensure, that your filename with umlauts on your USB drive has correct umlauts within Windows GUI.

4. Ensure, that your TCMD/TCC is NOT set to Unicode (NOT set!)

5. Then I have the following results with manually typed commands within TCC:

5. Create your batch

After you can create your batch/script. SAVE IT IN ANSI 1252, NOT in UTF or so! I you edit your script, you have eventually to convert and save it as ANSI 1252 first!

I have created this test.btm ...

Code:

@echo off
cls

w:
cd "\_SyncDirs_\5_DOWN\Down\Books, Documents and Manuals"

ffind /b schnäppchen.pdf > filelist
rem for %line in (@filelist) do echo "%line"
for %line in (@filelist) do copy "%line" testfile.txt

pause
exit

6. Test it

With this solution, change to Unicode (and back) should be NOT necessary!

Greetings from Switzerland!

Alpengreis

Search

Welcome!

WAD FOR reads Text in ASCII !??!?

Peter Murschall

Charles Dye

Super Moderator

vefatica

Peter Murschall

Charles Dye

Super Moderator

Peter Murschall

Alpengreis

Alpengreis

Similar threads