Read Cyrillic text from a file

Local Body · Sep 15, 2016

Hi from Belarus!
I installed Russian version of Windows 10 x64. OEM code page - 866.
When writing text in Russian with the help of a file >>! This text is written correctly, ie it can be read with any text editor, for example, FAR Manager.
But when reading it line by line using the "%@line" on the screen is given unreadable text.
Is there any way to solve this problem?
TCC version 20

MaartenG · Sep 15, 2016

It's a little hard to reproduce (I think) here on an Eglish Windows version, but isn't this about the FONT you are using in TCC?
(Alt-spacebar > Properties (which will be called somthing else on a Russian Windows version, I guess...) > Tab Fonts to change it for the current session; "Defaults" to change it for well .. default :-)

rconn · Sep 15, 2016

Remember that Windows GUI apps and Windows console apps use different codepages & fonts, so you need to make sure you've got the same settings in both environments.

If the file is Unicode UTF-16 and you have the correct font selected for the console session, you shouldn't have any issues. If the file is non-UTF16 (i.e., UTF-8 or 8-bit extended ASCII) then it gets much more problematic for TCC to identify.

If you think you have everything configured correctly and still can't display the file correctly, please post an example file here for us to test.

MaartenG · Sep 15, 2016

Oh, I read your post wrong: you are using Windows 10 (somehow I managed to read Windows XP...).

Behaviour of "Properties" has changed between WinXP and Win7/10:
From Win7 on, if you change settings under "Properties", they will be put in the shortcut you used to start the program with.
So in your case (Win10): these settings will be saved.
Sorry for any confusion....

Local Body · Sep 16, 2016

MaartenG said:
Oh, I read your post wrong: you are using Windows 10 (somehow I managed to read Windows XP...).

Behaviour of "Properties" has changed between WinXP and Win7/10:
From Win7 on, if you change settings under "Properties", they will be put in the shortcut you used to start the program with.
So in your case (Win10): these settings will be saved.
Sorry for any confusion....

Ok, Thanks Yoy! But the problem is not solved.

Local Body · Sep 16, 2016

rconn said:
Remember that Windows GUI apps and Windows console apps use different codepages & fonts, so you need to make sure you've got the same settings in both environments.

If the file is Unicode UTF-16 and you have the correct font selected for the console session, you shouldn't have any issues. If the file is non-UTF16 (i.e., UTF-8 or 8-bit extended ASCII) then it gets much more problematic for TCC to identify.

If you think you have everything configured correctly and still can't display the file correctly, please post an example file here for us to test.

The program is written in coding 866. All messages are displayed on the screen is correct, except for reading lines from a file. Here is an example.

echo off
echo.
echo Проверка > test.txt
echo check >>! test.txt
echo %@line[test.txt,0]
echo %@line[test.txt,1]
pause

The first line is displayed as unreadable text. The second - is correct.
Option, which has offered of MaartenG, brought no result.
I tried the code page 65001 and others, but the problem remained.
If the text is displayed with using an "echo" or a "scrput" or "screen" text is visible correctly.

With Far manager editor created a file encoded in 1200. The lines of this file is read correctly. But! When you try to change the code page 1200 (chcp 1200), the message "The specified code page is invalid".

In version 15, it works correctly in all systems - XP, 7, 8, 8.1, 10.

The problem comes in 16 and later versions.

p.s. Sorry for my English :-)

MaartenG · Sep 16, 2016

WinLanEm said:
p.s. Sorry for my English :-)

No problem at all! It was very easy to read (and must be 1000 times better than me trying to speak/understand Russian :-)

I tried your testscript and you are absolutely right. Same problem here.

I simplified your test even further and here are some results
(in Print-screen format, to prevent "translations" between my system and this forum)

Codepage = 866 and Console font = Lucida Console.

TCC 20 :

In the 2. output you will notice there are only 7 characters. That is because %@char[160] is actualy the space-character. Just a coincidence

In TCC/LE 14 (the current version):

But it's good to know my computer can speak Russian (never tried that one); now I can change my Matrix-script (somewhere else on this forum) to output Russian. Or Chinese ...
You learn every day on these forums :-)

rconn · Sep 16, 2016

This is a bug in the Microsoft RTL. I added a workaround for it in v20.0.20 (which will be uploaded later today).

MaartenG · Sep 16, 2016

That is really quick!

If this question can be answered simple and easy, I would like to know why there is a difference in results for different TCC versions on the same machine? Don't they call the same library?
Just curious...

("one fool can ask more questions than seven wise men can answer")

rconn · Sep 16, 2016

MaartenG said:
That is really quick!

If this question can be answered simple and easy, I would like to know why there is a difference in results for different TCC versions on the same machine? Don't they call the same library?

They call different versions of the same library. Microsoft changes it; sometimes they fix things, and sometimes they break things.

MaartenG · Sep 16, 2016

OK, I get it. Thanks for explaining!

Local Body · Sep 17, 2016

Thank you very much for your help!!! I look forward to build 20.

Local Body · Sep 18, 2016

Build 20. Great! Everything is working! Problem solved. Thanks again for your help. :-)

Search

Welcome!

Read Cyrillic text from a file

Local Body

MaartenG

rconn

Administrator

MaartenG

Local Body

Local Body

Attachments

MaartenG

rconn

Administrator

MaartenG

rconn

Administrator

MaartenG

Local Body

Local Body

Similar threads