Read Cyrillic text from a file

Sep 13, 2016
35
0
#1
Hi from Belarus!
I installed Russian version of Windows 10 x64. OEM code page - 866.
When writing text in Russian with the help of a file >>! This text is written correctly, ie it can be read with any text editor, for example, FAR Manager.
But when reading it line by line using the "%@line" on the screen is given unreadable text.
Is there any way to solve this problem?
TCC version 20
 
#2
It's a little hard to reproduce (I think) here on an Eglish Windows version, but isn't this about the FONT you are using in TCC?
(Alt-spacebar > Properties (which will be called somthing else on a Russian Windows version, I guess...) > Tab Fonts to change it for the current session; "Defaults" to change it for well .. default :-)
 

rconn

Administrator
Staff member
May 14, 2008
10,504
94
#3
Remember that Windows GUI apps and Windows console apps use different codepages & fonts, so you need to make sure you've got the same settings in both environments.

If the file is Unicode UTF-16 and you have the correct font selected for the console session, you shouldn't have any issues. If the file is non-UTF16 (i.e., UTF-8 or 8-bit extended ASCII) then it gets much more problematic for TCC to identify.

If you think you have everything configured correctly and still can't display the file correctly, please post an example file here for us to test.
 
#4
Oh, I read your post wrong: you are using Windows 10 (somehow I managed to read Windows XP...).

Behaviour of "Properties" has changed between WinXP and Win7/10:
From Win7 on, if you change settings under "Properties", they will be put in the shortcut you used to start the program with.
So in your case (Win10): these settings will be saved.
Sorry for any confusion....
 
Sep 13, 2016
35
0
#5
Oh, I read your post wrong: you are using Windows 10 (somehow I managed to read Windows XP...).

Behaviour of "Properties" has changed between WinXP and Win7/10:
From Win7 on, if you change settings under "Properties", they will be put in the shortcut you used to start the program with.
So in your case (Win10): these settings will be saved.
Sorry for any confusion....
Ok, Thanks Yoy! But the problem is not solved.
 
Sep 13, 2016
35
0
#6
Remember that Windows GUI apps and Windows console apps use different codepages & fonts, so you need to make sure you've got the same settings in both environments.

If the file is Unicode UTF-16 and you have the correct font selected for the console session, you shouldn't have any issues. If the file is non-UTF16 (i.e., UTF-8 or 8-bit extended ASCII) then it gets much more problematic for TCC to identify.

If you think you have everything configured correctly and still can't display the file correctly, please post an example file here for us to test.
The program is written in coding 866. All messages are displayed on the screen is correct, except for reading lines from a file. Here is an example.

echo off
echo.
echo Проверка > test.txt
echo check >>! test.txt
echo %@line[test.txt,0]
echo %@line[test.txt,1]
pause

The first line is displayed as unreadable text. The second - is correct.
Option, which has offered of MaartenG, brought no result.
I tried the code page 65001 and others, but the problem remained.
If the text is displayed with using an "echo" or a "scrput" or "screen" text is visible correctly.

With Far manager editor created a file encoded in 1200. The lines of this file is read correctly. But! When you try to change the code page 1200 (chcp 1200), the message "The specified code page is invalid".

In version 15, it works correctly in all systems - XP, 7, 8, 8.1, 10.

The problem comes in 16 and later versions.

p.s. Sorry for my English :-)
 

Attachments

Last edited:
#7
p.s. Sorry for my English :-)
No problem at all! It was very easy to read (and must be 1000 times better than me trying to speak/understand Russian :-)

I tried your testscript and you are absolutely right. Same problem here.

I simplified your test even further and here are some results
(in Print-screen format, to prevent "translations" between my system and this forum)

Codepage = 866 and Console font = Lucida Console.

TCC 20 :

Capture1.JPG

In the 2. output you will notice there are only 7 characters. That is because %@char[160] is actualy the space-character. Just a coincidence


In TCC/LE 14 (the current version):

Capture2.JPG



But it's good to know my computer can speak Russian (never tried that one); now I can change my Matrix-script (somewhere else on this forum) to output Russian. Or Chinese ...
You learn every day on these forums :-)
 
Last edited:
#9
That is really quick!

If this question can be answered simple and easy, I would like to know why there is a difference in results for different TCC versions on the same machine? Don't they call the same library?
Just curious...

("one fool can ask more questions than seven wise men can answer")
 
Last edited:

rconn

Administrator
Staff member
May 14, 2008
10,504
94
#10
That is really quick!

If this question can be answered simple and easy, I would like to know why there is a difference in results for different TCC versions on the same machine? Don't they call the same library?
They call different versions of the same library. Microsoft changes it; sometimes they fix things, and sometimes they break things.