vefatica wrote:
| A seemingly knowledgeable gent replied to my newsgroup query thus
| (below). It's beyond me. Does it make sense to you. I can
| accurately get the character under the mouse cursor and reproduce it.
| As for turning it into a **familiar** number (some character code) I
| think I'm SOL.
|
| Quoting:
|
| You have dipped into subject that mixes ancient history and modern
| internationalization.
|
| The original IBM CGA display included fonts in its ROM that had
| glyphs in
| all 256 places, including the control characters and the high 128
| characters. The glyph for 0x1B was a left-facing arrow.
|
| Today, this character set lives on as the default 8-bit code page for
| command shells, CP437. The console buffer (essentially a
| virtualization of
| the CGA text-mode buffer at 0B8000) is an 8-bit buffer, so the value
| that
| is written is the 8-bit value 0x27.
|
| When you use ReadConsoleOutputW, the system does an ANSI-to-Unicode
| conversion for you, using the CP437 code page. Since 0x27 in CP437 is
| left-pointing-arrow, you read 0x2190.
|
| -It's interesting. If I use ReadConsoleOutputW() on that character,
| I get -
| - CHAR_INFO::Char.UnicodeChar = 8592
| - CHAR_INFO::Char.AsciiChar = 65424 [garbage?]
|
| This would have made more sense if you had looked at this in hex.
|
| 8592 = 0x2190
| 65424 = 0xff90
|
| This is just taking the low-order byte of the Unicode character you
| got,
| and sign-extending it.
|
| -If I use ReadConsoleOutputCharacterW(), I get 8592.
| -If I use ReadConsoleOutputCharacterA(), I get 27.
| -If I use ReadConsoleOutputA(), I get 27.
| -
| -So the "A" version of the functions is doing some translating (or
| the "W"
| -version is). WideCharToMultiByte() always failed to translate
| correctly for
| -returning 8592 into 63 ("?", the default un-printable). I wish I
| understood
| -what's going on.
|
| When YOU call WideCharToMultiByte, you are using some other 8-bit code
| page, and that code page does not have an encoding for "left-pointing
| arrow". If you called WideCharToMultiByte with CP437, you would get
| 0x27.
Seems that the W mode performs glyph-based code translation. The glyph for
any octet that is not a printable ASCII character (0x00-0x1F, 0x7F-0xFF)
depends on the codepage, and is thus translated. Printable characters within
the ASCII range (0x20-0x7E) the W codes should be OK. The A mode seems to be
OK - it does not translate, returns the actual octets.
PS: Nearly 20 years ago I had used the CP437 non-printable codes to display
on the screen the outline drawing of an add-on PC card, showing its proper
jumper settings for the BIOS and other add-in cards in use (to select memory
mapping and port selection).
--
Steve