By registering with us, you'll be able to discuss, share and private message with other members of our community.

SignUp Now!


Charles Dye

Super Moderator
Staff member
High time these two supported characters outside the BMP, i.e. 0x10000 <= n <= 0x10FFFF.
Since those characters won't fit in UTF-16, I'm not sure what you're suggesting. Did you want UTF-32 support or to do this with a 3 or 4 byte UTF8?

They can be encoded in UTF-16. Characters above 0xFFFF are encoded as two wchar_ts, the first encoding the high ten bits of the character, and the second encoding the low one -- a "surrogate pair". See e.g. Wikipedia for the details.

So, when @CHAR finds a value above 0xFFFF, it should return the surrogate pair for the specified character. And conversely, when @UNICODE finds a surrogate pair in the input string, it should return a single value > 0xFFFF. (Values above 0x10FFFF are illegal, and should give an error message.)
Last edited:
This is the current Take Command using Consolas:


But I don't know whether those characters are actually in the Consolas font, or whether Windows is just doing its font-substitution thing. I suspect the latter. Those glyphs look much the same in Lucida Console or Courier New.
I would love to see this too.