Welcome!

By registering with us, you'll be able to discuss, share and private message with other members of our community.

SignUp Now!

Putting @char[160] (nbsp) in the clipboard?

May
12,845
164
I can't put @char[160] (non-breaking space) in the clipboard. It goes in (and comes out) as 'á'.

Code:
d:\data\tcclibrary> echo **%@char[160]**
** **

d:\data\tcclibrary> echo **%@char[160]** > clip:

d:\data\tcclibrary> type clip:
**á**

d:\data\tcclibrary>
 
It works as expected here. You might try with OPTION //UNICODEOUTPUT=YES.
 
By "works" I mean that the %@CHAR[160] is translated into its CP437 equivalent:
Code:
C:\>ver

TCC  26.01.35 x64   Windows 10 [Version 10.0.17763.914]

C:\>option unicodeoutput
unicodeoutput=No

C:\>echo **%@char[160]** > clip:

C:\>type clip:
** **

C:\>type /x clip:
0000 0000 2a 2a ff 2a 2a 0d 0a                              ** **..

C:\>
 
It works as expected here. You might try with OPTION //UNICODEOUTPUT=YES.
That would screw everything up.

I see this.

Code:
v:\> echo **%@char[160]** > clip:

v:\> type /x clip:
0000 0000 2a 2a e1 2a 2a 0d 0a                              **á**..
 
I can reproduce this if I CHCP 1252.

I have no idea why this is happening. NBSP is 0xA0 in both Unicode and CP1252.
 
Same behaviour here with CP1252. That's really weird!

Even with: this:

Code:
echo " " > clip:

The space is a copy and pasted 0XA0 or course ...

PS: Interesting: I have this problem NOT in the new MS Terminal (v1.0.1401.0) within a TCC tab. Could it be a problem with the "old" conhost?
 
Last edited:
Here's another good one.

Code:
v:\> echo **%@char[160]** > clip:

v:\> type clip:
**á**

v:\> echo **%@char[160]** >> clip:

v:\> type clip:
**ß**
**á**

v:\> echo **%@char[160]** >> clip:

v:\> type clip:
**¯**
**ß**
**á**
 
It looks like appending to CLIP: is reading the existing text as OEM characters before adding the new text. But for some reason it's using one code page for input and a different code page for output, so that poor unfortunate character gets mangled again, and again....

It looks like we're writing to the clipboard as code page 1252 (NBSP = 0xA0 / CP1252) but reading from it as code page 437 (0xA0 / CP437 = á). And then again: writing á = 0xE1 / CP1252, reading 0xE1 / CP437 = ß. The next mutation should be 0xDF / CP437 = ▀, but yours looks different. Perhaps your current font doesn't contain that graphic?

Rex, you've been saying for some time now — rightly! — that we should be using Unicode instead of 8-bit character sets. May I suggest that the same principle should apply to clipboard reads and writes? In version 27? CF_TEXT is dead, long live CF_UNICODETEXT!
 
Rex, you've been saying for some time now — rightly! — that we should be using Unicode instead of 8-bit character sets. May I suggest that the same principle should apply to clipboard reads and writes? In version 27? CF_TEXT is dead, long live CF_UNICODETEXT!

OK, done. Several years ago.

The only time TCC uses CF_TEXT is if that's what's already in the clipboard when you want to read it, or if you're copying ASCII text (like, from a file, or if you're redirecting output to ASCII).
 
Will this one be fixed?

Code:
v:\> echo *%@char[160]*
* *

v:\> echo *%@char[160]* > clip:

v:\> type clip:
*á*
 
Why on earth would you want to put a non-spacing character into the clipboard? (And what do think you'd get if you did??)

But if you really want to, ask Microsoft, because it's their clipboard code.
I really thought I'd get a NBSP in the clipboard ... because I wanted to paste it (IIRC) into a forum post. Charles seems to have figured out that "> clip:" uses CP_OEMCP. If that's not CP_ACP, oddball things will happen, like

Code:
v:\> echo *%@char[177]* > clip:

v:\> echo *%@char[177]*
*±*

v:\> echo %@line[clip:,0]
*▒*

v:\> type clip:
*¦*

Strangely, if I use a CP that doesn't have the PlusMinusSign ...

Code:
v:\> chcp 437
Active code page: 437

v:\> echo *%@char[177]* > clip:

v:\> echo *%@char[177]*
*±*

v:\> echo %@line[clip:,0]
*±*

v:\> type clip:
*±*

I don't understand why it depends on the codepage and why it works as I want with a code page which doesn't contain the character and fail to work as I want with a code page which does contain the character.

Can't the clipboard use CP_ACP and/or UTF-8?

As I said a while back, I just want it to be seamless. One aspect of that is that @CHAR[n] always look the same.
 
There is no way you can get a non-spacing Unicode character in the clipboard when you're using ASCII output.

Switch to UTF-8 or UTF16 output. And then you still won't be able to paste it into a forum post, because it's not a character, it's the *lack* of a character. (But the value will be in the clipboard.)
 
I've pasted non-breaking spaces into the forum from time to time; it's a perfectly valid Unicode character. And I've pasted in stranger things than that. (Sometimes I want to mention CMD‍.EXE....)
 
There is no way you can get a non-spacing Unicode character in the clipboard when you're using ASCII output.

Switch to UTF-8 or UTF16 output. And then you still won't be able to paste it into a forum post, because it's not a character, it's the *lack* of a character. (But the value will be in the clipboard.)

I don't know what you're talking about. @CHAR[160] (non-breaking space) ... @CHAR[177] (PlusMinus) ... they're normal full width characters. There's no problem (except with VIEW) when I use CP 437 (which does not include either one).

1594505100339.png


If I use CP 1252 (which includes both of them) ...

1594505310855.png


As for pasting them into the forum ... here they are pasted:

After chcp 437 and echoing to clip: ... * ± *

After chcp 1252 and echoing to clip: ... *á▒á*

If I'm not using 437 (CP_OEMCP) they don't go into the clipboard correctly.
 
Whatever Powershell does is equally confusing. Here's powershell. I used Alt+00160 and Alt+00177 to enter the characters. What appears on the set-clipboard command line looks different but I get the same thing out of the clipboard.

1594515505551.png
 
I've been experimenting. It's not just a few characters. It's many high-bit characters that my font (Consolas) can handle.

If I put data in the clipboard like this.

Code:
    HGLOBAL hMem = GlobalAlloc(GMEM_ZEROINIT|GMEM_MOVEABLE|GMEM_DDESHARE, 107 * sizeof(WCHAR));
    WCHAR *pszChars = (WCHAR *) GlobalLock(hMem);
    // @char[160] ~ @char[265] plus terminating NUL, all supported by Consolas
    for ( INT i=0; i<106; i++ )
        pszChars[i] = 160 + i;
    pszChars[106] = 0;
    OpenClipboard(NULL);
    SetClipboardData(CF_UNICODETEXT, hMem);

Then echo @CLIP[0] gets it 100% right and TYPE clip: doesn't, regardless of code page. The differences is in characters 256~265.

In either case, it's better than TCC setting the clipboard data (which seems dependent on code page).

Code:
d:\projects2019\denom\x64\release> echo %@clip[0]
 ¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿĀāĂ㥹ĆćĈĉ

d:\projects2019\denom\x64\release> type clip:
 ¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿAaAaAaCcCc

So I wonder, does (should?) TCC use CF_UNICODETEXT when setting the clipboard data? Any of CF_UNICODETEXT, CF_TEXT, and CF_OEMTEXT give you the other two automatically, but I imaging CF_UNICODETEXT might not be faithful if it comes from the clipboard translating one of the other formats.

And does TCC use CF_UNICODETEXT (if available) when getting the clipcoard data?
 
More on CLIP:

The help says

Redirection to the clipboard is always done using UTF16 Unicode.

and

>:u Redirected output is UTF16 Unicode

Does that mean that "> CLIP:" and ">:u CLIP:" should behave the same? They don't behave the same.

1594841079494.png
 
Back
Top