Welcome!

By registering with us, you'll be able to discuss, share and private message with other members of our community.

SignUp Now!

Feature request for TEXTUTILS command WRAP

May
572
4
Charles,

The handling of quotes described here:
Code:
Quotes replacement: /Q causes WRAP to replace generic ASCII apostrophes and quote marks ( ' and " ) with Unicode open and close 
quote marks ( ‘ ’ and  “ ” ). The new quote marks may or may not look different from the originals, depending on how they are 
displayed and the font used. If the output is displayed in a non-Unicode font, the curly quotes will be lost or mangled.
works very well, but could use an enhancement.

There are text sources which use two consecutive ASCII apostrophes as a substitute for double quotation marks; e.g.,
Code:
''This is a sample quotation,'' wrote Dave.

It would be useful if /Q would treat two consecutive ASCII apostrophes as if they were a single occurrence of the generic ASCII [double] quotation mark.

(For now, I'm preprocessing the text with
Code:
sed -e `s/''/\"/g`
.)

Tnx,
Dave C.
 
Thanks, Charles.

It also occurs to me that there's a writing style which uses accent grave marks (`) as opening quotation marks and apostrophes (') as the corresponding closing quotation marks, ``like this'' or `this'. They are used singly and doubly. Those aren't appearing as part of my current problem, but, while you're there, you might look at resolving those, too.

Thanks,
 
Please beat on it. There are a bunch of internal changes for v17 which could use testing. The handling of internet files, for one thing, had to be hacked up a bit.

You can disable the new behaviors via environment variables.
 
Already testing. Here are four cases:
Code:
``This is a normal quotation.''  --> "This is a normal quotation." 
`This is an inner quotation.'  --> unchanged, should probably be changed to leading and trailing apostrophes (and then reprocessed).
``Dave wrote, `This is an inner quotation,' just above.''  -->   Outside marks as in first example, inners as in second.
``And then he wrote, `This is an edge-case, terminating both quotations together.'''  -->   "And then ... `This...together.'''  (3 apostrophes at end!)
 
Already testing. Here are four cases:
Code:
``This is a normal quotation.''  --> "This is a normal quotation."
`This is an inner quotation.'  --> unchanged, should probably be changed to leading and trailing apostrophes (and then reprocessed).
``Dave wrote, `This is an inner quotation,' just above.''  -->   Outside marks as in first example, inners as in second.
``And then he wrote, `This is an edge-case, terminating both quotations together.'''  -->   "And then ... `This...together.'''  (3 apostrophes at end!)

I'm not seeing that. It all gets handled properly here, except for the awful triple apostrophe. (That one I'll have to think about; don't know whether I'll ever be able to handle it smartly.)

Can you post a sample text file and the exact command you are typing?
 
Sure, save the following into a file
Code:
Troublesome quote combinations like those found mostly in unix documentation.

Case 1.   Simple quotation


``This is a unix-like quotation.''


Case 2.   Simple quotation within a quotation, but only the inner
quotation is shown here

`This is a single-quoted simple case.'


Case 3.   Quotation strictly within a quotation, at neither boundary

``The author wrote, `This is a simple case,' but he was wrong.''


Case 4.   Inner quotation ends at end of outer quotation

``The author wrote, `This is an edge case.'''

Case 5.   Like above, but the inner quote begins the outer quote


```This is awful,' he wrote.''
Okay, so I'm not very inventive.

Really, only cases 1, 2, and 3 are likely. Case 1 works right.

Here's the result I get:
Code:
WRAP /Q X.TXT > Y.TXT
Code:
Troublesome quote combinations like those found mostly in unix documentation.

Case 1.   Simple quotation


"This is a unix-like quotation."


Case 2.   Simple quotation within a quotation, but only the inner quotation is shown here

`This is a single-quoted simple case.'


Case 3.   Quotation strictly within a quotation, at neither boundary

"The author wrote, `This is a simple case,' but he was wrong."


Case 4.   Inner quotation ends at end of outer quotation

"The author wrote, `This is an edge case.'''

Case 5.   Like above, but the inner quote begins the outer quote


```This is awful,' he wrote."
 
Try it again with OPTION //UNICODEOUTPUT=YES ....

(I'd kind of like to see OEM output disappear altogether, and //UNICODEOUTPUT with it. But as long as CMD.EXE compatibility remains a selling point, it ain't gonna happen.)
 
Okay! With OPTION //UNICODEOUTPUT=YES, cases 1, 2, and 3 are all correct. Thanks!

Cases 4 and 5 look exactly the same as they do without UNICODEOUTPUT, but I'm not concerned about those.
It's very, very unlikely that a inner quotation would show up at the beginning or end of an outer quotation in my application.

Unfortunately, I'm using WRAP to process the quotes and margins on many small files, and appending them
to a larger file, whose main contents are ASCII. In order to use this version of WRAP, I'll need to recode the entire process
to use unicode output.

I do appreciate the fix.
 

Similar threads

Back
Top