Feature request for TEXTUTILS command WRAP

May 29, 2008
529
3
Groton, CT
#1
Charles,

The handling of quotes described here:
Code:
Quotes replacement: /Q causes WRAP to replace generic ASCII apostrophes and quote marks ( ' and " ) with Unicode open and close 
quote marks ( ‘ ’ and  “ ” ). The new quote marks may or may not look different from the originals, depending on how they are 
displayed and the font used. If the output is displayed in a non-Unicode font, the curly quotes will be lost or mangled.
works very well, but could use an enhancement.

There are text sources which use two consecutive ASCII apostrophes as a substitute for double quotation marks; e.g.,
Code:
''This is a sample quotation,'' wrote Dave.
It would be useful if /Q would treat two consecutive ASCII apostrophes as if they were a single occurrence of the generic ASCII [double] quotation mark.

(For now, I'm preprocessing the text with
Code:
sed -e `s/''/\"/g`
.)

Tnx,
Dave C.
 
May 29, 2008
529
3
Groton, CT
#3
Thanks, Charles.

It also occurs to me that there's a writing style which uses accent grave marks (`) as opening quotation marks and apostrophes (') as the corresponding closing quotation marks, ``like this'' or `this'. They are used singly and doubly. Those aren't appearing as part of my current problem, but, while you're there, you might look at resolving those, too.

Thanks,
 
May 29, 2008
529
3
Groton, CT
#6
Already testing. Here are four cases:
Code:
``This is a normal quotation.''  --> "This is a normal quotation." 
`This is an inner quotation.'  --> unchanged, should probably be changed to leading and trailing apostrophes (and then reprocessed).
``Dave wrote, `This is an inner quotation,' just above.''  -->   Outside marks as in first example, inners as in second.
``And then he wrote, `This is an edge-case, terminating both quotations together.'''  -->   "And then ... `This...together.'''  (3 apostrophes at end!)
 

Charles Dye

Super Moderator
Staff member
May 20, 2008
3,575
46
Albuquerque, NM
prospero.unm.edu
#7
Already testing. Here are four cases:
Code:
``This is a normal quotation.''  --> "This is a normal quotation."
`This is an inner quotation.'  --> unchanged, should probably be changed to leading and trailing apostrophes (and then reprocessed).
``Dave wrote, `This is an inner quotation,' just above.''  -->   Outside marks as in first example, inners as in second.
``And then he wrote, `This is an edge-case, terminating both quotations together.'''  -->   "And then ... `This...together.'''  (3 apostrophes at end!)
I'm not seeing that. It all gets handled properly here, except for the awful triple apostrophe. (That one I'll have to think about; don't know whether I'll ever be able to handle it smartly.)

Can you post a sample text file and the exact command you are typing?
 
May 29, 2008
529
3
Groton, CT
#8
Sure, save the following into a file
Code:
Troublesome quote combinations like those found mostly in unix documentation.

Case 1.   Simple quotation


``This is a unix-like quotation.''


Case 2.   Simple quotation within a quotation, but only the inner
quotation is shown here

`This is a single-quoted simple case.'


Case 3.   Quotation strictly within a quotation, at neither boundary

``The author wrote, `This is a simple case,' but he was wrong.''


Case 4.   Inner quotation ends at end of outer quotation

``The author wrote, `This is an edge case.'''

Case 5.   Like above, but the inner quote begins the outer quote


```This is awful,' he wrote.''
Okay, so I'm not very inventive.

Really, only cases 1, 2, and 3 are likely. Case 1 works right.

Here's the result I get:
Code:
WRAP /Q X.TXT > Y.TXT
Code:
Troublesome quote combinations like those found mostly in unix documentation.

Case 1.   Simple quotation


"This is a unix-like quotation."


Case 2.   Simple quotation within a quotation, but only the inner quotation is shown here

`This is a single-quoted simple case.'


Case 3.   Quotation strictly within a quotation, at neither boundary

"The author wrote, `This is a simple case,' but he was wrong."


Case 4.   Inner quotation ends at end of outer quotation

"The author wrote, `This is an edge case.'''

Case 5.   Like above, but the inner quote begins the outer quote


```This is awful,' he wrote."
 
May 29, 2008
529
3
Groton, CT
#10
Okay! With OPTION //UNICODEOUTPUT=YES, cases 1, 2, and 3 are all correct. Thanks!

Cases 4 and 5 look exactly the same as they do without UNICODEOUTPUT, but I'm not concerned about those.
It's very, very unlikely that a inner quotation would show up at the beginning or end of an outer quotation in my application.

Unfortunately, I'm using WRAP to process the quotes and margins on many small files, and appending them
to a larger file, whose main contents are ASCII. In order to use this version of WRAP, I'll need to recode the entire process
to use unicode output.

I do appreciate the fix.