Apostrophes in non-ASCII names (was: A proposed solution for descriptions)

Mark Davis mark.davis at icu-project.org
Thu Jun 29 16:40:27 CEST 2006


Some people have carried this discussion over to the Unicode list. I'll
repeat what I wrote there:

Early in the development of Unicode, we considered separating punctuation
characters by function. Period has the functions of abbreviation,
sentence-termination, decimal separation, thousands separation, and others,
depending on language.

However, we decided against this path. Depending on users to type in N
different visibly-identical characters for different functions is doomed to
failure.

If you are working in a closed environment, where you have control over all
the text internally and on entry, and you can enforce the policy you come up
with, it might make sense to try to distinguish them. However, as a broader
policy it is hard enough to get people to distinguish the characters that
look alike that are as different as letter and punctuation.

(For that matter, my personal opinion is that it was a mistake to try to
distinguish between U+02BC and U+2019. We should have just accepted the
reality that the odds of user's correctly distinguishing them was extremely
low, and just accepted a unified character with a somewhat ambiguous status
vis-a-vis letter/punctuationhood.)

On 6/29/06, Caoimhin O Donnaile <caoimhin at smo.uhi.ac.uk> wrote:
>
> Karl said:
>
> > Yes, it will. U+2019 *is* the preferred character for the apostrophe
> > used for contraction and posession. Let me quote the relevant sections
> >      [...]
> >   Punctuation Apostrophe.
> >   U+2019 RIGHT SINGLE QUOTATION MARK is preferred where the character
> >   is to represent a punctuation mark, as for contractions: "We've been
> >   here before."
> >      [...]
> > The semantics of U+2019 are therefore context-dependent. For example, if
> > surrounded by letters or digits on both sides, it behaves as an in-text
> > punctuation character and does not separate words or lines.
>
> I am quite shocked to learn this.  Ciaran is not alone.  It's not at
> all what I would have expected from Unicode.  But if that's the
> way it is...
>
> It looks as if I should avoid using single quotation marks to quote
> text in the future, but rather stick to double quotation marks.
> Or maybe go for the French « »  :-)
>
> Caoimhín
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/ietf-languages/attachments/20060629/91c0f5f9/attachment.html


More information about the Ietf-languages mailing list