U+303B VERTICAL IDEOGRAPHIC ITERATION MARK

Mark Davis ⌛ mark at macchiato.com
Thu Jul 16 02:42:19 CEST 2009


You made good points, I agree.

Mark


On Wed, Jul 15, 2009 at 16:20, Kenneth Whistler <kenw at sybase.com> wrote:

> Mark suggested:
>
> > If so, we could do this by changing Tables 2.9 to be:
> >
> > 2.9.  Other Exclusions by Property (I)
> >    I: Hangul_Syllable_Type(cp) is in {L, V, T} or
> >       (General_Category(cp) is Lm and Block(cp) =
> CJK_Symbols_And_Punctuation)
> >
> >    This category consists of all conjoining Hangul Jamo (Leading Jamo,
> >    Vowel Jamo, and Trailing Jamo), plus exclusion of Letter Modifiers in
> the
> >    CJK_Symbols_And_Punctuation block
> >
> >    Elimination of conjoining Hangul Jamos from the set of PVALID
> >    characters results in restricting the set of Korean PVALID characters
> >    just to preformed, modern Hangul syllable characters.  Old Hangul
> >    syllables, which must be spelled with sequences of conjoining Hangul
> >    Jamos, are not PVALID for IDNs.
> >
> >    These particular letter modifiers are not required in normal
> presentation.
>
> I oppose that suggestion.
>
> 1. It dilutes the intent of 2.9, which is currently just focussed
>   on removing Hangul jamo, and turns it into another grab-bag
>   exception category. That is what 2.6 Exceptions (F) is for.
>
> 2. By seeking to provide a property derivation that just happens
>   to fit the list of exceptions in question, it essentially hides
>   the fact that this is none other than an exception list
>   masquerading as a principled filtering by properties.
>   You could do the same thing for everything else in the
>   2.6 Exceptions (F) list.
>
>   The Arabic-Indic digits (both sets):
>
>   (General_Category(cp) = Nd and Block(cp) = Arabic)
>
>   The geresh and gershayim:
>
>   (General_Category(cp) = Po and Block(cp) = Hebrew and Word_Break(cp) =
> ALetter)
>
>   U+00B7 MIDDLE DOT:
>
>   (General_Category(cp) = Po and Block(cp) = Latin_1 and Word_Break(cp) =
> MidLetter)
>
>   And so on.
>
> 3. Building such derivations into the rules list in idnabis-tables.txt
> might
>   seem to be an elegant way to avoid listing exceptions and to gain
>   extensibility at the same time. However, in this case, it does
>   neither.
>
>   a. First of all, the block in question is filled already. No other
>      characters can ever be added to it. So you are gaining no generality
>      whatsoever by writing a "rule" that is restricted to an already
>      closed set.
>
>   b. As opposed to a fixed exception list, you actually *open* the document
>      to a problem should the UTC ever decide that the General_Category of
>      any *other* character in that block should be changed to gc=Lm.
>      Suddenly, by a side effect that nobody will remember at the time,
>      and which will only be reported much later after the fact, that
>      decision will result potentially in tipping a PVALID character
>      into the DISALLOWED category, by virtue of a rule too clever by half.
>
> So just fix the exception list to take care of U+303B.
>
> Then you're done with the topic and can move on.
>
> --Ken
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20090715/1d8d46ba/attachment.htm 


More information about the Idna-update mailing list