U+303B VERTICAL IDEOGRAPHIC ITERATION MARK
Mark Davis ⌛
mark at macchiato.com
Thu Jul 16 02:42:19 CEST 2009
You made good points, I agree.
Mark
On Wed, Jul 15, 2009 at 16:20, Kenneth Whistler <kenw at sybase.com> wrote:
> Mark suggested:
>
> > If so, we could do this by changing Tables 2.9 to be:
> >
> > 2.9. Other Exclusions by Property (I)
> > I: Hangul_Syllable_Type(cp) is in {L, V, T} or
> > (General_Category(cp) is Lm and Block(cp) =
> CJK_Symbols_And_Punctuation)
> >
> > This category consists of all conjoining Hangul Jamo (Leading Jamo,
> > Vowel Jamo, and Trailing Jamo), plus exclusion of Letter Modifiers in
> the
> > CJK_Symbols_And_Punctuation block
> >
> > Elimination of conjoining Hangul Jamos from the set of PVALID
> > characters results in restricting the set of Korean PVALID characters
> > just to preformed, modern Hangul syllable characters. Old Hangul
> > syllables, which must be spelled with sequences of conjoining Hangul
> > Jamos, are not PVALID for IDNs.
> >
> > These particular letter modifiers are not required in normal
> presentation.
>
> I oppose that suggestion.
>
> 1. It dilutes the intent of 2.9, which is currently just focussed
> on removing Hangul jamo, and turns it into another grab-bag
> exception category. That is what 2.6 Exceptions (F) is for.
>
> 2. By seeking to provide a property derivation that just happens
> to fit the list of exceptions in question, it essentially hides
> the fact that this is none other than an exception list
> masquerading as a principled filtering by properties.
> You could do the same thing for everything else in the
> 2.6 Exceptions (F) list.
>
> The Arabic-Indic digits (both sets):
>
> (General_Category(cp) = Nd and Block(cp) = Arabic)
>
> The geresh and gershayim:
>
> (General_Category(cp) = Po and Block(cp) = Hebrew and Word_Break(cp) =
> ALetter)
>
> U+00B7 MIDDLE DOT:
>
> (General_Category(cp) = Po and Block(cp) = Latin_1 and Word_Break(cp) =
> MidLetter)
>
> And so on.
>
> 3. Building such derivations into the rules list in idnabis-tables.txt
> might
> seem to be an elegant way to avoid listing exceptions and to gain
> extensibility at the same time. However, in this case, it does
> neither.
>
> a. First of all, the block in question is filled already. No other
> characters can ever be added to it. So you are gaining no generality
> whatsoever by writing a "rule" that is restricted to an already
> closed set.
>
> b. As opposed to a fixed exception list, you actually *open* the document
> to a problem should the UTC ever decide that the General_Category of
> any *other* character in that block should be changed to gc=Lm.
> Suddenly, by a side effect that nobody will remember at the time,
> and which will only be reported much later after the fact, that
> decision will result potentially in tipping a PVALID character
> into the DISALLOWED category, by virtue of a rule too clever by half.
>
> So just fix the exception list to take care of U+303B.
>
> Then you're done with the topic and can move on.
>
> --Ken
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20090715/1d8d46ba/attachment.htm
More information about the Idna-update
mailing list