You made good points, I agree.<br><br clear="all">Mark<br>

<br><br><div class="gmail_quote">On Wed, Jul 15, 2009 at 16:20, Kenneth Whistler <span dir="ltr">&lt;<a href="mailto:kenw@sybase.com">kenw@sybase.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

Mark suggested:<br>

<div class="im"><br>

&gt; If so, we could do this by changing Tables 2.9 to be:<br>

&gt;<br>

&gt; 2.9.  Other Exclusions by Property (I)<br>

&gt;    I: Hangul_Syllable_Type(cp) is in {L, V, T} or<br>

&gt;       (General_Category(cp) is Lm and Block(cp) = CJK_Symbols_And_Punctuation)<br>

&gt;<br>

&gt;    This category consists of all conjoining Hangul Jamo (Leading Jamo,<br>

&gt;    Vowel Jamo, and Trailing Jamo), plus exclusion of Letter Modifiers in the<br>

&gt;    CJK_Symbols_And_Punctuation block<br>

&gt;<br>

&gt;    Elimination of conjoining Hangul Jamos from the set of PVALID<br>

&gt;    characters results in restricting the set of Korean PVALID characters<br>

&gt;    just to preformed, modern Hangul syllable characters.  Old Hangul<br>

&gt;    syllables, which must be spelled with sequences of conjoining Hangul<br>

&gt;    Jamos, are not PVALID for IDNs.<br>

&gt;<br>

&gt;    These particular letter modifiers are not required in normal presentation.<br>

<br>

</div>I oppose that suggestion.<br>

<br>

1. It dilutes the intent of 2.9, which is currently just focussed<br>

   on removing Hangul jamo, and turns it into another grab-bag<br>

   exception category. That is what 2.6 Exceptions (F) is for.<br>

<br>

2. By seeking to provide a property derivation that just happens<br>

   to fit the list of exceptions in question, it essentially hides<br>

   the fact that this is none other than an exception list<br>

   masquerading as a principled filtering by properties.<br>

   You could do the same thing for everything else in the<br>

   2.6 Exceptions (F) list.<br>

<br>

   The Arabic-Indic digits (both sets):<br>

<br>

   (General_Category(cp) = Nd and Block(cp) = Arabic)<br>

<br>

   The geresh and gershayim:<br>

<br>

   (General_Category(cp) = Po and Block(cp) = Hebrew and Word_Break(cp) = ALetter)<br>

<br>

   U+00B7 MIDDLE DOT:<br>

<br>

   (General_Category(cp) = Po and Block(cp) = Latin_1 and Word_Break(cp) = MidLetter)<br>

<br>

   And so on.<br>

<br>

3. Building such derivations into the rules list in idnabis-tables.txt might<br>

   seem to be an elegant way to avoid listing exceptions and to gain<br>

   extensibility at the same time. However, in this case, it does<br>

   neither.<br>

<br>

   a. First of all, the block in question is filled already. No other<br>

      characters can ever be added to it. So you are gaining no generality<br>

      whatsoever by writing a &quot;rule&quot; that is restricted to an already<br>

      closed set.<br>

<br>

   b. As opposed to a fixed exception list, you actually *open* the document<br>

      to a problem should the UTC ever decide that the General_Category of<br>

      any *other* character in that block should be changed to gc=Lm.<br>

      Suddenly, by a side effect that nobody will remember at the time,<br>

      and which will only be reported much later after the fact, that<br>

      decision will result potentially in tipping a PVALID character<br>

      into the DISALLOWED category, by virtue of a rule too clever by half.<br>

<br>

So just fix the exception list to take care of U+303B.<br>

<br>

Then you&#39;re done with the topic and can move on.<br>

<br>

--Ken<br>

<br>

</blockquote></div><br>