U+303B VERTICAL IDEOGRAPHIC ITERATION MARK

Mark Davis ⌛ mark at macchiato.com
Thu Jul 16 00:20:04 CEST 2009


Hmmm. To answer that,

   - I'd look first at the characters with "VERTICAL" and "REPEAT" or
   "ITERATION" in their name:
      - http://unicode.org/cldr/utility/list-unicodeset.jsp?a=
      \p{name%3D%2FVERTICAL.*%28REPEAT|ITERATION%29%2F}
   - Picking one, we can see the properties:
      - http://unicode.org/cldr/utility/character.jsp?a=3032
   - All of them are in the block [:Block=CJK_Symbols_And_Punctuation:], and
   in Letter Modifier:
      - http://unicode.org/cldr/utility/list-unicodeset.jsp?a=[
      \p{Block%3DCJK_Symbols_And_Punctuation}%26\p{lm}]
   - The additional character in this set is
U+3005<http://unicode.org/cldr/utility/character.jsp?a=3005>( 々 )
IDEOGRAPHIC ITERATION MARK. That is called out specially as a context
   character in 2.6. Exceptions (F), so we don't have to worry about it.
   - So we could use the above set.

If so, we could do this by changing Tables 2.9 to be:

2.9.  Other Exclusions by Property (I)
   I: Hangul_Syllable_Type(cp) is in {L, V, T} or
      (General_Category(cp) is Lm and Block(cp) = CJK_Symbols_And_Punctuation)

   This category consists of all conjoining Hangul Jamo (Leading Jamo,
   Vowel Jamo, and Trailing Jamo), plus exclusion of Letter Modifiers in the
   CJK_Symbols_And_Punctuation block

   Elimination of conjoining Hangul Jamos from the set of PVALID
   characters results in restricting the set of Korean PVALID characters
   just to preformed, modern Hangul syllable characters.  Old Hangul
   syllables, which must be spelled with sequences of conjoining Hangul
   Jamos, are not PVALID for IDNs.

   These particular letter modifiers are not required in normal presentation.


Mark


On Wed, Jul 15, 2009 at 14:43, Vint Cerf <vint at google.com> wrote:

> If we can possibly avoid char by char rules that would be very helpful
> in dealing with updates to Unicode.
>
> I gather these characters don't quite fall into a category that would
> permit algorithmic treatment?
>
> vint
>
>
> On Jul 15, 2009, at 5:06 PM, Eric Brunner-Williams wrote:
>
> > Kenneth Whistler wrote:
> >> I agree with Wil Tan about this.
> >>
> >> The Vertical Kana repeat marks (3031..3035) make no sense
> >> in IDN's, particularly since they will certainly be forced
> >> into horizontal display contexts, where they could accomplish
> >> nothing but introduce mischief and confusion.
> >>
> >
> > This, "... since they will certainly be forced into horizontal display
> > contexts ..." is just what I ment when attempting to discuss what I
> > called at the time (SF +/- some) the "linearization" of descending
> > script, Arabic script in particular. I'm also concerned about
> > non-Cyrillic Mongolian, which is vertical, for similar reasons.
> >
> > The point I was attempting to make earlier (SF +/-), circa TATWEEL, is
> > that a requirement for single baseline script doesn't arise from a
> > registrar requirement. It may arise elsewhere, but if we can't state
> > where the requirement comes from, it doesn't exist, and where a
> > vertical
> > script uses vertical character sequence conventions, such as iteration
> > marks, the rational for action can't be "it doesn't work
> > horizontally".
> >
> > I'm not disagreeing with Wil, and possibly Ken, only noting concern
> > about a preference for display contexts.
> >
> > Eric
> >> As for U+303B VERTICAL IDEOGRAPHIC ITERATION MARK, it is
> >> also useless in IDN's, and I don't think it is helpful or
> >> pertinent to clutter up the CONTEXTO rules in the appendix A
> >> listing trying to come up with an appropriate rule for this.
> >>
> >> As for attempting to stand on principle that IDNA should not
> >> categorize characters as DISALLOWED unless shown to be
> >> harmful, we already crossed that bridge a long time ago
> >> by ruling 1000's of symbols as DISALLOWED on general
> >> principle, even though they are less problematical than
> >> these vertical display characters.
> >>
> >> And finally, there is no good reason whatsoever why U+303B
> >> should be CONTEXTO (and have that stand as some kind of
> >> precedent that we can't reverse to make it DISALLOWED
> >> in the table), when all these other, more problematical
> >> vertical form characters are sitting in the table as PVALID
> >> and not CONTEXTO. So from the point of view of
> >> consistency and minimal confusion for implementers,
> >> the best choice is to make the lot DISALLOWED and be done
> >> with it -- *particularly* if we agree that:
> >>
> >> "Sane registry policy everywhere will still probably set this to
> >> registry-disallowed."
> >>
> >> --Ken
> >>
> >>
> >>> I think the following should be DISALLOWED:
> >>>
> >>> U+3031: Lm: VERTICAL KANA REPEAT MARK
> >>> U+3032: Lm: VERTICAL KANA REPEAT WITH VOICED SOUND MARK
> >>> U+3033: Lm: VERTICAL KANA REPEAT MARK UPPER HALF
> >>> U+3034: Lm: VERTICAL KANA REPEAT WITH VOICED SOUND MARK UPPER HALF
> >>> U+3035: Lm: VERTICAL KANA REPEAT MARK LOWER HALF
> >>> U+303B: Lm: VERTICAL IDEOGRAPHIC ITERATION MARK
> >>>
> >>> Mainly because U+3033 looks like protocol character (forward slash)
> >>> and thus harmful IMO. Since this is a group of characters with
> >>> related
> >>> usage, and that Yoneya-san, Martin Dürst and John suggested that
> >>> they
> >>> should be disallowed:
> >>>  http://www.alvestrand.no/pipermail/idna-update/2009-April/004398.html
> >>>
> >>> =wil
> >>>
> >>
> >> _______________________________________________
> >> Idna-update mailing list
> >> Idna-update at alvestrand.no
> >> http://www.alvestrand.no/mailman/listinfo/idna-update
> >>
> >>
> >>
> >
> >
> > _______________________________________________
> > Idna-update mailing list
> > Idna-update at alvestrand.no
> > http://www.alvestrand.no/mailman/listinfo/idna-update
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20090715/54867366/attachment.htm 


More information about the Idna-update mailing list