Katakana Middle Dot again (Was: tables-06b.txt: A.5, A.6, A.9)

Vint Cerf vint at google.com
Sat Jul 25 17:00:24 CEST 2009


Kent,
it seemed to me that there was general consensus on banning the  
verticals.

v

On Jul 25, 2009, at 10:22 AM, Kent Karlsson wrote:

> I really dislike treating ASCII letters specially in any way that is
> not **absolutely necessary** technically. This is far from that.
> In addition, romanised Japanese (maybe not what you were considering)
> does use non-ASCII Latin letters.
>
> Just let this one be valid.
>
> I also don't see any particular reason for
> 3031; DISALLOWED # VERTICAL KANA REPEAT MARK
> 3032; DISALLOWED # VERTICAL KANA REPEAT WITH VOICED SOUND MARK
> 303B; DISALLOWED # VERTICAL IDEOGRAPHIC ITERATION MARK
> to be disallowed. They don't seem harmful or in need of special  
> exceptions.
> I would not worry about the "vertical" in the names. And maybe, just  
> maybe,
> they may get a second life and future more common use. Why ban them?
>
>    /kent k
>
>
> Den 2009-07-25 15.43, skrev "Wil Tan" <wil at cloudregistry.net>:
>
>> On Sat, Jul 25, 2009 at 10:54 PM, Patrik  
>> Fältström<patrik at frobbit.se> wrote:
>>
>> On 25 jul 2009, at 14.34, Wil Tan wrote:
>>
>>> I accidentally left out the
>> U+3005..U+3007 that Yoneya-san proposed.
>>> Therefore, #3 should be:
>>>
>>>  3.
>> That the label contains only
>>> (Han|Hiragana|Katakana|LDH|U+3005..U+3007) +
>> katakana middle dot.
>>>
>>> It's important to note that having these
>> constraints would rule out:
>>
>> What you say is that you want the following
>> rules:
>>
>
> With the caveat that this will invalidate existing registrations
>> and
> prohibit some classes of Japanese labels that people may expect to be
> able
>> to use.
>
>>  True;
>>  if .not. Script(BeforeChar(cp)) .in.
>>  (Han|Hiragana|Katakana) then False;
>
> Not just the character before, but there
>> must be at least one
> Han|Hiragana|Katakana character in one of the preceding
>> characters
> before the katakana middle dot. We might need additional constructs
>> in
> the pseudocode grammar for this. In pseudo-functional-style-python:
>
> #
>> PosOfChar() returns the index of the candidate character within the  
>> label#
>> CPat() returns the code point at the given index
> if not any([Script(CPat(pos))
>> in (Han, Hiragana, Katakana) for pos in
> range(0, PosOfChar())]) then
>> False;
>
>
>>  For each cp:
>>    if .not. (Script(cp) .in.
>> (Han|Hiragana|Katakana) .or.
>>        cp in
>> {U+002D,U+0030..U+0039,U+0061..U+007A,U+3005..U+3007}) then
>> False;
>>
>
> We'll
>> need to include the candidate character itself, yeah?
>
> For each cp:
> if .not.
>> (Script(cp) .in. (Han|Hiragana|Katakana) .or.
>        cp in
>> {U+002D,U+0030..U+0039,U+0061..U+007A,U+3005..U+3007,U+30FB})
> then
>> False;
>
>
> =wil
> _______________________________________________
> Idna-update
>> mailing
>> list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-
>> update
>
>
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update



More information about the Idna-update mailing list