New version, draft-faltstrom-idnabis-tables-02.txt, available
Mark Davis
mark.davis at icu-project.org
Tue Jun 19 21:35:44 CEST 2007
On 6/13/07, Harald Tveit Alvestrand <harald at alvestrand.no> wrote:
>
> The intent of "MAYBE YES" and "MAYBE NO" was:
>
> - ALWAYS: We guarantee that these codepoints will be permitted in IDNs (at
> this level of the standard).
> - NEVER: We guarantee that these codepoints will never be permitted in
> IDNs
...
This is precisely the kind of information that belongs in Patrik's draft.
Without a model of the intended usage, it is impossible to assess the
structure of the document. At least with this kind of information we can
begin to sensibly discuss the pros and cons of the changes over what we had
last December.
However, it needs even more background information and justification.
Without knowing what you and Patrik meant by "Stable", it is also impossible
to assess how scripts should be assigned to that category. After all, Thai
is just as stable as Latin, if not more so, depending on what is meant, yet
you exclude Thai but keep Latin. I was originally guessing that what you
mean by "stable" is "has no characters that are problematic for IDN", but in
that case, Latin itself is not stable because of the potential confusability
between "1" and "l". Or, if confusability is not the issue, then you really
need to have some examples of what exactly are the problems you are trying
to prevent.
That is, you need to provide more information as to what you intend by
"stable", with specific examples of scripts that you consider stable and
why, and scripts that you consider unstable and why. Failing that, I don't
see why every non-archaic script would not be stable, thus obviating the
need for your new 4 distinctions instead of the 3 that we had up until a few
days ago.
Now, I found a hint of what may be at issue in Patrik's further response:
> Secondly, the ALWAYS and NEVER property values are only allowed on
unproblematic scripts if we have a rough consensus that the
codepoints will not move from ALWAYS to NEVER or vice versa given the
algorithm we have to calculate the property value itself.
If *that* notion of stability is all that is being talked about, then it is
very easy, and we have done it with a number of Unicode properties. Define
the following:
Grandfathered_Always to be all characters that were Always under any
previous Unicode version back to some base level (say 5.0)
Grandfathered_Never to be all characters that were Never under any previous
Unicode version back to some base level (say 5.0)
Then modify the end of my message of June 13 to be:
Then derive the following sets:
- Always = Grandfathered | (Favored & Functional) |
Grandfathered_Always
- Maybe_Yes = !Favored & Functional & !(Always | Grandfathered_Never)
- Maybe_Not = (Archaic | (!Favored & !Functional)) & !(Always |
Grandfathered_Never)
- Never = everything else
Harald said a bit later:
> So far, I've seen a lot of hand-wringing about the list of scripts being
too short, the list of scripts being Europe-centric, the arguments for
the list of scripts being too weak, the list of scripts including
worrisome characters (IPA), but I have NOT seen ANY flat statement
"script XXX is unproblematic and should be included".
My response would be:
Each script other than the archaic ones is no more problematic overall than
the Latin, Greek, and Cyrillic you have already included. Thus if you
include Latin, Greek, and Cyrillic, you should include them:
Arab Arabic
Armn Armenian
Bali Balinese
Beng Bengali
Bopo Bopomofo
Buhd Buhid
Cans Canadian_Aboriginal
Cher Cherokee
Cyrl Cyrillic
Deva Devanagari
Ethi Ethiopic
Geor Georgian
Grek Greek
Gujr Gujarati
Guru Gurmukhi
Hang Hangul
Hani Han
Hebr Hebrew
Hira Hiragana
Kana Katakana
Khmr Khmer
Knda Kannada
Laoo Lao
Latn Latin
Limb Limbu
Mlym Malayalam
Mong Mongolian
Mymr Myanmar
Nkoo Nko
Orya Oriya
Sinh Sinhala
Tale Tai_Le
Talu New_Tai_Lue
Taml Tamil
Telu Telugu
Tfng Tifinagh
Thaa Thaana
Thai Thai
Tibt Tibetan
Yiii Yi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20070619/b96ef1b6/attachment.html
More information about the Idna-update
mailing list