IDNs and Language definitions and labeling (was: RE: New
version, draft-faltstrom-idnabis-tables-02.txt, available)
Debbie Garside
debbie at ictmarketing.co.uk
Fri Jun 22 12:12:17 CEST 2007
Hi John
Thank you for taking the time to formulate such an informative and useful
depth response - I cannot tell you how helpful this is! I have read one or
two of the RFC's but it is good to have the "full set" identified. I am
sure others monitoring this forum will think so too.
Wrt RFC4646bis and ISO 639-6, I have been an active participant of
IETF-languages and subsequently LTRU for the past 4 years. See
http://www1.ietf.org/mail-archive/web/ltru/current/msg06477.html and follow
subsequent postings if you are interested in the latest discussion wrt
639-6.
However, I can see that my ideas for allocating Unicode code points to the
writing systems within ISO 639-6 does not meet the objectives of this
particular forum; sorry for muddying the waters but it is a bit of a passion
of mine which means it is quite hard to stop my fingers moving on the
keyboard sometimes :-)
I will take the time to read all the references in the hope that I can
actively participate in this forum in an informed manner - I hope you (and
others) will bear with me as I get up to speed with all this.
Thanks again.
Best wishes
Debbie
> One of the larger difficulties in many of the recent
> discussions of IDNs -- much more so around ICANN than here --
> is that people try to make both policy and technical
> decisions without a thorough understanding of the technology
> itself. I'd recommend, to you and others, a decent tutorial
> on what the DNS is about in terms of design, operations, and
> function [1]. One then needs to understand that IDNs are
> simply a set of conventions and overlay on the DNS itself
> and, at least in general, how that overlay works [2]. And,
> to understand this effort, one should probably start with the
> summaries of issues that have been found (or perceived) with
> the 2003 version of IDNA [3].
> Part of that understanding (but not a quick summary or
> substitute for the above) is that, while many of us are
> intensely interested in identifier and referencing mechanisms
> that are sensitive to language, orthography, and culture at a
> level as fine-grained as the user or applications designer
> thinks appropriate to his or her needs, the DNS is not a good
> vehicle for that sort of work.
>
> Because an application encountering a "DNS name" [4] has no
> way to obtain information about the language the registrant
> had in mind when registering the mnemonic string, the
> applicability of any language-based information is quite
> limited. We can use information informed by knowledge of a
> language to inform choices of scripts and characters to be
> included, but that use does not require either language
> tagging or a language taxonomy.
> Some registries can, and do, use language information to
> restrict the characters that they permit to occur together in
> a given label. Using language (or script) information that
> way has become a recommended practice, but it is optional,
> different registries can and do handle it differently, and
> the only use for language tagging in that context involves
> communication between registrant and registrar and between
> registrar and registry. There has been no demonstrated need
> for a single international standard in that area and, if
> there were such a
> need, it would be out of the scope of this effort.
>
> However, all of those uses occur at registration time; at the
> time of name resolution, or of presentation of information to
> the user, there is no language information available at all
> except by heuristic on the strings themselves. Because those
> strings are typically very short (or at least as short as
> registrants who recognize user distaste for typing long
> strings and the opportunities for bad behavior if there are
> typing errors can make them), heuristics that work very well
> with moderate-sized blocks of text will often not work well.
> And, interestingly, one of the heuristics that many people
> believe they can make into a firm and useful rule won't work
> at all in the general DNS case (see discussion in reference [1]).
>
> One final observation before I encourage you to stop reading
> this and start reading the references: A suggestion to base
> any of this work on ISO 639-6 runs into an extra problem that
> you will need to address. The IETF has adopted a system for
> language tagging that is based on ISO 639-1, 639-2, and 15924
> [5]. As you can probably appreciate, we smile at the old saw
> that the nice thing about standards is that there are so many
> of them, but generally try to avoid standardizing or relying
> on redundant, duplicative, or alternate approaches to work
> that is considered finished unless there are strong
> justifications for doing so. I suggest --with the
> understanding that this is just my personal opinion-- that,
> if you want to see 639-6 used in IETF-based protocols
> (presumably including but not limited to IDNA), your first
> step is to write up a set of discussion notes, in
> Internet-Draft form [6], that reviews the differences between
> an approach based on 639-6 and one based on a profile of RFC
> 4646 or its successor and that discusses the circumstances in
> which one would be more usefully applicable than the other.
>
> best wishes and happy reading,
> john
>
>
> -----------
>
> [1] A well-vetted and reasonably balanced tutorial, oriented
> toward policy makers rather than deep understanding of the
> technology, is a US National Research Council Report,
> _Signposts in Cyberspace: The Domain Name System and Internet
> Navigation_,
> http://books.nap.edu/catalog.php?record_id=11258. For a
> deeper understanding, the core DNS specifications themselves are RFC
> 1034 and 1035. (RFCs can be obtained from a number of
> locations. The official location permits retrieving them by
> substituting the RFC number for NNNN in
> ftp://ftp.rfc-editor.org/in-notes/rfcNNNN.txt)
>
> [2] RFC 3490, 3491, 3492, and 3454. RFCs can be obtained
> from a number of locations. The official location permits
> retrieving them by substituting the RFC number for NNNN in
> ftp://ftp.rfc-editor.org/in-notes/rfcNNNN.txt
> There are also several tutorials floating around, but they
> tend to be addressed to a user-level understanding rather
> than the understanding needed to discuss the protocol issues
> intelligently. Slideware for one of them (now somewhat dated)
> is at http://ws.edu.isoc.org/workshops/2004/ICANN-KL/
>
> [3] RFC 4690 and
> http://www.ietf.org/internet-drafts/draft-klensin-idnabis-issu
es-01.txt.
> These two documents are complementary; neither can be
> adequately understood without the other. The second one is
> likely to be replaced in the next week or so with an updated
> version, which will have the same URL but with "-02"
> substituted for "-01".
>
> [4] As you might have noticed in my exchange with Gervase,
> I've concluded that the use of terms like "name" or "word"
> are just introducing more confusion. Many, perhaps most, DNS
> "names" are not "words" in the sense of obeying the
> orthographic or phonetic rules of any language; perhaps we
> can reduce the confusion we are causing ourselves by shifting
> to "mnemonic", which more closely describes the actual situation.
>
> [5] RFC 4646 and
> http://www.ietf.org/internet-drafts/draft-ietf-ltru-4646bis-06.txt.
> For many purposes, these documents are incomplete without
> "matching rules", discussed in RFC 4647.
>
> [6] See the discussion at http://www.ietf.org/ID and the
> links to information about format and tools leading from that page.
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>
>
More information about the Idna-update
mailing list