Return-Path: Received: from eikenes.alvestrand.no ([unix socket]) by eikenes.alvestrand.no (Cyrus v2.1.11-Mandrake-RPM-2.1.11-1mdk) with LMTP; Wed, 16 Feb 2005 12:07:03 +0100 X-Sieve: CMU Sieve 2.2 Return-Path: Received: from localhost (localhost.localdomain [127.0.0.1]) by eikenes.alvestrand.no (Postfix) with ESMTP id 528B961C1B for ; Wed, 16 Feb 2005 12:07:03 +0100 (CET) Received: from eikenes.alvestrand.no ([127.0.0.1]) by localhost (eikenes.alvestrand.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 21793-06 for ; Wed, 16 Feb 2005 12:06:59 +0100 (CET) Received: from psg.com (psg.com [147.28.0.62]) by eikenes.alvestrand.no (Postfix) with ESMTP id E672F61BAB for ; Wed, 16 Feb 2005 12:06:58 +0100 (CET) Received: from majordom by psg.com with local (Exim 4.44 (FreeBSD)) id 1D1Mz3-000PN0-Sc for idn-data@psg.com; Wed, 16 Feb 2005 11:04:41 +0000 Received: from [63.247.74.122] (helo=montage.altserver.com) by psg.com with esmtp (Exim 4.44 (FreeBSD)) id 1D1Mz1-000PMR-9f for idn@ops.ietf.org; Wed, 16 Feb 2005 11:04:39 +0000 Received: from lns-p19-4-idf-82-65-252-32.adsl.proxad.net ([82.65.252.32] helo=jfc.afrac.org) by montage.altserver.com with esmtpa (Exim 4.44) id 1D1Mz0-00087N-2s for idn@ops.ietf.org; Wed, 16 Feb 2005 03:04:38 -0800 Message-Id: <6.1.2.0.2.20050216110348.04110630@mail.jefsey.com> X-Sender: jefsey+jefsey.com@mail.jefsey.com X-Mailer: QUALCOMM Windows Eudora Version 6.1.2.0 Date: Wed, 16 Feb 2005 12:04:30 +0100 To: idn@ops.ietf.org From: "JFC (Jefsey) Morfin" Subject: Re: [idn] homograph attacks In-Reply-To: <4212EF44.4040208@v.loewis.de> References: <4212EF44.4040208@v.loewis.de> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1"; format=flowed Content-Transfer-Encoding: 8bit X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - montage.altserver.com X-AntiAbuse: Original Domain - ops.ietf.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - jefsey.com X-Source: X-Source-Args: X-Source-Dir: Sender: owner-idn@ops.ietf.org Precedence: bulk X-Virus-Scanned: by amavisd-new at alvestrand.no I will try to review some of the points with an intergovernance consistency in mind (what we want the user to get the same service whatever the Registry). On 07:12 16/02/2005, Michel Suignard said: >No languages used in the former soviet union should require a mix of latin >and cyrillic in a single dns label. Recent Ukrainian comments on ietf-languages@alvestrand.no (the reference list for language tags) said they are back to roman characters, so I suppose there are mixed TMs which need a mix. This shows the WIPO should be part of the debate. >To answer another message in this thread, there is no definitive answer >about which Unicode characters are allowed for a given languages. In a domain name, yes: the IANA Table by the ccTLD Manager. In an IRI DN part this should be the same. Beware, these are just ICANN recommendations to sovereign entities. What we should agree upon is that when there is a mix, the user could be warned a way or another. >But in all languages that have a reasonable concept of 'words', you should >never need to allow mixed script in a word, at least in the context of IDN >label. There are exceptions to these rules, like in South and East Asia >(Japanese comes to mind), but these languages can be detected reasonably >using the Unicode script property. As discussed above, this is debatable. But the mere fact that you accept an exception create the need to support what happens in such cases. At 07:50 16/02/2005, Martin v. Löwis wrote: >I don't speak for Verisign, but I can answer some of the questions: You comment them. My questions to Pat Kane remain. >>1. where do you maintain an ASCII list of your language tags? > >It is *maintained* probably on some development infrastructure inside >Verisign, it is published atleast at > >http://www.verisign.com/static/002533.pdf PDF is hardly an ASCII list. >>Should it not be supported on the IANA server and common to all the gTLDs? > >I personally don't see why it should be on a IANA server. >As for "should it be common to all the gTLDs", I honestly believe >the answer should be "no". These opinions of yours are opposed by gTLD contracts with ICANN. There is a IANA registration for IDN Table, even very poor. The proposed "RFC 3066bis" Draft lead to a debate where the authors who are discussing the language tags to be registered by IANA described as outrageous the idea that they should consider IDNs and be consistent with them. Do you support that? >>3. did you decide them by yourself, or did you gather a group of lingual >>authorities to assist you. This would be very interesting. > >While working to address the issue of character variants, VeriSign has >consulted and will continue to work with all interested stakeholders. We co-created Eurolinc with Louis Pouzin and some others. We are members of the ccTLD WG-IDN. Never been consulted by VRSN. >>4. would there not be a way to register IDN in using their "xn--" >>version? It would simplify international management by resellers? > >Why do you think the SRS currently does not use the xn-- version in >registrations? ??? I just say I want to be able to register the name in entering its ASCII xn-- string. On 10:44 16/02/2005, Thomas Keller said: >By design the IDNA processing happens inside the application and therefore in >my thinking the applications are the right place for any security meassures >as well. Talking about about security measures we have to think about what >exactly we want to prevent from happening. Incorrect. "Application" as in a browser etc. is a wrong idea and will probably never fly. Application can realistically be understood as an application to the path. This is the OPES concept. Has been documented for HTTP and is on the (slow) work for SMTP. However IRI consistency/analysis could be of interest, should the IRI specs. permit it. > > There are other languages that are listed within ISO 639-2 that today > use a combination of Latin and Cyrillic as they were originally Latin > based (Tajik was Arabic prior to being Latin based), migrated to Cyrillic > during the Soviet era and today are migrating back to Latin. It is > common to use Latin and Cyrillic characters in Tajik, from what I > understand not being a native speaker. Granted there are not a lot of > registrations in com net that are Tajik, but this is just the point of an IDN. This is the basic problem of the hybrid two keyboards IDN.ASCII domain names. The character set should be indicated by the TLD own character set and registered table. The resulting security violations come from ICANN. The mixed Tajik table is legitimate for Tajik people, not for others. Let understand that naming is going to be as spam: a need for technical and legal cooperation to protect users. On 07:59 16/02/2005, Martin v. Löwis said: >Michel Suignard wrote: >>To answer another message in this thread, there is no definitive >>answer about which Unicode characters are allowed for a given >>languages. > >There is no definitive answer to that question. There is an attempt to a confused that we are to fight which is Draft 3066bis defining languages tags. >However, I sure >hope there is a definitive answer to the question "which Unicode >characters are allowed for a given language" *when used as >a label in the .COM zone*. For another example, the definitive list >of (additional) characters for a German label in the .DE zone is There is a IANA section for that to gather such Tables. All what this shows is that there is a need for users to discuss technical issues and propose BCPs. This is not an uncommon problem so it might be worth proposing the IADs a WG-USERS? The WG-IDN charter proposed by James Seng initially included a market study. All this would have been part of the IESG approved/IAB reviewed charter of the WG-IDN. But the market would have most probably opposed the IDN.ASCII strategy. jfc