Draft on IDN Tables in XML
james.mitchell at ausregistry.com.au
Wed Mar 7 02:47:36 CET 2012
I think this work should focus on identifying only:
1) The set of code points that can be used for registration
2) The set of code points (or sequences of code points) that are considered equivalent by the registry
The table should not attempt to place rules on the use of code points within a label as these rules are often non-trivial. One can easily tell whether a name is registered by performing a DNS lookup or a WHOIS query for the name. Alternatively a registrar will be able to notify a potential registrant should a name be considered "invalid".
Further to the above the table should not attempt to define those variants that are activated/allowed/blocked. An active variant can be determined from a query to the DNS or WHOIS and these protocols will have to used considering a variant may have been activated post-registration. Additionally the rules for determining whether a variant can be activated are non-trivial. Consider the example below.
And a registered name of "0627 0627". It is unclear from the definition above whether the label "0627 0625" is valid because it does not describe whether the substitution should have been applied across the whole label or whether it can be applied to one character. This is only a trivial example however I can provide many more complex rules.
To avoid the somewhat common mistake of incorrectly defining equivalence I suggest that equivalent sequences of code points are defined in one place. For example
<!-- whoops, forgot to identify 0627 as an equivalent character -->
should be expressed as
> -----Original Message-----
> From: idna-update-bounces at alvestrand.no [mailto:idna-update-
> bounces at alvestrand.no] On Behalf Of Kim Davies
> Sent: Friday, 2 March 2012 6:15 AM
> To: vip at icann.org; idna-update at alvestrand.no
> Subject: Draft on IDN Tables in XML
> I have posted a first draft regarding a format that could be used for
> representing IDN Tables in XML to the I-D Repository:
> After discussion with a number of folks that felt this would be good
> work to undertake, I've put together a first cut which is not
> comprehensive, but I think goes some way toward a potential format.
> Unless there is interest in this being a more formal activity, my
> assumption is to aim to publish the final result independently as an
> Informational RFC. However, the mechanism of publication is secondary
> to coming up with something useful that would benefit TLD registries
> and other implementors. A list of design goals, from the document, is
> as follows:
> * MUST be in a format that can be implemented in a reasonably
> straightforward manner in software;
> * The format SHOULD be able to be checked for formatting errors,
> such that common mistakes can be caught;
> * An IDN Table MUST be able to express the set of valid code
> points that are allowed for registration under a specific zone
> administrator's policies;
> * MUST be able to express computed alternatives to a given domain
> name based on a one-to-one, or one-to-many relationship. These computed
> alternatives are commonly known as "IDN variants";
> * IDN Variants SHOULD be able to be tagged with specific
> categories, such that the categories can be used to support registry
> policy (such as whether to list the computed variant in the zone, or to
> merely block it from registration);
> * IDN Variants MUST be able to stipulated based on contextual
> information. For example, specific variants may only be applicable when
> they follow another specific code point, or when the code point is
> displayed in a specific presentation form;
> * The data contained within the table MUST be unambiguous, such
> that independent implementations that utilise the contents will arrive
> at the same results;
> * IDN Tables SHOULD be suitable for comparison and re-use, such
> that one could easily compare the contents of two or more to see the
> differences, to merge them, and so on.
> * As many existing IDN Tables are practicable SHOULD be able to
> be migrated to the new format with all applicable logic retained.
> It is explicitly NOT the goal of this format to:
> * Stipulate what code points should be listed in an IDN Table by
> a zone administrator. What registration policies are used for a
> particular zone is outside the scope of this memo.
> * Stipulate what a consumer of an IDN Table must do when they
> determine a particular domain is valid or invalid; or arrive at a set
> of computed IDN variants. IDN Tables are only used to describe rules
> for computing code points, but does not prescribe how registries and
> other parties utilise them.
> I'd appreciate any feedback.
> Idna-update mailing list
> Idna-update at alvestrand.no
More information about the Idna-update