U-labels, NFC, and symmetry
J-F C. Morfin
jfc at morfin.org
Fri Apr 15 21:23:30 CEST 2011
Peter,
Yours is an excellent example of the IUI role and of the ML-DNS.
These two notions are not documented by RFCs yet (due to personal
delays) but are the normal continuation on the user side of IDNA2008
on the Internet side.
As you may know, f rom the very beginning of the WG/IDNABis I raised
the question of its intended final purpose:
- was it to satisfy the Internet's,
- or innovation's and the (lead) users' needs. Lead users like you and I.
The response was clear enough: it was the Internet's needs layers
only. On this clear target, I promised to document and build the
response to the users, as a user-application-front-end called the
ML-DNS, where ML stands for "multi-layer". In a multilingual
environment it was to support every language at the same level as
English. The solution was to support a "domain name pile" where the
Internet domain name corresponds to the two bottom layers (A-label
and U-label), but where many layers can be added on top to support
things like:
- metadata that Unicode does not support (such as what is needed for
French majuscules),
- encrypted domain names,
- various normalization forms (such as the one you are
investigating), command lines, etc. each being supported by its own
layer's library or service. You could conceptually understand it as
"punyplus code", which will convert back and forth different layers
(such as yours) into A-labels.
NB. I evaluate that network written exchanges need a dedicated
extended network oriented normalization form based on graphical
premizes + (optionnally) the language being used.
1. That way A-labels can be considered as pivotal among the different
UDNs formats (User Domain Names) including the Internet Class Domain
Names (IDN) made of the A-U-label duality. Just remember that a class
is a consistent particular vision of network protocols. If you can
use the basic Internet through converters like punycode it will be
simpler for you. However, what the Internet can deliver, as
demonstrated by DNA2008, is quite powerful and gradual. Before having
to consider a different class, you can consider your own
presentation. There was no presentation layer in the Internet
technology, yet to support the language diversity that one needs the
presentation layer was introduced through the "xn--" prefix, which
represents the "xn" presentation.
So, it is up to you to decide what you need. This is just
acknowledging that the Internet is able to support the principle of
subsidiarity, which is necessary to support the users' diversity,
such as yours.
However, please remember RFC 3439: very large systems, and the
Internet is a very large system multiplied into a very very large
system by subsidiarity, are to respect the principle of simplicity.
Keep it as simple as you can.
2. Now there is another problem. The Internet supports subsidiarity
without changing a single one of its bits: it is just reading RFCs in
a different perspective. However, this is not the case for IDNinA,
the very initial IDN support concept that does not scale. If you have
two applications on your machine that support two IDNA libraries,
nothing makes you sure that they will behave the same and resolve the
same IP for the same U-label.
The consequence is that you need to have the ML-DNS in a single IDN
(A-label/U-label) management position, i.e. in a distinct layer that
all the applications will call. Actually when you consider that new
middle-layer, you quickly see that it can be the place for other
services to the user and that it is a network architecture new area:
an Internet use interface. This IUI is actually made of the
Internet's extended intelligence, which we do not want on a robust
end to end network and that we always have placed on the fringe. The
IUI is a the Internet fringe. Its addition leads to a fringe to
fringe smart extended use of the end to end stratum that itself is a
smart use of the plug to plug offered by the Telecom stratum.
Then you can quickly see that this IUI can be an extension of the
Internet, and also a location for intelligent user front-ends, like
the ML-DNS is with front-ends to the Internet and other technologies,
that may share their ML-DNS, for example. Then, it becomes the
Intelligent Use Interface to serve an entire new world (different
technologies) and stratum (brain to brain semantics, what I call the
Intersem). When raising the issue with the IESG and IAB (you can see
that my appeals for clarification on their sites), you see this is
implicitly the emergence of an entirely new networking area.
Therefore, one needs a special SDO level to document and experiment it.
I called it the "IUse" emergent community, in turn proposing an IUTF
liaising with the IETF through the iucg at ietf.org (internet/IETF users
contributing group) mailing list supported by the
<http://iucg.org/>http://iucg.org/wiki (I am currently rather behind
schedule for different personal and health reasons) needing to use
the Internet as a basic common central commodity and to extend it to
suit their individual needs in their own areas, at a minimum
architectural and developmental cost based on incremental simplicity.
Best,
jfc
At 23:59 07/04/2011, Peter Saint-Andre wrote:
>RFC 5890 states:
>
> o A "U-label" is an IDNA-valid string of Unicode characters, in
> Normalization Form C (NFC) and including at least one non-ASCII
> character, expressed in a standard Unicode Encoding Form (such as
> UTF-8). It is also subject to the constraints about permitted
> characters that are specified in Section 4.2 of the Protocol
> document and the rules in the Sections 2 and 3 of the Tables
> document, the Bidi constraints in that document if it contains any
> character from scripts that are written right to left, and the
> symmetry constraint described immediately below. Conversions
> between U-labels and A-labels are performed according to the
> "Punycode" specification [RFC3492], adding or removing the ACE
> prefix as needed.
>
> To be valid, U-labels and A-labels must obey an important symmetry
> constraint. While that constraint may be tested in any of several
> ways, an A-label A1 must be capable of being produced by conversion
> from a U-label U1, and that U-label U1 must be capable of being
> produced by conversion from A-label A1. Among other things, this
> implies that both U-labels and A-labels must be strings in Unicode
> NFC [Unicode-UAX15] normalized form. These strings MUST contain only
> characters specified elsewhere in this document series, and only in
> the contexts indicated as appropriate.
>
>I'm updating the i18n handling in XMPP, and the XMPP community would
>like to use NFD on the wire for various reasons. Ideally we would like
>to do so without requiring a trip through NFC. However, it appears that
>we can do this only by using a term other than U-label, since that is
>tied to NFC. Indeed, it seems that a string in Unicode NFD normalized
>form is not an IDN label at all. This strikes me as unfortunate (I
>thought that normalization was handled only in RFC 5895 along with other
>such mapping issues), but probably because I do not understand how the
>symmetry requirement expressed in RFC 5890 necessitates the use of NFC.
>Would any of the i18n experts on this list care to enlighten me on the
>latter point?
>
>In the meantime, I shall pursue a way to specify XMPP domainparts
>independently of the term U-label.
>
>Peter
>
>--
>Peter Saint-Andre
>https://stpeter.im/
>
>
>
>
>
>
>_______________________________________________
>Idna-update mailing list
>Idna-update at alvestrand.no
>http://www.alvestrand.no/mailman/listinfo/idna-update
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.alvestrand.no/pipermail/idna-update/attachments/20110415/aeff1ad8/attachment-0001.html>
More information about the Idna-update
mailing list