my comments on draft-ietf-idnabis-bidi-05

"Martin J. Dürst" duerst at
Thu Sep 10 04:43:59 CEST 2009

Hello Cary, Harald, others,

On 2009/09/09 20:48, Cary Karp wrote:
> Quoting Harald quoting Martin:
>>> 4.1, para 1: "This marking is obligatory, and both double vowels and
>>> syllable-final consonants are indicated by the marking of special
>>> unvoiced characters." ->  "This marking is obligatory, and syllable-final
>>> consonants are indicated a special unvoiced character."
>>> (double (long) vowels are indicated in Unicode by their own combining
>>> mark, which is of course voiced. These are graphically in most cases
>>> just duplications of the single (short) vowels. The current text
>>> suggests a special "duplicate the proceeding vowel" sing similar to the
>>> one (sukun) for consonants, but such a suggestion is wrong.)
>> I'll leave this to Cary....
> The term "double vowel" (as "ao") does not mean "doubled vowel" (as
> "oo"). But if it makes things clearer we can replace "...and both double
> vowels and syllable-final consonants are indicated..." with "...and two
> consecutive vowels as well as syllable-final consonants are indicated...".


>>> 4.2: This section could be shortened considerably. "Greater latitude
>>> here than ... Dhivehi." is irrelevant; as long as a significant part of
>>> a language's words cannot be used in IDN, there's a problem. The
>>> subsection is interesting for people interested in Yiddish, but the
>>> average reader of the spec will try to find something relevant for the
>>> algorithm, and mostly be more confused than enlightened.
> The relevance of including any tutorial material was questioned earlier
> on, but if memory serves, we agreed to leave it as it was. If a reader
> feels unnecessarily diverted into a peripheral discussion, we can
> probably assume safely that they'll skim forward without allowing their
> attention to the core text to be derailed.


>>> 4.3: "(with the 5 being considered right-to-left because of the leading
>>> ALEF)": No, the 5 itself is never right-to-left. Change to "(the overall
>>> directionality being right-to-left because of the leading ALEF)"
>>> 4.3: "but barring them both seems to require justification" ->  "but
>>> barring them both seems unnecessary" or "but barring them both turned
>>> out to be unnecessary"
>> I'll let Cary handle this one too. He's the Hebrew expert.
> I'm not sure I understand the problem here, but "turned out to be" is
> not an appropriate term for the result of armchair analysis. I'd be
> equally happy retaining the current wording or replacing it with "but
> barring them both seems unnecessary".

Writing "but barring them both seems to require justification" creates 
the expectation that there is justification, and that they were barred. 
But that's not the case. If Cary is fine with "but barring them both 
seems unnecessary", then let's go with that.

>>> 6. "which might surprise someone expecting to see labels displayed in
>>> hierarchical order.": Please add that this may not be such a big problem
>>> to general users familiar with BIDI, because they are used to
>>> seeing/reading a sequuence of RTL units (e.g. words) from right to left.
>>> (for wording alternatives, see
>>>, first para, *second
>>> para*, ...)
>> Cary mentioned that registrations under .museum show that this is not so
>> clear-cut...
> People who work in a BIDI environment have explicitly requested RtL
> labels to be displayed in 3LD.2LD.TLD order even when the 3LD and 2LD
> are pure RtL. We've had to register tandem pairs of labels as 3LD.2LD
> and 3LD.2LD to meet that expectation.

I know, you have reported this previously. However, I think very, very 
few registries will be in a position to do that.

> It may be fair to assume that this
> expectation will change when RtL strings begin to appear as TLD labels,
> but we can't shrug the problem off at this point.

I don't want to shrug off the problem, quite to the contrary. The text 
that we have currently pointing out the problem should stay in. But I 
want to give people some head start to show them how they can try to 
understand domain names displayed as 2LD.3LD.tld (where 2LD.3LD is RTL, 
and I put 'tld' in lower case to show that it's LTR). The fact that this 
display order is the same as one would have with ordinary words of the 
same directionality has originally been pointed out by Mati, who guessed 
that while a display of 3LD.2LD.tld would look odd to people already 
very familiar with (ASCII) domain names, a display of 2LD.3LD.tld might 
not be that odd for people new to the Internet.

So to repeat, I don't want to shrug off the problem, I just want to give 
some people some hints towards a potential solution.

Regards,    Martin.

#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-#   mailto:duerst at

More information about the Idna-update mailing list