Moving Right Along on the Inclusions Table...

Kenneth Whistler kenw at sybase.com
Thu Dec 21 20:45:24 CET 2006


Cary noted:

> Before anyone else points it out, the maqaf is not justifiable on the
> latter ground, either. It is functionally and graphically close enough
> to the hyphen that I wouldn't have argued separate need for it. On top
> of that, if a maqaf appears in a label together with an incorrectly
> rendered rafe, those two marks can easily be confused. 

O.k., based on that input, and to be consistent, I think MAQAF
should be pulled from SPInclusionAdd.txt.

> The essence of the argument about internationalizing the H in LDH is

In my opinion, "internationalizing" any syntactic element of
a formally-based syntax is simply a bad idea, period.

> whether the legacy hyphen should be taken to justify the inclusion of a
> functionally similar mark in other scripts to which the hyphen is alien,
> or whether the hyphen is simply something that may be used or not
> depending on the extent to which it can be shoehorned into an
> orthographic context in which it otherwise does not appear.

The latter. I see no way to justify adding any number of other
script hyphen analogs (and recall, this isn't limited to dashes,
but is also going to end up in arguments about middle dots and such)
merely because we have to grandfather in "-" from ASCII. If
people end up using "-" in inappropriate contexts, they will do
so, and we can't stop them, but I don't see the justification
for adding to the confusion on the basis of some supposed
fairness to other scripts issue.

> 
> Before leaving it to Ken to decide whether to remove the maqaf from the
> inclusion table (with thanks for his having given it the benefit of the
> doubt), it may be worth noting differences between its directional
> properties and those of the hyphen. 

The only reason why I haven't suggested immediately pulling MAQAF:

002D;HYPHEN-MINUS;Pd;0;ES;;;;;N;;;;;

i.e. bc=ES

05BE;HEBREW PUNCTUATION MAQAF;Po;0;R;;;;;N;;;;;

i.e. bc=R

If you only allow Hebrew script (and combining marks), then
putting a hyphen-minus or a maqaf shouldn't result in
radically different layouts in a label -- the only difference
would be whatever font difference there was in the design
of the maqaf and the hyphen-minus.

If you allow script mixing with Hebrew in a label, and
put a hyphen-minus or a maqaf between directional runs, you
could conceivably end up with layout differences that display
the hyphen-minus or the maqaf at different ends of a run.

If anything, however, I think this would militate *against*
including maqaf, because it could end up being one of those
subtle differences that someone might take advantage of to
bad ends, whereas it is hard to conceive of valid circumstances
where the placement of a hyphen at one end or other of
a visible run in a mixed-script bidirectional label is
an important thing to be trying to preserve in internet
identifiers.

> As John pointed out in a discussion
> triggered by a political tiff at the IGF, it might be reasonable to
> permit the use of alternate forms of hyphens if a condition can be
> imposed that does not permit the mixed use of the script-specific form
> with the common one.

But I don't think that is a reasonable condition to try to
impose in the IDNA protocol itself. Therefore, I would
argue that it is safer (as well as more consistent) to
simply omit MAQAF from the inclusions list in the first place.

--Ken



More information about the Idna-update mailing list