Combining mark vs combining character?

Vint Cerf vint at google.com
Wed Jan 5 15:17:37 CET 2011


yes, having general category M seems to encompass both "mark" and
"character" - at least for IDNA2008 purposes.

v


On Wed, Jan 5, 2011 at 9:07 AM, Simon Josefsson <simon at josefsson.org> wrote:

> Vint Cerf <vint at google.com> writes:
>
> > Simon,
> >
> > I am pretty sure that the terms "combining mark" and "combining
> character"
> > as used in IDNA2008 mean the same thing.
> >
> > neither are permitted as the initial character of a Unicode domain label
>
> Thanks.  And the practical definition of what a combining mark&character
> is that it has a General Category of M as explained in section 3.6 of
> Unicode 5.0 quoted below?
>
> Note that this is different than having a non-0 Combining Class value.
>
> /Simon
>
> > vint
> >
> >
> >
> >
> > On Wed, Jan 5, 2011 at 5:06 AM, Simon Josefsson <simon at josefsson.org>
> wrote:
> >
> >> Hi,
> >>
> >> I need a clarification regarding this paragraph in section 4.2.3.2 of
> >> RFC 5891:
> >>
> >>   The Unicode string MUST NOT begin with a combining mark or combining
> >>   character (see The Unicode Standard, Section 2.11 [Unicode] for an
> >>   exact definition).
> >>
> >> And this in section 5.4:
> >>
> >>   Putative U-labels with any of the following characteristics MUST be
> >>   rejected prior to DNS lookup:
> >> ...
> >>   o  Labels whose first character is a combining mark (see The Unicode
> >>      Standard, Section 2.11 [Unicode]).
> >>
> >> The reference to [Unicode] is not normative, which would be a problem
> >> for any implementer.
> >>
> >> Reading section 2.11 of Unicode 5.0 discuss "combining character" but
> >> not "combining mark".
> >>
> >> There is a section 7.9 in Unicode 5.0 called "Combining Marks".
> >>
> >> A section that discuss both Combining Marks and Combining Characters in
> >> the same section is section 3.11 on "Canonical Ordering Behaviour".
> >>
> >> There is one section 3.6 on "Combination" that gives the precice
> >> definition of a "Combining character":
> >>
> >>   Combining character: A character with the General Category of
> >>   Combining Mark (M).
> >>
> >> Is this the intended definition of Combining character by RFC 5891?
> >>
> >> Questions:
> >>
> >> 1) Does RFC 5891 refer to "combining mark" and "combining character" as
> >> the same thing?
> >>
> >> 2) Is there a significant difference between the requirement in 4.2.3.2
> >> and 5.4?  The latter section only mentions "combining mark" and not
> >> "combining character".
> >>
> >> 3) What is the precice definition of a "combining mark"?
> >>
> >> /Simon
> >> _______________________________________________
> >> Idna-update mailing list
> >> Idna-update at alvestrand.no
> >> http://www.alvestrand.no/mailman/listinfo/idna-update
> >>
> > _______________________________________________
> > Idna-update mailing list
> > Idna-update at alvestrand.no
> > http://www.alvestrand.no/mailman/listinfo/idna-update
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.alvestrand.no/pipermail/idna-update/attachments/20110105/87d747b6/attachment-0001.html>


More information about the Idna-update mailing list